Documentation Index Fetch the complete documentation index at: https://mintlify.com/Arize-ai/openinference/llms.txt
Use this file to discover all available pages before exploring further.
This guide will help you instrument your first AI application with OpenInference and visualize traces in Phoenix.
Prerequisites
Choose your language:
Python 3.9 or higher
An OpenAI API key (or another LLM provider)
Node.js 18 or higher
npm or yarn
An OpenAI API key (or another LLM provider)
Java 11 or higher
Gradle or Maven
Installation
Install OpenInference instrumentation
Install the OpenInference instrumentation library for your framework: pip install openinference-instrumentation-openai "openai>=1.26" arize-phoenix opentelemetry-sdk opentelemetry-exporter-otlp
Replace openinference-instrumentation-openai with the instrumentation for your framework:
LangChain: openinference-instrumentation-langchain
LlamaIndex: openinference-instrumentation-llama-index
Anthropic: openinference-instrumentation-anthropic
See all Python instrumentations
npm install --save @arizeai/openinference-instrumentation-openai @opentelemetry/sdk-trace-node @opentelemetry/exporter-trace-otlp-http @opentelemetry/resources @opentelemetry/instrumentation openai
Replace @arizeai/openinference-instrumentation-openai with the instrumentation for your framework:
LangChain: @arizeai/openinference-instrumentation-langchain
Anthropic: @arizeai/openinference-instrumentation-anthropic
Bedrock: @arizeai/openinference-instrumentation-bedrock
See all JavaScript instrumentations
Gradle: dependencies {
implementation 'com.arize:openinference-instrumentation-langchain4j:0.1.+'
implementation 'io.opentelemetry:opentelemetry-api:1.49.0'
implementation 'io.opentelemetry:opentelemetry-sdk:1.49.0'
implementation 'io.opentelemetry:opentelemetry-exporter-otlp:1.49.0'
}
Maven: < dependencies >
< dependency >
< groupId > com.arize </ groupId >
< artifactId > openinference-instrumentation-langchain4j </ artifactId >
< version > 0.1.+ </ version >
</ dependency >
< dependency >
< groupId > io.opentelemetry </ groupId >
< artifactId > opentelemetry-api </ artifactId >
< version > 1.49.0 </ version >
</ dependency >
</ dependencies >
Start Phoenix server
Phoenix is an open-source AI observability platform that runs entirely on your machine. Start the server to collect traces: python -m phoenix.server.main serve
Phoenix will start on http://localhost:6006. Open this URL in your browser. python -m phoenix.server.main serve
Phoenix will start on http://localhost:6006. Open this URL in your browser. Phoenix requires Python even when instrumenting JavaScript apps. Install it with pip install arize-phoenix.
python -m phoenix.server.main serve
Phoenix will start on http://localhost:6006. Open this URL in your browser. Phoenix requires Python. Install it with pip install arize-phoenix.
The Phoenix server does not send data over the internet — all traces stay on your machine.
Set your API key
Set your OpenAI API key as an environment variable: export OPENAI_API_KEY = "your-api-key-here"
Instrument your application
Create a file with the following code to instrument your first LLM call: Create app.py: import openai
from openinference.instrumentation.openai import OpenAIInstrumentor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
# Configure OpenTelemetry to send traces to Phoenix
endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))
# Instrument OpenAI
OpenAIInstrumentor().instrument( tracer_provider = tracer_provider)
if __name__ == "__main__" :
# Make an OpenAI call - it will be automatically traced
client = openai.OpenAI()
response = client.chat.completions.create(
model = "gpt-3.5-turbo" ,
messages = [{ "role" : "user" , "content" : "Write a haiku about observability." }],
max_tokens = 50 ,
)
print (response.choices[ 0 ].message.content)
Run the application: Create instrumentation.js: const { NodeTracerProvider } = require ( '@opentelemetry/sdk-trace-node' );
const { OTLPTraceExporter } = require ( '@opentelemetry/exporter-trace-otlp-http' );
const { SimpleSpanProcessor } = require ( '@opentelemetry/sdk-trace-node' );
const { OpenAIInstrumentation } = require ( '@arizeai/openinference-instrumentation-openai' );
const { registerInstrumentations } = require ( '@opentelemetry/instrumentation' );
// Configure OpenTelemetry to send traces to Phoenix
const provider = new NodeTracerProvider ();
provider . addSpanProcessor (
new SimpleSpanProcessor (
new OTLPTraceExporter ({
url: 'http://127.0.0.1:6006/v1/traces' ,
})
)
);
provider . register ();
// Register OpenAI instrumentation
registerInstrumentations ({
instrumentations: [ new OpenAIInstrumentation ()],
});
Create app.js: const OpenAI = require ( 'openai' );
const client = new OpenAI ({
apiKey: process . env . OPENAI_API_KEY ,
});
async function main () {
const response = await client . chat . completions . create ({
model: 'gpt-3.5-turbo' ,
messages: [{ role: 'user' , content: 'Write a haiku about observability.' }],
max_tokens: 50 ,
});
console . log ( response . choices [ 0 ]. message . content );
}
main ();
Run the application: node -r ./instrumentation.js app.js
The instrumentation file must be required before your application code runs. Use the -r flag to ensure it loads first.
Create Main.java: import io.openinference.instrumentation.langchain4j.LangChain4jInstrumentor;
import io.opentelemetry.api.GlobalOpenTelemetry;
import io.opentelemetry.exporter.otlp.http.trace.OtlpHttpSpanExporter;
import io.opentelemetry.sdk.OpenTelemetrySdk;
import io.opentelemetry.sdk.trace.SdkTracerProvider;
import io.opentelemetry.sdk.trace.export.SimpleSpanProcessor;
import dev.langchain4j.model.openai.OpenAiChatModel;
public class Main {
public static void main ( String [] args ) {
// Configure OpenTelemetry to send traces to Phoenix
SdkTracerProvider tracerProvider = SdkTracerProvider . builder ()
. addSpanProcessor (
SimpleSpanProcessor . create (
OtlpHttpSpanExporter . builder ()
. setEndpoint ( "http://127.0.0.1:6006/v1/traces" )
. build ()
)
)
. build ();
OpenTelemetrySdk . builder ()
. setTracerProvider (tracerProvider)
. buildAndRegisterGlobal ();
// Auto-instrument LangChain4j
LangChain4jInstrumentor . instrument ();
// Make an LLM call - it will be automatically traced
OpenAiChatModel model = OpenAiChatModel . builder ()
. apiKey ( System . getenv ( "OPENAI_API_KEY" ))
. modelName ( "gpt-3.5-turbo" )
. build ();
String response = model . generate ( "Write a haiku about observability." );
System . out . println (response);
}
}
Run the application:
View traces in Phoenix
Open Phoenix in your browser at http://localhost:6006. You’ll see:
Traces view : All LLM calls with timing, token counts, and costs
Span details : Input messages, output messages, model parameters
Timeline : Visual representation of the execution flow
What’s captured?
OpenInference automatically captures:
Messages Full conversation history including system prompts, user messages, and assistant responses
Token counts Prompt tokens, completion tokens, cached tokens, and reasoning tokens
Model parameters Temperature, max tokens, top-p, and other invocation parameters
Costs Estimated costs for prompt and completion tokens in USD
Timing Start time, end time, and duration with nanosecond precision
Errors Exception messages and stack traces when calls fail
Advanced example with context
Add session tracking, user IDs, and custom metadata to your traces:
from openinference.instrumentation import using_attributes
with using_attributes(
session_id = "user-session-123" ,
user_id = "user-456" ,
metadata = {
"environment" : "production" ,
"version" : "1.0.0" ,
},
tags = [ "chat" , "customer-support" ],
):
response = client.chat.completions.create(
model = "gpt-3.5-turbo" ,
messages = [{ "role" : "user" , "content" : "How do I reset my password?" }],
)
const { context , trace } = require ( '@opentelemetry/api' );
const { SemanticAttributes } = require ( '@arizeai/openinference-semantic-conventions' );
const span = trace . getTracer ( 'my-app' ). startSpan ( 'chat' );
span . setAttribute ( SemanticAttributes . SESSION_ID , 'user-session-123' );
span . setAttribute ( SemanticAttributes . USER_ID , 'user-456' );
span . setAttribute ( SemanticAttributes . METADATA , JSON . stringify ({
environment: 'production' ,
version: '1.0.0' ,
}));
span . setAttribute ( SemanticAttributes . TAG_TAGS , JSON . stringify ([ 'chat' , 'customer-support' ]));
context . with ( trace . setSpan ( context . active (), span ), async () => {
const response = await client . chat . completions . create ({
model: 'gpt-3.5-turbo' ,
messages: [{ role: 'user' , content: 'How do I reset my password?' }],
});
span . end ();
});
import io.opentelemetry.api.trace.Span;
import io.openinference.semconv.trace.SpanAttributes;
Span span = GlobalOpenTelemetry . getTracer ( "my-app" ). spanBuilder ( "chat" ). startSpan ();
span . setAttribute ( SpanAttributes . SESSION_ID , "user-session-123" );
span . setAttribute ( SpanAttributes . USER_ID , "user-456" );
span . setAttribute ( SpanAttributes . METADATA , "{ \" environment \" : \" production \" , \" version \" : \" 1.0.0 \" }" );
try {
String response = model . generate ( "How do I reset my password?" );
System . out . println (response);
} finally {
span . end ();
}
Streaming example
OpenInference supports streaming LLM responses:
response = client.chat.completions.create(
model = "gpt-3.5-turbo" ,
messages = [{ "role" : "user" , "content" : "Write a story about AI." }],
stream = True ,
stream_options = { "include_usage" : True }, # Required for token counts
)
for chunk in response:
if chunk.choices and (content := chunk.choices[ 0 ].delta.content):
print (content, end = "" )
Set stream_options={"include_usage": True} to capture token counts when streaming (requires openai>=1.26).
const stream = await client . chat . completions . create ({
model: 'gpt-3.5-turbo' ,
messages: [{ role: 'user' , content: 'Write a story about AI.' }],
stream: true ,
});
for await ( const chunk of stream ) {
const content = chunk . choices [ 0 ]?. delta ?. content ;
if ( content ) {
process . stdout . write ( content );
}
}
Next steps
Python instrumentations Explore all 30+ Python instrumentation libraries
JavaScript instrumentations Explore JavaScript/TypeScript instrumentations
Privacy controls Configure data masking and PII protection
Concepts Learn about traces, spans, and attributes