Quickstart - OpenInference

This guide will help you instrument your first AI application with OpenInference and visualize traces in Phoenix.

Prerequisites

Choose your language:

Python
JavaScript
Java

Python 3.9 or higher
An OpenAI API key (or another LLM provider)

Installation

Install OpenInference instrumentation

Install the OpenInference instrumentation library for your framework:

Python
JavaScript
Java

pip install openinference-instrumentation-openai "openai>=1.26" arize-phoenix opentelemetry-sdk opentelemetry-exporter-otlp

Replace openinference-instrumentation-openai with the instrumentation for your framework:

LangChain: openinference-instrumentation-langchain
LlamaIndex: openinference-instrumentation-llama-index
Anthropic: openinference-instrumentation-anthropic
See all Python instrumentations

npm install --save @arizeai/openinference-instrumentation-openai @opentelemetry/sdk-trace-node @opentelemetry/exporter-trace-otlp-http @opentelemetry/resources @opentelemetry/instrumentation openai

Replace @arizeai/openinference-instrumentation-openai with the instrumentation for your framework:

LangChain: @arizeai/openinference-instrumentation-langchain
Anthropic: @arizeai/openinference-instrumentation-anthropic
Bedrock: @arizeai/openinference-instrumentation-bedrock
See all JavaScript instrumentations

Gradle:

dependencies {
    implementation 'com.arize:openinference-instrumentation-langchain4j:0.1.+'
    implementation 'io.opentelemetry:opentelemetry-api:1.49.0'
    implementation 'io.opentelemetry:opentelemetry-sdk:1.49.0'
    implementation 'io.opentelemetry:opentelemetry-exporter-otlp:1.49.0'
}

Maven:

<dependencies>
    <dependency>
        <groupId>com.arize</groupId>
        <artifactId>openinference-instrumentation-langchain4j</artifactId>
        <version>0.1.+</version>
    </dependency>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-api</artifactId>
        <version>1.49.0</version>
    </dependency>
</dependencies>

Start Phoenix server

Phoenix is an open-source AI observability platform that runs entirely on your machine. Start the server to collect traces:

Python
JavaScript
Java

python -m phoenix.server.main serve

Phoenix will start on http://localhost:6006. Open this URL in your browser.

python -m phoenix.server.main serve

Phoenix will start on http://localhost:6006. Open this URL in your browser.

Phoenix requires Python even when instrumenting JavaScript apps. Install it with pip install arize-phoenix.

python -m phoenix.server.main serve

Phoenix will start on http://localhost:6006. Open this URL in your browser.

Phoenix requires Python. Install it with pip install arize-phoenix.

The Phoenix server does not send data over the internet — all traces stay on your machine.

Set your API key

Set your OpenAI API key as an environment variable:

export OPENAI_API_KEY="your-api-key-here"

Instrument your application

Create a file with the following code to instrument your first LLM call:

Python
JavaScript
Java

Create app.py:

app.py

import openai
from openinference.instrumentation.openai import OpenAIInstrumentor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

# Configure OpenTelemetry to send traces to Phoenix
endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))

# Instrument OpenAI
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

if __name__ == "__main__":
    # Make an OpenAI call - it will be automatically traced
    client = openai.OpenAI()
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": "Write a haiku about observability."}],
        max_tokens=50,
    )
    print(response.choices[0].message.content)

Run the application:

python app.py

Create instrumentation.js:

instrumentation.js

const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-node');
const { OpenAIInstrumentation } = require('@arizeai/openinference-instrumentation-openai');
const { registerInstrumentations } = require('@opentelemetry/instrumentation');

// Configure OpenTelemetry to send traces to Phoenix
const provider = new NodeTracerProvider();
provider.addSpanProcessor(
  new SimpleSpanProcessor(
    new OTLPTraceExporter({
      url: 'http://127.0.0.1:6006/v1/traces',
    })
  )
);
provider.register();

// Register OpenAI instrumentation
registerInstrumentations({
  instrumentations: [new OpenAIInstrumentation()],
});

Create app.js:

app.js

const OpenAI = require('openai');

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

async function main() {
  const response = await client.chat.completions.create({
    model: 'gpt-3.5-turbo',
    messages: [{ role: 'user', content: 'Write a haiku about observability.' }],
    max_tokens: 50,
  });
  console.log(response.choices[0].message.content);
}

main();

Run the application:

node -r ./instrumentation.js app.js

The instrumentation file must be required before your application code runs. Use the -r flag to ensure it loads first.

Create Main.java:

Main.java

import io.openinference.instrumentation.langchain4j.LangChain4jInstrumentor;
import io.opentelemetry.api.GlobalOpenTelemetry;
import io.opentelemetry.exporter.otlp.http.trace.OtlpHttpSpanExporter;
import io.opentelemetry.sdk.OpenTelemetrySdk;
import io.opentelemetry.sdk.trace.SdkTracerProvider;
import io.opentelemetry.sdk.trace.export.SimpleSpanProcessor;
import dev.langchain4j.model.openai.OpenAiChatModel;

public class Main {
    public static void main(String[] args) {
        // Configure OpenTelemetry to send traces to Phoenix
        SdkTracerProvider tracerProvider = SdkTracerProvider.builder()
            .addSpanProcessor(
                SimpleSpanProcessor.create(
                    OtlpHttpSpanExporter.builder()
                        .setEndpoint("http://127.0.0.1:6006/v1/traces")
                        .build()
                )
            )
            .build();

        OpenTelemetrySdk.builder()
            .setTracerProvider(tracerProvider)
            .buildAndRegisterGlobal();

        // Auto-instrument LangChain4j
        LangChain4jInstrumentor.instrument();

        // Make an LLM call - it will be automatically traced
        OpenAiChatModel model = OpenAiChatModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("gpt-3.5-turbo")
            .build();

        String response = model.generate("Write a haiku about observability.");
        System.out.println(response);
    }
}

Run the application:

./gradlew run

View traces in Phoenix

Open Phoenix in your browser at http://localhost:6006. You’ll see:

Traces view: All LLM calls with timing, token counts, and costs
Span details: Input messages, output messages, model parameters
Timeline: Visual representation of the execution flow

What’s captured?

OpenInference automatically captures:

Messages

Full conversation history including system prompts, user messages, and assistant responses

Token counts

Prompt tokens, completion tokens, cached tokens, and reasoning tokens

Model parameters

Temperature, max tokens, top-p, and other invocation parameters

Costs

Estimated costs for prompt and completion tokens in USD

Timing

Start time, end time, and duration with nanosecond precision

Errors

Exception messages and stack traces when calls fail

Advanced example with context

Add session tracking, user IDs, and custom metadata to your traces:

Python
JavaScript
Java

from openinference.instrumentation import using_attributes

with using_attributes(
    session_id="user-session-123",
    user_id="user-456",
    metadata={
        "environment": "production",
        "version": "1.0.0",
    },
    tags=["chat", "customer-support"],
):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": "How do I reset my password?"}],
    )

const { context, trace } = require('@opentelemetry/api');
const { SemanticAttributes } = require('@arizeai/openinference-semantic-conventions');

const span = trace.getTracer('my-app').startSpan('chat');
span.setAttribute(SemanticAttributes.SESSION_ID, 'user-session-123');
span.setAttribute(SemanticAttributes.USER_ID, 'user-456');
span.setAttribute(SemanticAttributes.METADATA, JSON.stringify({
  environment: 'production',
  version: '1.0.0',
}));
span.setAttribute(SemanticAttributes.TAG_TAGS, JSON.stringify(['chat', 'customer-support']));

context.with(trace.setSpan(context.active(), span), async () => {
  const response = await client.chat.completions.create({
    model: 'gpt-3.5-turbo',
    messages: [{ role: 'user', content: 'How do I reset my password?' }],
  });
  span.end();
});

import io.opentelemetry.api.trace.Span;
import io.openinference.semconv.trace.SpanAttributes;

Span span = GlobalOpenTelemetry.getTracer("my-app").spanBuilder("chat").startSpan();
span.setAttribute(SpanAttributes.SESSION_ID, "user-session-123");
span.setAttribute(SpanAttributes.USER_ID, "user-456");
span.setAttribute(SpanAttributes.METADATA, "{\"environment\": \"production\", \"version\": \"1.0.0\"}");

try {
    String response = model.generate("How do I reset my password?");
    System.out.println(response);
} finally {
    span.end();
}

Streaming example

OpenInference supports streaming LLM responses:

Python
JavaScript

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Write a story about AI."}],
    stream=True,
    stream_options={"include_usage": True},  # Required for token counts
)

for chunk in response:
    if chunk.choices and (content := chunk.choices[0].delta.content):
        print(content, end="")

Set stream_options={"include_usage": True} to capture token counts when streaming (requires openai>=1.26).

const stream = await client.chat.completions.create({
  model: 'gpt-3.5-turbo',
  messages: [{ role: 'user', content: 'Write a story about AI.' }],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    process.stdout.write(content);
  }
}

Next steps

Python instrumentations

Explore all 30+ Python instrumentation libraries

JavaScript instrumentations

Explore JavaScript/TypeScript instrumentations

Privacy controls

Configure data masking and PII protection

Concepts

Learn about traces, spans, and attributes

Documentation Index

​Prerequisites

​Installation

​What’s captured?

Messages

Token counts

Model parameters

Costs

Timing

Errors

​Advanced example with context

​Streaming example

​Next steps

Python instrumentations

JavaScript instrumentations

Privacy controls

Concepts

Prerequisites

Installation

What’s captured?

Advanced example with context

Streaming example

Next steps