# Instructor (library)

> Source: https://aiwiki.ai/wiki/instructor_library
> Updated: 2026-07-16
> Categories: Developer Tools, Large Language Models, Open Source AI
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**Instructor** is an open-source Python library that returns type-safe, Pydantic-validated structured outputs from large language model APIs. It works by wrapping (originally "patching") a provider's client SDK so that any chat completion call accepts an extra `response_model=` argument whose value is a `pydantic.BaseModel` subclass, and the call returns a fully validated instance of that model instead of a raw string. The library was created in June 2023 by Jason Liu and has since grown into the most widely used open-source structured-output utility for LLMs, with roughly 3 million monthly downloads on PyPI and more than 13,000 GitHub stars by mid-2026.[^1][^2][^3] Instructor is distributed under the MIT License and is part of a small family of ports that share the same API ideas in TypeScript, Ruby, Go, Elixir, and Rust.[^1][^4]

| Field | Value |
| --- | --- |
| Original author | Jason Liu |
| Initial release | June 2023[^5] |
| Latest stable version (as of writing) | 1.15.1 (3 April 2026)[^3] |
| Language | Python (>=3.9, <4.0) |
| License | MIT[^3] |
| Repository | github.com/567-labs/instructor (formerly jxnl/instructor) |
| Core dependency | Pydantic v2 |
| Primary entry points | `instructor.from_openai`, `instructor.from_anthropic`, `instructor.from_provider`, `instructor.patch` |
| Sister projects | instructor-js (TypeScript), and ports in Ruby, Go, Elixir, Rust[^4] |
| Supported providers (selection) | OpenAI, Azure OpenAI, [Anthropic](/wiki/anthropic), [Google Gemini](/wiki/gemini), [Google Vertex AI](/wiki/google_vertex_ai), [Amazon Bedrock](/wiki/amazon_bedrock), [Mistral AI](/wiki/mistral), [Cohere](/wiki/cohere), [Groq](/wiki/groq_hardware), [Cerebras](/wiki/cerebras), [Ollama](/wiki/ollama), llama-cpp-python, [Together AI](/wiki/together_ai), [Fireworks AI](/wiki/fireworks_ai), [DeepSeek](/wiki/deepseek), xAI, Writer, Perplexity, SambaNova, OpenRouter, LiteLLM[^6] |

## Motivation

Before tool-use APIs were widespread, applications that needed structured data from an LLM commonly executed code resembling `response.choices[0].message.content`, then ran an ad-hoc regular expression or a JSON parser on the resulting string. This approach fails frequently in production: the model may add prose around the JSON, omit fields, include trailing commas, hallucinate enum values, or change the schema between calls. Each of those errors must be caught and retried by hand, and the retry prompt must somehow communicate to the model exactly what went wrong without re-explaining the entire schema.[^7][^2]

Instructor's stated motivation is to remove that boilerplate. The library's documentation frames the problem in three parts: unreliability ("LLM outputs work 90% of the time, but failures are costly"), provider fragmentation (each vendor's [function-calling](/wiki/function_calling) surface looks slightly different), and the awkwardness of writing nested JSON schemas by hand instead of in idiomatic Python.[^7] Liu, in his original September 2023 blog post that introduced the project, argued that Pydantic was already an obvious bridge between Python developers and language models because Pydantic models double as both validators and self-documenting schemas, and he positioned Instructor as the equivalent of `requests` for structured LLM calls: a thin layer that handles the boring parts and stays out of the way otherwise.[^5][^8]

The result is that, in the canonical usage, a developer never sees a JSON schema string at all. They write a Pydantic `BaseModel`, pass it as `response_model`, and either receive an instance of that model or, if all retries are exhausted, a `ValidationError` they can catch like any other Python exception.[^2][^9]

A second, less obvious motivation is testability. Because the returned object is a real Python instance, application logic that consumes it can be unit-tested with synthetic Pydantic instances rather than mock JSON blobs, and contract changes in the schema surface as type errors at edit time. In practice this means LLM-using code can adopt the same review and linting practices as ordinary backend code, which was difficult when the contract between application and model was an unstructured string.[^2][^7]

## History

Liu was an ML engineer at Stitch Fix for five years before taking a sabbatical at South Park Commons in New York and moving into full-time AI consulting. He has described initially dismissing language models as impractical and only returning to them after a wrist injury and the post-2022 release of GPT-3.5 made the technology unavoidable.[^8]

The first commits of Instructor were written, by Liu's own account, on the Shinkansen during a trip to Japan in June 2023, shortly after [OpenAI](/wiki/openai) introduced function calling for the GPT-3.5 and GPT-4 chat models.[^8] At the time, alternative libraries such as Guardrails, Marvin, and the earliest versions of [LangChain](/wiki/langchain) tended to use XML-style annotations, custom prompt templates, or full agent frameworks to coerce structure. Liu's design instead treated function calling as a JSON-schema delivery mechanism and used Pydantic as the schema source. The library's first public form was distributed as the package `openai_function_call` before being renamed.[^5]

Adoption grew quickly: by late 2023 Instructor had moved from a personal repository under Liu's `jxnl` username to the `instructor-ai` organization, and again later to `567-labs/instructor`. A TypeScript port, `instructor-js`, was started in 2023 by Liu together with Dimitri Kennedy and reached v1.7.0 by January 2025.[^4] Community-maintained ports followed in Ruby, Go, Elixir, and Rust.[^1]

Major Python milestones include the v1.0 release in early 2024, which introduced the `instructor.from_openai(OpenAI())` factory style and deprecated the global monkey-patching `instructor.patch()` call in favor of explicit per-client wrapping; the gradual addition of multimodal helpers (`Image`, `Audio`, `PDF`) across 2024 and 2025; partial-streaming and iterable-streaming primitives; provider-specific extras for Anthropic, Cohere, Google GenAI and Vertex, Mistral, Bedrock, and Groq; and an internal hook system that exposes events for completion start, completion end, retry, and parse error. Release v1.13.0 added the `py.typed` marker so that downstream users get proper type inference, and v1.14 added a uniform provider factory and broadened Bedrock support. By v1.15.1 (3 April 2026), the library supported xAI, additional security hardening for Bedrock image inputs, retry tracking inside hooks, and an Anthropic User-Agent header.[^10][^3]

The naming of the package has also evolved: early users will recall it being known as `openai_function_call`, then briefly as `openai-function-call-pydantic`, before the rename to `instructor` cemented after the OpenAI rebrand made the original name misleading. The migration from `instructor.patch()` to factory functions was driven by user reports that monkey-patching at module import time interacted badly with applications that maintain multiple OpenAI clients in the same process (for example, one client for embeddings and one for chat with different timeouts). Wrapping a specific client instance avoids that global-state pitfall.[^5][^9]

## API surface

### Initialising a client

In the modern style, an Instructor client is just a regular provider client that has been wrapped:

```python
import instructor
from openai import OpenAI
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

client = instructor.from_openai(OpenAI())

user = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=User,
    messages=[{"role": "user", "content": "Jason is 25."}],
)
# user is now a fully validated User instance
```

The same shape works for other providers via dedicated factories (`instructor.from_anthropic`, `instructor.from_cohere`, `instructor.from_mistral`, `instructor.from_google`, etc.), or via the unified `instructor.from_provider("openai/gpt-4o")` string that selects a provider and model in a single call.[^2][^9][^11] The underlying client retains its normal type signature, which means existing tools such as IDE autocompletion, retry decorators, and request logging continue to work.

### Modes

Because every provider exposes structured output through a slightly different mechanism, Instructor introduces a `Mode` enum that selects the wire-level technique. The most common modes are:

| Mode | Mechanism | Typical providers |
| --- | --- | --- |
| `Mode.TOOLS` | Native function/tool calling | OpenAI, Anthropic, Mistral, Cohere, Groq |
| `Mode.JSON` | "JSON mode" with post-hoc parsing | OpenAI, DeepSeek, Together, vLLM endpoints |
| `Mode.JSON_SCHEMA` | Vendor-supplied JSON Schema enforcement | OpenAI (Structured Outputs), Gemini |
| `Mode.MD_JSON` | Markdown-fenced JSON block, regex-extracted | Ollama, older local models |
| `Mode.PARALLEL_TOOLS` | Multiple tool calls in one response | OpenAI, Anthropic; auto-selected when `response_model` is `Iterable[Union[...]]` |

The chosen mode determines how the Pydantic schema is serialised into the underlying provider request and how validation errors are translated back into a "reask" turn.[^11][^12]

### response_model

`response_model` accepts any Pydantic `BaseModel` subclass, including models with nested submodels, lists, dictionaries, `Literal` enums, `Union` types, generics, and Pydantic's full `Field(...)` constraint vocabulary (`gt`, `lt`, `min_length`, regex patterns, custom validators). Two type wrappers are provided for streaming, `Partial[Model]` and `Iterable[Model]`, described below.[^2][^9]

Field validators are first-class citizens. Because Pydantic v2 supports arbitrary `@field_validator` and `@model_validator` callables, application authors can encode business rules ("phone number must match an E.164 regex," "end date must be after start date") and Instructor will treat any `ValidationError` they raise the same as a missing field: it will package the error message into the next prompt and retry. This is the mechanism that gives Instructor its de-facto "self-healing" reputation.[^9][^11]

### max_retries

The simplest retry interface is an integer:

```python
user = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=User,
    max_retries=3,
    messages=[...],
)
```

When validation fails on attempt N, Instructor constructs an additional turn that includes both the offending output and the `ValidationError` text, then resubmits. The reask turn is short and targeted, so the marginal token cost of a retry is small relative to the original generation.[^9]

For richer behaviour `max_retries` also accepts a Tenacity `Retrying` (sync) or `AsyncRetrying` (async) object, allowing callers to combine schema-level retries with exponential backoff, rate-limit-aware waits, jitter, or stop-after-deadline policies. The Instructor documentation recommends combining its built-in `max_retries` integer with Tenacity decorators only when network-level resilience is also needed, and always setting an explicit stop condition to avoid infinite loops.[^9]

### Hooks

Since v1.10, Instructor exposes a hook system that fires callbacks on `completion:kwargs`, `completion:response`, `completion:error`, `parse:error`, and `completion:last_attempt`. Hooks make it straightforward to log raw API traffic to observability platforms, attribute token usage to retries, or short-circuit retries on specific exceptions.[^10]

```python
def on_parse_error(error, **kwargs):
    metrics.increment("instructor.parse_error", tags={"model": kwargs["model"]})

client.on("parse:error", on_parse_error)
```

Hooks are also a convenient place to record the full request and response payloads for evaluation datasets, since both pre-retry and post-retry artifacts pass through the same handlers. The v1.12 release expanded the hook surface so that retry counts and per-attempt latency are attached to the hook context, simplifying instrumentation.[^10]

## Partial and iterable streaming

Two streaming primitives ship in the box.

`create_partial()` returns a generator that yields successive partial states of the response model as tokens arrive. Internally Instructor rewrites the user-supplied `BaseModel` so that all fields are `Optional`, and as each field is fully decoded it switches from `None` to the concrete value. Application code can iterate the generator and re-render a UI on every yielded snapshot:[^13]

```python
class MeetingInfo(BaseModel):
    title: str
    attendees: list[str]
    starts_at: datetime

stream = client.chat.completions.create_partial(
    response_model=MeetingInfo,
    messages=[{"role": "user", "content": "Schedule a 9 a.m. retro with Anna and Bo."}],
)
for snapshot in stream:
    render(snapshot)  # snapshot is a MeetingInfo with possibly-None fields
```

`create_iterable()` is the dual for "extract many objects." It is typed as `Iterable[Model]`, switches the underlying call to parallel tool calling when the provider supports it, and yields validated instances one at a time as the model emits them. This is the typical pattern for entity extraction over documents.[^13][^11]

Two caveats apply to partial streaming: Pydantic validators cannot run mid-stream because the model is still incomplete, and `Literal` types need to be marked with the `PartialLiteralMixin` so that intermediate values are accepted. Async iteration is supported via `async for` on the partial generator.[^13]

## Multimodal inputs

Instructor offers provider-agnostic wrappers for non-text inputs so that the same Pydantic model can be reused across OpenAI, Anthropic, Google GenAI, Mistral, and Bedrock without rewriting the message construction.[^14]

`instructor.Image` accepts URLs, Google Cloud Storage URLs, local file paths, and base64 strings via class methods `from_url`, `from_gs_url`, `from_path`, `from_base64`, and an `autodetect` heuristic. It is supported across OpenAI, Anthropic, and Google GenAI. The library can also be configured with `autodetect_images=True` so that any string looking like a path or URL inside a message is converted automatically.[^14]

`instructor.Audio` exposes the same five constructors but is restricted to OpenAI (for the GPT-4o audio inputs) and Gemini.[^14]

`instructor.PDF` has the widest coverage (OpenAI, Anthropic, Google GenAI, Mistral, Bedrock) and ships two specialised subclasses: `PDFWithCacheControl`, which adds Anthropic prompt-caching metadata so that a repeatedly-used document only pays its upload cost once, and `PDFWithGenaiFile`, which uploads via Google's Files API and references the uploaded file in the prompt.[^14]

A typical multimodal extraction reads:

```python
from instructor import Image
from pydantic import BaseModel

class Receipt(BaseModel):
    merchant: str
    total_usd: float
    line_items: list[str]

receipt = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Receipt,
    messages=[
        {"role": "user", "content": [
            "Extract this receipt:",
            Image.from_path("receipt.jpg"),
        ]},
    ],
)
```

The same code, with `client = instructor.from_anthropic(...)`, works against Claude without modification.[^14][^11]

## Provider integrations

Instructor advertises support for 15-plus providers; the integrations page enumerates more than 20 by mid-2026.[^6]

| Category | Providers |
| --- | --- |
| Frontier API providers | OpenAI, [Anthropic](/wiki/anthropic), [Google Gemini](/wiki/gemini), [Google GenAI](/wiki/gemini), [Vertex AI](/wiki/google_vertex_ai), xAI |
| Enterprise clouds | Azure OpenAI, [Amazon Bedrock](/wiki/amazon_bedrock), Writer, SambaNova |
| Specialty inference | [Groq](/wiki/groq_hardware), [Cerebras](/wiki/cerebras), [Fireworks AI](/wiki/fireworks_ai), [Together AI](/wiki/together_ai) |
| Other commercial models | [Cohere](/wiki/cohere), [Mistral AI](/wiki/mistral), [DeepSeek](/wiki/deepseek), Perplexity |
| Local and self-hosted | [Ollama](/wiki/ollama), llama-cpp-python, [vLLM](/wiki/vllm) (via OpenAI-compatible endpoint) |
| Routers | LiteLLM, OpenRouter |

In practice the easiest way to use any of these is `instructor.from_provider("provider/model")`; the library reads the provider segment, imports the relevant SDK (which must already be installed via the matching extra such as `instructor[anthropic]` or `instructor[google-genai]`), picks a default `Mode`, and returns a wrapped client.[^15][^6] Every integration supports `response_model`, validation-feedback retries, hooks, and at least basic streaming; some, notably Anthropic, also expose provider-specific options such as the `thinking` extended-reasoning parameter, prompt caching, and parallel tool calling.[^11]

## Internals

Although Instructor's public surface is small, its internal pipeline has several distinct stages, and understanding them helps explain why the library is positioned as a thin wrapper rather than a framework.[^11][^2]

1. **Schema generation.** When `response_model` is bound to a `BaseModel` subclass, Instructor calls Pydantic's `model_json_schema()` and post-processes the result so that it conforms to provider-specific quirks. OpenAI tool schemas require `additionalProperties: false` and disallow some keywords; Anthropic's tools accept a richer schema but reject certain `oneOf` constructs; Gemini has its own JSON-schema dialect. Each provider plugin owns its translation.
2. **Mode dispatch.** The chosen `Mode` (TOOLS, JSON, JSON_SCHEMA, MD_JSON, PARALLEL_TOOLS) determines whether the schema is sent as a tool definition, as a `response_format`, or as part of the system prompt.
3. **Completion.** The wrapped client makes the underlying provider call. Instructor is intentionally a pass-through here: streaming, async, function arguments, and provider-specific parameters such as `temperature`, `top_p`, `seed`, Anthropic `thinking`, and Gemini safety settings all flow through unchanged.
4. **Extraction.** The raw response is parsed into a candidate JSON string. For tool modes this is the tool call's arguments; for JSON mode this is the top-level message content; for MD_JSON mode the parser looks for a fenced code block.
5. **Validation.** The candidate string is passed to `Model.model_validate_json()`. If validation succeeds, the validated instance is returned. If validation raises a `ValidationError`, control jumps to the retry layer.
6. **Reask.** On validation failure, Instructor constructs an additional `user` (or in some modes `tool` and `assistant` chained) turn containing the previous output and the formatted error messages, and reissues the call. The retry budget is governed by `max_retries`.

Because each stage is independent, advanced users can replace individual pieces. For example, a custom `Mode` plugin can be registered for a provider that is not yet first-party supported, or the reask construction can be overridden to inject a different prompt template.[^11][^9]

## Applications

Liu has repeatedly framed Instructor as targeting three application archetypes:[^8]

1. **Extraction.** Pulling structured records (people, dates, line items, claims) out of unstructured text or scanned documents. Pydantic models map naturally to database rows, so the extracted instances often go straight into a relational schema.
2. **Knowledge-graph construction.** Iterating over a corpus and asking the model to return `Iterable[Edge]` or similar, then upserting nodes and edges into a graph store. The combination of `create_iterable` with parallel tool calling makes this throughput-bound rather than logic-bound.
3. **Routing and query transformation.** Translating natural-language user input into a typed query object that a [retrieval-augmented generation](/wiki/retrieval_augmented_generation_rag) pipeline can execute, or into the parameters of an internal API. In this role Instructor competes directly with bespoke prompt-and-parse code.

Beyond Liu's framing, common community uses include classification (a `Literal[...]` field with a tight enum), structured chain-of-thought (a `reasoning` field followed by a `final_answer` field, both validated), evaluators (Pydantic models that capture rubric scores), and as the structured-data engine inside larger agent loops where each tool's input and output are Pydantic models. Observability vendors including Langfuse have shipped first-class integrations that record each Instructor call along with its retries and validation errors.[^16]

A representative entity-extraction workflow demonstrates several of these patterns at once:

```python
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Iterable, Literal

class Mention(BaseModel):
    surface_form: str
    canonical_name: str
    kind: Literal["person", "org", "location"]
    confidence: float = Field(ge=0.0, le=1.0)

client = instructor.from_openai(OpenAI())

mentions: Iterable[Mention] = client.chat.completions.create_iterable(
    model="gpt-4o-mini",
    response_model=Mention,
    max_retries=2,
    messages=[
        {"role": "system", "content": "Extract named entities. One Mention per call."},
        {"role": "user", "content": open("article.txt").read()},
    ],
)

for m in mentions:
    upsert_entity(m)
```

A handful of design decisions are visible in this example. The `Literal` type narrows the LLM's choice of `kind` to three values, and a `confidence` field bounded by Pydantic's `ge`/`le` constraint provides a uniform calibration signal. Because `create_iterable` switches the underlying call to parallel tool calling, each `Mention` is yielded as soon as it is parsed rather than after the entire document has been processed, which substantially reduces tail latency on long inputs.[^11][^13]

## Comparison with other libraries

The structured-output ecosystem splits into two design families. Post-generation validators let the model emit tokens freely and then validate the result, retrying with feedback on failure; pre-generation constrainers modify token sampling so that only schema-conformant tokens can be produced. Instructor is the canonical post-generation validator.[^17][^18]

| Library | Approach | Validation engine | Provider coverage | Typical use case |
| --- | --- | --- | --- | --- |
| Instructor | Post-generation, with retries | Pydantic | 20+ via SDK wrapping | Generic structured extraction, multi-provider apps |
| [LangChain](/wiki/langchain) `with_structured_output` | Post-generation | Pydantic or JSON Schema | All LangChain-supported providers | Existing LangChain pipelines, agents |
| Pydantic AI | Post-generation, with agents | Pydantic | OpenAI, Anthropic, Gemini, Groq, Bedrock, others | Lightweight agents that also need structured outputs |
| Outlines | Pre-generation, constrained decoding | Pydantic, regex, CFG | Transformers, [vLLM](/wiki/vllm), llama.cpp; limited OpenAI | Local models where 100% valid output is required |
| Guidance | Pre-generation, programmatic prompts | Custom DSL plus Python control flow | Transformers, llama.cpp, OpenAI (limited) | Workflows that interleave model output with Python branching |
| BAML | Code-generation from `.baml` schema files | Compiled validator | OpenAI, Anthropic, others | Polyglot teams that want a single schema source |

The trade-offs are reasonably well established in practitioner write-ups:

- **vs. Outlines.** Outlines guarantees the output is schema-valid because invalid tokens are masked out of the sampler; there is no retry cost. However, it requires direct access to the sampling loop, which means it works best with [vLLM](/wiki/vllm) or the `transformers` library and has only partial OpenAI support. Instructor reverses the trade-off: it works with any provider that supports tool-use or JSON mode, at the cost of occasionally needing to retry.[^17][^18]
- **vs. Guidance.** Guidance is more expressive when an application needs to interleave model output with Python control flow ("ask for an enum, then conditionally ask follow-up questions based on the answer") and supports context-free grammars. Its learning curve is steeper, and it is constrained to local models for full functionality.[^17]
- **vs. [LangChain](/wiki/langchain) `with_structured_output`.** LangChain bundles structured output into a much larger framework that also handles chains, retrievers, agents, and callbacks. For teams already using LangChain, `with_structured_output` is a natural choice. For teams that want only structured output, Instructor is a much smaller dependency and stays closer to the underlying provider SDK.[^18][^17]
- **vs. Pydantic AI.** Pydantic AI, released in 2024 by the Pydantic team, also returns validated Pydantic objects from LLM calls but is positioned as a full agent framework: it bundles tool registration, dependency injection, message history, and Logfire integration. Instructor is purely an extraction layer; Pydantic AI is an alternative when an application also needs tools and conversational state but does not want the full LangChain ecosystem.[^19][^17]
- **vs. BAML.** BAML is a code-generation system that compiles `.baml` schema files into typed client libraries in multiple languages. It is the only entrant that is not a Python library at runtime; the comparison axis is roughly "single source of truth in a dedicated DSL" versus "single source of truth in Python types."[^17]

Liu has been candid that Instructor's positioning relies on staying small. His repeated message to interviewers has been that he is not building a framework; he is building "the boring stuff" so that downstream framework authors do not need to.[^8]

## Limitations and criticisms

Instructor's design has well-understood limits.

**Retry cost.** Because validation happens after generation, every validation failure costs at least one extra round-trip to the model. For high-volume pipelines on expensive frontier models, this can dominate the bill. Practitioners report that for sufficiently complex schemas a fraction of calls require one or two retries, and pre-generation tools such as Outlines avoid that overhead entirely at the price of running locally.[^17][^18]

**Schema complexity ceiling.** Very deeply nested Pydantic models translate into very large JSON schemas, which in turn consume input tokens on every call. Some providers also impose hard limits on schema depth or on the number of properties in a tool definition. In those cases the schema needs to be flattened, or split into multiple calls.

**Mode coverage gaps.** Not every provider supports every mode. Local models served through Ollama frequently fall back to `Mode.MD_JSON`, which is regex-based and brittle if the model emits anything outside the fenced block. Some providers' JSON Schema implementations do not honour every Pydantic constraint, so validation that nominally happens at the provider becomes a no-op and Instructor's client-side validator is the only safety net.[^11][^17]

**Streaming validators.** Partial streaming intentionally skips validators because intermediate states are by definition incomplete. Applications that rely on validators for correctness must run them on the final yielded snapshot, not on intermediate ones.[^13]

**Migration friction.** The transition from `instructor.patch()` (global monkey patching) to `instructor.from_openai()` (explicit wrapping) and then to `instructor.from_provider()` (string-based) introduced two waves of community example code that no longer reflects current best practice. Users following older tutorials sometimes pick up patterns that the modern docs explicitly deprecate.[^9]

**Not a sampling-time guarantee.** A model that produces 200 tokens of nonsense before the retry budget is exhausted is still 200 tokens of nonsense. Instructor cannot prevent the model from confabulating field values that happen to satisfy the schema; Pydantic validators and downstream business logic remain necessary for semantic correctness.

**Latency.** Each retry inflates total wall-clock time. For latency-sensitive surfaces such as interactive copilots, applications often cap `max_retries` at one and fall back to a graceful error rather than risk a slow first byte. Streaming with `create_partial` partially mitigates the perceived latency by letting the UI render fields as they decode, but it does not reduce the upper bound on total time when the schema is large.[^13]

**Provider drift.** Vendor APIs change. When OpenAI introduced its "Structured Outputs" feature with strict JSON Schema enforcement in 2024, Instructor added a corresponding mode but had to accommodate the fact that the strict schema dialect rejects certain Pydantic constructs (such as `Union` types without discriminators). Similar small frictions appear whenever a provider extends or restricts its tool-call surface, and the burden of keeping up falls on the library's maintainers.[^12][^11]

## Community and ecosystem

By mid-2026 Instructor reports roughly 3 million monthly downloads, 13,000-plus GitHub stars, and over 100 contributors across the Python repository alone.[^1][^2][^3] The library and its docs are maintained under the `567-labs` GitHub organisation; Liu also runs an associated cohort-based course on applied LLM systems and a paid consulting practice, both of which have served as informal feedback channels for API design.[^8]

The ecosystem includes the TypeScript port `instructor-js` (also MIT, maintained by Liu and Dimitri Kennedy, latest tag v1.7.0 in January 2025) and community ports in Ruby, Go, Elixir, and Rust that share the same conceptual API.[^4][^1] Observability integrations exist for Langfuse and other tracing platforms, and Instructor calls are first-class in [DSPy](/wiki/dspy)-style pipelines when developers want to mix a structured-output layer into a larger prompt-optimisation workflow.[^16]

## Related work

- [Structured output](/wiki/structured_output) for an overview of the broader problem space.
- [Function calling](/wiki/function_calling) and [Tool use](/wiki/tool_use) for the underlying provider APIs Instructor builds on.
- [LangChain](/wiki/langchain) and [LlamaIndex](/wiki/llamaindex) for the closest large-framework comparisons.
- [DSPy](/wiki/dspy) for an alternative philosophy that learns prompts and signatures rather than declaring them.
- [Retrieval-augmented generation (RAG)](/wiki/retrieval_augmented_generation_rag) for the most common downstream consumer of Instructor's outputs.

## See also

- [Structured output](/wiki/structured_output)
- [Function calling](/wiki/function_calling)
- [Tool use](/wiki/tool_use)
- [OpenAI API](/wiki/openai_api)
- [Anthropic API](/wiki/anthropic_api)
- [LangChain](/wiki/langchain)
- [LlamaIndex](/wiki/llamaindex)
- [DSPy](/wiki/dspy)
- [Ollama](/wiki/ollama)
- [vLLM](/wiki/vllm)
- [Retrieval-Augmented Generation (RAG)](/wiki/retrieval_augmented_generation_rag)

## References

[^1]: Instructor maintainers, "Instructor: Multi-Language Library for Structured LLM Outputs", python.useinstructor.com, 2026. https://python.useinstructor.com/. Accessed 2026-05-20.
[^2]: 567-labs, "instructor: structured outputs for llms (README)", GitHub, 2026. https://github.com/567-labs/instructor. Accessed 2026-05-20.
[^3]: Python Packaging Authority, "instructor 1.15.1 on PyPI", PyPI, 2026-04-03. https://pypi.org/project/instructor/. Accessed 2026-05-20.
[^4]: instructor-ai, "instructor-js: structured extraction in TypeScript", GitHub, 2025-01-27. https://github.com/instructor-ai/instructor-js. Accessed 2026-05-20.
[^5]: Jason Liu, "Bridging Language Models with Python via Instructor, Pydantic, and OpenAI's function calling", Medium, 2023-09-09. https://medium.com/@jxnlco/bridging-language-model-with-python-with-instructor-pydantic-and-openais-function-calling-f32fb1cdb401. Accessed 2026-05-20.
[^6]: Instructor maintainers, "Integrations", python.useinstructor.com, 2026. https://python.useinstructor.com/integrations/. Accessed 2026-05-20.
[^7]: Instructor maintainers, "Why use Instructor?", python.useinstructor.com, 2026. https://python.useinstructor.com/why/. Accessed 2026-05-20.
[^8]: swyx and Alessio Fanelli, "High Agency Pydantic > VC Backed Frameworks: with Jason Liu of Instructor", Latent Space podcast, 2024. https://www.latent.space/p/instructor. Accessed 2026-05-20.
[^9]: Instructor maintainers, "Retrying", python.useinstructor.com, 2026. https://python.useinstructor.com/concepts/retrying/. Accessed 2026-05-20.
[^10]: 567-labs, "Releases for instructor", GitHub, 2026. https://github.com/567-labs/instructor/releases. Accessed 2026-05-20.
[^11]: Instructor maintainers, "Anthropic integration", python.useinstructor.com, 2026. https://python.useinstructor.com/integrations/anthropic/. Accessed 2026-05-20.
[^12]: Instructor maintainers, "OpenAI integration", python.useinstructor.com, 2026. https://python.useinstructor.com/integrations/openai/. Accessed 2026-05-20.
[^13]: Instructor maintainers, "Partial responses", python.useinstructor.com, 2026. https://python.useinstructor.com/concepts/partial/. Accessed 2026-05-20.
[^14]: Instructor maintainers, "Multimodal", python.useinstructor.com, 2026. https://python.useinstructor.com/concepts/multimodal/. Accessed 2026-05-20.
[^15]: Instructor maintainers, "Installing Instructor for LLM Structured Outputs", python.useinstructor.com, 2026. https://python.useinstructor.com/learning/getting_started/installation/. Accessed 2026-05-20.
[^16]: Langfuse, "Observability and Tracing for Instructor", langfuse.com, 2025. https://langfuse.com/integrations/frameworks/instructor. Accessed 2026-05-20.
[^17]: CodeCut, "5 Python Tools for Structured LLM Outputs: A Practical Comparison", codecut.ai, 2026. https://codecut.ai/structured-llm-outputs-tools-comparison/. Accessed 2026-05-20.
[^18]: Paul Simmering, "The best library for structured LLM output", simmering.dev, 2024-05. https://simmering.dev/blog/structured_output/. Accessed 2026-05-20.
[^19]: Mahadevan Varadhan, "PydanticAI vs. Instructor: Structured LLM-AI Outputs with Python Tools", Medium, 2025. https://medium.com/@mahadevan.varadhan/pydanticai-vs-instructor-structured-llm-ai-outputs-with-python-tools-c7b7b202eb23. Accessed 2026-05-20.