Skip to content

Generation Schemas

generation_schema guide the model to respond in a pre-established structured format. By defining a schema using msgspec.Struct, the agent automatically constrains the model's output to match the expected structure, ensuring type-safe and predictable responses.

Performance

msgspec is the fastest validation and serialization library, which is why it was chosen to deliver maximum performance. See the benchmarks.

Example
# pip install msgflux[openai]
from msgspec import Struct
import msgflux as mf
import msgflux.nn as nn

# mf.set_envs(OPENAI_API_KEY="...")

class ContentCheck(Struct):
    reason: str
    is_safe: bool

class Moderator(nn.Agent):
    model = mf.Model.chat_completion("openai/gpt-4.1-mini")
    generation_schema = ContentCheck
    config = {"verbose": True}

agent = Moderator()
result = agent(
    "Analyze this message: 'You are amazing and I appreciate your help!'"
)
print(result.is_safe)  # True/False

Reasoning Schemas

msgFlux provides built-in generation schemas that implement common reasoning strategies. These schemas guide the model through structured thinking patterns before producing a response.

All reasoning schemas produce a final_answer: str field containing the model's concluded response.


Chain of Thought

Wei et al., 2022 — Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Chain of Thought (CoT) is the simplest and most widely used reasoning schema. It prompts the model to articulate its reasoning step-by-step before committing to a final answer. By making the thinking process explicit and structured, the model is less likely to jump to incorrect conclusions — especially on math, logic, and multi-step problems.

The schema adds a single reasoning field whose description hint ("Let's think step by step in order to") nudges the model to elaborate before responding.

                    Input
┌─────────────────────────────────────────────┐
│  reasoning:                                 │
│    "Step 1: Subtract 7 from both sides...   │
│     Step 2: Divide both sides by 8...       │
│     Step 3: x = -3.75"                      │
└──────────────────────┬──────────────────────┘
              [ final_answer: "-3.75" ]

Schema fields:

Field Type Description
reasoning str Step-by-step thinking that precedes the final answer
final_answer str The concluded response based on the reasoning chain
# pip install msgflux[openai]
import msgflux as mf
import msgflux.nn as nn
from msgflux.generation.reasoning import ChainOfThought

# mf.set_envs(OPENAI_API_KEY="...")

class Solver(nn.Agent):
    model = mf.Model.chat_completion("openai/gpt-4.1-mini")
    generation_schema = ChainOfThought
    config = {"verbose": True}

agent = Solver()
result = agent("Solve: 8x + 7 = -23")

print(result.reasoning)    # "Step 1: Subtract 7... Step 2: Divide by 8..."
print(result.final_answer) # "x = -3.75"

When to use

CoT works best for problems that benefit from explicit decomposition: algebra, logic puzzles, multi-step reasoning, comparisons, or any task where the path to the answer matters as much as the answer itself.


ReAct

Yao et al., 2022 — ReAct: Synergizing Reasoning and Acting in Language Models

ReAct (Reasoning + Acting) is a dynamic schema designed for agents that need to use tools. Instead of reasoning in a single pass, the model interleaves thought (internal planning), actions (tool calls), and observations (tool results) in an iterative loop — repeating until it has enough information to produce a final_answer.

                    Input
┌─────────────────────────────────────────────┐
│  thought:  "I need to look up the current   │
│             Python release on python.org"   │
│  actions:  [{name: "web_fetch",             │
│              args: [{name: "url",           │
│                      value: "..."}]}]       │
└──────────────────────┬──────────────────────┘
               [ Tool Execution ]
┌─────────────────────────────────────────────┐
│  observations: [{tool: "web_fetch",         │
│                  result: "Python 3.13..."}] │
└──────────────────────┬──────────────────────┘
┌─────────────────────────────────────────────┐
│  thought:  "I now have the information I    │
│             need to answer the question"    │
└──────────────────────┬──────────────────────┘
                  Task complete?
                 /              \
               No               Yes
                │                 │
          Next cycle     [ final_answer: "3.14.x" ]

Schema fields:

Field Type Description
thought str The agent's internal reasoning and plan
actions List[Action] \| None Tool calls to execute in this step
final_answer str \| None Set once the agent has all needed information

Each Action contains:

Field Type Description
name str The tool function to call
arguments dict[str, Any] Named arguments passed to the tool

Tools are serialized as text

Unlike standard tool calling, ReAct injects tool schemas into the system prompt as text descriptions rather than passing function definitions to the model's native tools parameter. This makes the loop more portable across models and providers, but changes how tools are represented internally.

# pip install msgflux[openai]
import msgflux as mf
import msgflux.nn as nn
from msgflux.generation.reasoning import ReAct
from msgflux.tools.builtin import WebFetch

# mf.set_envs(OPENAI_API_KEY="...")

class WebResearcher(nn.Agent):
    model = mf.Model.chat_completion("openai/gpt-4.1-mini")
    generation_schema = ReAct
    tools = [WebFetch]
    config = {"verbose": True}

agent = WebResearcher()
result = agent("What is the latest Python version from python.org?")

print(result.thought)       # "I need to fetch python.org to get the version..."
print(result.final_answer)  # "Python 3.14.x"

Default system_message

ReAct ships with a built-in system_message that instructs the model to follow the Thought → Action → Observation loop. You can inspect it with ReAct.system_message. It can be overridden by setting system_message on the agent, though that is generally not recommended — the default prompt is carefully tuned to keep the loop stable.

When to use

ReAct is the right choice when the agent needs external information to answer a question — web searches, API calls, database lookups, file reads, or any task requiring multi-turn tool interactions before an answer can be formed.


Self Consistency

Wang et al., 2022 — Self-Consistency Improves Chain of Thought Reasoning in Language Models

Self-Consistency reduces reasoning errors by generating multiple independent reasoning paths and selecting the most frequent answer through majority voting. Instead of relying on a single chain of thought, the model explores different approaches to the same problem and cross-checks its own conclusions — surface-level errors in one path get cancelled out by the others.

        Input
          ├──▶ Path 1: "Distance ÷ Time = 120 ÷ 2..."  → answer: "60 km/h"
          ├──▶ Path 2: "v = d/t, so 120/2 = ..."       → answer: "60 km/h"
          └──▶ Path 3: "Speed formula: s × t = d..."   → answer: "65 km/h"
                           Majority Vote
                         ("60 km/h" wins 2/3)
                     [ final_answer: "60 km/h" ]

Schema fields:

Field Type Description
paths List[ReasoningPath] Set of multiple independent reasoning paths
final_answer str Answer chosen by majority vote across all paths

Each ReasoningPath contains:

Field Type Description
reasoning str A single chain of thought for this path
answer str The answer derived from this path
# pip install msgflux[openai]
import msgflux as mf
import msgflux.nn as nn
from msgflux.generation.reasoning import SelfConsistency

# mf.set_envs(OPENAI_API_KEY="...")

class Solver(nn.Agent):
    model = mf.Model.chat_completion("openai/gpt-4.1-mini")
    generation_schema = SelfConsistency
    config = {"verbose": True}

agent = Solver()
result = agent("If a train travels 120km in 2 hours, what is its speed?")

for i, path in enumerate(result.paths, 1):
    print(f"Path {i}: {path.reasoning!r}{path.answer!r}")

print(result.final_answer)  # "60 km/h"

When to use

Self-Consistency shines when accuracy is critical and the problem has a verifiable correct answer — math, science, logic, or any domain where multiple approaches can independently converge on the right result.

Token usage

Self-Consistency generates multiple reasoning paths in a single response, which increases output token consumption compared to CoT. The model decides how many paths to produce based on the complexity of the question.


Extending Reasoning Schemas

You can extend any reasoning schema by inheriting from it and redefining the final_answer field with a custom type.

  ┌──────────────────────┐            ┌───────────────────────┐
  │   ChainOfThought     │            │    NumericAnswer      │
  ├──────────────────────┤   ─────▶   ├───────────────────────┤
  │ reasoning: str       │            │ reasoning: str        │  ← inherited
  │ final_answer: str    │            │ final_answer: int     │  ← overridden
  └──────────────────────┘            └───────────────────────┘

  ┌──────────────────────┐            ┌─────────────────────────────────────┐
  │   ChainOfThought     │            │       ReasonedDecision              │
  ├──────────────────────┤   ─────▶   ├─────────────────────────────────────┤
  │ reasoning: str       │            │ reasoning: str          ← inherited │
  │ final_answer: str    │            │ final_answer: Decision  ← overridden│
  └──────────────────────┘            └────────────────┬────────────────────┘
                                              ┌────────┴────────┐
                                              │    Decision     │
                                              ├─────────────────┤
                                              │ approved: bool  │
                                              │confidence: float│
                                              │justification:str│
                                              └─────────────────┘

Example

# pip install msgflux[openai]
import msgflux as mf
import msgflux.nn as nn
from msgflux.generation.reasoning import ChainOfThought

# mf.set_envs(OPENAI_API_KEY="...")

class NumericAnswer(ChainOfThought):
    final_answer: int

class Calculator(nn.Agent):
    model = mf.Model.chat_completion("openai/gpt-4.1-mini")
    generation_schema = NumericAnswer
    config = {"verbose": True}

agent = Calculator()
result = agent("What is 25 + 17?")
print(result.final_answer)  # 42 (int)
# pip install msgflux[openai]
import msgflux as mf
import msgflux.nn as nn
from msgspec import Struct
from msgflux.generation.reasoning import ChainOfThought

# mf.set_envs(OPENAI_API_KEY="...")

class Decision(Struct):
    approved: bool
    confidence: float
    justification: str

class ReasonedDecision(ChainOfThought):
    final_answer: Decision

class Reviewer(nn.Agent):
    model = mf.Model.chat_completion("openai/gpt-4.1-mini")
    generation_schema = ReasonedDecision
    config = {"verbose": True}

agent = Reviewer()
result = agent("Should we approve this budget request for $5000?")
print(result.final_answer.approved)      # True/False
print(result.final_answer.confidence)    # 0.85
print(result.final_answer.justification) # "The request is within budget limits..."