Install

LLM calls deserve a home.

ActiveAI does for LLM calls what ActionMailer does for email — gives them a declared, testable home with provider-agnostic abstractions, a built-in agentic loop, streaming support, and full Rails instrumentation. Stop reinventing the plumbing.

Quick install
$
Active Development Ruby ≥ 3.2 Rails ≥ 7.0
Before

LLM calls don't have a home in Rails. Yet.

Without a convention, they end up wherever they fit — controllers, service objects, background jobs — each reinventing message formatting, tool dispatch, provider routing, and streaming. The same four bugs appear in every codebase.

Before — scattered, reinvented
# app/controllers/documents_controller.rb
def summarize
  client = Anthropic::Client.new
  response = client.messages.create(
    model: "claude-sonnet-4-6",
    max_tokens: 1024,
    system: "You are a writing assistant.",
    messages: [{ role: "user",
                 content: "Summarize: #{@document.body}" }]
  )
  # no streaming, no tools, no instrumentation,
  # no error handling, provider-specific API…
  render json: { result: response.content.first.text }
end

# Same thing copied into PostsController,
# CommentsController, AiReviewJob…
After — declared, testable, instrumented
# app/ai/agents/writing_agent.rb
class WritingAgent < ApplicationAgent
  tools  ActiveAI::Tools::WebSearch
  skills ToneSkill

  def initialize(document:, **kwargs)
    @document = document
    super(**kwargs)
  end
end

# Blocking call — full agentic loop
response = WritingAgent.new(
  document: @doc,
  message:  "Summarize this."
).complete

# Streaming — same agent, yields chunks
WritingAgent.new(document: @doc, message: "…")
  .stream { |e| send_to_client(e) if e.is_a?(String) }
Rails

If you know ActionMailer, you already understand this.

ActionMailer didn't make sending email easier by wrapping SMTP differently. It made it easier by giving email a place in your app. ActiveAI gives LLM calls the same thing: generators, config, conventions, and a class hierarchy the whole team can reason about.

Rails / ActionMailer ActiveAI
ActionMailer::Base ActiveAI::Agent::Base
ApplicationMailer ApplicationAgent
UserMailer < ApplicationMailer WritingAgent < ApplicationAgent
SMTP delivery method ActiveAI::Provider::Anthropic
config/database.yml config/ai.yml
config.action_mailer ActiveAI.configure
rails g mailer User rails g active_ai:agent Writing
ActionMailer::Base.deliveries (test) ActiveAI::TestHelper + stubs
ActionMailer::Base.delivery_method provider :anthropic / :openai / :xai
mail.deliver_later agent.stream { |event| … }

If you've onboarded someone to ActionMailer, the same mental model transfers. The agent class is where LLM configuration lives; the instance is one conversation; the provider is the delivery mechanism.

ActiveAI vs. other Ruby AI libraries
Library Rails generators config/ai.yml Agentic loop Workflows & orchestration ActiveSupport instrumentation
ActiveAI Yes Yes Yes Yes (Workflow + Orchestrator) Yes (10+ events)
langchainrb No No Partial No No
ruby_llm No No No No No
Direct SDK gems No No Manual No No

langchainrb (~2k GitHub stars) works in any Ruby context but is not Rails-first: no generators, no config/ai.yml, no ActiveSupport::Notifications. ruby_llm is Rails-friendly for single-turn chat but has no agentic loop, workflows, or memory pipeline. Direct SDK gems (ruby-openai, anthropic) are low-level API wrappers — you write the loop, tool dispatch, and error handling yourself. ActiveAI gives you all five conventions together.

Agent

What the library handles vs. what you implement

Build Messages Build System Prompt Stream Loop Tool Dispatch Final Response iterates until done

Single agent call — library manages the loop

Library handles
  • agentic loop — stream → tool dispatch → continue
  • Provider routing, streaming, token counting
  • complete accumulation, run class method
  • Tool execution and error recovery inside the loop
You implement
  • Agent class inheriting ApplicationAgent
  • Class DSL: provider, model, tools, skills
  • initialize when injecting domain objects
  • instance_tools when tools need per-call context
Minimal agent class
class WritingAgent < ApplicationAgent
  provider :anthropic
  model    "claude-sonnet-4-6", max_tokens: 8096
  tools    ActiveAI::Tools::WebSearch
  skills   ToneSkill
  description "Drafts, edits, and structures written content."

  def initialize(document:, **kwargs)
    @document = document
    super(**kwargs)
  end

  def instance_tools
    return [] unless @document
    [
      WriteDocumentSectionTool.new(document: @document),
      InsertDocumentSectionTool.new(document: @document)
    ]
  end
end

# Inspect what will be sent — no API call
params = WritingAgent.new(document: @doc, message: "Rewrite section 3.").to_canonical_params
params[:system]    # => resolved system prompt with injected skills
params[:tools]     # => [{ name: "web_search", … }, { name: "write_document_section", … }]
Loop

Stream → tool call → continue → repeat until done

Every agent call runs the same cycle. The provider streams back either text chunks or tool calls. Tool calls are executed immediately and their results sent back. The loop continues until the provider returns a final text response with no pending tool calls.

Request Provider Stream Tool Call Execute Continue Final Response call stream tool calls results continue continue text only MAX_TOOL_ITERATIONS = 20 → raises ToolLoopError if exceeded

Agentic loop — arrows draw on scroll entry. Blue = LLM steps. Amber = tool execution.

Stream events
agent.stream do |event|
  case event
  when String
    # text chunk — write to SSE client
    send_to_client(event)
  when Hash
    # { tool_call: { id:, name:, input: } }
    # dispatched automatically
  end
end
Loop guard & token usage
class ResearchAgent < ApplicationAgent
  # Override per agent when needed
  MAX_TOOL_ITERATIONS = 40
end

agent.last_usage
# => {
#      input_tokens:      412,
#      output_tokens:      89,
#      cache_read_tokens: 100,
#      stop_reason:   "end_turn"
#    }
Route

Fixed rails or dynamic dispatch — pick the right tool

Two routing modes. A Workflow coordinates agents in a fixed sequence you define — deterministic, predictable, easy to test. An Orchestrator lets the LLM decide which agent or workflow to call at runtime based on the input.

Workflow deterministic · step() chain input ResearchAgent DraftingAgent ReviewAgent step 1 step 2 step 3 Orchestrator LLM-routed · dynamic dispatch input LLM Orchestrator routes via meta-tools Writing Agent Research Agent Publish Workflow picks at runtime

Left: step() chain — you control the order. Right: LLM picks the target at runtime.

Workflow — fixed sequence
class ResearchAndDraftWorkflow < ApplicationWorkflow
  def run(input)
    research = step(ResearchAgent, message: input)
    draft    = step(DraftingAgent,
                   message: "Write based on:\n#{research}")
    step(ReviewAgent,
         message: "Review and improve:\n#{draft}")
  end
end

# Parallel steps — concurrent, ordered results
parallel_step(
  [ResearchAgent,  { message: "Find AI news." }],
  [FactCheckAgent, { message: "Verify: #{claims}" }]
)
Orchestrator — LLM routes
class WritingOrchestrator < ApplicationOrchestrator
  # Agents register via `description`
  # Orchestrator sees them as meta-tools
  # and picks the right one at runtime
end

class WritingAgent < ApplicationAgent
  description "Drafts and edits written content."
  # ^ registers in ActiveAI.registry
end

result = WritingOrchestrator.new(
  message: "Edit the intro for clarity."
).complete
Memory

Behavior injection and cross-session recall

The memory pipeline is a four-stage background processing system that converts completed agent conversations into searchable, tiered memory: PersistJob summarizes each session immediately after a turn, EmbedJob generates vector embeddings for semantic recall, TierJob ages memories from warm to cold storage on a daily schedule, and ConsolidateJob detects relationships between memory pairs on a weekly schedule. Agents opt in per class using recall_memory and receive relevant memories as prepended system-prompt context.

Skills — behavioral instruction blocks

Reusable instruction blocks injected into an agent's system prompt. Define once, attach to any agent.

class ToneSkill < ApplicationSkill
  skill_name "tone_guidelines"

  def content
    "Always use active voice. Be concise.
     Prefer concrete examples over abstractions.
     Do not use marketing language."
  end
end

class WritingAgent < ApplicationAgent
  skills ToneSkill
  skills [BlogStructureSkill, CitationStyleSkill]
  # or inline string skills
end

Memory recall — opt-in per agent

Past conversations distilled into a compact block and prepended to the system prompt. Three strategies: :warm, :cold, :hybrid.

class WritingAgent < ApplicationAgent
  recall_memory strategy: :warm, token_budget: 600

  private

  def memory_recall_context
    # scope memories to current document
    { subject: @document }
  end
end

# Injected context looks like:
# "Historical context (soft signal):
#  Decisions: Use short paragraphs; active voice
#  Open threads: Title still undecided"

Conversation ends →

PersistJob summarize session after each turn EmbedJob generate vector auto-enqueued TierJob warm → cold daily scheduled Consolidate Job weekly scheduled Blue = event-driven Amber = scheduled maintenance

Stages light up in sequence on scroll entry. Stages 3–4 run on a configurable schedule.

Speculative: ConsolidateJob — relationship detection between memory pairs — is architecturally complete but described in the docs as active development. PersistJob and EmbedJob are further along. Production memory requires a real embedding provider replacing the default zero-vector stub.

Observe

10+ ActiveSupport::Notifications events, zero configuration

Every agent turn, tool dispatch, workflow step, skill resolution, and orchestrator route fires an event. The flight recorder is always on. Attach subscribers only when you need them.

Subscribe in a Rails initializer
# config/initializers/active_ai_instrumentation.rb

ActiveSupport::Notifications.subscribe("agent_complete.active_ai") do |event|
  AgentLog.create!(
    agent:         event.payload[:agent_class],
    model:         event.payload[:model],
    input_tokens:  event.payload.dig(:usage, :input_tokens),
    output_tokens: event.payload.dig(:usage, :output_tokens),
    duration_ms:   event.duration.round
  )
end

# Caller context propagates automatically through nested calls:
# workflow_step.active_ai  caller_type: :workflow
# agent_complete.active_ai caller_type: :workflow
# tool_call.active_ai      caller_type: :agent
ActiveAI::Instrumentation.current_caller
# => { type: :agent, name: "WritingAgent" }
All 10 events
agent_complete.active_ai
full agent turn incl. tool iterations
agent_stream.active_ai
raw stream loop timing
tool_call.active_ai
each tool invocation inside the loop
skill_resolve.active_ai
each skill content resolved
workflow_run.active_ai
Workflow.run(input)
workflow_step.active_ai
each step() call within a workflow
workflow_parallel_step.active_ai
entire parallel_step batch
orchestrator_route.active_ai
routing decision + dispatched targets
orchestrator_dispatch.active_ai
each meta-tool invocation
memory_persist.active_ai
PersistJob + EmbedJob completion
Providers

Provider-agnostic framework, custom tool DSL

The framework is provider-agnostic at the routing level. Switching providers is one line in the agent DSL. Whether prompts translate cleanly across models is a separate concern.

Anthropic
Claude family. Native streaming, tool use. Gem: anthropic
OpenAI
GPT family. Same streaming contract. Gem: ruby-openai
xAI
Grok models. OpenAI-compatible API. Gem: ruby-openai
Switching is one line
class WritingAgent < ApplicationAgent
  provider :anthropic   # → change to :openai or :xai — one line
  model    "claude-sonnet-4-6"
end

# Custom provider — implement the adapter interface
class MyProvider < ActiveAI::Provider::Base
  def stream(params, &block) = # …
end
ActiveAI.register_provider(:my_provider, MyProvider)
Built-in tools
WebSearch
Searches the web via configured provider. Requires search_provider in initializer.
WebPageReader
Fetches and extracts text from a URL. No configuration required.
Generate

Rails-native scaffolding for every component

Every piece of the AI layer has a generator. Run rails g active_ai:install once; then scaffold agents, tools, skills, workflows, orchestrators, and prompts as the app grows.

Install — run once
rails generate active_ai:install
# → app/ai/agents/application_agent.rb
# → app/ai/tools/application_tool.rb
# → app/ai/skills/application_skill.rb
# → app/ai/workflows/application_workflow.rb
# → app/ai/orchestrators/application_orchestrator.rb
# → config/ai.yml
# → config/initializers/active_ai.rb
Agent + Tool + Skill
rails generate active_ai:agent Writing
# → app/ai/agents/writing_agent.rb

rails generate active_ai:tool PriceCheck
# → app/ai/tools/price_check_tool.rb
# → test/ai/tools/price_check_tool_test.rb

rails generate active_ai:skill Tone
# → app/ai/skills/tone_skill.rb
# → test/ai/skills/tone_skill_test.rb
Workflow + Orchestrator + Prompt
rails generate active_ai:workflow ResearchAndDraft
# → app/ai/workflows/research_and_draft_workflow.rb

rails generate active_ai:orchestrator Writing
# → app/ai/orchestrators/writing_orchestrator.rb

rails generate active_ai:prompt agent writing
# → app/ai/agents/prompts/writing.md.erb
Memory system
rails generate active_ai:memory:install
rails db:migrate
# → 4 migrations (memories, embeddings, correlations, flags)
# → 3 memory agents
# → 4 background jobs

# With pgvector for semantic recall:
rails generate active_ai:memory:install --vector=pgvector

Each generator namespace is declared explicitly with namespace "active_ai:agent" to work around Rails converting ActiveAIactive_a_i in path lookups. This is handled internally — you run the generators as documented.

Origin

Extracted from real production use, not designed in the abstract

ActiveAI is built by and extracted from writer-v3, a personal writing application built around a human-first philosophy: the writer owns the work; AI assists with editing, suggests improvements, and explains its reasoning — but never drives.

The writer-v3 premise is that AI is most useful when it makes a writer better, not when it writes for them. That means the AI's role is to surface what isn't working, propose specific changes, and explain why — so the writer learns from every session rather than just accepting output.

The patterns here — streaming responses, skills as behavioral constraints, memory that accumulates understanding over sessions, orchestrators that route to the right specialist — all emerged from real use in a writing context and were generalized as they proved their value. These are not theoretical patterns. They are the answer to "what does it actually take to run LLM calls in a Rails app over weeks of use?"

"The writer owns the work; AI assists with editing, suggests improvements, and explains its reasoning — but never drives."

writer-v3 design premise
Start

Three steps to your first agent

1. Add to Gemfile
gem "active_ai", github: "revans/active_ai"
2. Install
$ bundle install
$ rails generate active_ai:install
3. Generate your first agent
$ rails generate active_ai:agent Writing
Active development
APIs, DSLs, generator output, and memory pipeline internals are subject to breaking changes between versions. Production deployment is at your own risk.