ActiveAI does for LLM calls what ActionMailer does for email — gives them a declared, testable home with provider-agnostic abstractions, a built-in agentic loop, streaming support, and full Rails instrumentation. Stop reinventing the plumbing.
Without a convention, they end up wherever they fit — controllers, service objects, background jobs — each reinventing message formatting, tool dispatch, provider routing, and streaming. The same four bugs appear in every codebase.
# app/controllers/documents_controller.rb
def summarize
client = Anthropic::Client.new
response = client.messages.create(
model: "claude-sonnet-4-6",
max_tokens: 1024,
system: "You are a writing assistant.",
messages: [{ role: "user",
content: "Summarize: #{@document.body}" }]
)
# no streaming, no tools, no instrumentation,
# no error handling, provider-specific API…
render json: { result: response.content.first.text }
end
# Same thing copied into PostsController,
# CommentsController, AiReviewJob…
# app/ai/agents/writing_agent.rb
class WritingAgent < ApplicationAgent
tools ActiveAI::Tools::WebSearch
skills ToneSkill
def initialize(document:, **kwargs)
@document = document
super(**kwargs)
end
end
# Blocking call — full agentic loop
response = WritingAgent.new(
document: @doc,
message: "Summarize this."
).complete
# Streaming — same agent, yields chunks
WritingAgent.new(document: @doc, message: "…")
.stream { |e| send_to_client(e) if e.is_a?(String) }
ActionMailer didn't make sending email easier by wrapping SMTP differently. It made it easier by giving email a place in your app. ActiveAI gives LLM calls the same thing: generators, config, conventions, and a class hierarchy the whole team can reason about.
| Rails / ActionMailer | ActiveAI |
|---|---|
ActionMailer::Base |
ActiveAI::Agent::Base |
ApplicationMailer |
ApplicationAgent |
UserMailer < ApplicationMailer |
WritingAgent < ApplicationAgent |
| SMTP delivery method | ActiveAI::Provider::Anthropic |
config/database.yml |
config/ai.yml |
config.action_mailer |
ActiveAI.configure |
rails g mailer User |
rails g active_ai:agent Writing |
ActionMailer::Base.deliveries (test) |
ActiveAI::TestHelper + stubs |
ActionMailer::Base.delivery_method |
provider :anthropic / :openai / :xai |
mail.deliver_later |
agent.stream { |event| … } |
If you've onboarded someone to ActionMailer, the same mental model transfers. The agent class is where LLM configuration lives; the instance is one conversation; the provider is the delivery mechanism.
| Library | Rails generators | config/ai.yml | Agentic loop | Workflows & orchestration | ActiveSupport instrumentation |
|---|---|---|---|---|---|
| ActiveAI | Yes | Yes | Yes | Yes (Workflow + Orchestrator) | Yes (10+ events) |
| langchainrb | No | No | Partial | No | No |
| ruby_llm | No | No | No | No | No |
| Direct SDK gems | No | No | Manual | No | No |
langchainrb (~2k GitHub stars) works in any Ruby context but is not Rails-first: no generators, no config/ai.yml, no ActiveSupport::Notifications. ruby_llm is Rails-friendly for single-turn chat but has no agentic loop, workflows, or memory pipeline. Direct SDK gems (ruby-openai, anthropic) are low-level API wrappers — you write the loop, tool dispatch, and error handling yourself. ActiveAI gives you all five conventions together.
Single agent call — library manages the loop
agentic loop — stream → tool dispatch → continuecomplete accumulation, run class methodApplicationAgentprovider, model, tools, skillsinitialize when injecting domain objectsinstance_tools when tools need per-call contextclass WritingAgent < ApplicationAgent
provider :anthropic
model "claude-sonnet-4-6", max_tokens: 8096
tools ActiveAI::Tools::WebSearch
skills ToneSkill
description "Drafts, edits, and structures written content."
def initialize(document:, **kwargs)
@document = document
super(**kwargs)
end
def instance_tools
return [] unless @document
[
WriteDocumentSectionTool.new(document: @document),
InsertDocumentSectionTool.new(document: @document)
]
end
end
# Inspect what will be sent — no API call
params = WritingAgent.new(document: @doc, message: "Rewrite section 3.").to_canonical_params
params[:system] # => resolved system prompt with injected skills
params[:tools] # => [{ name: "web_search", … }, { name: "write_document_section", … }]
Every agent call runs the same cycle. The provider streams back either text chunks or tool calls. Tool calls are executed immediately and their results sent back. The loop continues until the provider returns a final text response with no pending tool calls.
Agentic loop — arrows draw on scroll entry. Blue = LLM steps. Amber = tool execution.
agent.stream do |event|
case event
when String
# text chunk — write to SSE client
send_to_client(event)
when Hash
# { tool_call: { id:, name:, input: } }
# dispatched automatically
end
end
class ResearchAgent < ApplicationAgent
# Override per agent when needed
MAX_TOOL_ITERATIONS = 40
end
agent.last_usage
# => {
# input_tokens: 412,
# output_tokens: 89,
# cache_read_tokens: 100,
# stop_reason: "end_turn"
# }
Two routing modes. A Workflow coordinates agents in a fixed sequence you define — deterministic, predictable, easy to test. An Orchestrator lets the LLM decide which agent or workflow to call at runtime based on the input.
Left: step() chain — you control the order. Right: LLM picks the target at runtime.
class ResearchAndDraftWorkflow < ApplicationWorkflow
def run(input)
research = step(ResearchAgent, message: input)
draft = step(DraftingAgent,
message: "Write based on:\n#{research}")
step(ReviewAgent,
message: "Review and improve:\n#{draft}")
end
end
# Parallel steps — concurrent, ordered results
parallel_step(
[ResearchAgent, { message: "Find AI news." }],
[FactCheckAgent, { message: "Verify: #{claims}" }]
)
class WritingOrchestrator < ApplicationOrchestrator
# Agents register via `description`
# Orchestrator sees them as meta-tools
# and picks the right one at runtime
end
class WritingAgent < ApplicationAgent
description "Drafts and edits written content."
# ^ registers in ActiveAI.registry
end
result = WritingOrchestrator.new(
message: "Edit the intro for clarity."
).complete
The memory pipeline is a four-stage background processing system that converts completed agent conversations into searchable, tiered memory: PersistJob summarizes each session immediately after a turn, EmbedJob generates vector embeddings for semantic recall, TierJob ages memories from warm to cold storage on a daily schedule, and ConsolidateJob detects relationships between memory pairs on a weekly schedule. Agents opt in per class using recall_memory and receive relevant memories as prepended system-prompt context.
Reusable instruction blocks injected into an agent's system prompt. Define once, attach to any agent.
class ToneSkill < ApplicationSkill
skill_name "tone_guidelines"
def content
"Always use active voice. Be concise.
Prefer concrete examples over abstractions.
Do not use marketing language."
end
end
class WritingAgent < ApplicationAgent
skills ToneSkill
skills [BlogStructureSkill, CitationStyleSkill]
# or inline string skills
end
Past conversations distilled into a compact block and prepended to the system prompt.
Three strategies: :warm, :cold, :hybrid.
class WritingAgent < ApplicationAgent
recall_memory strategy: :warm, token_budget: 600
private
def memory_recall_context
# scope memories to current document
{ subject: @document }
end
end
# Injected context looks like:
# "Historical context (soft signal):
# Decisions: Use short paragraphs; active voice
# Open threads: Title still undecided"
Conversation ends →
Stages light up in sequence on scroll entry. Stages 3–4 run on a configurable schedule.
Speculative: ConsolidateJob — relationship detection between memory pairs — is
architecturally complete but described in the docs as active development.
PersistJob and EmbedJob
are further along. Production memory requires a real embedding provider replacing the default zero-vector stub.
Every agent turn, tool dispatch, workflow step, skill resolution, and orchestrator route fires an event. The flight recorder is always on. Attach subscribers only when you need them.
Subscribe in a Rails initializer# config/initializers/active_ai_instrumentation.rb
ActiveSupport::Notifications.subscribe("agent_complete.active_ai") do |event|
AgentLog.create!(
agent: event.payload[:agent_class],
model: event.payload[:model],
input_tokens: event.payload.dig(:usage, :input_tokens),
output_tokens: event.payload.dig(:usage, :output_tokens),
duration_ms: event.duration.round
)
end
# Caller context propagates automatically through nested calls:
# workflow_step.active_ai caller_type: :workflow
# agent_complete.active_ai caller_type: :workflow
# tool_call.active_ai caller_type: :agent
ActiveAI::Instrumentation.current_caller
# => { type: :agent, name: "WritingAgent" }
All 10 events
The framework is provider-agnostic at the routing level. Switching providers is one line in the agent DSL. Whether prompts translate cleanly across models is a separate concern.
anthropicruby-openairuby-openaiclass WritingAgent < ApplicationAgent
provider :anthropic # → change to :openai or :xai — one line
model "claude-sonnet-4-6"
end
# Custom provider — implement the adapter interface
class MyProvider < ActiveAI::Provider::Base
def stream(params, &block) = # …
end
ActiveAI.register_provider(:my_provider, MyProvider)
Built-in tools
WebSearchsearch_provider in initializer.WebPageReader
Every piece of the AI layer has a generator. Run rails g active_ai:install once;
then scaffold agents, tools, skills, workflows, orchestrators, and prompts as the app grows.
rails generate active_ai:install
# → app/ai/agents/application_agent.rb
# → app/ai/tools/application_tool.rb
# → app/ai/skills/application_skill.rb
# → app/ai/workflows/application_workflow.rb
# → app/ai/orchestrators/application_orchestrator.rb
# → config/ai.yml
# → config/initializers/active_ai.rb
rails generate active_ai:agent Writing
# → app/ai/agents/writing_agent.rb
rails generate active_ai:tool PriceCheck
# → app/ai/tools/price_check_tool.rb
# → test/ai/tools/price_check_tool_test.rb
rails generate active_ai:skill Tone
# → app/ai/skills/tone_skill.rb
# → test/ai/skills/tone_skill_test.rb
rails generate active_ai:workflow ResearchAndDraft
# → app/ai/workflows/research_and_draft_workflow.rb
rails generate active_ai:orchestrator Writing
# → app/ai/orchestrators/writing_orchestrator.rb
rails generate active_ai:prompt agent writing
# → app/ai/agents/prompts/writing.md.erb
rails generate active_ai:memory:install
rails db:migrate
# → 4 migrations (memories, embeddings, correlations, flags)
# → 3 memory agents
# → 4 background jobs
# With pgvector for semantic recall:
rails generate active_ai:memory:install --vector=pgvector
Each generator namespace is declared explicitly with namespace "active_ai:agent" to work around
Rails converting ActiveAI → active_a_i in path lookups. This is handled internally —
you run the generators as documented.
ActiveAI is built by Robert Evans and extracted from writer-v3, a personal writing application built around a human-first philosophy: the writer owns the work; AI assists with editing, suggests improvements, and explains its reasoning — but never drives.
The writer-v3 premise is that AI is most useful when it makes a writer better, not when it writes for them. That means the AI's role is to surface what isn't working, propose specific changes, and explain why — so the writer learns from every session rather than just accepting output.
The patterns here — streaming responses, skills as behavioral constraints, memory that accumulates understanding over sessions, orchestrators that route to the right specialist — all emerged from real use in a writing context and were generalized as they proved their value. These are not theoretical patterns. They are the answer to "what does it actually take to run LLM calls in a Rails app over weeks of use?"
"The writer owns the work; AI assists with editing, suggests improvements, and explains its reasoning — but never drives."
writer-v3 design premise