A strict 12-week plan to go from zero AI to shipping a working coding agent — designed for a full-time software engineer who works 8–5, Mon–Fri, with limited time. No prior AI knowledge required.
May 23, 2026~25 min read~9 min listen12 weeks~5 hours/week
You're a staff-level software engineer. You can code. You understand architecture, debugging, and shipping software. The gap is that you have no AI/LLM-specific knowledge. This plan bridges that gap using only your existing engineering skills.
Core assumption — You have ~5 hours per week devoted to this. That's roughly 45 minutes on 3-4 evenings and one longer 3-hour block on Saturday. If you have more, accelerate; if less, extend the timeline proportionally. Stick to the plan — don't skip weeks, don't jump between topics.
What You'll Build
By the end of 12 weeks, you'll have a working coding agent — an AI-powered assistant that can read your codebase, propose changes, write patches, and optionally apply them. Along the way, you'll learn the patterns that scale to building a general-purpose agent orchestrator in the mold of Hermes, OpenClaw, or CrewAI — where multiple agents coordinate to solve complex tasks.
By the End, You'll Be Able To:
Design and implement agent architectures — single agent and multi-agent orchestrations
Write effective prompts — the #1 skill in AI development, and something you'll use every day
Integrate LLM APIs — OpenAI, Anthropic, local models via OpenAI-compatible endpoints
Build agents that use tools — file I/O, code execution, web search, databases
Ship a production-ready SaaS — either sell the agent as a service, white-label it, or sell your expertise building agents for clients
🎯 Final Milestone (Week 12)
A deployed agent that can take a feature request, read your codebase, design a solution, write the code, and apply the changes — all with human-in-the-loop approval.
The Three Pillars
Everything you need to learn falls into three buckets. Each week focuses on one pillar, in order:
🧠
Pillar 1: LLM Fundamentals
How models work, what prompts do, tokenization, temperature, contexts, costs, and limitations
🔧
Pillar 2: Agent Patterns
Prompt chains, tool use, function calling, structured output, multi-agent orchestration
🚀
Pillar 3: Ship & Monetize
API design, deployment, UX, pricing models, positioning, and go-to-market strategy
Weeks 1–2: LLM Foundations
Pillar 1
Your goal: understand how LLMs work at a practical level so you can reason about them like a system. No math required. Focus on intuition, engineering patterns, and cost awareness.
Sunday (Evening, 1h): What Is a Large Language Model?
Watch: "The spelled-out intro to LLMs" by 3Blue1Brown on YouTube (3Blue1Brown YouTube). ~2 hours if you want the full series, but the first two videos will give you the core intuition.
Read: "What are Large Language Models?" by Google — a high-level overview of how models are trained, what tokens are, and what they can generate.
Key takeaway: LLMs are autoregressive token predictors — they don't "reason," they predict the next tokens based on context. This shapes everything about how you design agents.
Do: OpenAI's ChatGPT and Claude. Play with the model. Try prompting it on things you'd normally code. Read its output critically. Notice what it gets wrong. This is your first data set.
Tuesday (Evening, 1h): Prompt Engineering Basics
Read: OpenAI's "Prompt Engineering" guide — the official documentation. OpenAI Prompt Guide
Key concepts: System prompts, few-shot examples, chain-of-thought, structured output, temperature, max tokens, stop sequences.
Do: Write 10 prompts for the same task (e.g., "explain a software concept") and compare outputs. Try zero-shot vs. few-shot vs. chain-of-thought. This is your most important hands-on exercise — you're learning the #1 skill in AI development.
Thursday (Evening, 1h): API & Tool Integration
Read: OpenAI API reference. Focus on the chat completions endpoint, message format (system/user/assistant), and response formats. API Reference
Do: Write a Python script that calls the OpenAI API using requests or openai Python SDK. Send a message. Get a response. Log it. This is a real API call — not using ChatGPT directly, but programmatically.
Saturday (3h block): Build a Prompt Runner
Build a Python CLI tool that: takes a prompt from stdin + a model name, calls the OpenAI API, returns the response.
Add support for temperature, max_tokens, system_prompt, and model as arguments.
This is your "Hello World" — a command-line prompt runner. You'll use this throughout the next weeks to experiment before building the full agent.
Deliverable: A working CLI tool. Commit it to GitHub.
You can call LLM APIs directly, write effective prompts, and understand the limits of what the model can do. You have a CLI tool you'll use throughout.
Weeks 3–4: Prompt Engineering & Structured Output
Pillar 1 → Pillar 2
Your goal: write prompts that produce reliable, structured, reproducible outputs. This is where 80% of AI development happens — prompt writing.
Sunday (Evening, 1h): Advanced Prompt Patterns
Read: "Chain of Thought Prompting" — OpenAI cookbook. OpenAI Techniques
Key concept: Models can call functions you define. They return arguments in a structured format. Your code executes the function and feeds the result back to the model. This is the foundation of agent tool use.
Do: Extend your CLI tool to support function calling. Define a function like read_file(path: str) -> str and have the model call it.
Read: Outlines library docs — how to force LLM output into Pydantic models / JSON schemas. Outlines docs
Do: Write a Python function that takes a prompt and a JSON schema, calls the OpenAI API with response_format={"type": "json_schema", ...}, and returns a parsed struct. Test with prompts like "list 5 file patterns to ignore in Python projects" and verify the output matches your schema.
Why this matters: Structured output is how you build agents — the model returns structured data you can use programmatically instead of raw text.
Saturday (3h block): Build a Code Review Agent v1
Build the simplest possible coding agent. It reads a file, sends it to the LLM with instructions like "review this code for bugs, style issues, and improvements," and returns a structured review as JSON.
Deliverable: Python script + function. Accepts a .py or .rs file path from CLI args. Returns JSON with {"bugs": [...], "suggestions": [...], "overall_score": 7}.
Ship it. Commit it. It should work end-to-end.
Resources for Weeks 3–4:
OpenAI Cookbook — function calling & structured output examples (GitHub)
You can write prompts that produce structured, tool-driven code outputs. Your code review agent works end-to-end. You understand function calling and structured output.
Weeks 5–6: Your First Real Agent
Pillar 2
Your goal: build an agent that can autonomously use tools (read/write files, execute code) in a loop — the core agent pattern.
Sunday (Evening, 1h): Learn Agent Frameworks
Read: Brief comparisons of agent frameworks. Focus on LangGraph (state-based graph, great for coding agents), OpenAI Agents SDK (minimal, opinionated), and CrewAI (multi-agent orchestration).
Recommendation: Start with LangGraph — it's the most flexible, most documented, and gives you the deepest understanding of agent mechanics because you build the graph yourself rather than hiding behind abstractions.
Why this matters: Frameworks handle the boilerplate (API calls, state management, retry logic, streaming). You'll use one throughout the rest of the plan.
Do: Follow the tutorial end-to-end. Build a simple graph with nodes for "user input," "agent," "tool," and "condition" (conditional edge for re-trying).
Key concept: A graph is a set of nodes (functions) connected by edges. Each node has state. Conditional edges decide which node runs next based on the model's output. This is the mental model for every agent.
This is a Tool-Using Agent — the model decides which tools to call based on the task.
Saturday (3h block): Build a File Organizer Agent
Build an agent that takes a directory path as input, reads all files, and organizes/restructures them based on rules the user provides.
Example: "Move all Python files into a src/ folder, test files into tests/, and create a proper package structure."
Deliverable: A LangGraph agent with file read/write tools that actually restructures a real directory. It reads the directory, the model plans the structure, and the agent executes the moves.
This is your first "real" agent — it observes state, reasons about it, takes action, and handles the loop (read → plan → act → verify).
Why a file organizer, not a code agent?
This is a stepping stone. A file organizer teaches you the full agent loop (observe → plan → act → verify) with files, which is 90% of what a coding agent does. Once this works, the coding agent is just the same pattern with better prompts and more sophisticated tools.
🎯 Milestone 3 (End of Week 6)
You have a working agent that uses tools, reads files, writes files, and executes commands. You understand the LangGraph state machine pattern. You can build any agent from this pattern.
Weeks 7–8: Multi-Agent Orchestration
Pillar 2
Your goal: move from a single agent to a multi-agent system — the kind that powers products like Hermes, OpenClaw, and CrewAI.
Sunday (Evening, 1h): Multi-Agent Architectures
Read: "Multi-Agent Architectures" in OpenAI's guidance docs and CrewAI documentation. Understand the difference between:
Swarm / sequential — agents pass work to each other in a pipeline
Supervisor / hierarchical — one agent orchestrates others (like a manager)
Graph-based — agents are nodes in a graph with conditional routing
Concept: One agent (the supervisor) breaks a task into subtasks, assigns each to a specialized agent, collects the results, and compiles a final output.
Example: User task: "Refactor this module to use async/await." Supervisor agent reads the module, breaks the task into: read files → analyze architecture → write changes → test → review. Each is a sub-agent with its own prompt.
Do: Build this in LangGraph. Two agents: one analyzer, one writer. Supervisor routes between them.
Thursday (Evening, 1h): Agent Communication Patterns
Read: How agents communicate — shared state, message passing, tool outputs. Focus on shared state (simplest, works well with LangGraph).
Read: CrewAI agent communication patterns for comparison. CrewAI docs
Do: Extend your multi-agent system so the analyzer agent writes its findings to shared state, and the writer agent reads them to inform its changes.
Saturday (3h block): Build a Multi-Agent Coding Agent
Build a multi-agent coding agent with at least two roles:
Planner Agent: reads the codebase, identifies what needs to change, writes a spec
Builder Agent: takes the spec and writes the actual code
Reviewer Agent (optional): reviews the changes before applying
Deliverable: Multi-agent LangGraph system. User provides a feature request and a code directory. The planner analyzes, the builder writes, the reviewer approves.
This is core production software — a multi-agent coding assistant.
🎯 Milestone 4 (End of Week 8)
You have a multi-agent system that plans and builds code changes from a feature request. This is the core of products like Cursor, OpenClaw, and Copilot Agents — now in your own codebase.
Weeks 9–10: Tool Integration & Reliability
Pillar 2
Your goal: make the agent production-ready. This is the difference between a toy and a product.
Sunday (Evening, 1h): Advanced Tool Use
Concept: Tools are your agent's hands. More/better tools = more capable agent. Think about what tools a coding agent needs:
Implement: A Git tool that can: diff, commit, create branch, show status, show diff — wrapped in a LangGraph tool.
Why this matters: Without git tools, the agent can't safely change code. With them, it can propose changes in a branch, create diffs, and let you review before merging. This is how Cursor/Claude Work operate.
Security: Always require human approval before any tool that modifies files or runs commands. This is "human-in-the-loop" — the agent proposes, you approve.
Read: "Evaluating LLM applications" — LangSmith (by LangChain) or Arize Phoenix for agent evaluation. Arize Phoenix
Concept: How do you test an agent? You need regression tests — prompts that should always produce correct results. Track metrics: token cost, response time, tool accuracy, correctness score.
Do: Write 5 test prompts for your coding agent. Each one specifies a code change and a success criterion (e.g., "the agent should write a function that reverses a list in Python. Test: the function exists and is syntactically valid").
Read: "LLM-as-a-Judge" pattern — using a strong model to evaluate another model's output.
Saturday (3h block): Integrate Web Search & Improve Tool Set
Give your agent web search capability (via an API — SearXNG, Tavily, or DuckDuckGo). Now when the agent encounters unfamiliar APIs or concepts, it can look them up.
Test your full pipeline on real codebases. Iterate on prompts until the agent consistently produces usable code changes.
Deliverable: A self-improving coding agent with 6+ tools (file read/write, git, shell, web search, structured output).
🎯 Milestone 5 (End of Week 10)
Your coding agent has real tools, can self-correct using web search, and you can systematically evaluate its outputs. This is no longer a toy — it's a product in beta.
Weeks 11–12: Ship It
Pillar 3
Your goal: deploy a usable product. Whether you sell it as a service, open-source it, or use it as a showcase of your skills — the agent goes live.
Sunday (Evening, 1h): Product Design
Decide your product form. Options:
SaaS: Web app where users submit feature requests and the agent writes the code (like Cursor or Replit Agent)
CLI tool: Command-line agent for developers to run locally (like Copilot CLI or Cursor CLI)
SDK/Library: Embeddable agent framework for other developers to build on (like LangChain or CrewAI)
Service: You use your agent to build coding agents for clients (your own service business)
Recommendation: Start with a CLI tool + API. Deploy the API, write a simple web UI. This gives you the most flexibility — you can later open-source or re-skin for clients.
Tuesday (Evening, 1h): API & Deployment
Build: Expose your agent as an API. FastAPI is the natural choice — same as your ThinkSmart stack. Endpoints: POST /chat (submit a task), GET /task/{id}/status (check progress), GET /task/{id}/result (get output).
Deploy: Deploy to a VPS (Hetzner, Linode, or AWS Lightsail — ~$5-10/month). You're a senior engineer; Docker + nginx on a VPS is well within your comfort zone.
Thursday (Evening, 1h): Frontend / UX
Build: A minimal web interface. Stream the agent's output (like ChatGPT — show tool calls, file changes, reasoning steps in real-time). This is critical UX — users need to see the agent thinking.
Use: FastHTML, Streamlit, or even plain HTML with WebSocket streaming. Your ThinkSmart knowledge of FastAPI + Jinja2 applies directly.
Key UX: Each API call should stream back: the agent's thought process, tool calls, file diffs, and final result. Make this visible and beautiful.
Deliverable: A public URL where anyone can submit a coding task and watch an AI agent work on it. You have a real product.
🎯 Milestone 6 (End of Week 12)
You have a deployed, working coding agent. People can submit tasks to it. You have a product you can sell, open-source, or use as your entry ticket into the AI agent space.
Framework Comparison
Here's a concise comparison of the leading agent frameworks, evaluated for your use case (coding agent, full-time engineer, learning path):
Framework
Language
Best For
Learning Curve
Starter Time
LangGraph
Python / JS
Multi-agent, custom workflows
Medium
~3h to graph
OpenAI Agents SDK
Python / JS
Simple single agents
Easy
~1h to hello-world
CrewAI
Python
Multi-agent orchestration
Easy
~2h to crew
AutoGen (Microsoft)
Python
Conversational multi-agent
Hard
~5h to configure
Google ADK
Python / JS
Google ecosystem apps
Medium
~3h to agent
Recommendation: Start with LangGraph (Python). It gives you maximum control, clear state management, and the deepest understanding of how agents actually work. You'll read less "magic" and more code you understand. If you later want rapid multi-agent setup, CrewAI is a good follow-up — it wraps similar concepts but hides the graph.
Monetization Paths
Once you've built the agent, here's how to make money from it:
Path 1: Sell as a SaaS
Product: Web app where users submit feature requests and the agent generates code
Pricing: $19-99/month per user (like Cursor at $20/mo, Replit Agent at $25/mo)
Positioning: "Your AI coding partner" — compete on UX, speed, and cost-effectiveness
Difficulty: Medium — requires a good UI, payment processing, and scaling infrastructure
Path 2: Sell as a Service
Product: You build coding agents for clients (your own service business)
Pricing: $2,000-10,000 per project (consulting pricing for AI agent development)
Positioning: "We build AI agents for your codebase" — position as a high-end dev shop with AI expertise
Difficulty: Easy — you're the product. Your agent is your proof-of-work.
Difficulty: Hard — requires community building, documentation, and ongoing maintenance
Path 4: Freelance Agent Engineering
Product: You use your agent to build agents for businesses
Pricing: $50-150/hour, or $5,000-50,000 per engagement
Positioning: "Staff engineer who builds AI agents" — rare combination of deep engineering + AI skills
Difficulty: Easiest — leverage your existing senior dev reputation, add AI agent skills as a differentiator
My recommendation: Start with Path 4 (freelance) to validate the market and build credibility. Move to Path 2 (service-based) once you have case studies. Consider Path 1 (SaaS) as a long-term play once the agent is polished enough to productize.
Common Pitfalls to Avoid
❌ Building without prompting first
Don't jump straight into building an agent framework before you've written hundreds of prompts. Prompt writing is the #1 skill in AI development. Spend Weeks 1-4 deliberately on this. Every hour spent on prompts is worth 10 hours spent on code.
❌ Using expensive models for everything
Not every task needs GPT-4o. Use cheaper models (GPT-4o-mini, Claude Haiku, OpenRouter) for simple tasks and reserve the expensive ones for complex reasoning. Track your token costs from Week 1.
❌ Over-engineering the tool set
Start with file read/write and shell execution. Don't add 20 tools upfront. Each tool adds complexity and cost. Add tools based on real agent failures, not hypothetical needs.
❌ Assuming the AI agent will be "right"
Agents make mistakes. They hallucinate, misread code, and produce buggy output. Build in verification at every step — the reviewer agent, syntax validation, test execution. Treat the agent as a junior developer, not a senior one.
❌ Chasing every new framework
Pick LangGraph and stick with it for 8+ weeks. Don't jump to CrewAI, then AutoGen, then LangChain. Master one framework deeply, then you can learn others in days.
Complete Timeline Summary
Weeks
Focus
Deliverable
1–2
LLM Fundamentals & Prompting
Prompt runner CLI tool
3–4
Structured Output & Function Calling
Code review agent
5–6
LangGraph & First Agent
File organizer agent
7–8
Multi-Agent Orchestration
Planner + Builder multi-agent system
9–10
Tools, Tool Use & Reliability
Full coding agent with 6+ tools
11–12
Ship & Monetize
Deployed, public-facing product
The path from engineer to agent builder: 12 weeks, 5 hours/week, 60 hours total. That's less time than most engineering bootcamps. Your existing senior-level software engineering skills mean you can skip 80% of the hard parts (data structures, API design, system architecture, testing). You're adding AI on top of a solid foundation — not building from zero.