Inside Agentic Coding

Reproducing Claude Code's core flow in 900 lines

Something You Use Every Day


$ claude "fix this bug"

Claude read 5 files
Claude edited 2 files
Claude ran the tests
Claude confirmed tests pass

✓ Bug fixed

What actually happens in between?

Goal for Today


Walk through the core flow of an agentic coding tool
and examine a minimal working implementation


Starting from running code, not slides full of abstractions.

Part 1

Live Demo

Pocket Code


Pocket Code — deepseek-chat @ https://api.deepseek.com
Type /help for commands, /exit to quit.

> check bug.py and fix it

[thinking] Let me read the file first...
[tool] read_file({"path":"bug.py"})
[result] def add(a, b): return a - b # bug here

[thinking] Found it — subtraction instead of addition...
[tool] edit_file({"path":"bug.py","old_string":"a - b","new_string":"a + b"})
[confirm] edit_file: bug.py (replace 5 chars) (y/n) y
[result] OK

[thinking] Fixed. Let me verify...
[tool] run_command({"command":"python bug.py"})
[confirm] run_command: python bug.py (y/n) y
[result] {"exitCode":0,"output":"3\n","timedOut":false}

[answer] Bug fixed. add(1, 2) now correctly returns 3.

⌨️

Switch to terminal — live run

Have Pocket Code ready with a buggy file

Part 2

What Just Happened

The Agentic Loop

User Input
while (true)
messages[]
LLM APIchatCompletion()
tool_calls?
Yes
Confirm?permissions.ts
Execute Tooltools.ts
Result → messages[]
No
Print Answerreturn

Message Flow


// Round 1
→ { role: "user", content: "fix bug.py" }
← { role: "assistant", tool_calls: [read_file("bug.py")] }
→ { role: "tool", content: "def add(a,b): return a-b" }

// Round 2
← { role: "assistant", tool_calls: [edit_file(...)] }
→ { role: "tool", content: "OK" }

// Round 3
← { role: "assistant", tool_calls: [run_command(...)] }
→ { role: "tool", content: '{"exitCode":0,...}' }

// Round 4 — no tool_calls, loop ends
← { role: "assistant", content: "Bug fixed." }

Chatbot vs Agent


Chatbot

  • User says → LLM replies
  • Single turn
  • Text generation only
  • No interaction with the outside

Agent

  • User says → LLM decides what to do
  • Loops until done
  • Can call tools
  • Read files, write code, run commands

At the API level, the difference is the tools parameter + a loop.
Real products, of course, involve far more than that.

Part 3

Layer by Layer

7 files, 920 lines, outside-in

Entry — index.ts

140 lines

  • Parse CLI args (--model, --base-url)
  • Load POCKET.md as system prompt
  • Initialize MCP servers
  • Start the REPL, handle slash commands

Analogy to Claude Code: POCKET.md is CLAUDE.md
index.ts — REPL

initReadline();

while (true) {
  const input = await question("> ");
  const trimmed = input.trim();
  if (!trimmed) continue;

  // Slash commands: /model, /clear, /help, /exit
  if (trimmed.startsWith("/")) {
    handleSlashCommand(trimmed, agent, config);
    continue;
  }

  // Regular input → hand off to the Agent
  await agent.run(trimmed);
}
  

Core — agent.ts

123 lines The key file

The agent loop lives here

agent.ts — the core loop (~30 lines)

async run(userInput: string) {
  this.messages.push({ role: "user", content: userInput });

  while (true) {
    // 1. Call the LLM
    const reply = await chatCompletion(
      this.config, this.messages, this.tools
    );

    // 2. No tool_calls → final answer, done
    if (!reply.tool_calls?.length) {
      printAnswer(reply.content);
      this.messages.push(reply);
      return;
    }

    // 3. Has tool_calls → execute each one
    printThinking(reply.content);  // [thinking]
    this.messages.push(reply);

    for (const tc of reply.tool_calls) {
      const args = JSON.parse(tc.function.arguments);
      printToolCall(name, args);        // [tool]

      if (needsConfirmation.has(name)) {
        if (!await confirm(name, args)) continue;  // [confirm]
      }

      const result = await executeTool(name, args);
      printResult(result);              // [result]
      this.messages.push({
        role: "tool", tool_call_id: tc.id, content: result
      });
    }
    // ↻ Back to while(true) — LLM sees tool results, decides next step
  }
}
  

Why while(true)?


How many rounds does it take to fix a bug?

Round 1: read_file → read the code
Round 2: edit_file → apply the fix
Round 3: run_command → run tests
Round 4: tests fail → error fed back to LLM
Round 5: edit_file → try again
Round 6: run_command → tests pass
Round 7: final answer

The LLM decides how many rounds it needs.
You don't write if/else to orchestrate steps — the LLM is the orchestrator.

LLM Wrapper — llm.ts

138 lines

  • OpenAI-compatible format — one POST request
  • Send messages + tools, receive message
  • Retry on 429, 120s timeout
  • Switch models by changing the base URL
llm.ts — the request body

{
  "model": "deepseek-chat",
  "messages": [
    { "role": "system", "content": "You are a coding assistant..." },
    { "role": "user", "content": "fix bug.py" },
    { "role": "assistant", "tool_calls": [...] },
    { "role": "tool", "content": "def add(a,b): return a-b" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "read_file",
        "description": "Read file contents (max 2000 lines)",
        "parameters": {
          "type": "object",
          "properties": { "path": { "type": "string" } },
          "required": ["path"]
        }
      }
    }
  ]
}
  
This is function calling.
The LLM doesn't execute tools — it tells you what to call. You execute.

Tool System — tools.ts

249 lines Largest file

Auto read_file
Confirm write_file
Auto list_dir
Confirm edit_file
Auto search_files
Confirm run_command
Auto ask_user

Basic rule: read-only operations run automatically, writes and commands need human confirmation.
Claude Code's permission model is more granular, but the starting point is the same.
tools.ts — definitions + execution

// 1. JSON Schema tells the LLM what tools are available
export const toolDefs: ToolDef[] = [
  {
    type: "function",
    function: {
      name: "read_file",
      description: "Read file contents (max 2000 lines)",
      parameters: {
        type: "object",
        properties: { path: { type: "string" } },
        required: ["path"],
      },
    },
  },
  // ... 6 more tools
];

// 2. A switch dispatches execution
export async function executeTool(name, args) {
  switch (name) {
    case "read_file":  return readFileTool(args.path);
    case "write_file": return writeFileTool(args.path, args.content);
    case "run_command": return runCommandTool(args.command);
    // ...
  }
}
  

Output Truncation


// What if the LLM reads a 100K-line log file?
read_file → max 2000 lines
search_files → max 100 matches
run_command → max 2000 lines + 30s timeout

[truncated, showing first 2000 of 98473 lines]

No truncation → context overflow → agent loop breaks
Every agent framework has to deal with this

Permissions — permissions.ts

20 lines Smallest file


export async function confirm(toolName, args) {
  // write_file: demo/index.html (2847 chars)
  // run_command: npm test
  const summary = summarize(toolName, args);
  const answer = await question(`[confirm] ${summary} (y/n) `);
  return answer.trim().toLowerCase() === "y";
}
  

Human in the loop.
The AI can't silently rm -rf /

MCP — mcp.ts

191 lines

Model Context Protocol — connect external tools to the agent

MCP Lifecycle


initialize
initialized
tools/list
Running
Shutdown


// pocket.json — declare MCP servers
{
  "mcpServers": {
    "weather": {
      "command": "node",
      "args": ["./my-weather-server.js"]
    }
  }
}
  

MCP tools and built-in tools are registered together.
The LLM doesn't care where a tool comes from — it only sees the JSON Schema.

Terminal UI — ui.ts

59 lines

[thinking] Let me read this file...
[tool] read_file({"path":"bug.py"})
[confirm] edit_file: bug.py (replace 5 chars) (y/n)
[result] OK
[error] LLM API error 429
[answer] Bug fixed.

⠹ Thinking...

Each color = one phase of the agent loop
This is how Pocket Code makes every step visible

Part 4

Comparison & Reflection

Pocket Code vs Production


What Pocket Code has

  • Agentic loop
  • 7 built-in tools
  • Permission confirmation
  • MCP support
  • Multi-model switching
  • Project instruction file

What Claude Code adds

  • Streaming output
  • Context compression / overflow
  • Sandbox isolation
  • Parallel sub-agents
  • File rollback / checkpoints
  • Memory system

Where Are the Gaps?


  • Context management — conversation too long? Compression, summarization, sliding window
  • Streaming — users don't want to stare at a blank screen for 30 seconds
  • Security — y/n isn't enough: sandbox, path restrictions, command allowlists
  • Fault tolerance — broke a file? Checkpoints + rollback
  • Parallelism — multiple sub-agents reading files and running tasks concurrently

These gaps span both engineering effort and design philosophy.

Takeaway


3

core concepts


1. LLM as decision maker — it chooses which tool to call, not your hardcoded logic

2. Tools as capabilities — JSON Schema defines what's available, results feed back to the LLM

3. The loop as the backbonewhile(true) lets the agent decide when it's done

In One Line



Agent = LLM + Tools + Loop


This is the minimal form. Production builds a lot more on top.

Q&A