BAML Toolkit Plugin for Claude Code

Comprehensive toolkit for BAML development, enabling type-safe LLM integrations with ease.

BAML Toolkit Plugin for Claude Code

Comprehensive toolkit for BAML (Boundary ML) development - build type-safe LLM integrations faster with project scaffolding, code generation, testing, and framework integration.

What is BAML?

BAML is a domain-specific language for building LLM applications with:

  • Type-safe function definitions - No more JSON parsing headaches
  • Multi-language code generation - Python, TypeScript, Go, Ruby, Elixir
  • Built-in error handling - Retry logic, broken JSON fixing
  • IDE integration - VSCode autocomplete and playground

Learn more: BAML Documentation

Features

Slash Commands (User-Invoked)

  • /baml-toolkit:init - Initialize new BAML project with structure and configuration
  • /baml-toolkit:generate - Generate client code from BAML definitions (with watch mode)
  • /baml-toolkit:test - Run BAML function tests and display results
  • /baml-toolkit:create-function - Scaffold new BAML function with types and prompts
  • /baml-toolkit:create-type - Create BAML type definitions (class, enum, alias)
  • /baml-toolkit:create-client - Configure LLM client (OpenAI, Anthropic, Google, etc.)
  • /baml-toolkit:integrate - Integrate with frameworks (React, FastAPI, Express, Django, etc.)
  • /baml-toolkit:batch-gemini - NEW! Batch process multiple files through BAML functions using Gemini

Skills (AI-Invoked Autonomously)

Claude will automatically help with BAML when:

  • baml-scaffolder: You describe wanting to extract/classify/generate with LLMs
  • baml-generator: .baml files are modified (auto-regenerates client code)
  • baml-integrator: You ask about using BAML in your app framework
  • gemini-batch-processor: NEW! You need to process multiple files cost-effectively with Gemini

Auto-Generation Hook

The plugin automatically regenerates client code when you modify .baml files - no manual steps needed!

Installation

Prerequisites

  • Claude Code installed
  • BAML CLI installed:
    # Using npm (recommended)
    npm install -g @boundaryml/baml
    
    # Or using pip
    pip install baml-py
    
  • LLM API key (OpenAI, Anthropic, or Google)

Install Plugin

# Install from directory
claude plugin install /Users/orlandobruno/Documents/Dev/baml-toolkit --scope user

# Restart Claude Code to load the plugin

Verify Installation

# Check plugin is loaded
claude plugin list

# Should see "baml-toolkit" in the list

✨ What's New: Gemini Batch Processing

Process thousands of documents cost-effectively!

The plugin now includes powerful batch processing capabilities using Google's Gemini API:

  • šŸ’° 17x cheaper than GPT-4 (with Gemini Flash)
  • šŸ“„ 2M token context - process entire documents at once
  • ⚔ Fast & efficient - batch hundreds of files in minutes
  • šŸŽÆ Type-safe - BAML schemas ensure consistent results
# Process 500 invoices for ~$0.50
/baml-toolkit:batch-gemini ExtractInvoice invoices/*.pdf --output results.jsonl

# Classify 1000 reviews for ~$0.08
/baml-toolkit:batch-gemini ClassifyReview reviews.csv --column "text" --output classified.csv

See Batch Processing Examples below!

Quick Start

1. Initialize a BAML Project

/baml-toolkit:init my-llm-app

This creates:

my-llm-app/
ā”œā”€ā”€ baml_src/
│   ā”œā”€ā”€ main.baml           # Your function definitions
│   └── clients.baml        # LLM client configurations
ā”œā”€ā”€ baml_client/            # Generated code (auto-created)
└── baml.config.json        # BAML configuration

2. Configure Your LLM Client

# Set your API key
export OPENAI_API_KEY="sk-..."
# Or for Gemini batch processing
export GOOGLE_API_KEY="your-gemini-key"

# Create a client configuration
/baml-toolkit:create-client "GPT4 client for OpenAI"
# Or use Gemini for cost-effective batch processing
/baml-toolkit:create-client "Gemini for batch processing"

3. Create Your First Function

/baml-toolkit:create-function "ExtractReceipt that takes an image and returns vendor, items, and total"

Or just describe what you need:

User: "I need to extract contact info from emails"
Claude: [Uses baml-scaffolder skill to create function automatically]

4. Generate Client Code

/baml-toolkit:generate --watch

The plugin automatically regenerates code when you edit .baml files!

5. Test Your Function

/baml-toolkit:test ExtractReceipt

6. Use in Your App

/baml-toolkit:integrate react
# or fastapi, express, django, etc.

Usage Examples

Example 1: Receipt Extraction

Create function (automatically done by Claude):

class Receipt {
  vendor string
  date string
  items ReceiptItem[]
  total float
}

class ReceiptItem {
  name string
  quantity int
  price float
}

function ExtractReceipt(image: image) -> Receipt {
  client GPT4Vision
  prompt #"
    Extract receipt information from this image.
    Return vendor, date, items, and total.

    {{ _.role('user') }}
    Image: {{ image }}
  "#
}

Use in Python:

from baml_client import b

result = await b.ExtractReceipt(image="receipt.jpg")
print(f"Vendor: {result.vendor}")
print(f"Total: ${result.total}")
for item in result.items:
    print(f"- {item.name}: ${item.price}")

Use in TypeScript:

import { b } from './baml_client';

const result = await b.ExtractReceipt({ image: "receipt.jpg" });
console.log(`Vendor: ${result.vendor}`);
console.log(`Total: $${result.total}`);

Example 2: Email Classification

Describe your need:

User: "I want to classify customer emails as urgent, normal, or spam"
Claude: [Creates BAML function automatically]

Generated BAML:

enum EmailPriority {
  URGENT
  NORMAL
  SPAM
}

function ClassifyEmail(text: string) -> EmailPriority {
  client Claude
  prompt #"
    Classify this email's priority.
    Choose: URGENT, NORMAL, or SPAM

    Email: {{ text }}
  "#
}

Use immediately (code auto-generated):

priority = await b.ClassifyEmail(text=email_content)
if priority == EmailPriority.URGENT:
    send_alert()

Example 3: React Integration

/baml-toolkit:integrate react

Generated hook:

import { useBAML } from '@boundaryml/baml-react';
import { b } from '../baml_client';

export function useExtractReceipt() {
  return useBAML(b.ExtractReceipt);
}

Use in component:

const { execute, data, loading, error } = useExtractReceipt();

const handleUpload = async (imageUrl: string) => {
  const result = await execute({ image: imageUrl });
  console.log(result.vendor, result.total);
};

Slash Commands Reference

/baml-toolkit:init [directory]

Initialize a new BAML project.

/baml-toolkit:init                    # Current directory
/baml-toolkit:init my-project         # New directory
/baml-toolkit:init ./backend/baml     # Specific path

/baml-toolkit:generate [options]

Generate client code from BAML definitions.

/baml-toolkit:generate                # Generate once
/baml-toolkit:generate --watch        # Watch mode (auto-regenerate)
/baml-toolkit:generate python         # Specify language

/baml-toolkit:test [function] [options]

Run BAML tests.

/baml-toolkit:test                    # Run all tests
/baml-toolkit:test ExtractReceipt     # Test specific function
/baml-toolkit:test --watch            # Watch mode
/baml-toolkit:test --verbose          # Detailed output

/baml-toolkit:create-function "<description>"

Create a new BAML function.

/baml-toolkit:create-function "ExtractReceipt from image returning vendor and total"
/baml-toolkit:create-function "ClassifyEmail as urgent/normal/spam"
/baml-toolkit:create-function "SummarizeArticle returning title and key points"

/baml-toolkit:create-type "<description>"

Create BAML type definition.

/baml-toolkit:create-type "Receipt class with vendor, date, items, total"
/baml-toolkit:create-type "Priority enum: LOW, MEDIUM, HIGH, URGENT"
/baml-toolkit:create-type "ContactInfo with optional email and phone"

/baml-toolkit:create-client "<description>"

Configure LLM client.

/baml-toolkit:create-client "GPT4 for OpenAI"
/baml-toolkit:create-client "Claude with vision support"
/baml-toolkit:create-client "Ollama for local Llama model"

/baml-toolkit:integrate <framework>

Integrate with application framework.

/baml-toolkit:integrate react         # React/Next.js
/baml-toolkit:integrate fastapi       # FastAPI (Python)
/baml-toolkit:integrate express       # Express (Node)
/baml-toolkit:integrate django        # Django (Python)
/baml-toolkit:integrate streamlit     # Streamlit (Python)

Autonomous AI Workflows

Claude will automatically help with BAML when you:

Describe LLM needs (baml-scaffolder skill):

User: "I need to extract product info from reviews"
Claude: [Creates BAML function with proper types automatically]

Modify BAML files (baml-generator skill):

User: [Edits function.baml]
Claude: [Detects change, auto-regenerates client code]
Plugin Hook: [Also regenerates via hook for double safety]

Ask about integration (baml-integrator skill):

User: "How do I use this in my FastAPI app?"
Claude: [Creates endpoint, error handling, full example]

Templates

Pre-made templates in templates/ directory:

  • function-template.baml - Generic function scaffold
  • extraction-template.baml - Data extraction pattern
  • classification-template.baml - Classification pattern
  • vision-template.baml - Image/video analysis
  • clients-template.baml - LLM client configurations

Best Practices

1. Project Structure

project/
ā”œā”€ā”€ baml_src/               # Source (commit to git)
│   ā”œā”€ā”€ main.baml
│   ā”œā”€ā”€ clients.baml
│   └── types.baml
ā”œā”€ā”€ baml_client/            # Generated (gitignore)
ā”œā”€ā”€ .env                    # API keys (gitignore)
└── baml.config.json

2. Environment Variables

Always use environment variables for API keys:

# .env file
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AI...
client<llm> GPT4 {
  provider openai
  options {
    model gpt-4-turbo-preview
    api_key env.OPENAI_API_KEY  # Reference env var
  }
}

3. Type Safety

Leverage generated types:

// TypeScript - full autocomplete
const result: Receipt = await b.ExtractReceipt({ image: "..." });
result.vendor // Type-safe access
# Python - type hints
result: Receipt = await b.ExtractReceipt(image="...")
result.vendor  # IDE autocomplete works

4. Testing

Always include test cases:

test ExtractReceipt {
  functions [ExtractReceipt]
  args {
    image @"./test_images/sample.jpg"
  }
  assert {
    {
      checks [
        this.vendor.length > 0,
        this.total > 0,
        this.items.length > 0
      ]
    }
  }
}

5. Watch Mode During Development

/baml-toolkit:generate --watch

Instant feedback on every change!

Batch Processing with Gemini

Process hundreds or thousands of files efficiently and cost-effectively using Gemini's API.

Why Gemini for Batch Processing?

FeatureGemini FlashGemini ProGPT-4 Turbo
Input cost (per 1M tokens)$0.075$1.25$10.00
Output cost (per 1M tokens)$0.30$5.00$30.00
Context window2M tokens2M tokens128K tokens
Best forSimple extraction, classificationComplex reasoning, detailed analysisPremium tasks

Gemini Flash is 17x cheaper than GPT-4 and perfect for bulk processing!

Quick Start: Batch Process Invoices

1. Create BAML function (Claude does this automatically):

User: "I need to extract data from invoices - vendor, date, items, total"
Claude: [Creates complete BAML function]

2. Process all invoices:

/baml-toolkit:batch-gemini ExtractInvoice invoices/*.pdf \
  --output invoice_data.jsonl \
  --model gemini-1.5-flash \
  --batch-size 50

3. View results:

cat invoice_data.jsonl | jq '.result | {vendor, total}'

Example: Invoice Extraction

# Process 500 invoices
/baml-toolkit:batch-gemini ExtractInvoice invoices/**/*.pdf --output results.jsonl

# Output (JSONL format):
# {"file": "invoice_001.pdf", "status": "success", "result": {"vendor": "Acme Corp", "total": 1234.56}}
# {"file": "invoice_002.pdf", "status": "success", "result": {"vendor": "TechCo", "total": 987.65}}

Cost estimate: 500 invoices Ɨ 50KB each = ~$0.50 (with Gemini Flash)

Example: CSV Classification

# Classify 1000 customer reviews from CSV
/baml-toolkit:batch-gemini ClassifyReview reviews.csv \
  --column "review_text" \
  --output classified_reviews.csv

Input CSV:

id,review_text,date
1,"Great product!",2024-12-20
2,"Shipping was slow",2024-12-21

Output CSV:

id,review_text,date,sentiment,category,confidence
1,"Great product!",2024-12-20,POSITIVE,PRODUCT,0.95
2,"Shipping was slow",2024-12-21,NEGATIVE,SERVICE,0.88

Cost estimate: 1000 reviews Ɨ 100 chars each = ~$0.08 (with Gemini Flash)

Using Python/TypeScript Utilities

Python:

from baml_client import b
from baml_toolkit.batch import GeminiBatchProcessor

processor = GeminiBatchProcessor(
    function=b.ExtractInvoice,
    gemini_model="gemini-1.5-flash",
    batch_size=50
)

results = await processor.process_files(
    pattern="invoices/**/*.pdf",
    output="results.jsonl",
    show_progress=True
)

# Results are typed according to BAML schema
for result in results:
    if result.status == "success":
        print(f"{result.data.vendor}: ${result.data.total}")

TypeScript:

import { b } from './baml_client';
import { GeminiBatchProcessor } from 'baml-toolkit/batch';

const processor = new GeminiBatchProcessor(b.ExtractInvoice, {
  geminiModel: 'gemini-1.5-flash',
  batchSize: 50
});

const results = await processor.processFiles({
  pattern: 'invoices/**/*.pdf',
  output: 'results.jsonl'
});

Batch Processing Best Practices

  1. Choose the right model:

    • Use Gemini Flash for simple tasks (classification, basic extraction)
    • Use Gemini Pro for complex reasoning or when accuracy is critical
  2. Set appropriate batch sizes:

    • Small files (< 10KB): 100-200 per batch
    • Medium files (10-100KB): 50-100 per batch
    • Large files (> 100KB): 10-25 per batch
  3. Test first:

    • Process 10-20 files to verify accuracy
    • Check results before processing thousands
  4. Monitor costs:

    • The plugin shows cost estimates before processing
    • Track actual spending in Google AI Studio
  5. Handle failures gracefully:

    • Failed items are automatically retried (up to 3 times)
    • Failures are logged for manual review
    • Continue processing even if some items fail

Common Use Cases

  • Invoice/Receipt Processing: Extract vendor, items, totals from PDFs
  • Document Classification: Categorize documents by type or content
  • Sentiment Analysis: Analyze customer reviews, feedback, support tickets
  • Data Extraction: Pull structured data from unstructured text
  • Content Summarization: Summarize articles, reports, meeting notes
  • Form Processing: Extract data from scanned forms or applications

See Complete Examples

Check out examples/ directory for full working examples:

  • invoice-extraction-batch.py - Extract data from hundreds of invoices
  • csv-classification-batch.py - Classify customer reviews from CSV

Framework Integration Examples

React/Next.js

'use client';

import { useExtractReceipt } from '@/lib/baml';

export default function ReceiptUploader() {
  const { execute, data, loading, error } = useExtractReceipt();

  return (
    <div>
      {loading && <p>Extracting...</p>}
      {data && <p>Vendor: {data.vendor}, Total: ${data.total}</p>}
      {error && <p>Error: {error.message}</p>}
    </div>
  );
}

FastAPI

from fastapi import FastAPI, HTTPException
from baml_client import b

app = FastAPI()

@app.post("/extract-receipt")
async def extract_receipt(image_url: str):
    try:
        result = await b.ExtractReceipt(image=image_url)
        return {"vendor": result.vendor, "total": result.total}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Express

import express from 'express';
import { b } from './baml_client';

const app = express();
app.use(express.json());

app.post('/extract-receipt', async (req, res) => {
  try {
    const result = await b.ExtractReceipt({ image: req.body.imageUrl });
    res.json({ vendor: result.vendor, total: result.total });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

Troubleshooting

"baml command not found"

Install BAML CLI:

npm install -g @boundaryml/baml
# or
pip install baml-py

"OPENAI_API_KEY not found"

Set environment variable:

export OPENAI_API_KEY="sk-..."
# Add to ~/.zshrc or ~/.bashrc for persistence

"Parse error in main.baml"

Check syntax:

  • Missing braces {}
  • Typos in type names
  • Missing #" quote delimiters for prompts
  • Incorrect Jinja syntax {{ }}

Generated code not updating

Run manual generation:

/baml-toolkit:generate

Or check hook is working:

# Edit a .baml file and save
# Should see auto-generation message

Plugin not loading

# Check plugin status
claude plugin list

# Reinstall if needed
claude plugin uninstall baml-toolkit
claude plugin install /path/to/baml-toolkit --scope user

File Structure

baml-toolkit/
ā”œā”€ā”€ .claude-plugin/
│   └── plugin.json              # Plugin manifest
ā”œā”€ā”€ commands/                    # Slash commands
│   ā”œā”€ā”€ init.md
│   ā”œā”€ā”€ generate.md
│   ā”œā”€ā”€ test.md
│   ā”œā”€ā”€ create-function.md
│   ā”œā”€ā”€ create-type.md
│   ā”œā”€ā”€ create-client.md
│   └── integrate.md
ā”œā”€ā”€ skills/                      # AI-invoked skills
│   ā”œā”€ā”€ baml-scaffolder/
│   ā”œā”€ā”€ baml-generator/
│   └── baml-integrator/
ā”œā”€ā”€ hooks/                       # Event handlers
│   └── hooks.json              # Auto-generation on .baml changes
ā”œā”€ā”€ templates/                   # BAML templates
│   ā”œā”€ā”€ function-template.baml
│   ā”œā”€ā”€ extraction-template.baml
│   ā”œā”€ā”€ classification-template.baml
│   ā”œā”€ā”€ vision-template.baml
│   └── clients-template.baml
ā”œā”€ā”€ scripts/
│   └── maybe-generate.sh       # Hook script for auto-generation
ā”œā”€ā”€ README.md                    # This file
ā”œā”€ā”€ CHANGELOG.md
└── LICENSE

Resources

Credits

Created by Orlando Bruno

Built with:

  • Claude Code Plugin Framework
  • BAML by Boundary ML
  • Love for type-safe LLM development

License

MIT License - see LICENSE file


Version: 1.0.0 Last Updated: 2025-12-24