Integrate Gemini workflows into Claude Code for advanced multimodal analysis and processing.

Gemini Workflows Plugin for Claude Code

Execute Gemini-specific workflows directly from Claude Code, leveraging Google's Gemini AI for multimodal analysis, long-context processing, and batch operations.

Features

Slash Commands (User-Invoked)

/gemini-workflows:analyze-image - Analyze images or videos using Gemini's multimodal capabilities
/gemini-workflows:process-long-context - Process very large documents with Gemini's 2M token context window
/gemini-workflows:batch-process - Batch process multiple files through Gemini workflows

Skills (AI-Invoked Autonomously)

Claude will automatically use Gemini when appropriate for:

Visual Analysis: Image/video content analysis, OCR, object detection
Long Context Processing: Entire codebase analysis, very large documents
Batch Processing: Applying operations across multiple files

MCP Tools

The plugin provides these tools via the gemini-api MCP server:

analyze_visual - Analyze images/videos with Gemini's vision models
process_long_context - Handle up to 2M tokens in a single request
gemini_query - General-purpose Gemini API queries

Installation

Prerequisites

Claude Code installed
Node.js 18+ and npm
Google Gemini API key (get one here)

Steps

Install the plugin:

# From the plugin directory
cd gemini-workflows
npm install

# Install in Claude Code
claude plugin install . --scope user

Set your API key:

# Add to your shell profile (~/.zshrc or ~/.bashrc)
export GEMINI_API_KEY="your-api-key-here"

# Or set for this session only
export GEMINI_API_KEY="your-api-key-here"

Restart Claude Code to load the plugin
Test it works:

/gemini-workflows:analyze-image path/to/image.jpg

Usage Examples

Analyze an Image

/gemini-workflows:analyze-image screenshot.png
/gemini-workflows:analyze-image photo.jpg "What objects are visible?"

Claude will also automatically use Gemini when you ask about images:

User: "What's in this screenshot.png?"
Claude: [Automatically uses visual-analysis skill with Gemini]

Process Long Documents

/gemini-workflows:process-long-context entire-codebase/**/*.py "Analyze architecture"
/gemini-workflows:process-long-context large-document.txt "Summarize key findings"

Or let Claude decide:

User: "Analyze all the Python files in this project for common patterns"
Claude: [Automatically uses long-context-processing skill with Gemini if content is large]

Batch Process Files

/gemini-workflows:batch-process "data/*.csv" "extract key insights"
/gemini-workflows:batch-process "images/*.jpg" "classify image content"

Configuration

Model Selection

By default, the plugin uses:

gemini-1.5-flash for visual analysis and quick queries (fast, cost-effective)
gemini-1.5-pro for long context processing (best quality for large content)

You can override model selection in the slash commands or when calling MCP tools directly.

API Key Management

The plugin reads GEMINI_API_KEY from the environment. You can set it:

Shell profile (recommended for permanent use):

# Add to ~/.zshrc or ~/.bashrc
export GEMINI_API_KEY="your-key"

Claude Code launch (session-specific):
```
GEMINI_API_KEY="your-key" claude
```
Plugin configuration (not recommended for security): Edit .claude-plugin/plugin.json to add the key (avoid committing this)

How It Works

Architecture

User Request
    ↓
Slash Command (optional) → Claude Code
    ↓
Skill Detection (autonomous) → Claude decides if Gemini needed
    ↓
MCP Server (gemini-api) → Calls Gemini API
    ↓
Gemini Response → Formatted and returned to user

When Claude Uses Gemini Automatically

Claude will invoke Gemini skills when:

Visual Analysis Skill:
- User asks about image/video content
- Requests OCR or text extraction
- Needs object detection or scene understanding
Long Context Processing Skill:
- Content exceeds ~100K tokens
- User wants to analyze entire codebase
- Task requires comprehensive "see everything" context
Batch Processing Skill:
- User needs same operation on multiple files
- Classification, extraction, or analysis tasks
- Processing collections of similar items

MCP Server Details

The gemini-api MCP server runs as a subprocess and communicates via stdio using the Model Context Protocol. It:

Manages Gemini API authentication
Handles file encoding (images/videos to base64)
Formats requests and responses
Provides error handling and retry logic

Troubleshooting

"GEMINI_API_KEY environment variable is required"

Set the environment variable before launching Claude:

export GEMINI_API_KEY="your-key"
claude

Plugin not loading

Check plugin installation:

claude plugin list

Reinstall if needed:

claude plugin uninstall gemini-workflows
claude plugin install /path/to/gemini-workflows --scope user

MCP server errors

Enable debug mode to see server logs:

claude --debug

Check that Node.js dependencies are installed:

cd gemini-workflows
npm install

"File not found" for images

Ensure you're using absolute paths or paths relative to your current working directory:

# Absolute path
/gemini-workflows:analyze-image /Users/you/Desktop/image.jpg

# Relative path (from current directory)
/gemini-workflows:analyze-image ./screenshots/image.png

Development

File Structure

gemini-workflows/
├── .claude-plugin/
│   └── plugin.json           # Plugin manifest
├── commands/                 # Slash commands
│   ├── analyze-image.md
│   ├── process-long-context.md
│   └── batch-process.md
├── skills/                   # AI-invoked skills
│   ├── visual-analysis/
│   │   └── SKILL.md
│   ├── long-context-processing/
│   │   └── SKILL.md
│   └── batch-processing/
│       └── SKILL.md
├── scripts/                  # MCP server and utilities
│   └── gemini-server.js
├── package.json              # Node.js dependencies
├── LICENSE
└── README.md

Adding New Workflows

Create slash command: Add .md file to commands/
Create skill (optional): Add directory with SKILL.md to skills/
Update MCP server (if needed): Add new tool definition to scripts/gemini-server.js
Test: Load plugin and test the new workflow

Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch
Test thoroughly
Submit a pull request

License

MIT License - see LICENSE file for details

Resources

Credits

Created by Orlando Bruno

Built with:

Claude Code Plugin Framework
Google Generative AI SDK
Model Context Protocol SDK

Version: 1.0.0 Last Updated: 2025-12-24