AutoGPT agent for Chrome control
Top 25.2% on sourcepulse
Chrome-GPT is an experimental AutoGPT agent that leverages Langchain and Selenium to automate user interactions within a Chrome browser. It's designed for users who want to automate web-based tasks, such as form filling, data extraction, and navigating complex websites, by providing a natural language interface for controlling browser actions.
How It Works
The agent utilizes Selenium to programmatically control a Chrome browser instance, enabling actions like scrolling, clicking elements, and inputting text into web forms. It integrates with Langchain for managing agent logic, memory, and interacting with large language models (LLMs) like GPT-3.5 and GPT-4 to interpret tasks and generate browser commands. This approach allows for complex, multi-step web automation driven by natural language prompts.
Quick Start & Requirements
poetry install
.OPENAI_API_KEY
environment variable).python -m chromegpt -t "{your request}"
.python -m chromegpt -v -a auto-gpt -m gpt-4 -t "{your request}"
.docker-compose up
.Highlighted Details
Maintenance & Community
The project is experimental and primarily maintained by its creator, richardyc. Community engagement and development status are not explicitly detailed beyond the GitHub repository.
Licensing & Compatibility
The repository does not explicitly state a license. Users should verify licensing for commercial use or integration into closed-source projects.
Limitations & Caveats
The agent has limited web crawling capabilities, with occasional failures in identifying buttons and input fields. Response times are slow, ranging from 1-10 seconds per action. There are known issues with Langchain agents parsing GPT outputs, suggesting alternative agent types if problems arise. The project is marked as experimental, implying potential instability and unexpected behavior.
1 year ago
Inactive