AutoWebGLM by THUDM

LLM-based web navigating agent (KDD'24)

Created 1 year ago

926 stars

Top 39.2% on SourcePulse

Project Summary

AutoWebGLM is an LLM-based agent designed for efficient automated web navigation. It targets researchers and developers building AI agents that interact with the web, offering improved webpage comprehension and task execution through novel algorithms and training methodologies.

How It Works

AutoWebGLM leverages the ChatGLM3-6B model, enhancing its web navigation capabilities with an HTML simplification algorithm that mimics human browsing to make pages more digestible for LLMs. It employs a hybrid human-AI training approach using curated web browsing data and utilizes reinforcement learning with rejection sampling to boost webpage comprehension, browser operation efficiency, and task decomposition.

Quick Start & Requirements

Install/Run: Refer to the ChatGLM3-6B repository for inference code. Evaluation code is provided.
Prerequisites: Requires modifications to WebArena and MiniWob++ environments.
Resources: Evaluation datasets available at AutoWebBench and Mind2Web.
Links: AutoWebBench, Mind2Web, ChatGLM3-6B, WebArena, MiniWob++

Highlighted Details

Introduces a novel HTML simplification algorithm inspired by human browsing patterns.
Features a hybrid human-AI training strategy for web browsing data.
Utilizes reinforcement learning and rejection sampling for enhanced agent performance.
Includes AutoWebBench, a bilingual benchmark for evaluating web navigation agents.

Maintenance & Community

The project is associated with THUDM and has a KDD'24 publication. Further development is encouraged by starring the repository.

Licensing & Compatibility

Licensed under Apache-2.0. Open-sourced data is for research purposes only.

Limitations & Caveats

The project requires modifications to existing environments (WebArena, MiniWob++), and the inference code is dependent on the separate ChatGLM3-6B repository.

AutoWebGLM by THUDM

Explore Similar Projects

WebRL by THUDM

skills by browserbase

cerebellum by theredsix

TheAgenticBrowser by TheAgenticAI

browserpilot by handrew

browserbee by parsaghaffari

webllama by McGill-NLP

WebVoyager by MinorJerry

awesome-web-agents by steel-dev

notte by nottelabs

SearChat by sear-chat

Agent-E by EmergenceAI