agent by 0xfreysa

Adversarial agent game where players try to convince an AI to release funds

Created 1 year ago

788 stars

Top 44.6% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jasper Zhang

Cofounder of Hyperbolic

Project Summary

Freysa is an adversarial agent game where participants attempt to convince an autonomous AI, Freysa, to release a growing prize pool. It targets AI safety researchers, white-hat hackers, and curious individuals interested in human-AI interaction and prompt engineering, offering a unique, high-stakes challenge.

How It Works

Freysa operates as a chat-based interface where users pay increasing fees in ETH (Base blockchain) to send messages. The AI, governed by a public system prompt forbidding fund release, uses LLM tool-calling to decide on fund transfers. It learns from all historical interactions, adapting its responses and defenses.

Quick Start & Requirements

Install/Run: Interact via the web UI.
Prerequisites: Crypto wallet (e.g., MetaMask) for ETH payments on the Base network.
Links: Official Game UI (implied by description).

Highlighted Details

Adversarial Game: Designed to test AI safety and human control over AGI.
Dynamic Pricing: Query fees start at $10 and increase exponentially (0.78% per message) up to $4500.
Growing Prize Pool: Starts at $3000, with 70% of query fees added, growing exponentially.
Open Source: The full game logic and Freysa's system prompt are public.

Maintenance & Community

The project is open-source, suggesting community contributions are possible. No specific community links or maintainer details are provided in the README.

Licensing & Compatibility

The README does not specify a license. The open-source nature implies potential for community review and contribution, but commercial use or closed-source linking compatibility is undetermined without a license.

Limitations & Caveats

The game has a time limit: if 1500 attempts are made without a winner, a global timer starts, ending the game if hourly attempts cease. The system prompt is public, but the exact LLM and its specific configuration are not detailed, potentially impacting the predictability of its behavior.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days