Discover and explore top open-source AI tools and projects—updated daily.
fjzzq2002Semantic search engine for competitive programming problems
Top 88.1% on SourcePulse
This project provides a semantic search engine for competitive programming problems, enabling users to find similar problems based on natural language descriptions. It simplifies problem statements using LLMs, generates embeddings, and performs vector searches, making it useful for competitive programmers seeking to discover new problems or identify duplicates.
How It Works
The core approach involves using a Large Language Model (LLM) to simplify and paraphrase competitive programming problem statements, removing extraneous background information. These simplified texts are then embedded into vector representations. When a user queries the system, their query is also embedded, and a vector search is performed against the problem embeddings to find semantically similar problems. This method leverages recent advancements in LLM capabilities and affordability for effective document retrieval.
Quick Start & Requirements
pip install -r requirements.txt.problems/ directory in json format (e.g., problems/1000.json).python -m src.build_summary, python -m src.build_embedding, python -m src.build_locale, and finally python -m src.ui to start the server.Highlighted Details
Maintenance & Community
@fstqwq are acknowledged.Licensing & Compatibility
Limitations & Caveats
The project does not provide scraped vjudge problems or a vjudge scraper due to copyright concerns, and it does not process PDF statements. Users must acquire their own data or contribute it.
4 days ago
Inactive
asg017
stanford-futuredata
lancedb