ML tool for entity recognition and disambiguation
Top 96.7% on SourcePulse
Entity-fishing is a machine learning tool for entity recognition and disambiguation against Wikidata, supporting 15 languages and various text formats including raw text, document-level PDF analysis, and search queries. It is designed for researchers and developers needing efficient and accurate entity linking, offering a faster and lighter alternative to models like BLINK for specific datasets.
How It Works
Entity-fishing employs a query DSL for disambiguation and leverages a large knowledge base derived from Wikidata, encompassing millions of entities and embeddings. Its architecture is optimized for speed, enabling high token processing rates on a single server, and includes an in-house Named Entity Recognizer for English and French.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is described as a "work-in-progress side project" with version 0.0.6. While benchmarks show strong performance, the F1-score for disambiguation is noted as needing improvement in future versions. An initial server launch/start-up time of 15-30 seconds is expected.
2 months ago
1 week