ACE  by Alibaba-NLP

Framework for automated embedding concatenation in structured prediction tasks

created 4 years ago
309 stars

Top 88.0% on sourcepulse

GitHubView on GitHub
Project Summary

ACE is a framework for automating the search and concatenation of word embeddings for structured prediction tasks in NLP, aiming to achieve state-of-the-art accuracy. It is designed for researchers and practitioners in NLP who need to optimize embedding combinations for tasks like Named Entity Recognition (NER), Part-of-Speech (POS) tagging, and dependency parsing. The primary benefit is the automated discovery of effective embedding strategies, reducing manual experimentation.

How It Works

ACE employs a reinforcement learning approach to explore various combinations and concatenations of pre-trained word embeddings. It treats the selection and combination of embeddings as a sequential decision-making process, learning a policy to construct optimal embedding representations for specific downstream tasks. This method allows for dynamic adaptation and discovery of synergistic embedding interactions that might not be obvious through manual selection.

Quick Start & Requirements

  • Install via pip: pip install -r requirements.txt
  • Requires PyTorch 1.1+ and Python 3.6+.
  • transformers library version 3.0.0 is a key dependency.
  • Automatic download of most embeddings is supported; manual download and path configuration are required for some.
  • Official documentation and instructions for reproducing results are available.

Highlighted Details

  • Achieved state-of-the-art results on multiple NER, POS tagging, and dependency parsing benchmarks.
  • Supports fine-tuning transformer embeddings and extracting document-level features for enhanced performance.
  • Offers flexibility in configuring datasets, embeddings, and training parameters via YAML files.
  • Provides pretrained models for NER and dependency parsing.

Maintenance & Community

The project is associated with Alibaba-NLP and the ACL-IJCLP 2021 paper. Recent news highlights related projects like AdaSeq and KB-NER. Contact information for questions is provided.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The code is based on an older version of flair (0.4.3) with significant modifications, which might impact compatibility with newer flair versions. Manual configuration of embedding paths is necessary after downloading. The README does not detail specific limitations or known bugs.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Didier Lopes Didier Lopes(Founder of OpenBB), and
11 more.

sentence-transformers by UKPLab

0.2%
17k
Framework for text embeddings, retrieval, and reranking
created 6 years ago
updated 3 days ago
Feedback? Help us improve.