Solution for sentiment analysis and topic recognition
Top 70.1% on sourcepulse
This repository provides the winning solution for the BDCI 2018 Automotive Industry User Opinion Topic and Sentiment Recognition competition. It offers a pipeline for aspect-based sentiment analysis, targeting researchers and practitioners in NLP and sentiment analysis. The solution achieves high accuracy by ensembling multiple deep learning models and fine-tuned BERT for both topic classification and sentiment polarity prediction.
How It Works
The approach employs a two-stage pipeline: first, topic classification (multi-label) using Binary Cross-Entropy, followed by sentiment polarity prediction (multi-class) conditioned on the predicted topic. For topic classification, nine models (8 diverse deep learning models with different embeddings and 1 fine-tuned BERT) are ensembled via stacking with Logistic Regression. Sentiment analysis utilizes 13 models (3 novel network designs with 4 embeddings each, plus fine-tuned BERT), also stacked with LR. This ensemble strategy leverages the strengths of various architectures and embeddings to maximize performance.
Quick Start & Requirements
pip install -r requirements.txt
(requirements not explicitly listed, but implied by imports like skmulti-learn
, tqdm
, hanlp
).Highlighted Details
Maintenance & Community
Contact: sqfzf69(At)163.com. The README mentions potential future updates for code optimization and BERT compatibility.
Licensing & Compatibility
The repository does not explicitly state a license. The code is provided for competition purposes; commercial use or integration into closed-source projects would require clarification.
Limitations & Caveats
The code is noted as not optimized and may contain imperfections (e.g., lack of batching in some networks). Compatibility issues with newer Hugging Face BERT conversion scripts are highlighted, recommending the use of provided or older conversion scripts. The solution does not handle "UNK" tokens, requiring modifications for real-world applications. Pre-trained model loading is tested on GPU only.
6 years ago
Inactive