NLP research paper for Twitter sentiment analysis
Top 45.9% on sourcepulse
This repository provides a research-oriented implementation for Twitter sentiment analysis, exploring various feature sets and machine learning classifiers to identify optimal combinations. It is targeted at NLP researchers and practitioners interested in microblogging sentiment analysis. The project offers a modular approach to experiment with different preprocessing, stemming, and classification techniques.
How It Works
The project employs a modular architecture with distinct feature extraction and classification components. It investigates preprocessing steps like handling hashtags, mentions, URLs, emoticons, punctuation, repeating characters, and applies stemming (Porter stemmer). Feature sets explored include unigrams, bigrams, trigrams, and negation detection. Classifiers tested are Naive Bayes and Maximum Entropy, with experiments comparing single-step (direct classification) and two-step (subjective/objective then positive/negative) classification approaches.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project author, Ayush Pareek, has sold the project to OnePanel Inc., which offers it as a commercial API. The code remains publicly hosted for the open-source community. No specific community channels (Discord, Slack) or active development signals are mentioned.
Licensing & Compatibility
The README does not explicitly state a license. Given the public hosting and research nature, it's likely permissive, but this requires verification. Compatibility for commercial use would depend on the specific license.
Limitations & Caveats
The README does not detail specific limitations or known bugs. The project appears to be research-focused, and the implementation details for running the code are not fully elaborated, suggesting it may require significant effort to set up and reproduce results. The best accuracy achieved is 86.68%, indicating room for improvement compared to state-of-the-art models.
4 years ago
Inactive