BettaFish  by 666ghj

Weibo analysis system for monitoring public sentiment

Created 1 year ago
6,602 stars

Top 7.7% on SourcePulse

GitHubView on GitHub
Project Summary

This system monitors, analyzes, and predicts public opinion trends on social media, specifically Weibo. It targets governments, enterprises, and researchers needing to understand public sentiment, respond to events, and optimize decision-making by processing large volumes of social media data. The system offers real-time data collection, sentiment analysis, topic classification, and trend prediction.

How It Works

The system employs a modular architecture built on Flask, utilizing Scrapy for real-time data collection via web scraping. Preprocessing involves Jieba for Chinese text segmentation and stop word removal. Sentiment analysis is performed using SnowNLP, while BERT models handle topic classification. Advanced AI capabilities are integrated via API keys for OpenAI (GPT), Anthropic (Claude), and DeepSeek models, enabling sophisticated text analysis and prediction. Data is stored in a MySQL database, and results are visualized using Matplotlib.

Quick Start & Requirements

  • Install: git clone https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem.git followed by pip install -r requirements.txt.
  • Prerequisites: Python 3.7+, MySQL database, Conda (optional), a Weibo account for data collection.
  • AI API Keys: OpenAI, Anthropic, or DeepSeek API keys are required for AI analysis features.
  • Setup: Configure config.py for MySQL and set environment variables for AI API keys. Start the Flask app with python app.py.
  • Access: Navigate to http://localhost:5000.
  • Docs: English | 中文文档

Highlighted Details

  • Leverages advanced AI models including OpenAI GPT-3.5/4, Anthropic Claude-3, and DeepSeek-V3 for sophisticated analysis.
  • Integrates BERT for topic classification and SnowNLP for sentiment analysis.
  • Features real-time data collection via Scrapy and data visualization for insights.
  • Supports user management for personalized and secure services.

Maintenance & Community

The project welcomes contributions via pull requests. Contact is available through GitHub Issues or email (670939375@qq.com).

Licensing & Compatibility

Licensed under GPL-2.0. This license may impose copyleft restrictions, requiring derivative works to also be open-sourced under the same license, potentially impacting commercial or closed-source integration.

Limitations & Caveats

The system's reliance on a Weibo account for data collection and external AI API keys for advanced features introduces dependencies that could be subject to platform changes or costs. The GPL-2.0 license may restrict its use in proprietary software.

Health Check
Last Commit

6 hours ago

Responsiveness

Inactive

Pull Requests (30d)
15
Issues (30d)
35
Star History
8,306 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
11 more.

pattern by clips

0.0%
9k
Python web mining module
Created 14 years ago
Updated 1 year ago
Feedback? Help us improve.