EduChat by ECNU-ICALK

Open-source educational chat model

Created 2 years ago

897 stars

Top 40.4% on SourcePulse

Project Summary

EduChat is an open-source educational chatbot developed by East China Normal University, targeting students, teachers, and parents. It aims to provide intelligent educational support, including automated question generation, homework grading, emotional support, and course tutoring, by leveraging large language models fine-tuned on diverse educational data.

How It Works

EduChat builds upon foundational models like LLaMA and Baichuan, fine-tuning them with a custom dataset named educhat-sft-002-data-osm, which comprises over 4 million deduplicated Chinese and English instruction and dialogue examples. The project also offers CleanTool, a data cleaning utility, to enhance data quality. The models are designed for educational scenarios, with specific system prompts enabling functionalities like open-ended Q&A, emotional support, essay grading, and heuristic teaching.

Quick Start & Requirements

Install dependencies using pip install transformers after installing PyTorch.
Requires Python 3.8+ and a GPU (e.g., A100/A800) for optimal performance, with FP16 precision using ~15GB VRAM.
Local deployment examples are provided via Python scripts for Gradio demos and API services.
Official demo: https://www.educhat.top/ (internal testing), https://educhat.xiaoi.com/ (public testing).

Highlighted Details

Offers multiple model sizes (1.8B, 7B, 13B, 14B, 32B) based on different architectures (Baichuan, Qwen1.5).
Includes a data cleaning tool (CleanTool) for improving dataset quality.
Provides example Python code for direct model inference and Gradio/API demos for interactive use.
Research paper available: https://arxiv.org/abs/2308.02773

Maintenance & Community

Developed by the EduNLP team at East China Normal University.
Acknowledges contributions from LLaMA, Baichuan, and Open Assistant.
Future plans include enhancing logical reasoning, personalization, and tool-calling capabilities.

Licensing & Compatibility

Code licensed under Apache 2.0.
Data licensed under CC BY-NC 4.0.
Restrictions: Prohibits commercial use and any use that may cause societal harm.

Limitations & Caveats

The model may generate factually incorrect or biased responses, and its capabilities in reasoning, coding, and multi-turn dialogue require further improvement. The project is intended for research purposes only.

EduChat by ECNU-ICALK

Explore Similar Projects

langchain-mini by ColinEberhardt

deepseek-r1-chat by Rizwankaka

intern3-chat by intern3-chat

ChatGLM-6B-Engineering by LemonQu-GIT

awesome-totally-open-chatgpt by nichtdax

chatwiki by zhimaAi

gpt_examples by malywut

baize-chatbot by project-baize

ChatGPT-Telegram-Bot by yym68686

AstrBot by AstrBotDevs

kirara-ai by lss233

chatgpt-on-wechat by zhayujie