Chatito  by rodrigopivi

DSL for chatbot dataset generation

Created 7 years ago
885 stars

Top 40.9% on SourcePulse

GitHubView on GitHub
Project Summary

Chatito is a tool for generating training and testing datasets for AI chatbots and NLP models. It targets developers building conversational AI systems, offering a simple Domain Specific Language (DSL) to define sentence structures, variations, and entity annotations, thereby streamlining the creation of robust and diverse training data.

How It Works

Chatito utilizes a custom DSL to describe possible sentence combinations and entity variations. This DSL allows users to define intents, slots (entities), and their relationships, including synonyms and contextual rules. The core generation engine, implemented in TypeScript, parses this DSL and produces datasets in various formats, facilitating data augmentation and preventing model overfitting by generating a wide range of linguistic examples.

Quick Start & Requirements

Highlighted Details

  • Supports multiple output formats including Rasa, Flair (NER/Text Classification), LUIS, and Snips NLU.
  • DSL includes features for preventing overfitting through controlled data generation and defining custom entity arguments.
  • VS Code syntax highlighting plugin available for the DSL.
  • Mentioned in "AI Blueprints: How to build and deploy AI business projects" for chatbot examples.

Maintenance & Community

  • Maintained by Rodrigo Pimentel.
  • No explicit community links (Discord/Slack) or roadmap mentioned in the README.

Licensing & Compatibility

  • License not explicitly stated in the README.
  • Compatible with commercial chatbot platforms like DialogFlow, Wit.ai, and Watson.

Limitations & Caveats

  • The Flair adapter is only available for the NodeJS NPM CLI package, not the online IDE.
  • The README mentions that samples are not shuffled between intents by default for easier review, which might require manual shuffling for certain training pipelines.
Health Check
Last Commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.