second-brain-ai-assistant-course  by decodingml

Open-source course for building an AI assistant with LLMs, agents, and RAG

created 7 months ago
1,400 stars

Top 29.5% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This open-source course teaches how to build a production-ready "Second Brain" AI assistant using LLMs, agents, and Retrieval-Augmented Generation (RAG). It's designed for ML/AI engineers, data engineers, and data scientists who learn by building, offering practical skills beyond typical notebook tutorials. The course provides a template for creating personalized GenAI applications.

How It Works

The course guides users through building an end-to-end agentic RAG system. It covers data ingestion from sources like Notion or web crawls, data normalization, quality scoring using LLMs, dataset generation via distillation, fine-tuning open-source LLMs (e.g., Llama 3.1 8B) with tools like Unsloth, and deploying them. Advanced RAG techniques like contextual or parent retrieval are implemented, alongside agent building with smolagents and LLMOps for monitoring and evaluation using ZenML and Opik.

Quick Start & Requirements

  • Install/Run: Clone the repository and follow setup instructions within apps/second-brain-offline and apps/second-brain-online documentation.
  • Prerequisites: Intermediate Python, beginner knowledge of ML/LLMs/RAG. A modern laptop is sufficient; GPU is optional (cloud alternatives provided).
  • Cost: Minimal costs ($1-$5) for API usage (OpenAI, optional Hugging Face endpoints). Reading-only access is free.
  • Resources: Code Repository, Lessons

Highlighted Details

  • Builds an end-to-end agentic RAG system with LLMOps best practices.
  • Covers fine-tuning LLMs with Unsloth and deploying via Hugging Face.
  • Utilizes tools like ZenML for orchestration, Opik for evaluation, and smolagents for agents.
  • Includes advanced RAG techniques like context/parent retrieval.

Maintenance & Community

The course is a collaboration between Decoding ML, MongoDB, Comet, Opik, Unsloth, and ZenML. Core contributors are listed. Users can get help via GitHub Issues.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

While cloud alternatives are provided, optimal performance may benefit from a GPU. The course focuses on specific tools and techniques, and adapting to significantly different workflows might require substantial modification.

Health Check
Last commit

2 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
394 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.