This Emacs package integrates generative AI models into the Org mode workflow, enabling users to leverage LLMs for text generation and diffusion models for image creation directly within their documents. It targets Emacs users who want to enhance their productivity with AI-powered content creation, summarization, and code refactoring.
How It Works
The package utilizes special #+begin_ai...#+end_ai
blocks within Org mode to define AI tasks. Users can specify models (OpenAI, Azure, Anthropic, Perplexity, Stable Diffusion, local LLMs via oobabooga) and parameters like system prompts, temperature, and image dimensions. It supports text generation, image creation/variation, and speech input/output via Whisper.
Quick Start & Requirements
- Installation: MELPA, Straight.el, or manual checkout.
- Prerequisites: Emacs, OpenAI API key (or other service credentials), optional: Whisper.el, ffmpeg, Stable Diffusion WebUI, oobabooga/text-generation-webui for local models.
- Setup: Basic OpenAI integration is quick. Setting up local models or speech requires additional installations and configuration.
- Docs: https://github.com/rksm/org-ai
Highlighted Details
- Seamless integration with Org mode for text and image generation.
- Support for multiple AI providers including OpenAI, Azure, Anthropic, Perplexity, Stable Diffusion, and local LLMs.
- Speech input/output capabilities using Whisper.
- Global commands for operating on regions, files, and projects outside of Org mode buffers.
- Noweb support for dynamic content generation and code evaluation within prompts.
Maintenance & Community
- Actively maintained by rksm.
- Community support via GitHub issues. Sponsorships are encouraged.
Licensing & Compatibility
- MIT License.
- Compatible with commercial and closed-source projects.
Limitations & Caveats
- Image variation currently requires
curl
to be installed.
- Perplexity.ai API integration does not currently provide references/links.
- macOS speech setup involves specific system permissions and microphone configuration.