picMenu  by Nutlope

AI app to visualize menus from photos

created 8 months ago
417 stars

Top 71.3% on sourcepulse

GitHubView on GitHub
Project Summary

PicMenu is an AI-powered application that transforms restaurant menu photos into visually appealing dish images. It targets restaurant owners and diners seeking an efficient way to digitize and present menu items. The core benefit is the rapid generation of attractive visuals for menu items from a single photograph.

How It Works

The application leverages a multi-model AI pipeline. Llama 3.2 Vision 90B on Together AI extracts menu items from an uploaded image. Subsequently, Llama 3.1 8B, also on Together AI, structures this extracted data into JSON format. Finally, Flux Schnell on Together AI generates dish images based on the processed information. This modular approach allows for specialized AI models to handle distinct tasks, optimizing accuracy and output quality.

Quick Start & Requirements

  • Install dependencies and run locally: npm install and npm run dev.
  • Prerequisites:
    • Together AI API key.
    • Configured S3 bucket with credentials.
    • Node.js environment.
  • Setup involves cloning the repository, configuring environment variables (.env file with TOGETHER_API_KEY and S3 credentials), and running the development server.

Highlighted Details

  • Utilizes Llama 3.2 Vision 90B for robust menu item extraction.
  • Employs Llama 3.1 8B with JSON mode for structured data output.
  • Leverages Flux Schnell for image generation.
  • Built with Next.js, TypeScript, Shadcn UI, and Tailwind CSS.

Maintenance & Community

The project is maintained by Nutlope. Further community engagement channels or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The repository is licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The application is under active development with several planned features, including enhanced error handling, improved image realism, and tag-based filtering. Users should be aware that image generation can take up to 60 seconds per menu.

Health Check
Last commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
21 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

opendream by varunshenoy

0.1%
2k
Web UI for diffusion model workflows
created 2 years ago
updated 1 year ago
Feedback? Help us improve.