PetThoughts by liu-ziting

Image recognition app using Gemini Pro API

Created 1 year ago

345 stars

Top 80.2% on SourcePulse

Project Summary

This project provides a web application that uses Google's Gemini Pro API to analyze pet photos, inferring their thoughts and emotions. It targets pet owners and enthusiasts looking for a fun way to understand their pets better, offering insights into their pets' feelings and activities through image and natural language processing.

How It Works

The application leverages Gemini Pro Vision's multimodal capabilities to process uploaded pet images. It performs image recognition to identify the pet and analyze facial expressions and the surrounding environment. This analysis is then combined with natural language processing to generate text descriptions of the pet's inferred thoughts and emotional state, presented in a user-friendly interface.

Quick Start & Requirements

Install/Run: Deploy via Netlify or Vercel using provided buttons.
Prerequisites: Requires a GEMINI_API_KEY environment variable.
Links: Live demo, Environment Variables

Highlighted Details

Utilizes Gemini Pro Vision for image and natural language processing.
Analyzes pet facial expressions and environment to infer thoughts.
Supports common pets like cats and dogs; accuracy may vary for other animals.
Front-end generated by v0.dev, interface by Google Gemini.

Maintenance & Community

Community support available via Discord.

Licensing & Compatibility

License: MIT.
Compatibility: Permissive for commercial use, but users must comply with Google's Terms of Use and applicable laws.

Limitations & Caveats

The application is designed for common pets (cats, dogs) and may not be accurate for other animals. Users must ensure uploaded photos are clear for optimal results. The project disclaimer notes compliance with generative AI service regulations, particularly for use in China.

PetThoughts by liu-ziting

Explore Similar Projects

lens by ContextualAI

Multimodal-Sentiment-Analysis by YeexiaoZheng

Woodpecker by VITA-MLLM

ScreenAI by kyegomez

exbert by bhoov

Gemini by kyegomez

R1-Omni by HumanMLLM

ComfyUI-Gemini by ZHO-ZHO-ZHO

MM-REACT by microsoft

Gemini-API by HanaokaYuzu

Google-Gemini-Crash-Course by krishnaik06

bertviz by jessevig