have-fun-with-machine-learning by humphd

Beginner's guide for image classification using neural networks

Created 9 years ago

5,114 stars

Top 9.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Deshraj Yadav

Cofounder of Mem0

Project Summary

This repository provides a hands-on, beginner-friendly guide to machine learning and image classification using Convolutional Neural Networks (CNNs). It targets individuals with programming experience but no prior AI knowledge, aiming to demystify neural network usage through practical application rather than theoretical deep dives. The primary benefit is enabling users to build and deploy image classification models without requiring advanced mathematical understanding or specialized hardware.

How It Works

The guide focuses on using existing, open-source tools like Caffe and NVIDIA's DIGITS. It demonstrates a practical workflow: preparing an image dataset, training a CNN from scratch, and crucially, fine-tuning pre-trained models (AlexNet and GoogLeNet) for specific classification tasks. This fine-tuning approach leverages transfer learning, allowing effective model customization with smaller datasets and less computational power by adapting learned features from large-scale datasets.

Quick Start & Requirements

Installation: Docker is recommended for an easier setup of Caffe and DIGITS. Alternatively, native installation is possible but noted as potentially frustrating, especially on macOS.
Prerequisites: Caffe (BSD licensed) and DIGITS (BSD licensed). The guide uses specific commits for Caffe and DIGITS, suggesting potential compatibility issues with newer versions.
Hardware: The guide emphasizes that powerful GPUs are not strictly necessary for fine-tuning, as the author successfully used a MacBook Pro with integrated Intel graphics.
Resources: Links to Caffe and DIGITS documentation are provided.

Highlighted Details

Demonstrates training a CNN from scratch using AlexNet, achieving 87.5% accuracy on a small dataset.
Details fine-tuning AlexNet and GoogLeNet, achieving 100% accuracy on the target task.
Provides Python code examples for using trained models to classify new images, including handling image preprocessing and network input/output.
Explains how to modify network architecture definitions (prototxt files) to adapt pre-trained models for new classification tasks.

Maintenance & Community

The project is hosted on GitHub, encouraging community contributions via pull requests for corrections and improvements. Specific community resources like a Caffe Users group and DIGITS User Group are mentioned.

Licensing & Compatibility

Caffe: BSD licensed.
DIGITS: BSD licensed.
Compatibility: The guide uses specific, potentially older, commits of Caffe and DIGITS, which might impact compatibility with current versions. The BSD license generally permits commercial use and integration into closed-source projects.

Limitations & Caveats

The guide acknowledges that the provided dataset is small and contrived, and that real-world robustness requires significantly more data. It also notes that Caffe's documentation can be sparse and assumes prior knowledge, suggesting an opportunity for higher-level, more beginner-friendly tools. The installation process, particularly native installation, is highlighted as a potential hurdle.

Health Check

Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days