smile  by haifengl

Java ML framework for various statistical learning tasks

created 10 years ago
6,229 stars

Top 8.4% on sourcepulse

GitHubView on GitHub
Project Summary

Smile is a comprehensive Java-based machine learning framework offering APIs for Scala, Kotlin, and Clojure. It provides state-of-the-art performance across a wide spectrum of ML tasks, including deep learning, LLMs, classification, regression, clustering, and NLP, making it suitable for researchers and developers needing a robust, multi-language ML solution.

How It Works

Smile leverages advanced data structures and algorithms for high performance. Its core strength lies in its breadth of implementation, covering everything from fundamental algorithms like SVM and K-Means to cutting-edge features like native Llama 3.1 inference and OpenAI-compatible LLM APIs. It also includes extensive support for numerical analysis, linear algebra, and symbolic manipulation, integrated with visualization tools.

Quick Start & Requirements

  • Installation: Via Maven Central (smile-core, smile-deep, smile-nlp), sbt (smile-scala), Gradle (smile-kotlin), or Clojure dependencies.
  • Prerequisites: Java 8+, BLAS/LAPACK (OpenBLAS or MKL recommended for performance-critical algorithms).
  • Resources: Interactive shells are provided; memory usage can be configured via JVM options (e.g., -J-Xmx30G).
  • Docs: Official Documentation

Highlighted Details

  • Native Java implementation of Llama 3.1 with an OpenAI-compatible inference server.
  • Supports deep learning on CPU and GPU, including EfficientNet.
  • Extensive visualization capabilities via smile-plot (Swing-based and Vega-Lite).
  • Model serialization for production deployment and integration with systems like Spark.

Maintenance & Community

  • Actively maintained by Haifeng Li and Karl Li.
  • Support channels include GitHub Discussions and Stack Overflow.
  • Issue tracker available for bugs and feature requests.

Licensing & Compatibility

  • Dual-license: Commercial license available upon contact; open-source use is also supported. Specific license details are in the LICENSE file.
  • Compatible with commercial and closed-source projects under the terms of the commercial license.

Limitations & Caveats

  • Some algorithms require specific BLAS/LAPACK implementations, which need careful configuration via system properties or dependency management.
  • The Clojure artifact version (4.2.0) lags slightly behind the Java/Scala/Kotlin versions (4.3.0).
Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
4
Star History
77 stars in the last 90 days

Explore Similar Projects

Starred by Tri Dao Tri Dao(Chief Scientist at Together AI), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
1 more.

oslo by tunib-ai

0%
309
Framework for large-scale transformer optimization
created 3 years ago
updated 2 years ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 16 hours ago
Feedback? Help us improve.