Discover and explore top open-source AI tools and projects—updated daily.
kherudJava SDK for local LLaMA model inference
Top 72.3% on SourcePulse
This project provides Java bindings for llama.cpp, enabling efficient inference of large language models like LLaMA and Gemma directly from Java applications. It targets Java developers seeking to integrate LLM capabilities without relying on external services or complex Python environments.
How It Works
The library leverages JNI (Java Native Interface) to bridge Java code with the C/C++ core of llama.cpp. This allows for direct execution of model inference on the CPU or GPU (via CUDA or Metal, depending on llama.cpp build flags) within the JVM. The architecture supports streaming output, context management, and configuration of inference parameters like temperature and grammar.
Quick Start & Requirements
<dependency>
<groupId>de.kherud</groupId>
<artifactId>llama</artifactId>
<version>4.1.0</version>
</dependency>
llama.cpp with appropriate flags (e.g., -DGGML_CUDA=ON).Highlighted Details
llama.cpp build arguments to CMake.ModelParameters#setModelUrl().AutoCloseable for proper native memory management.Maintenance & Community
The project is maintained by kherud. Community channels are not explicitly mentioned in the README.
Licensing & Compatibility
The project appears to be distributed under the MIT License, based on the llama.cpp dependency. This license is permissive and generally compatible with commercial and closed-source applications.
Limitations & Caveats
Custom builds or GPU acceleration require manual compilation of the native llama.cpp library, which can be complex. The README notes that llama.cpp allocates memory not managed by the JVM, necessitating careful use of AutoCloseable to prevent leaks. Android integration requires specific Gradle configurations.
6 months ago
1 day
huggingface
shubham0204
meta-llama