ml_things  by gmihaila

Python library for speeding up ML workflows

created 7 years ago
261 stars

Top 98.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a lightweight Python library, ml-things, containing reusable functions and code snippets for machine learning, deep learning, and NLP tasks. It aims to accelerate the workflow of ML practitioners by offering utilities for array manipulation, plotting, text cleaning, and web-related operations, along with curated code snippets and tutorial notebooks.

How It Works

The library offers modular functions categorized by task. Key utilities include pad_array for handling variable-length arrays by padding them to a fixed size with a specified value, and batch_array for splitting lists into manageable chunks. Plotting functions like plot_array, plot_dict, and plot_confusion_matrix are optimized for quick visualization of ML data. Text processing is supported by clean_text, which removes noise and standardizes text.

Quick Start & Requirements

  • Install via pip: pip install ml-things or pip install git+https://github.com/gmihaila/ml_things
  • Tested with Python 3.6+.
  • Official documentation and tutorials are available via Google Colab links.

Highlighted Details

  • pad_array: Handles single or nested arrays, with options for custom padding values and lengths.
  • batch_array: Efficiently splits lists into batches, useful for ML data loading.
  • plot_confusion_matrix: Includes normalization options and customizable plot parameters.
  • clean_text: Offers robust text cleaning, including punctuation removal and case standardization.

Maintenance & Community

The project is actively maintained by the author, gmihaila. Users are encouraged to open issues for bugs or suggestions. Links to the author's GitHub, website, and LinkedIn are provided for contact.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The library is a personal project and may not cover all edge cases or advanced functionalities. Some notebooks are not yet in a polished form. The lack of an explicit license could pose a risk for commercial adoption.

Health Check
Last commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Carol Willing Carol Willing(Core Contributor to CPython, Jupyter).

genai by rgbkrk

0%
352
IPython extension for generative AI assistance in Jupyter notebooks
created 3 years ago
updated 1 year ago
Feedback? Help us improve.