machine-learning-for-trading  by stefan-jansen

ML code for algorithmic trading strategies

created 7 years ago
15,356 stars

Top 3.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides over 150 Jupyter notebooks accompanying the 2nd edition of "Machine Learning for Algorithmic Trading." It offers a comprehensive, practical guide for traders and developers looking to integrate ML into their strategies, covering data sourcing, feature engineering, supervised/unsupervised learning, NLP, deep learning, and reinforcement learning for trading applications.

How It Works

The project demonstrates an end-to-end ML for trading workflow, from idea generation and data collection to model optimization and strategy backtesting. It emphasizes practical implementation using Python libraries, showcasing how to extract signals from diverse data sources (market, fundamental, text, image) and build predictive models and trading strategies. The notebooks are designed to be executed alongside the book, often containing additional details and practical examples.

Quick Start & Requirements

  • Installation: Recommended to install libraries per chapter to avoid version conflicts. Conda environments are provided for setup.
  • Prerequisites: Python, pandas, TensorFlow, scikit-learn, TA-Lib, Zipline, backtrader, PyFolio, Alphalens, spaCy, TextBlob, PyMC3, XGBoost, LightGBM, CatBoost, gensim. Specific chapters may require additional libraries.
  • Data: Instructions for downloading and preprocessing data (e.g., Algoseek, SEC filings) are included within the repository.
  • Resources: The book is over 800 pages; notebooks may require significant computational resources for training deep learning models and backtesting.
  • Links: Book Website

Highlighted Details

  • Replicates recent academic research using CNNs for time series, autoencoders for asset pricing, and GANs for synthetic data.
  • Covers intraday strategies with minute-frequency data and alternative data sources like satellite imagery and SEC filings.
  • Includes a new appendix detailing over 100 alpha factors.
  • Demonstrates end-to-end strategy backtesting with Zipline and backtrader.

Maintenance & Community

  • The project is actively maintained, with updates reflecting newer software versions (e.g., TensorFlow 2.2, pandas 1.0).
  • A community platform is available for questions and discussions.

Licensing & Compatibility

  • The code is provided under an unspecified license. The book itself is copyrighted. Compatibility for commercial use or closed-source linking is not explicitly stated and may depend on the underlying library licenses.

Limitations & Caveats

  • The project is tied to the content of a specific book edition, and some software versions mentioned may be outdated.
  • Users must manage potential version conflicts when installing dependencies.
  • The complexity of the topics and the need for substantial data and computational resources may present a barrier to entry.
Health Check
Last commit

11 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
654 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.