GPT-InvestAR by UditGupta10

Tool for stock investment strategy via LLM analysis of annual reports

Created 2 years ago

268 stars

Top 95.9% on SourcePulse

Project Summary

This repository provides tools for enhancing stock investment strategies by analyzing company annual reports using Large Language Models (LLMs). It targets quantitative analysts, researchers, and investors seeking to leverage AI for financial data processing and predictive modeling, aiming to improve portfolio performance against benchmarks like the S&P 500.

How It Works

The project follows a pipeline: downloading 10-K filings from the SEC, converting them to PDF for token efficiency, generating embeddings using ChromaDB, and then querying these embeddings with an LLM (like GPT-3.5) to extract scores as features. These features are used in a Linear Regression model within a Jupyter Notebook to predict stock returns and construct investment portfolios.

Quick Start & Requirements

Install: Recommended to install Llama Index and OpenBB in separate virtual environments. Specific installation commands are not provided, but dependencies include Llama Index, OpenBB, Scikit-Learn, and PDFKit.
Prerequisites: Access to SEC filings, LLM API keys (e.g., GPT-3.5), and potentially significant computational resources for embedding generation and modeling.
Resources: No specific setup time or resource footprint is detailed.
Links: arXiv Link, SSRN link

Highlighted Details

Automates the extraction of financial insights from 10-K filings.
Leverages LLM-generated embeddings and query scores as predictive features.
Implements a modeling pipeline for return estimation and portfolio construction.
Compares portfolio performance against the S&P 500 index.

Maintenance & Community

The project is associated with a published paper, indicating academic backing. No specific community channels (Discord, Slack) or active maintenance signals are provided in the README.

Licensing & Compatibility

The repository's code is not explicitly licensed. The associated paper is available under a Creative Commons license (implied by arXiv). Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project relies on external LLM APIs, which may incur costs and have usage limitations. The effectiveness of the predictive model is dependent on the quality of LLM embeddings and the chosen features, and no performance benchmarks are provided. The setup process requires managing multiple complex dependencies in separate environments.

GPT-InvestAR by UditGupta10

Explore Similar Projects

defeatbeta-api by defeat-beta

llm-mistral-invoice-cpu by katanaml

financial-datasets by virattt

AlphaFin by AlphaFin-proj

smartpdfs by Nutlope

ashare-llm-analyst by Ogannesson

OpenContracts by Open-Source-Legal

llmsherpa by nlmatics

open-deep-research by btahir

edgartools by dgunning

FinGLM by MetaGLM

stock-scanner by DR-lin-eng