chapi  by phodal

Code analysis tool for cross-language development

Created 5 years ago
305 stars

Top 87.8% on SourcePulse

GitHubView on GitHub
Project Summary

CHAPI (Common Hierarchical Abstract Parser and Information Converter) is a tool designed to parse source code from various programming languages and convert it into a unified abstract syntax tree (AST) model. This facilitates cross-language code analysis, enabling developers to understand and manage complex, multi-language codebases more effectively.

How It Works

CHAPI employs a language-specific parser for each supported language, leveraging ANTLR grammars for syntax analysis. It then transforms the parsed code into a common, hierarchical data structure (CodeContainer, CodeDataStruct, CodeFunction, etc.) that represents the code's architecture and components. This abstraction layer simplifies tasks like dependency analysis, architecture governance, and code quality assessment across different programming languages.

Quick Start & Requirements

  • Dependencies: Java 8+, Kotlin, Python 2/3, Rust v1.60.0+.
  • Usage: Add chapi-ast-<language> and chapi-domain to your project's dependencies.
  • Examples: See the provided Kotlin code snippet for Java analysis and the JSON output example for a BlogPO class.
  • Development: Requires IntelliJ IDEA, JDK 11+. Build with ./gradlew build.

Highlighted Details

  • Supports parsing for Java, TypeScript/JavaScript, Go, Kotlin, Rust, C, C++, C#, Scala, and Protobuf/Thrift.
  • Provides a unified domain model for code analysis, including structures for packages, classes, functions, fields, and imports.
  • Integrates with ArchGuard for architecture governance and UnitGen for code fine-tuning data.
  • Follows a Test-Driven Development (TDD) approach with a focus on high test coverage.

Maintenance & Community

  • The project is primarily developed by Phodal Huang.
  • Contributions are welcomed via Pull Requests.
  • Commit messages follow a conventional format (e.g., feat(java): <message>).

Licensing & Compatibility

  • Licensed under the MPL (Mozilla Public License).
  • Permits commercial use and linking with closed-source projects, subject to MPL terms.

Limitations & Caveats

  • C# parsing has known issues with interpolated strings and requires specific handling for namespace imports.
  • C code preprocessing relies on jcpp.
  • Full type resolution for Kotlin classes within the same package requires implementation of warpTargetFullType.
Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
6 more.

awesome-machine-learning-on-source-code by src-d

0.1%
6k
Curated list of ML applied to source code (MLonCode)
Created 8 years ago
Updated 4 years ago
Feedback? Help us improve.