map-of-github  by anvaka

Interactive map for GitHub project discovery

created 2 years ago
2,655 stars

Top 18.1% on sourcepulse

GitHubView on GitHub
Project Summary

This project visualizes the GitHub ecosystem by mapping over 690,000 repositories based on shared stargazers. It targets developers and researchers interested in understanding GitHub's landscape, offering a unique way to discover project relationships and community clusters.

How It Works

The project leverages a massive dataset of GitHub activity events from Google BigQuery, processing approximately 500 million stars. It calculates Jaccard Similarity between repositories to quantify their shared audience. Leiden clustering is then applied to group similar projects, followed by a force-directed layout algorithm (ngraph.forcelayout) for visualization. The final map is rendered using MapLibre, with data converted to GeoJSON and tiles generated via tippecanoe.

Quick Start & Requirements

  • Data Processing: Requires significant RAM (512GB recommended for similarity computation) and time.
  • Visualization: Uses MapLibre and tippecanoe for rendering.
  • Data Source: Relies on public GitHub activity events from Google BigQuery.
  • Further Information: map-of-github

Highlighted Details

  • Visualizes 690,000+ GitHub projects and 1,500+ clusters.
  • Clustering based on Jaccard Similarity of stargazers.
  • Country names for clusters are AI-generated using ChatGPT with specific prompts.
  • Client-side fuzzy search functionality is implemented.

Maintenance & Community

The project is primarily maintained by the author, anvaka. Support can be sought via GitHub issues or Twitter.

Licensing & Compatibility

Released under the MIT license. Attribution is requested if the data is used in other works.

Limitations & Caveats

The initial data processing phase is resource-intensive, requiring substantial RAM and computation time. The visual design of the map is noted as an area for potential improvement.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
220 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.