OpenKE by thunlp

Open-source toolkit for knowledge graph embedding research

Created 8 years ago

4,023 stars

Top 12.0% on SourcePulse

Project Summary

OpenKE is an open-source toolkit for knowledge graph embedding (KGE), offering efficient implementations of various KGE models in PyTorch and TensorFlow. It targets researchers and practitioners in knowledge representation learning, providing a flexible platform for training, evaluating, and deploying KGE models on large-scale knowledge graphs.

How It Works

OpenKE leverages PyTorch for model implementation and Python interfaces, allowing for GPU acceleration. Core operations like data preprocessing and negative sampling are optimized using C++. This hybrid approach balances ease of use with high performance, enabling efficient handling of complex relations and relational paths, notably with its featured TransR and PTransE models.

Quick Start & Requirements

Install: Clone the OpenKE-PyTorch branch and run bash make.sh to compile C++ components.
Run: Execute example training scripts like python train_transe_FB15K237.py.
Prerequisites: PyTorch, C++ compiler.
Data Format: Requires specific train2id.txt, entity2id.txt, and relation2id.txt files.
Resources: Official website: http://openke.thunlp.org/

Highlighted Details

Supports a wide range of KGE models including TransE, TransH, TransR, TransD, DistMult, ComplEx, RotatE, and others.
Provides C++ inference implementations (Fast-TransX) for enhanced efficiency.
Includes evaluation metrics like Hits@k, MR, and MRR, with support for filtered settings and type constraints.
Offers pre-trained embeddings for large-scale knowledge graphs like Wikidata, Freebase, and XLORE.

Maintenance & Community

The project is primarily contributed by researchers from THU, including Xu Han, Yankai Lin, and Zhiyuan Liu. The project is part of the larger OpenSKL initiative.

Licensing & Compatibility

The toolkit itself is available under a permissive license, while pre-trained embeddings are provided under the MIT license. This generally allows for commercial use and integration with closed-source applications.

Limitations & Caveats

The README mentions TensorFlow 1.0 support, which is now deprecated. While PyTorch is the primary focus, users might need to manage dependencies for older TensorFlow versions if using those specific repositories.

Health Check

Last Commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days