This repository is the official implementation of the paper: A Comprehensive Analysis on LLM-based Node Classification Algorithms. It provides a standardized framework for evaluating LLM-based node classification methods, including 10 datasets, 8 LLM-based algorithms, and 3 learning paradigms.
Please consider citing or giving a 🌟 if our repository is helpful to your work!
@misc{wu2025llmnodebed,
title={A Comprehensive Analysis on LLM-based Node Classification Algorithms},
author={Xixi Wu and Yifei Shen and Fangzhou Ge and Caihua Shan and Yizhu Jiao and Xiangguo Sun and Hong Cheng},
year={2025},
eprint={2502.00829},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2502.00829},
}
📅 [2025-02-04] The code for LLMNodebed, along with the project pages and paper, has now been released! 🧨
To get started, follow these steps to set up your Python environment:
conda create -n NodeBed python=3.10
conda activate NodeBed
pip install torch torch_geometric transformers peft pytz scikit-learn torch_scatter torch_sparse
Some packages might be missed for specific algorithms. Check the algorithm READMD or error logs to identify any missing dependencies and install them accordingly.
-
Close-source LLMs like GPT-4o, DeepSeek-Chat:
Add API keys to
LLMZeroShot/Direct/api_keys.py
-
Open-source LLMs like Mistral-7B, Qwen:
Download models from HuggingFace (e.g., Mistral-7B). Then, update model paths in
common/model_path.py
as you actual saving paths.Example paths:
MODEL_PATHs = { "MiniLM": "sentence-transformers/all-MiniLM-L6-v2", "Mistral-7B": "mistralai/Mistral-7B-Instruct-v0.2", "Llama-8B": "meta-llama/Llama-3.1-8B-Instruct", # See full list in common/model_path.py }
Download datasets from Google Drive and unzip into the datasets
folder.
Before running LLM-based algorithms, please generate LM / LLM-encoded embeddings as follows:
cd LLMEncoder/GNN
python3 embedding.py --dataset=cora --encoder_name=roberta # LM embeddings
python3 embedding.py --dataset=cora --encoder_name=Mistral-7B # LLM embeddings
For LLM Direct Inference using open-source LLMs, we depoly them as local services based on the FastChat framework.
# Install dependencies
pip install vllm "fschat[model_worker,webui]"
# Start services
python3 -m fastchat.serve.controller --host 127.0.0.1
CUDA_VISIBLE_DEVICES=0 python3 -m fastchat.serve.vllm_worker --model-path mistralai/Mistral-7B-Instruct-v0.2 --host 127.0.0.1
python3 -m fastchat.serve.openai_api_server --host 127.0.0.1 --port 8008
Then, the Mistral-7B model can be invoked via the url http://127.0.0.1:8008/v1/chat/completions
.
Refer to method-specific READMEs for execution details:
-
LLM-as-Encoder:
LLMEncoder/README.md
-
LLM-as-Predictor:
LLMPredictor/README.md
-
LLM-as-Reasoner:
LLMReasoner/README.md
-
Zero-shot Methods:
LLMZeroShot/README.md
LLMNodeBed/
├── LLMEncoder/ # LLM-as-Encoder (GNN, ENGINE)
├── LLMPredictor/ # LLM-as-Predictor (GraphGPT, LLaGA, Instruction Tuning)
├── LLMReasoner/ # LLM-as-Reasoner (TAPE)
├── LLMZeroShot/ # Zero-shot Methods (Direct Inference, ZeroG)
├── common/ # Shared utilities
├── datasets/ # Dataset storage
├── results/ # Experiment outputs
└── requirements.txt
Method | Veneue | Official Implementation | Our Implementation |
---|---|---|---|
TAPE | ICLR'24 | link | LLMReasoner/TAPE |
ENGINE | IJCAI'24 | link | LLMEncoder/ENGINE |
GraphGPT | SIGIR'24 | link | LLMPredictor/GraphGPT |
LLaGA | ICML'24 | link | LLMPredictor/LLaGA |
ZeroG | KDD'24 | link | LLMZeroShot/ZeroG |
- | Ours Proposed | LLMEncoder/GNN |
|
LLM Instruction Tuning | - | Ours Implemented | LLMPredictor/Instruction Tuning |
Direct Inference | - | Ours Implemented | LLMZeroShot/Direct |
If you have any further questions about usage, reproducibility, or would like to discuss, please feel free to open an issue or contact the authors via email at [email protected].
We thank the authors of TAPE, ENGINE, GraphGPT, LLaGA, and ZeroG for their open-source implementations. Part of our framework is inspired by GLBench.