ICMR25-HMKD

HMKD: Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation

Official PyTorch implementation of the ICMR 2025 paper “Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation”.

Authors

Mingzhu Xu1, Jing Wang1, Mingcai Wang1, Yiping Li1, Yupeng Hu1, Xuemeng Song2*, Weili Guan3

1 Shandong University
2 City University of Hong Kong
3 Harbin Institute of Technology (Shen Zhen)
* Corresponding author


Table of Contents


Introduction

This project is the official implementation of the paper “Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation”.

HMKD proposes a heterogeneous model knowledge distillation framework for semantic segmentation, which effectively addresses the challenge of knowledge transfer caused by structural differences between teacher and student networks through a dual alignment mechanism:

Example Description

We present HMKD, a framework for Semantic Segmentation via Knowledge Distillation.
Our method addresses architectural heterogeneity between teacher and student models by introducing dual alignment mechanisms.
This repository provides the official implementation, distilled checkpoints, and evaluation scripts.


Highlights


Project Structure

.
├── configs/               # Experiment configuration files
├── data/                  # Dataset paths and preprocessing scripts
├── models/                # Implementations of teacher and student networks
├── train_NEW_AEU_kd.py    # Core distillation training script
├── README.md
└── requirements.txt       # Environment dependencies

Installation

1. System Requirements

2. Clone the repository

git clone [https://github.com/iLearn-Lab/HMKD-ICMR.git](https://github.com/iLearn-Lab/HMKD-ICMR.git)
cd HMKD-ICMR

3. Install Python packages

pip install timm==0.3.2 mmcv-full==1.2.7 opencv-python==4.5.1.48

Checkpoints / Models

1. Initialization Weights (for Training)

Please download the following pretrained weights according to your experimental needs:

2. Trained HMKD Weights (for Testing)


Dataset / Benchmark

Dataset Train Size Val Size Test Size Classes
Cityscapes 2975 500 1525 19
CamVid 367 101 233 11

lease generate the corresponding dataset path list files (.txt) in the code.


Usage

Training

After downloading the pretrained weights and datasets, launch the distillation task using distributed training:

# training mode
CUDA_VISIBLE_DEVICES=0,1 nohup python -m torch.distributed.launch --nproc_per_node=2 train_NEW_AEU_kd.py > train_distill.log 2>&1 &

# debugging mode
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train_NEW_AEU_kd.py

Testing

Download the distilled weights, modify the path variable in the code, and run:

python evaluate.py --model_id HMKD --dataset cityscapes

Citation

If you use this code or method in your research, please cite our paper:

@ARTICLE{HMKD,
  author={Xu, Mingzhu and Wang, Jing and Wang, Mingcai and Li, Yiping and Hu, Yupeng and Song, Xuemeng and Guan, Weili},
  journal={ICMR}, 
  title={Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation}, 
  year={2025}
}

License

This project is released under the Apache License 2.0.