Spatial-Frequency Guided Pixel Transformer (SF-GPT) for NIR-to-RGB Translation

Official PyTorch implementation of the paper:

Spatial-Frequency Guided Pixel Transformer for NIR-to-RGB Translation
Infrared Physics & Technology, 2025
🔗 ScienceDirect
🔗 DOI: 10.1016/j.infrared.2025.105891

📌 Introduction

Near-Infrared (NIR) imaging provides enhanced contrast and sensitivity but lacks the spatial and textural richness of RGB images. NIR-to-RGB translation is a challenging task due to:

Spectral mapping ambiguity
Statistical weak correlation between NIR and RGB

We propose SF-GPT, a novel deep learning architecture that leverages both spatial and frequency domains through transformer-based mechanisms.

✨ Key Contributions

SF-GPT: We propose a novel Spatial-Frequency Guided Pixel Transformer for NIR-to-RGB translation, combining spatial and frequency cues to capture both local textures and global context.
Dual-domain Feature Extraction: We incorporate DCT or DWT to extract low- and high-frequency features, while pixel-wise cues are obtained via PixelUnshuffle for fine-grained reconstruction.
SFG-MSA Module: We design a Spatial-Frequency Guided Multi-head Self-Attention mechanism that adaptively fuses pixel and frequency features, enhancing translation fidelity and feature discrimination.
State-of-the-art Performance: Extensive experiments validate the effectiveness of SF-GPT, outperforming existing methods in both visual quality and quantitative metrics.

📈 PSNR Comparison on VCIP Test Dataset

🧠 Network Architecture

🔍 Visualization of DCT and DWT Decomposition

🖼️ Visual Comparison on VCIP Test Dataset

🖼️ Visual Comparison on SSMID Test Dataset

📊 Quantitative Comparison on VCIP Test Dataset

Method	PSNR (↑)	SSIM (↑)	AE (↓)	LPIPS (↓)
ATcycleGAN	19.59	0.59	4.33	0.295
CoColor	23.54	0.69	2.68	0.233
ColorMamba	24.56	0.71	2.81	0.212
DCT-RCAN	22.15	0.77	3.40	0.214
DRSformer	20.18	0.56	4.22	0.254
HAT	19.42	0.69	3.98	0.298
MCFNet	20.34	0.61	3.79	0.208
MFF	17.39	0.61	4.69	0.318
MPFNet	22.14	0.63	3.68	0.253
NIR-GNN	17.50	0.60	5.22	0.384
Restormer	19.43	0.54	4.41	0.267
SPADE	19.24	0.59	4.59	0.283
SST	14.26	0.57	5.61	0.361
TTST	18.57	0.67	4.46	0.320
SF-GPT (DCT)	26.09	0.77	2.72	0.132
SF-GPT (DWT)	25.82	0.79	2.57	0.114

📊 Quantitative Comparison on SSMID Test Dataset

Method	PSNR (↑)	SSIM (↑)	AE (↓)	LPIPS (↓)
DGCAN	18.133	0.601	—	—
Compressed DGCAN	16.973	0.565	—	—
DRSformer	18.176	0.698	5.698	0.238
Restormer	17.983	0.693	5.560	0.256
HAT	17.677	0.692	5.803	0.249
TTST	17.722	0.696	5.747	0.244
SF-GPT (DCT)	19.011	0.699	5.541	0.185
SF-GPT (DWT)	19.917	0.710	5.414	0.176

📚 Citation

If you find this work helpful in your research, please cite:

@article{jiang2025spatial,
  title={Spatial-Frequency Guided Pixel Transformer for NIR-to-RGB Translation},
  author={Jiang, Hongcheng and Chen, ZhiQiang},
  journal={Infrared Physics \& Technology},
  year={2025},
  pages={105891},
  publisher={Elsevier},
  doi={10.1016/j.infrared.2025.105891},
  note={doi: \href{https://doi.org/10.1016/j.infrared.2025.105891}{10.1016/j.infrared.2025.105891}}
}

📬 Contact

If you have any questions, feedback, or collaboration ideas, feel free to reach out:

💻 Website: jianghongcheng.github.io
📧 Email: [email protected]
🏫 Affiliation: University of Missouri–Kansas City (UMKC)

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Figures		Figures
Results		Results
measurement		measurement
train		train
venv		venv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spatial-Frequency Guided Pixel Transformer (SF-GPT) for NIR-to-RGB Translation

📌 Introduction

✨ Key Contributions

📈 PSNR Comparison on VCIP Test Dataset

🧠 Network Architecture

🔍 Visualization of DCT and DWT Decomposition

🖼️ Visual Comparison on VCIP Test Dataset

🖼️ Visual Comparison on SSMID Test Dataset

📊 Quantitative Comparison on VCIP Test Dataset

📊 Quantitative Comparison on SSMID Test Dataset

📚 Citation

📬 Contact

About

Uh oh!

Releases

Packages

Languages

jianghongcheng/SF-GPT

Folders and files

Latest commit

History

Repository files navigation

Spatial-Frequency Guided Pixel Transformer (SF-GPT) for NIR-to-RGB Translation

📌 Introduction

✨ Key Contributions

📈 PSNR Comparison on VCIP Test Dataset

🧠 Network Architecture

🔍 Visualization of DCT and DWT Decomposition

🖼️ Visual Comparison on VCIP Test Dataset

🖼️ Visual Comparison on SSMID Test Dataset

📊 Quantitative Comparison on VCIP Test Dataset

📊 Quantitative Comparison on SSMID Test Dataset

📚 Citation

📬 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages