🎓 I'm currently a second-year Ph.D. student at the College of Computer Science, Nankai University, advised by Prof. Qibin Hou and Prof. Ming-Ming Cheng in the Tianjin Key Laboratory of Visual Computing and Intelligent Perception (VCIP).
Before that, I obtained my B.S. in Mathematics from the Shiing-Shen Chern Class at Nankai University.
🔬 My current research interests focus on multimodal vision foundation models, exploring how vision and other modalities can be unified through large-scale representation learning. In the future, I plan to explore broader directions in computer vision and machine learning.
- Multimodal Models
- Visual Representation Learning
- Large-Scale Pretraining
- Computer Vision
- Deep Learning: PyTorch, Transformers🤗
- Languages: Python, Matlab
Building general and scalable multimodal vision foundation models that can serve as strong backbones for diverse downstream tasks.
- Email: [email protected]
- Google Scholar: Jiao-Long Cao
- GitHub: @caojiaolong
“Stay curious, stay critical, and keep building.”
— Inspired by the spirit of open research
