We present VICT, a test-time visual in-context tuning method that can adapt visual in-context learning models on the fly with a single test sample. VICT can be applied to a wide range of unseen domains and tasks at test time.
📖 For more results, please refer to our paper
- [03/2025] 🔥 VICT is released on arXiv.
VICT is a simple yet effective test-time training approach to adapt visual in-context learning (VICL) models on the fly. The motivation is that each test input offers a hint about the test distribution. Thus, we modify a VICL model at test time to make full use of this hint by setting up a one-sample learning problem.
Specifically, we flip the role between the task prompts and the test sample and use a cycle consistency self-supervised loss to reconstruct the original task prompt output. Our key insight is that a model should be aware of a new test distribution if it can successfully recover the original task prompts.
- Release the arXiv version.
- Release the code.
If you find this work useful for your research, please consider citing our paper:
@inproceedings{xie2025test,
title = {Test-Time Visual In-Context Tuning},
author = {Xie, Jiahao and Tonioni, Alessio and Rauschmayr, Nathalie and Tombari, Federico and Schiele, Bernt},
booktitle={CVPR},
year = {2025}
}