Implementation of the What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models
link to paper: What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models
The authors did not specify a license in the publication, so I do not have a specific license here. I have contacted the original paper authors for a proper license.