Hi! I am a third year PhD student at Robotics Institute of Carnegie Mellon University, advised by Prof. Deva Ramanan. I did my undergrad in Computer Science and Maths at Cornell University and served as college symbol bearer (top 5 of the college). My current research focuses on computer vision and learning, especially robustness to distribution shifts (continual/lifelong vision) and data-efficient adaptation with multi-modalities.

🔥 News

📝 Publications

In submission.

Revisiting the Role of Language Priors in Vision-Language Models (VisualGPTScore)

Zhiqiu Lin*, Xinyue Chen*, Deepak Pathak, Pengchuan Zhang, Deva Ramanan

Website | Arxiv |

  • We use generative VLMs to implement Visual Generative Pre-Training Score (VisualGPTScore), i.e., the probablity score of generating a text given an image.
  • Such a generative score achieves top-tier image-text retrieval performance on multiple compositionality benchmarks, surpassing all discriminative approaches by a great margin.
  • We further investigate the role of language prior P(text) through a probablistic lens, and introduce a debiasing solution that consistently improves the VisualGPTScore under train-test distribution shifts over text.
CVPR 2023

Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models

Zhiqiu Lin*, Samuel Yu*, Zhiyi Kuang, Deepak Pathak, Deva Ramanan

Website | Arxiv |

  • We propose a simple cross-modal adaptation method for multimodal models that repurposes information from other modalities (e.g., class names and audio clips) as additional training samples.
  • For CLIP, it achieves SOTA few-shot adaptation performance even with a simple linear probe, and consistently improves prior art such as prompting, adapter, and weight ensembling.
  • Audiovisual experiments with AudioCLIP suggest that one can learn a better dog visual classifier by listening to them bark.
NeurIPS 2022

LECO: Continual Learning with Evolving Class Ontologies

Zhiqiu Lin, Deepak Pathak, Yu-Xiong Wang, Deva Ramanan*, Shu Kong*

Website | Arxiv | NeurIPS’22 Talk

  • A practical lifelong vision benchmark motivated by real-world dataset versioning issues, e.g., Mapillary 1.2 to 2.0.
  • Simple but effective solutions such as joint training, semi-supervised learning, and learning-with-partial-labels to address inconsistent annotation (both coarse-grained and fine-grained).
NeurIPS 2021 (Datasets and Benchmarks)

The CLEAR Benchmark: Continual LEArning on Real-World Imagery

Zhiqiu Lin, Jia Shi, Deepak Pathak*, Deva Ramanan*

CLEAR Wiki | NeurIPS Paper Site | Arxiv | CVPR’22 Talk

  • The first continual benchmark for visual recognition with natural distribution shifts over a decade!
  • CLEAR has a 10- and 100-classes version (download links), similar to the famous CIFAR-10 and CIFAR-100 benchmarks.
  • 1st CLEAR challenge was hosted on June 19th, 2022. We have 79 participants from 21 different countries and regions signed up for the challenge!
CVPR 2020 (Best Paper Nomination)

Visual Chirality

Zhiqiu Lin, Jin Sun, Abe Davis, Noah Snavely

Website | Arxiv | Video |

  • How does reflection change what we learn from images? Despite widespread use in data augmentation, people had not looked closely at this question before our work.

🎖 Honors and Awards

  • 2020.06 Best Paper Nomination at CVPR’20 for Visual Chirality!
  • 2020.05 Graduated Summa Cum Laude in Computer Science and Mathematics from Cornell University, and served as college symbol bearer (top 5 of the college).

📖 Educations

  • 2020.09 - (now), PhD student, Carnegie Mellon University.
  • 2016.09 - 2020.06, Undergraduate, Cornell University.

💬 Invited Talks

💻 Services

  • Organizer: CVPR’22 VPLOW Workshop (Challenge Track)
  • Reviewer: ECCV, CVPR (Outstanding reviewer), ICCV, NeurIPS, ICML.
  • Teaching (CMU): Learning-based Image Synthesis and Advanced Computer Vision
  • Teaching (Cornell): Advanced Machine Learning, Cornell Tech Pre-Master Program, Functional Programming, Algorithm Analysis, Data Structures, Computer Vision