Geo Ahn

I am Geo Ahn, a recent M.S. graduate in Computer Science & Engineering at Kyung Hee University, South Korea, advised by Prof. Jinwoo Choi. I am currently looking for Ph.D. positions.

My research interests lie in video representation learning, video understanding, and debiasing. Specifically, I aim to build models that generalize beyond spurious shortcuts toward holistic reasoning over actions in videos. I am also broadly interested in addressing the current bottlenecks in video understanding, such as limited temporal and compositional reasoning, and in exploring how video models can serve as a foundation for VLA and world models.

Most recently, I was a research intern at NAVER Cloud (Video Team) in 2025, where I worked on compositional generalization with vision-language models.

Publications

EVIDENT: Routing MLLM Adaptation through Entity-Grounded Visual Evidence for Cross-Domain Video Temporal Grounding

Geo Ahn^*, Jiwook Han^*, Youngrae Kim^*, Joonseok Lee, and Jinwoo Choi

arXiv preprint, 2026

arXiv

@article{ahn2026evident,
  title = {EVIDENT: Routing MLLM Adaptation through Entity-Grounded Visual Evidence for Cross-Domain Video Temporal Grounding},
  author = {Ahn, Geo and Han, Jiwook and Kim, Youngrae and Lee, Joonseok and Choi, Jinwoo},
  journal = {arXiv preprint},
  year = {2026},
}

Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action Recognition

Geo Ahn, Inwoong Lee, Taeoh Kim, Minho Shim, Dongyoon Wee, and Jinwoo Choi

In ECCV, 2026

arXiv Project Page

@inproceedings{ahn2026rcore,
  title = {Why Can't I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action Recognition},
  author = {Ahn, Geo and Lee, Inwoong and Kim, Taeoh and Shim, Minho and Wee, Dongyoon and Choi, Jinwoo},
  booktitle = {ECCV},
  year = {2026},
  project_page = {RCORE.html}
}

SlotVTG: Object-Centric Adapter for Generalizable Video Temporal Grounding

Geo Ahn^*, Jiwook Han^*, Youngrae Kim^*, and Jinwoo Choi

In GRAIL-V Workshop at CVPR, 2026

arXiv Project Page

@inproceedings{han2026slotvtg,
  title = {SlotVTG: Object-Centric Adapter for Generalizable Video Temporal Grounding},
  author = {Ahn, Geo and Han, Jiwook and Kim, Youngrae and Choi, Jinwoo},
  booktitle = {GRAIL-V Workshop at CVPR},
  year = {2026},
  project_page = {https://slotvtg.netlify.app/}
}

DEVIAS: Learning Disentangled Video Representations of Action and Scene for Holistic Video Understanding

Geo Ahn^*, Kyungho Bae^*, Youngrae Kim^*, and Jinwoo Choi

In ECCV (Oral, 2.3% acceptance rate), 2024

arXiv Code Project Page

@inproceedings{bae2024devias,
  title = {DEVIAS: Learning Disentangled Video Representations of Action and Scene for Holistic Video Understanding},
  author = {Ahn, Geo and Bae, Kyungho and Kim, Youngrae and Choi, Jinwoo},
  booktitle = {ECCV (Oral, 2.3% acceptance rate)},
  year = {2024},
  project_page = {https://khu-vll.github.io/DEVIAS/}
}