Otniel-Bogdan Mercea

I am a final-year PhD candidate at the Max Planck Institute for Intelligent Systems and the University of Tubingen, supervised by Prof. Zeynep Akata and Prof. Andreas Geiger within the IMPRS-IS doctoral program. I was also a guest PhD student at Helmholtz Munich and Technical University of Munich, supervised by Prof. Zeynep Akata.
I earned my MSc in Artificial Intelligence from the University of Edinburgh in 2020, with a MSc thesis supervised by Prof. Amos Storkey, and my BEng in Computers and Information technology from Politehnica University of Timisoara in 2019, with a BEng thesis supervised by Prof. Calin-Adrian Popa.
I applied my research expertise in industry through impactful roles, including internships at Google DeepMind (Autumn 2024), supervised by Stefano Pellegrini, Jasper Uijlings, and Cordelia Schmid, on improving SAM 2 for significant occlusion and movement scenarios; and Google Research (2023–2024), collaborating with Anurag Arnab, Alexey Gritsenko, and Cordelia Schmid on efficient adaptation of large-scale models. I also collaborated with Aleksandra Nowak, Utku Evci, and Yann Dauphin from Google DeepMind on related projects. Previously, I was a Machine Learning Researcher at Everseen, developing real-time multi-camera tracking systems.
My research focuses on improving the efficiency of deep learning through data-efficient (low-shot) and model-efficient adaptation methods for large-scale models. I also have extensive experience in multimodal learning (audio-visual, multimodal language models) and video semantic segmentation and tracking.
News
Dec 27, 2024 | I have finished my 4 month internship at Google DeepMind. I have worked on enhancing SAM 2 for scenarios involving significant occlusion or movement, under the supervision of Stefano Pellegrini, Jasper Uijlings and Cordelia Schmid. |
---|---|
Oct 21, 2024 | New preprint is available on optimal adapter placement for efficient transfer learning. |
Apr 04, 2024 | Our work on audio-visual GZSL using large multi-modal models was accepted at CVPR 2024 workshops (L3D-IVU). |
Mar 22, 2024 | After 8 months, I finished my internship at Google Research. I have worked on efficient adaptation under the supervision of Anurag Arnab, Alexey Gritsenko and Cordelia Schmid. |
Feb 27, 2024 | LoSA was accepted as a SPOTLIGHT at CVPR 2024. |
Aug 21, 2023 | ReGAdA was accepted as an ORAL to BMVC 2023. |
Aug 11, 2023 | AV-Diff was accepted at DAGM GCPR 2023 |
Selected publications
- Time-Memory-and Parameter-Efficient Visual AdaptationSPOTLIGHT @ CVPR, 2024Seattle, USA
- Text-to-feature diffusion for audio-visual few-shot learningDAGM GCPR, 2023Heidelberg, Germany
- Temporal and cross-modal attention for audio-visual zero-shot learningECCV, 2022Tel Aviv, Israel
- Audio-visual generalised zero-shot learning with cross-modal attention and languageCVPR, 2022New Orleans, USA