Avatar

Xiao (Brandon) Han

Ph.D. Student

UoSurrey, CVSSP

Biography

I am on the job market for Spring 2024. 😬

I am a third-year Ph.D. student at University of Surrey under the supervision of Prof. Tao Xiang and Prof. Yi-Zhe Song. I also work closely with Dr. Xiatian Zhu. Before coming to Surrey, I obtained my bachelor’s degree at Zhejiang University in 2020. I have several academic experiences at University of Michigan, Westlake University and Fudan University.

I am broadly interested in the field of Deep Learning. My current research interest lies in the intersection between Computer Vision and Natural Language Processing (i.e., vision‑language). My research goal is to build multi‑modal AI systems that can be used in real‑world applications (e.g., e‑commerce platform). My expertise includes but not limited to

  • Vision‑language pre‑training and (parameter‑efficient) adaptation;
  • Vision‑language downstream tasks (e.g., uni-/cross‑/multi-modal image retrieval, image captioning, text‑based/guided 2D/3D contents generation/editing);
  • Some specific tasks (e.g., person ReID).

For more details, see my academic CV.

Feel free to poke me if you want to discuss, collaborate, or just say hi. 😊

News

  • 21/03/2023: šŸ˜ Our FAME-ViL is selected as a highlight paper at CVPR 2023! (Top 2.5% of 9155 submissions)
  • 24/02/2023: šŸ˜† Our paper on multi-task vision-and-language model for fashion tasks get accepted by CVPR 2023 .
  • 03/07/2022: šŸ˜† Our paper on fashion-focused vision-and-language representation learning get accepted by ECCV 2022 .
  • 30/06/2022: šŸ˜‰ Our team win the second place of eBay eProduct Visual Search Challenge - FGVC9 (CVPR 2022) .

Publications

Journal & Conference

Quickly discover relevant content by filtering publications.

HeadSculpt: Crafting 3D Head Avatars with Text

A versatile pipeline for generating and editing 3D head avatars with textual prompts.

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

A versatile and efficient multi-task model for fashion-focused V+L tasks.

Large-Scale Product Retrieval with Weakly Supervised Representation Learning

The second place solution for 2nd eBay eProduct Visual Search Challenge (FGVC9-CVPR2022).

FashionViL: Fashion-Focused Vision-and-Language Representation Learning

A versatile and flexible framework for fashion-focused V+L representation learning.

UIGR: Unified Interactive Garment Retrieval

A unified framework and benchmark for two interactive garment retrieval tasks.


Copyright Ā© Xiao (Brandon) Han · Last update on June 2023 · Powered by the Academic theme for Hugo.