Few-shot, zero-shot, continual (un-)learning
We try to mitigate the annotation cost for visual understanding and machine learning problems.
We are a team of researchers and engineers who are serious in machine learning and computer vision. Here are our answers to some of frequently asked questions regarding joining our group.
The team's primary focus is to get you succeeded in your career path. Customizing to your interest, we try to help your way to reach the goal after you graduate the lab. Most of our team members are for academic positions (e.g., faculty, world class industry researchers) and a few are for engineering positions (e.g., world class software companies). We mostly aim for sharing our results in top venus in computer vision and machine learning (e.g., CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML) and contribute to open source softwares to benefit other researchers including ourselves.

We want to come up with working ideas for a lot of bottlenecks in visual/multi-modal understanding problems and relevant machine learning problems. We read papers, discuss with fellow students and faculty, implement them and come up with new ideas for the issues in the state of the arts. We welcome any crazy ideas to try and discuss among the team and the faculty is always open to discuss on anything.
We try our best to provide you the best development environments (standing desks, noise cancelling headphones (depending on funding situation) and etc.). Each graduate student will have a state of the art workstation equipped with one A6000 and a 39" wide curved monitor for fast prototyping. We are equipped with roughly 80 GPU's (A6000 or 3090) (roughtly 6+ GPUs per person). We are buying more!
We try our best to give you the best of monetary compensation. Usually, most of our team members are paid to the legally allowed maximum graduate student salaries and more for additional project workloads if you want to be involved.
We try to mitigate the annotation cost for visual understanding and machine learning problems.
Combining few-shot, continual video understanding with language understanding, we try to architect new models to build a robotics agent to help household tasks (e.g., bring a cup of water from the kitchen).
We try to build models to understand languages alongside with visual signals. Other than the vision-and-language understanding, we are also interested in various modalities including sketch, diagram and neuromorphic (or event) understanding.
We try to architect new models for understanding videos - a long-waited open problem in computer vision. See more details in our publication pages.
We are a team of researchers and engineers who are serious in machine learning and computer vision. Here are our answers to some of frequently asked questions regarding joining our group.
Google Form Link