Computer Vision and Multimodal Computing Reading Group

We read and discuss papers in the areas of (1) diffusion models (2) digatal avatars (3) multimodal perception+robotics (4) computer vision for social interaction (5) implicit neural representations (6) other areas related to our research. If you would like to join this group, please send email to: yapeng dot tian at utdallas dot edu.

Regular Meeting Time & Place

Scheduled Meetings

Date

Agenda

03/23/2023

Shijian Deng: PaLM-E: An embodied multimodal language model

03/30/2023

Yulang Wu: What are Diffusion Models?

04/06/2023

Siva Sai Nagender Vasireddy: DiffusionDet: Diffusion Model for Object Detection

Harsh Singh: Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

04/13/2023

Shijian Deng: High-resolution image reconstruction with latent diffusion models from human brain activity

04/20/2023

Siva Sai Nagender Vasireddy: Visual Programming for Compositional Visual Reasoning

Sasha Kaplan: ViperGPT: Visual Inference via Python Execution for Reasoning

04/27/2023

Harsh Singh: SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

05/04/2023

Shijian Deng: Vision Transformers are Parameter-Efficient Audio-Visual Learners

05/18/2023

Siva Sai Nagender Vasireddy: SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

05/25/2023

Harsh Singh: Image as Set of Points

06/01/2023

Saksham Singh Kushwaha: Dataset Condensation with Distribution Matching

06/08/2023

Shijian Deng: ImageBind: One Embedding Space To Bind Them All

06/15/2023

Siva Sai Nagender Vasireddy: Audio Visual Language Maps for Robot Navigation

06/22/2023

Sisi Aarukapalli: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

07/13/2023

Siva Sai Nagender Vasireddy: Semantic Audio-Visual Navigation

07/20/2023

Harsh Singh: Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

07/27/2023

Shijian Deng: Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

08/03/2023

Akshay Vyas: ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution

TBD

The paper reading has been postponed until CVPR.


Current Participants


Prof. Yapeng Tian
Siva Sai Nagender Vasireddy, PhD Student
Shijian Deng, PhD Student
Saksham Singh Kushwaha, PhD Student
Weiguo Pian, PhD Student
Akshay Vyas, PhD Student
Zongyang Du, PhD Student
Sasha Kaplan, Undergraduate student
Sisi Aarukapalli, Undergraduate student

Past Participants


Yulang Wu, Postdoc at UCSF
Harsh Singh, Graduate Student at MBZUAI

This website was inspired by Topology Data Analysis Reading Group.