Plenary Speakers


Olga Russakovsky
Department of Computer Science
Princeton University

Title of the Talk: Fairness in Visual Recognition: Redesigning the Datasets, Improving the Models and Diversifying the AI leadership
Date & Time: 20 December 2021, 9:30 AM (IST)
Session Chair: Santanu Chaudhury

Abstract: Computer vision models trained on unparalleled amounts of data have revolutionized many applications. However, more and more historical societal biases are making their way into these seemingly innocuous systems. We focus our attention on two types of biases: (1) bias in the form of inappropriate correlations between protected attributes (age, gender expression, skin color, ...) and the predictions of visual recognition models, as well as (2) bias in the form of unintended discrepancies in error rates of vision systems across different social, demographic or cultural groups. In this talk, I'll dive deeper both into the technical causes and the viable strategies for mitigating bias in computer vision. I’ll highlight a subset of our recent work mitigating bias in visual datasets, in recognition models, in evaluation metrics as well as in the makeup of the next generation of researchers.

Biography: Dr. Olga Russakovsky is an Assistant Professor in the Computer Science Department at Princeton University. Her research is in computer vision, closely integrated with the fields of machine learning, human-computer interaction and fairness, accountability and transparency. She has been awarded the's Emerging Leader Abie Award in honor of Denice Denton in 2020, the CRA-WP Anita Borg Early Career Award in 2020, the MIT Technology Review's 35-under-35 Innovator award in 2017, the PAMI Everingham Prize in 2016 and Foreign Policy Magazine's 100 Leading Global Thinkers award in 2015. In addition to her research, she co-founded and continues to serve on the Board of Directors of the AI4ALL foundation dedicated to increasing diversity and inclusion in Artificial Intelligence (AI). She completed her PhD at Stanford University in 2015 and her postdoctoral fellowship at Carnegie Mellon University in 2017.


Kristen Grauman
Department of Computer Science
University of Texas at Austin

Title of the Talk: First-Person Video for Understanding Interactions
Date & Time: 21 December 2021, 9:30 AM (IST)
Session Chair: Rama Chellappa

Abstract: First-person or “egocentric” perception requires understanding the multimodal video that streams to a wearable camera. An always-on egocentric view offers a special window into the camera wearer’s attention, goals, and interactions with people and objects in the environment, making it an exciting avenue for perception in augmented reality and robot learning. I will present our work on first-person video understanding, and show our progress using passive observations of human activity to inform active robot behaviors. First, we explore learning visual affordances to anticipate how objects and environments are used, which requires not simply knowing what to call things, but also knowing how they work. Turning to audio-visual sensing, we extract a conversation partner’s speech from competing background sounds or other human speakers. Towards translating these models into robot action, we prime reinforcement learning agents to prefer human-like interactions, thereby accelerating their task learning. Finally, I will overview Ego4D, a massive new egocentric video dataset and benchmark built by a multi-institution collaboration that will be publicly available this month.

Biography: Kristen Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Director in Facebook AI Research (FAIR). Her research in computer vision and machine learning focuses on visual recognition, video, and embodied perception. Before joining UT-Austin in 2007, she received her Ph.D. at MIT. She is an IEEE Fellow, AAAI Fellow, Sloan Fellow, and recipient of the 2013 Computers and Thought Award. She was inducted into the Academy of Distinguished Teachers at UT Austin in 2017. She and her collaborators have been recognized with several Best Paper awards in computer vision, including a 2011 Marr Prize and a 2017 Helmholtz Prize (test of time award). She has served as Associate Editor-in-Chief for PAMI and Program Chair of CVPR 2015 and NeurIPS 2018.


Jonathan Ragan-Kelley
Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology

Title of the Talk: Organizing Computation for High-Performance Visual Computing
Date & Time: 22 December 2021, 9:30 AM (IST)
Session Chair: Subodh Kumar

Abstract: In the face of declining returns to Moore’s law, future visual computing applications—from photorealistic real-time rendering, to 4D light field cameras, to pervasive sensing with deep learning—still demand orders of magnitude more computation than we currently have. From data centers to mobile devices, performance and energy scaling is limited by locality (the distance over which data has to move, e.g., from nearby caches, far away main memory, or across networks) and parallelism. Because of this, I argue that we should think of the performance and efficiency of an application as determined not just by the algorithm and the hardware on which it runs, but critically also by the organization of its computations and data. For algorithms with the same complexity—even the exact same set of arithmetic operations—the order and granularity of execution and placement of data can easily change performance by an order of magnitude because of locality and parallelism. To extract the full potential of our machines, we must treat the organization of computation as a first-class concern, while working across all levels, from algorithms and data structures, to programming languages, to hardware.
This talk will present facets of this philosophy in systems for image processing, 3D graphics, and machine learning. I will show that, for the data-parallel pipelines common in these data-intensive applications, the possible organizations of computations and data, and the effect they have on performance, are driven by the fundamental dependencies in a given problem. Then I will show how, by exploiting domain knowledge to define structured spaces of possible organizations and dependencies, we can enable radically simpler high-performance programs, smarter compilers, and more efficient hardware. Finally, I will show how we use these structured spaces to unlock the power of machine learning for optimizing systems.

Biography: Jonathan Ragan-Kelley is the Esther and Harold E. Edgerton Assistant Professor of Electrical Engineering & Computer Science at MIT and assistant professor of EECS at UC Berkeley. He works on high-efficiency visual computing, including systems, compilers, and architectures for image processing, vision, 3D graphics, and machine learning. He is a recipient of the ACM SIGGRAPH Significant New Researcher award, NSF CAREER award, Intel Outstanding Researcher award, and two CACM Research Highlights. He was previously a visiting researcher at Google, a postdoc in Computer Science at Stanford, and earned his PhD in Computer Science from MIT in 2014. He co-created the Halide language and has helped build more than a dozen other DSL and compiler systems, the first of which was a finalist for an Academy technical achievement award.


Bhabatosh Chanda
Electronics and Communication Sciences Unit
Indian Statistical Institute

Title of the Talk: Morphological Network
Date & Time: 20 December 2021, 1:30 PM (IST)
Session Chair: Jayanta Mukhopadhyay

Abstract: Mathematical morphology develops powerful image processing tools that can directly operate on shapes. On the other hand, in recent times, convolutional neural networks have shown outstanding performance on many well known computer vision problems. This talk tries to present the fusion of these two concepts and develop what we call a morphological network or Morph-Net. We also show its performance in various applications.

Biography: Currently Bhabatosh Chanda is a Professor in Indian Statistical Institute, Kolkata, India. His research interest includes Image and video Processing, Pattern Recognition, Computer Vision and Mathematical Morphology. He has published more than 200 technical articles in refereed journals and conferences, authored two books and edited six books. He has received `Young Scientist Medal' of Indian National Science Academy in 1989, `Computer Engineering Division Medal' of the Institution of Engineers (India) in 1998, 'Vikram Sarabhai Research Award in 2002 and IETE-Ram Lal Wadhwa Gold medal in 2007. He is fellow of Institute of Electronics and Telecommunication Engineers (FIETE), National Academy of Science, India (FNASc.), Indian National Academy of Engineering (FNAE), and International Association of Pattern Recognition (FIAPR).

December 20December 21December 22
Session 1A Session 2A Session 3A
Session 1B Session 2B Session 3B
Session P1 Session P2 Vision India
Plenary 1 Plenary 3 Plenary 4
Plenary 2    
List of Accepted Papers
Conference Program