Plenary Speakers
Title of the Talk: Fairness in Visual Recognition: Redesigning the Datasets, Improving the Models and Diversifying the AI leadership
Date & Time: 20 December 2021, 9:30 AM (IST)
Session Chair: Santanu Chaudhury
Abstract: Computer vision models trained on unparalleled amounts of data have revolutionized many applications. However, more and more historical societal biases are making their way into these seemingly innocuous systems. We focus our attention on two types of biases: (1) bias in the form of inappropriate correlations between protected attributes (age, gender expression, skin color, ...) and the predictions of visual recognition models, as well as (2) bias in the form of unintended discrepancies in error rates of vision systems across different social, demographic or cultural groups. In this talk, I'll dive deeper both into the technical causes and the viable strategies for mitigating bias in computer vision. I’ll highlight a subset of our recent work mitigating bias in visual datasets, in recognition models, in evaluation metrics as well as in the makeup of the next generation of researchers.
Biography:
Dr. Olga Russakovsky is an Assistant Professor in the Computer Science Department at Princeton University. Her research is in computer vision, closely integrated with the fields of machine learning, human-computer interaction and fairness, accountability and transparency. She has been awarded the AnitaB.org's Emerging Leader Abie Award in honor of Denice Denton in 2020, the CRA-WP Anita Borg Early Career Award in 2020, the MIT Technology Review's 35-under-35 Innovator award in 2017, the PAMI Everingham Prize in 2016 and Foreign Policy Magazine's 100 Leading Global Thinkers award in 2015. In addition to her research, she co-founded and continues to serve on the Board of Directors of the AI4ALL foundation dedicated to increasing diversity and inclusion in Artificial Intelligence (AI). She completed her PhD at Stanford University in 2015 and her postdoctoral fellowship at Carnegie Mellon University in 2017.
Kristen Grauman
Department of Computer Science
University of Texas at Austin
Title of the Talk:
First-Person Video for Understanding Interactions
Date & Time: 21 December 2021, 9:30 AM (IST)
Session Chair: Rama Chellappa
Abstract:
First-person or “egocentric” perception requires understanding the multimodal video that streams to a wearable camera. An always-on egocentric view offers a special window into the camera wearer’s attention, goals, and interactions with people and objects in the environment, making it an exciting avenue for perception in augmented reality and robot learning. I will present our work on first-person video understanding, and show our progress using passive observations of human activity to inform active robot behaviors. First, we explore learning visual affordances to anticipate how objects and environments are used, which requires not simply knowing what to call things, but also knowing how they work. Turning to audio-visual sensing, we extract a conversation partner’s speech from competing background sounds or other human speakers. Towards translating these models into robot action, we prime reinforcement learning agents to prefer human-like interactions, thereby accelerating their task learning. Finally, I will overview Ego4D, a massive new egocentric video dataset and benchmark built by a multi-institution collaboration that will be publicly available this month.
Biography:
Kristen Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Director in Facebook AI Research (FAIR). Her research in computer vision and machine learning focuses on visual recognition, video, and embodied perception. Before joining UT-Austin in 2007, she received her Ph.D. at MIT. She is an IEEE Fellow, AAAI Fellow, Sloan Fellow, and recipient of the 2013 Computers and Thought Award. She was inducted into the Academy of Distinguished Teachers at UT Austin in 2017. She and her collaborators have been recognized with several Best Paper awards in computer vision, including a 2011 Marr Prize and a 2017 Helmholtz Prize (test of time award). She has served as Associate Editor-in-Chief for PAMI and Program Chair of CVPR 2015 and NeurIPS 2018.
http://www.cs.utexas.edu/~grauman/
Jonathan Ragan-Kelley
Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Title of the Talk: Organizing Computation for High-Performance Visual Computing
Date & Time: 22 December 2021, 9:30 AM (IST)
Session Chair: Subodh Kumar
Abstract:
In the face of declining returns to Moore’s law, future visual computing
applications—from photorealistic real-time rendering, to 4D light field
cameras, to pervasive sensing with deep learning—still demand orders of
magnitude more computation than we currently have. From data centers to
mobile devices, performance and energy scaling is limited by locality (the
distance over which data has to move, e.g., from nearby caches, far away main
memory, or across networks) and parallelism. Because of this, I argue that we
should think of the performance and efficiency of an application as
determined not just by the algorithm and the hardware on which it runs, but
critically also by the organization of its computations and data. For
algorithms with the same complexity—even the exact same set of arithmetic
operations—the order and granularity of execution and placement of data can
easily change performance by an order of magnitude because of locality and
parallelism. To extract the full potential of our machines, we must treat the
organization of computation as a first-class concern, while working across
all levels, from algorithms and data structures, to programming languages, to
hardware.
This talk will present facets of this philosophy in systems for image
processing, 3D graphics, and machine learning. I will show that, for the
data-parallel pipelines common in these data-intensive applications, the
possible organizations of computations and data, and the effect they have on
performance, are driven by the fundamental dependencies in a given problem.
Then I will show how, by exploiting domain knowledge to define structured
spaces of possible organizations and dependencies, we can enable radically
simpler high-performance programs, smarter compilers, and more efficient
hardware. Finally, I will show how we use these structured spaces to unlock
the power of machine learning for optimizing systems.
Biography: Jonathan Ragan-Kelley is the Esther and Harold E. Edgerton Assistant Professor of Electrical Engineering & Computer Science at MIT and assistant professor of EECS at UC Berkeley. He works on high-efficiency visual computing, including systems, compilers, and architectures for image processing, vision, 3D graphics, and machine learning. He is a recipient of the ACM SIGGRAPH Significant New Researcher award, NSF CAREER award, Intel Outstanding Researcher award, and two CACM Research Highlights. He was previously a visiting researcher at Google, a postdoc in Computer Science at Stanford, and earned his PhD in Computer Science from MIT in 2014. He co-created the Halide language and has helped build more than a dozen other DSL and compiler systems, the first of which was a finalist for an Academy technical achievement award.
Bhabatosh Chanda
Electronics and Communication Sciences Unit
Indian Statistical Institute
Title of the Talk: Morphological Network
Date & Time: 20 December 2021, 1:30 PM (IST)
Session Chair: Jayanta Mukhopadhyay
Abstract: Mathematical morphology develops powerful image processing tools that can directly operate on shapes. On the other hand, in recent times, convolutional neural networks have shown outstanding performance on many well known computer vision problems. This talk tries to present the fusion of these two concepts and develop what we call a morphological network or Morph-Net. We also show its performance in various applications.
Biography:
Currently Bhabatosh Chanda is a Professor in Indian Statistical Institute, Kolkata, India.
His research interest includes Image and video Processing, Pattern Recognition, Computer Vision and Mathematical Morphology.
He has published more than 200 technical articles in refereed journals and conferences, authored two books and edited six books.
He has received `Young Scientist Medal' of Indian National Science Academy in 1989, `Computer Engineering Division Medal' of the Institution of Engineers (India) in 1998, 'Vikram Sarabhai Research Award in 2002 and IETE-Ram Lal Wadhwa Gold medal in 2007.
He is fellow of Institute of Electronics and Telecommunication Engineers (FIETE), National Academy of Science, India (FNASc.), Indian National Academy of Engineering (FNAE), and International Association of Pattern Recognition (FIAPR).