Invited Talks


Siamak Yousefi
Department of Ophthalmology and the Department of Genetics, Genomics and Informatics,
University of Tennessee Health Science Center (UTHSC)

Title of the Talk: Mining ophthalmic data using Artificial Intelligence.

Date&Time: 19 December, 9:15 AM IST.
Session Chair: Arnav Bhavsar

Bio:Siamak Yousefi is Assistant Professor at the Department of Ophthalmology and the Department of Genetics, Genomics, and Informatics of the University of Tennessee Health Science Center (UTHSC) in Memphis. He received his PhD in Electrical Engineering from the University of Texas at Dallas in 2012 and completed two postdoctoral trainings at the University of California Los Angeles (UCLA) working on Brain Computer Interface (BCI) and University of California San Diego (UCSD) working on computational ophthalmology. He is the director of the Data Mining and Machine Learning (DM2L) laboratory at UTHSC. He has published more than 100 peer-reviewed journal articles, conference papers, and abstracts, with over 50 in applications of Artificial Intelligence (AI) in vision and ophthalmology. He has been an invited guest speaker, moderator, and co-organizer of numerous Ophthalmology venues including Association for Research in Vision and Ophthalmology (ARVO), The Glaucoma Foundation, Asia-Pacific Glaucoma Congress (APGC), International Society for Eye Research (ISER), and Iranian Society of Ophthalmology (IRSO). He has been a member of several National Institute of Health (NIH) grant review panels. He is an Editorial Board Member of the Translational Vision Science and Technology (TVST) journal.
His lab is working on developing deep learning, manifold learning, conventional machine learning, unsupervised machine learning, and statistical approaches to screen, diagnose, and monitor different ocular conditions such as glaucoma, macular degeneration, keratoconus, keratoplasty, and uveitis from imaging and visual field data.


Xiaojun Chang
Monash University

Title of the Talk: Towards Neural Architecture Search: Challenges and Solutions

Date&Time: 20 December, 3:15 PM (IST).
Session Chair: Abhinav Dhall

In recent years, a large number of related algorithms for Neural Architecture Search (NAS) have emerged. They have made various improvements to the NAS algorithm, and the related research work is complicated and rich. In order to reduce the difficulty for beginners to conduct NAS-related research, in this tutorial, we will provide a new perspective: starting with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then giving solutions for subsequent related research work. In addition, we will conduct a detailed and comprehensive analysis, comparison and summary of these works. Finally, we will give possible future research directions.

Bio: Dr Xiaojun Chang is a Senior Lecturer at Faculty of Information Technology, Monash University Clayton Campus, Australia. He is also affiliated with Monash University Centre for Data Science. He is ARC Discovery Early Career Researcher Award (DECRA) Fellow between 2019-2021 (awarded in 2018). Before joining Monash, Dr Chang was a Postdoc Research Associate in School of Computer Science, Carnegie Mellon University, working with Prof. Alex Hauptmann. He has spent most of his time working on exploring multiple signals (visual, acoustic, textual) for automatic content analysis in unconstrained or surveillance videos. Dr Chang's system has achieved top performance in various international competitions, such as TRECVID MED, TRECVID SIN, and TRECVID AVS. Dr. Chang received his Ph.D. degree in Centre for Artificial Intelligence & Faculty of Engineering and Information Technology, University of Technology Sydney, under the supervision of Prof. Yi Yang. During his PhD study, he was sequentially mentored by Prof. Feiping Nie and Yaoliang Yu. His research focus in this period was mainly on developing machine learning algorithms and apply them to multimedia analysis and computer vision.


Tanaya Guha
University of Warwick

Title of the Talk: Trajectory Forecasting: Multi-object to Multi-camera

Date&Time: 21 December, 3:15 PM (IST).
Session Chair: Abhinav Dhall

Abstract: Predicting future events in videos is a core task in computer vision. Trajectory forecasting is the problem of predicting future locations of an object in a video with wide applications in surveillance, autonomous vehicle navigation and mobile robotics. Pedestrians are a particularly challenging class of objects to predict, as they exhibit highly dynamic motion and may change speed or direction rapidly. This talk will focus on our recent work on pedestrian trajectory forecasting in videos introducing new perspectives to this popular task. In contrast to existing works which primarily consider a birds-eye perspective, we formulate the problem from an object-level perspective and call for the prediction of full object bounding boxes, rather than trajectories alone. Next, we introduce the task of multi-camera trajectory forecasting, where the future trajectory of an object is predicted in a network of cameras. The talk will discuss relevant literature, databases and state-of-the-art in the area.

Bio: Tanaya Guha is an Assistant Professor of Computer Science, University of Warwick, UK, where she is a member of the Warwick Machine Learning Group. Prior to joining Warwick, she was an Assistant Professor in IIT Kanpur and a Postdoctoral Researcher in University of Southern California. She holds a PhD in Electrical & Computer Engineering from the University of British Columbia (UBC), Vancouver, Canada. Her research focuses on building machine intelligence capabilities to understand, recognize, and predict human behaviour combining machine learning and signal processing. She regularly serves in the Program Committees of INTERSPEECH, ICME, ACM MM and ACII. She was a recipient of Warwick Global Research Priority award, ICME'20 Outstanding Area Chair award, and has won prestigious scholarships from Mensa Canada, Amazon and Google at the doctoral level.


Karan Sikka

SRI International

Title of the Talk: Overview of Multimodal Embeddings and its Application in Vision-Language Tasks and Beyond

Date&Time: 19 December, 3:15 PM.
Session Chair: Abhinav Dhall

Abstract: We have witnessed significant progress in learning tasks involving vision and language such as Visual Question Answering and Phrase Grounding, in the last few years. This growth has been fueled by advances in deep learning in vision and language domains and availability of large scale multimodal data. In this talk I will focus on building learning models that can jointly understand and reason about multiple modalities. I will begin this talk with an introduction to multimodal embeddings, where the key idea is to align the two modalities by embedding them in a common vector space. This alignment ensures that similar entities in both modalities are closer e.g. word "cow" and images of concept "cow". I will then discuss some of our past work on using these embeddings to solve problems such as such as zero-shot recognition and phrase grounding. I will then cover application of multimodal embeddings to tasks other than vision-language such as social media analysis, visual localization, etc. I will then briefly discuss recent works on multimodal transformer that have shown large performance improvements in vision-language tasks by relying on the ability of transformers to learn strong correlations through multi-head attention, large capacity, large scale data and pre-training strategies. I hope this talk will provide a basic introduction to related concepts and encourage researchers to work in this field.

Bio: Available at


Walter Schneider

University of Pittsburgh

Sudhir Pathak

Sudhir Pathak

University of Pittsburgh

Title: MRI diffusion Brain imaging of of the human Connectome and clinical assessment of brain connectivity disorders and neurosurgical planning.

Date&Time: 20 December, 9:00 AM (IST).
Session Chair: Arnav Bhavsar

Dr Walter Schneider, Professor of Psychology, Neurosurgery, Radiology & Bioengineering at the University of Pittsburgh & Medical Center and Senior Scientist at the Learning Research and Development Center. His research includes basic and actionable neuroscience based on diffusion imaging of white matter fiber tracts with High Definition Fiber Tracking (HDFT) and behavioral assessment. HDFT technology is now being used in neurosurgery for both presurgical planning and operating room real time surgical guidance. HDFT is used in diagnostic assessment of Traumatic Brain Injury (TBI) for visualizing and quantifying fiber breaks where other MRI imaging methods could not. He uses HDFT and MRI to localize tasks that can be used in targeted cognitive therapy to regrow damaged tissue. He has over 200 publications and published the 4th and 9th most cited papers in the history of psychology with over 50,000 citations; first functional neuroimaging paper in Nature helping to spark the modern era of brain imaging, developed a major model of brain executive and control systems (top downloaded paper in Cognitive Science 2003), and received the 2010 Editor's choice award for best imaging methods paper from NeuroImage. His group has developed brain tractographic imaging for mapping the brain Connectome, co-developed E-Prime software used by over 10,000 laboratories in 58 countries, he developed the Integrated Functional Imaging Systems (IFIS) (now sold by Phillips) that has been installed by over 150 brain imaging centers around the world. His technology was the basis of the Pittsburgh based Psychology Software Tools Inc. spinoff company that employs forty people in high technology jobs in Pittsburgh. He develops advanced technology for MRI based imaging, patient assessment, data visualization, mobile computing, and physical MRI phantom engineering. His recent work in diffusion fiber tracking identifies brain networks, quantifies tract integrity, and maps brain areas. His technology is used in clinical neurosurgery and TBI assessments on over a hundred patients per year and has produce improved medical outcomes and helped patients to understand and better deal with their brain pathology and rehabilitation impact. He leads a program to do TBI imaging across 7 university and 8 DoD/VA hospital systems. His work was highlighted by First Lady Obama as the most promising new technology for returning TBI war wounded and has appeared in major media reports including 60 Minutes, Discovery Channel, Scientific American, U.S. Medicine as well as traditional news media including AP, CNN, and Fox news. He is committed to doing collaborative international programs to advance basic and medical brain imaging.

Sudhir Pathak's Bio: Dr. Sudhir Pathak's research involves the development and designing of computational and mathematical models for use in diffusion MRI reconstruction, as well as the development of anisotropic metrics derived from diffusion models and their application in fiber tractography for use in biological tissue and textile phantoms. He is part of a project that uses a textile-based hollow fiber phantom to validate diffusion models. We are using different specialized textiles to mimic axonal features e.g. Axonal diameter, packing density, and crossing angle to validate micro-structural models. He has proposed a novel diffusion metric that can relate to the number of axons. These validation techniques and computational and mathematical models are further used to identify TBI lesions and understanding brain connectivity. He also proposed a new reconstruction algorithm that can be used to parse the geometrical information of both phantom and biological tissues. It combines diffusion spectrum images with popular constrained spherical deconvolution techniques to estimate underlying fiber crossing. This method can further be used for the segmentation of multi-tissue compartments. At the University of Pittsburgh, he is a the computational and mathematical lead in the development and designing of the High Definition Fiber Tracking (HDFT) pipeline. HDFT is used on a daily basis in TBI and other Neuro-surgical projects at the University of Pittsburgh. He is also adding the metrics described above into the HDFT pipeline to further improve the identification of lesions in patient populations with Huntington's disease and Amyotrophic lateral sclerosis. He is also part of an NIH funded grant related to Aphasia.