Poster Session P1

December 20,   2:30 PM to 3:45 PM

Chair: Ajit Bopardikar

          
37 Abnormality Detection and Classification of Macular Diseases from Optical Coherence Tomography Images: Using Feature Space Comparison
December 20,   14:30:00 to 15:45:00
Authors: Ashok L R (College of Engineering, Trivandrum, Kerala)*; Sreeni K G (College of Engineering, Trivandrum)
Abstract: Optical Coherence Tomography (OCT) is a non-invasive imaging technology for diagnosing various macular pathologies. It assists ophthalmologists to detect abnormalities in the retina and thereby avoid many sight-threatening conditions. However, the manual detection process depends on the experience of the clinical practitioner and it is a time-consuming process. Automatic detection of abnormalities using computer algorithms allows the clinical practitioner to detect the abnormalities in a faster and accurate manner. Here, a Deep learning-based approach is proposed which can detect abnormal retinal images in various disease classes which are not limited to classes present in the training data-set. The performance of the Abnormality detector is tested on UCSD, Kaggle & OCTID data-sets and obtained an accuracy of 99.80%, 99.90% & 94.59%, respectively. The performance of Classifier is also tested on UCSD & Kaggle data-sets and obtained an accuracy of 97.7%, 99.79%, respectively.
Presenting Author: Ashok L R
Paper: https://doi.org/10.1145/3490035.3490265
Attend Spotlight Presentation
Join Breakout Meeting Room for Poster 37
December 20,   14:30:00 to 15:45:00
 
41 Deep Feature Fusion for Automated Retinal Disease Detection Using OCT Images
December 20,   14:30:00 to 15:45:00
Authors: LATHA V (COLLEGE OF ENGINEERING TRIVANDRUM)*; SREENI K G (COLLEGE OF ENGINEERING, TRIVANDRUM)
Abstract: This paper proposes Deep Feature Fusion, a multi-layer feature fusion technique based on a deep convolutional neural network (DCNN), to build a fusion model for detecting retinal diseases from Optical Coherence Tomography (OCT) images. Although OCT has emerged as a potential imaging tool for retinal disease screening, automated disease detection remains a challenge. Most of the classification research concentrates on the DCNN feature map of the final convolution layer, ignoring the potential of internal layers. Hidden features information at different levels can be used to accomplish feature distinction. A DCNN model is created based on InceptionV3 which combines both local and global features. As a result, rather than depending solely on the final convolution layer, feature maps from multiple layers are combined using fusion techniques. The proposed approach appears to function effectively on publicly available retinal OCT data sets, according to the testing results.
Presenting Author: Latha V
Paper: https://doi.org/10.1145/3490035.3490268
Attend Spotlight Presentation
Join Breakout Meeting Room for Poster 41
December 20,   14:30:00 to 15:45:00
 
48 Robust Brain State Decoding using Bidirectional Long Short Term Memory Networks in functional MRI
December 20,   14:30:00 to 15:45:00
Authors: Anant Mittal (Indraprastha Institute of Information Technology, Delhi)*; Priya Aggarwal (IIIT-Delhi); Luiz Pessoa (University of Maryland); Anubha Gupta (Indraprastha Institute of Information Technology-Delhi (IIITD))
Abstract: Decoding brain states of the underlying cognitive processes via learning discriminative feature representations has recently gained a lot of interest in brain imaging studies. Particularly, there has been an impetus to encode the dynamics of brain functioning by analyzing temporal information available in the fMRI data. Long-short term memory (LSTM), a class of machine learning model possessing a ``memory" component, to retain previously seen temporal information, is increasingly being observed to perform well in various applications with dynamic temporal behavior, including brain state decoding. Because of the dynamics and inherent latency in fMRI BOLD responses, future temporal context is crucial. However, it is neither encoded nor captured by the conventional LSTM model. This paper performs robust brain state decoding via information encapsulation from both the past and future instances of fMRI data via bi-directional LSTM. This allows for explicitly modeling the dynamics of BOLD response without any delay adjustment. To this end, we utilize a bidirectional LSTM, wherein, the input sequence is fed in normal time-order for one LSTM network, and in the reverse time-order, for another. The two hidden activations of forward and reverse directions in bi-LSTM are collated to build the "memory" of the model and are used to robustly predict the brain states at every time instance. Working memory data from the Human Connectome Project (HCP) is utilized for validation and was observed to perform 18\% better than it's unidirectional counterpart in terms of accuracy in predicting the brain states.
Presenting Author: Anant Mittal
Paper: https://doi.org/10.1145/3490035.3490269
Attend Spotlight Presentation
Join Breakout Meeting Room for Poster 48
December 20,   14:30:00 to 15:45:00
 
71 Analysis of Vascular Dysregulation Caused by Infiltrating Glioma Cells Using BOLD fMRI
December 20,   14:30:00 to 15:45:00
Authors: Ammu R (IIITB)*; Rajikha Raja (University of Arkansas for Medical Sciences); Neelam Sinha (IIIT Bangalore); Jitender Saini (National Institute of Mental Health and Neurological Sciences)
Abstract: Malignant glioma is a brain malignancy that can infiltrate into surrounding tissues causing disruption in cerebral blood flow. It is important to identify the regions of vascular dysregulation that may aid to detect the tumor spread. Also, the strength of functional connectivity within the tumor has prognostic value. The purpose of this study was to identify the vascular dysfunction caused by glioma with the help of blood oxygen level-dependent (BOLD) functional MRI (fMRI). Multiple linear regression was performed to identify the regions correlated with tumor cells. Functionally intact voxels within the tumor and the presence of Gaussian noise in BOLD fMRI images were the challenges faced to find an efficient representation for regressors. To address these challenges, we found a better representation for regressors using regional homogeneity maps derived from the tumor and control regions and was tested on images contaminated with Gaussian noise. The proposed method resulted in an improved D prime value of 2.3 which indicates the reliability of our method when compared with the state-of-the-art method.
Presenting Author: Ammu R
Paper: https://doi.org/10.1145/3490035.3490276
Attend Spotlight Presentation
Join Breakout Meeting Room for Poster 71
December 20,   14:30:00 to 15:45:00
 
90 Automatic Quantification and Visualization of Street Trees
December 20,   14:30:00 to 15:45:00
Authors: Arpit Bahety (IIIT Hyderabad)*; Rohit Saluja (IIIT-Hyderabad); Ravi Kiran Sarvadevabhatla (IIIT Hyderabad); Anbumani Subramanian (IIIT-Hyderabad); C.V. Jawahar (IIIT-Hyderabad)
Abstract: Assessing the number of street trees is essential for evaluating urban greenery and can help municipalities employ solutions to identify tree-starved streets. It can also help identify roads with different levels of deforestation and afforestation over time. Yet, there has been little work in the area of street trees quantification. This work first explains a data collection setup carefully designed for counting roadside trees. We then describe a unique annotation procedure aimed at robustly detecting and quantifying trees. We work on a dataset of around 1300 Indian road scenes annotated with over 2500 street trees. We additionally use the five held-out videos covering 25 km of roads for counting trees. We finally propose a street tree detection, counting, and visualization framework using current object detectors and a novel yet simple counting algorithm owing to the thoughtful collection setup. We find that the high-level visualizations based on the density of trees on the routes and Kernel Density Ranking (KDR) provide a quick, accurate, and inexpensive way to recognize tree-starved streets. We obtain a tree detection mAP of 83.74% on the test images, which is a 2.73% improvement over our baseline. We propose Tree Count Density Classification Accuracy (TCDCA) as an evaluation metric to measure tree density. We obtain TCDCA of 96.77% on the test videos, with a remarkable improvement of 22.58% over baseline, and demonstrate that our counting module’s performance is close to human level. Source code: https://github.com/iHubData-Mobility/public-tree-counting.
Presenting Author: Arpit Bahety
Paper: https://doi.org/10.1145/3490035.3490280
Attend Spotlight Presentation
Join Breakout Meeting Room for Poster 90
December 20,   14:30:00 to 15:45:00
 
107 Translating Sign Language Videos to Talking Faces
December 20,   14:30:00 to 15:45:00
Authors: Seshadri Mazumder (International Institute of Information Technology, Hyderabad)*; Rudrabha Mukhopadhyay (IIIT Hyderabad); Vinay Namboodiri (University of Bath); C.V. Jawahar (IIIT-Hyderabad)
Abstract: Communication with the deaf community relies profoundly on the interpretation of sign languages performed by the signers. In light of the recent breakthroughs in sign language translations, we propose a pipeline that we term "Translating Sign Language Videos to Talking Faces". In this context, we improve the existing sign language translation systems by using POS tags to improve language modeling. We further extend the challenge to develop a system that can interpret a video from a signer to an avatar speaking in spoken languages. We focus on the translation systems that attempt to translate sign languages to text without glosses, an expensive annotation form. We critically analyze two state-of-the-art architectures, and based on their limitations, we improvise the systems. We propose a two-stage approach to translate sign language into intermediate text followed by a language model to get the final predictions. Quantitative evaluations on the challenging benchmarks on RWTH-PHOENIX-Weather 2014 T show that the translation accuracy of the texts generated by our translation model improves the state-of-the-art models by approximately 3 points. We then build a working text to talking face generation pipeline by bringing together multiple existing modules. The overall pipeline is capable of generating talking face videos with speech from sign language poses. Additional materials about this project including the codes and a demo video can be found in~\url{https://seshadri-c.github.io/SLV2TF/}
Presenting Author: Seshadri Mazumder
Lab/Author homepage: https://cvit.iiit.ac.in/
Paper: https://doi.org/10.1145/3490035.3490286
Attend Spotlight Presentation
Join Breakout Meeting Room for Poster 107
December 20,   14:30:00 to 15:45:00
 
114 An Encoder-decoder based Deep Architecture for Visible to Near Infrared Image Transformation
December 20,   14:30:00 to 15:45:00
Authors: Shashaank Aswatha Mattur (Philips)*; Sai Phani Kumar Malladi (Indian Institute of Technology Kharagpur); Jayanta Mukhopadhyay (IIT Kharagpur)
Abstract: Near-infrared (NIR) band is also one of the exciting wavelength ranges, besides optical range, to the human visual perception system.The NIR band shares various attributes with visible spectrum range. NIR images have the potential for better representation of vegetation and certain details of a scene that are sometimes opaque to RGB wavelengths. In this paper, we propose a dual encoder-decoder based architecture with different depths for transforming RGB images to NIR images. This network handles scale variations in data and helps to reduce artifacts in the transformed image domain. Two different resolutions of the input image are subjected to feature extraction by a series of five consecutive convolution and ReLu layers. Then, their corresponding aggregated features are assembled and up-sampled to regress over the ground truth NIR image. We consider two datasets with different resolutions and properties, EPFL and OMSIV, which provide corresponding RGB-NIR pairs of images of natural scenes. We train and test our network on both the datasets and evaluate its qualitative and quantitative performance.The results of our network in transforming RGB images to NIR images are found to be interesting which are demonstrated through various examples.
Presenting Author: Saiphani Kumar Malladi
Paper: https://doi.org/10.1145/3490035.3490288
Attend Spotlight Presentation
Join Breakout Meeting Room for Poster 114
December 20,   14:30:00 to 15:45:00

    
December 20December 21December 22
Session 1A Session 2A Session 3A
Session 1B Session 2B Session 3B
Session P1 Session P2 Vision India
Plenary 1 Plenary 3 Plenary 4
Plenary 2    
List of Accepted Papers
Conference Program