Poster Session P2

December 21,   2:30 PM to 3:45 PM

Chair: Abhinav Dhall

          
70 HDR-cGAN: Single LDR to HDR Image Translation using Conditional GAN
December 21,   14:30:00 to 15:45:00
Authors: Prarabdh Raipurkar (Indian Institute of Technology Gandhinagar, India)*; Rohil Pal (Indian Institute of Technology Gandhinagar); Shanmuganathan Raman (Indian Institute of Technology (IIT) Gandhinagar)
Abstract: The prime goal of digital imaging techniques is to reproduce the realistic appearance of a scene. Low Dynamic Range (LDR) cameras are incapable of representing the wide dynamic range of the real-world scene. The captured images turn out to be either too dark (underexposed) or too bright (overexposed). Specifically, saturation in overexposed regions makes the task of reconstructing a High Dynamic Range (HDR) image from single LDR image challenging. In this paper, we propose a deep learning based approach to recover details in the saturated areas while reconstructing the HDR image. We formulate this problem as an image-to-image (I2I) translation task. To this end, we present a novel conditional GAN (cGAN) based framework trained in an end-to-end fashion over the HDR-REAL and HDR-SYNTH datasets. Our framework uses an overexposed mask obtained from a pre-trained segmentation model to facilitate the hallucination task of adding details in the saturated regions. We demonstrate the effectiveness of the proposed method by performing an extensive quantitative and qualitative comparison with several state-of-the-art single-image HDR reconstruction techniques.
Presenting Author: Prarabdh Raipurkar
Paper: https://doi.org/10.1145/3490035.3490275
Attend Spotlight Presentation
Join Breakout Meeting Room (Password: iitj) for Poster 70
December 21,   14:30:00 to 15:45:00
 
110 Manifold Learning to address Catastrophic Forgetting
December 21,   14:30:00 to 15:45:00
Authors: Prathyusha Akundi (International Institute of Information and Technology Hyderabad (IIITH))*; Jayanthi Sivaswamy (International Institute of Information Technology Hyderabad)
Abstract: A major challenge that deep learning systems face is the Catastrophic Forgetting (CF) phenomenon that is observed when fine-tuning is used to try and adapt a system to a new task or a sequence of datasets with different distributions. CF refers to the significant degradation in performance on the old task/dataset. In this paper, a novel approach is proposed to address CF in computer-aided diagnosis (CAD) system design in the medical domain. CAD systems often need to handle a sequence of datasets collected over time from different sites with different imaging parameters/populations. The solution we propose is to move samples from all the datasets closer to a common manifold via a reformer at the front end of a CAD system. The utility of this approach is demonstrated on two common tasks, namely segmentation, and classification, using publicly available datasets. Results of extensive experiments show that manifold learning can yield about 74\% improvement on an average in the reduction of CF over the baseline fine-tuning process and the state-of-the-art regularization-based methods. The results also indicate that a Reformer when used in conjunction with the state-of-the-art regularization methods, has the potential to yield further improvement in CF reduction.
Presenting Author: Prathyusha Akundi
Paper: https://doi.org/10.1145/3490035.3490287
Attend Spotlight Presentation
Join Breakout Meeting Room (Password: iitj) for Poster 110
December 21,   14:30:00 to 15:45:00
 
142 Idol Dataset: A Database on Religious Idols and its Recognition with Deep networks∗
December 21,   14:30:00 to 15:45:00
Authors: Dr.Sathyabama B (Thaigarajar College Of Engineering)*; Md Mansoor (Roomi); Sabari Nathan (Couger Inc, Tokyo); Senthilarasi Marimuthu (Dr.Mahalingam College of Engineering and Technology); Manimala Gurusamy (tce)
Abstract: Idols are rich descriptors capturing both visual and historical information about temples and therefore enhance the process of documenting and managing cultural heritage of a place. There are very limited annotated databases for artistic cultural heritage images especially for idols. To meet this need, we collected, annotated, and prepared a new database of Hindu Religious Idols. The first version of Idol dataset contains 14,592 images collected from Internet by querying three major search engines using 150 name manifestations related keywords about 31 idol categories. Correctly identifying a particular God/Goddess image in the form of paintings, photographs and sculptures is crucial. In this paper, we investigate the use of deep neural networks to solve the problem of recognizing religious idols. To achieve this result, the network is first pre-trained on 10 ImageNets and selected Densenet121 which outperforms the other networks. A modified Densenet architecture is proposed with a softmax output for idol recognition to achieve 74.9% and 84.28% of Top1 and Top 2 Rank Accuracies respectively on imbalanced learning and 74.93% and 84.02% Top1 and Top2 Rank accuracies respectively for weighted loss learning.
Presenting Author: Sathyabama B
Paper: https://doi.org/10.1145/3490035.3490295
Attend Spotlight Presentation
Join Breakout Meeting Room (Password: iitj) for Poster 142
December 21,   14:30:00 to 15:45:00
 
143 HazeNet: Does the image have Haze ?
December 21,   14:30:00 to 15:45:00
Authors: Kalaivani P (GCE Bodi)*; Md Mansoor (Roomi); Sabari Nathan (Couger Inc, Tokyo)
Abstract: The visibility of scene is much important in applications of IntelligentVideo surveillance systems, Autonomous Vehicle Navigation.The presence of haze limits the visibility, thereby hindering theproper operation of computer vision algorithms. There are lot ofalgorithms to remove haze unconditionally. But the haze removalon clear images or frames will cause adverse effects. Unnecessarydehazing results in the loss of original information and also delaysthe following process. To address these challenges, there is aneed to recognize the image as hazy and haze-free before applyingdehazing on it. Hence, this paper presents three light-weight deep neural network models to automatically classifyimage as hazy and haze-free in the name of HazeNet, R-HazeNetand RSE-HazeNet. The HazeNet is Convolutional Neural Networkhaving five convolutional blocks and results in classification accuracyof 97.89%. In order to improve the performance of HazeNet further, the Residual blocks have been included after each convolutional layer. The model iscalled R-HazeNet. The residual blocks help in producing hierarchical features. To strengthen the useful convolutional channelfeatures alone, Squeeze and Excitation blocks have been includedwith Residual units in RSE-HazeNet model. Both R-HazeNet andRSE-HazeNet produced classification accuracy of 99.98% whichdemonstrate superior performance compared to the stat-of-the-artmethods. The proposed models are light-weight models due to thepercentage improvement in number of parameters as a minimumof 93% has been achieved with respect to the the baseline method.
Presenting Author: Kalaivani P
Paper: https://doi.org/10.1145/3490035.3490296
Attend Spotlight Presentation
Join Breakout Meeting Room (Password: iitj) for Poster 143
December 21,   14:30:00 to 15:45:00
 
151 SiamRPN++D: Improved SiamRPN++ Using Cascaded Detector Sensing
December 21,   14:30:00 to 15:45:00
Authors: Vinayak Shriniwas (DRDO)*; Gorthi Rama Krishna Sai Subrahmanyam (IIT Tirupati); Arshad Jamal (DRDO)
Abstract: This paper presents a novel and robust long-term tracking algorithm to address continuous target tracking problems. The continuous tar- get tracking demands handling of correct re-initialization of the lost target when it reappears. The main limitation of the currently pop- ular Siamese class of deep trackers is their inability to re-initialize a target when it is lost for sufficiently long duration or when it re-appears at a location away from the lost location. Most of the Siamese class of deep trackers search for the lost targets in a limited region, close to where it disappears. Hence, they fail in automated re-initialization, tracking resumption and maintaining track after long-term occlusion or tracker-loss. This puts a serious impediment on the current state-of-the-art deep tracker frameworks for many real applications. Here, we propose integration of a lightweight and efficient Cascaded Classifier based detection mechanism with the Siamese trackers for re-initialization of the target. While the proposed approach is generic and applicable to all Siamese class of deep trackers, we have taken SiamRPN++ as a base tracker to illus- trate the effectiveness of our tracking framework. The proposition enables Cascaded Classifier based detector to adaptively direct the search region for the base tracker. Extensive experimental results on the well known tracking benchmark datasets such as UAV123, VOT2019, VOT2016, VOT2018 and VOT2018-LT show that the pro- posed integration significantly improves the performance of the base tracker under occlusion and tracker-loss scenarios. Further, the proposed tracker improves the precision by 1.71% and the recall by 13.98% over the base tracker for the long-term tracking on the VOT2018-LT dataset.
Presenting Author: Vinayak Shriniwas
Paper: https://doi.org/10.1145/3490035.3490298
Attend Spotlight Presentation
Join Breakout Meeting Room (Password: iitj) for Poster 151
December 21,   14:30:00 to 15:45:00
 
162 GAN Based Indian Sign Language Synthesis
December 21,   14:30:00 to 15:45:00
Authors: Shyam Krishna (IIITB)*; Janmesh Ukey (Reliance Jio, AICoE); Dinesh Babu J (International Institute of Information Technology, Bangalore)
Abstract: Controllable visual reproduction of sign language, termed Sign Language Synthesis (SLS), is a major and challenging task in sign language processing. Traditional methods have used computer animation to perform this task, but these havefaced several limitations. Animation usually requires expensive equipment to perform motioncapture and intensive manual oversight to ensure accuracy. Recently, Generative Adversarial Networks (GANs) have had very promising results in pose and motion transfer. This has been explored as a SLS method in Stoll, et al. and related work. However, this work has required datasets with manual annotation, both in the form of requiring a corpus of manually selected "good" hand poses, as well a large corpus of videos of continuous signing which are annotatedfor the sequence of signs appearing in them. Most sign languages, however, face a dearth in data, especially annotated data, and this is the case for Indian Sign Language (ISL). In this paper, we present a method for overcoming this issue in the first GAN based SLS model created specifically for ISL. We use a combination of separate generators for the hand and body to overcome the problem of requiring hand-picked "good" hand images from training videos. We further refine the output with another network to remove the artefacts appearing due to combining separate GAN outputs. We also experiment with creating continuous sign language output without requiring an annotated corpus, by stitching together individual signs obtained from a publicly available videolexicon of ISL. We show our model performs competitively in these tasksin both quantitative measures as well as in human perception tests.
Presenting Author: Shyam Krishna
Lab/Author homepage: http://mpl.iiitb.ac.in/
Paper: https://doi.org/10.1145/3490035.3490301
Attend Spotlight Presentation
Join Breakout Meeting Room (Password: iitj) for Poster 162
December 21,   14:30:00 to 15:45:00
 
189 TexRGAN: A deep adversarial framework for Text Restoration from deformed handwritten documents
December 21,   14:30:00 to 15:45:00
Authors: Arnab Poddar (IIT-Kharagpur)*; Akash Chakraborrty (IIT Kharagpur); Jayanta Mukhopadhyay (IIT Kharagpur); Prabir Kumar Biswas (IIT Khargpur)
Abstract: Free form handwritten document images commonly contains deformed text images such as struck-out and underlined words. The deformed text images drastically degrades the performance of intensely used document processing applications like optical character recognition (OCR). Here we propose an end-to-end text image restoration system based on generative adversarial network (GAN). The proposed model TexRGAN is perhaps the first attempt to restore deformed handwritten texts like struck-out and underlined text images using GAN model and it simultaneously handles strike-out and underline words both with a single deep network model. The proposed GAN model uses spatial as well as structural loss to generate restored text images from a given deformed text image input as condition. The proposed network is trained in weakly supervised approach to avoid unavailability of training data and the cost and error of manual annotations. We evaluate the performance of the proposed TexRGAN on various types of deformation and shapes of strike-through-strokes such as slanted strokes, nearly straight strokes, multiple strokes, underlines, crossed strokes etc. The TexRGAN is also evaluated directly in terms the OCR performance. The evaluation metrics show robustness and applicability in real-world scenario.
Presenting Author: Arnab Poddar
Lab/Author homepage: http://www.ecdept.iitkgp.ac.in/
Paper: https://doi.org/10.1145/3490035.3490306
Attend Spotlight Presentation
Join Breakout Meeting Room (Password: iitj) for Poster 189
December 21,   14:30:00 to 15:45:00

    
December 20December 21December 22
Session 1A Session 2A Session 3A
Session 1B Session 2B Session 3B
Session P1 Session P2 Vision India
Plenary 1 Plenary 3 Plenary 4
Plenary 2    
List of Accepted Papers
Conference Program