Oral Session 1B

December 20,   4:00 PM to 4:45 PM

Chair: Anand Mishra

69 𝑆3DMT-Net: Improving soft sharing based multi-task CNN using task-specific distillation and cross-task interactions
December 20,   16:00:00 to 16:15:00
Authors: Ankit Jha (Indian Institute of Technology Bombay)*; Biplab Banerjee (Indian Institute of Technology, Bombay); Subhasis Chaudhuri (Indian Institute of Technology Bombay)
Abstract: We deal with the problem of multi-task learning (MTL) in the context of performing multiple related visual dense prediction tasks from single image inputs. The soft-sharing-based deep MTL Convnets (CNN) have separate networks for each of the tasks with additional constraints on the model parameters. Although such MTL models have shown convincing performances for tasks including semantic segmentation, depth estimation, and surface normal estimation from monocular images, they have two inherent bottlenecks: i) the constraints imposed on such models do not in general leverage the inter-task information which otherwise can boost the joint training, and ii) the performances of the individual task-specific networks are not explicitly optimized. We hypothesize that the MTL performance can be enhanced comprehensively if the aforesaid issues are taken into account in soft-sharing-based MTL models. To this end, we introduce a novel MTL architecture in this paper called $S^3$DMT-Net, where i) the task-specific networks are trained under the notion of self-distillation, which aims at inheriting the features from the deeper layers into the shallower layers, thus enriching the capacity of the network, and ii) the idea of cross-task interactions are utilized where the feature-maps of the task-specific encoders are shared amongst each other. We validate the model performance on two scenes: indoor (NYUv2 and Mini Taskonomy) and urban (CityScapes and ISPRS), where substantial improvements are recorded for all the tasks over the existing literature.
Presenting Author: Ankit Jha
Paper: https://doi.org/10.1145/3490035.3490274
Joining link to attend this talk
December 20,   16:00:00 to 16:15:00
75 Deformable Deep Networks for Instance Segmentation of Overlapping Multi Page Handwritten Documents
December 20,   16:15:00 to 16:30:00
Authors: Sowmya Aitha (IIIT Hyderabad); Sindhu Bollampalli (IIIT Hyderabad); Ravi Kiran Sarvadevabhatla (IIIT Hyderabad)*
Abstract: Digitizing via scanning the physical artifact often forms the first primary step in preserving historical handwritten manuscripts. To maximally utilize scanner surface area and minimize manual labor, multiple manuscripts are usually scanned together into a scanned image. Therefore, the first crucial task in manuscript content understanding is to ensure that each of the individual manuscripts within a scanned image can be isolated (segmented) on a per-instance basis. Existing deep network based approaches for manuscript layout understanding implicitly assume a single or two manuscripts per image. Since this assumption may be routinely violated, there is a need for a precursor system which extracts individual manuscripts before downstream processing. Another challenge is the highly curved and deformed boundaries of manuscripts, causing them to often overlap with each other. To tackle such challenges, we introduce a new document image dataset called IMMI (Indic Multi Manuscript Images). To improve the efficiency of dataset and aid deep network training, we also propose an approach which generates synthetic images to augment sourced non-synthetic images. We conduct experiments using modified versions of existing document instance segmentation frameworks. The results demonstrate the efficacy of the new frameworks for the task. Overall, our contributions enable robust extraction of individual historical manuscript pages. This in turn, could potentially enable better performance on downstream tasks such as region-level instance segmentation within handwritten manuscripts and optical character recognition.
Presenting Author: Sowmya Aitha
Paper: https://doi.org/10.1145/3490035.3490278
Joining link to attend this talk
December 20,   16:15:00 to 16:30:00
129 Modeling Nuisance Classifier Towards Class-incremental Learning of Crowd-sourced Data
December 20,   16:30:00 to 16:45:00
Authors: Ramesh Tabib (KLE Technological University)*; T Santoshkumar (KLE Technological University); Dikshit Hegde (KLE Technological University); Adarsh Jamadandi (KLE Technological University); Uma Mudenagudi (KLE Technological University)
Abstract: In this paper, we propose a novel framework for tackling class-incremental learning of crowd-sourced data. The hallmark of crowd-sourced data is the presence of multiple participants, with limited expertise contributing incrementally leading to varying amounts of data. Modern machine learning algorithms are plagued with catastrophic forgetting, where the model performance degrades as new classes are added progressively. Unlike methods that use standard datasets with a known number of image categories, incremental learning of crowd-sourced data is challenging, since the number of categories is not known a priori and the data itself is noisy. To solve this problem, we propose a novel training strategy that leads to a Modular Stacked Classifier architecture capable of adapting to incremental data. The elegance of this solution lies in its modularity, which allows for re-training the desired model for specific classes without affecting knowledge acquired hitherto. We achieve this through two steps - first we formulate a deep clustering method to represent the data in latent space and use it to automatically discover novel categories in given unlabeled data. The second step involves designing a novel Nuisance Classifier to learn newly added classes while retaining the knowledge acquired from previously learned classes. We introduce a crowd-sourced heritage dataset to demonstrate our proposed framework, and extensive experiments show consistent improvements over existing methods.
Presenting Author: Ramesh Tabib
Paper: https://doi.org/10.1145/3490035.3490292
Joining link to attend this talk
December 20,   16:30:00 to 16:45:00

December 20December 21December 22
Session 1A Session 2A Session 3A
Session 1B Session 2B Session 3B
Session P1 Session P2 Vision India
Plenary 1 Plenary 3 Plenary 4
Plenary 2    
List of Accepted Papers
Conference Program