Sukalpa Chanda

English version of this page Institutt for informasjonsteknologi og kommunikasjon

English version of this page Stilling

Førsteamanuensis

Kontakt

sukalpa.chanda@hiof.no

Studiested

Halden

Kontornr.

010A

Du kan finne mer informasjon på min engelske profilside

Prosjekter

Hugin-munin: Enhanced Access to Norwegian Cultural Heritage using AI-driven Handwriting Recognition

Forskergrupper

Maskinlæring

Emneord: Det digitale samfunn, DigiTech

Adak, Chandranath; Jaswanth, Batturi; Akhtar, Zahid; Kåsen, Andre & Chanda, Sukalpa (2023). Writer Identification from Nordic Historical Manuscripts using Transformer Networks. I IEEE, . (Red.), 2023 IEEE International Joint Conference on Biometrics (IJCB): [Proceedings]. Institute of Electrical and Electronics Engineers (IEEE). ISSN 979-8-3503-3726-6. doi: 10.1109/IJCB57857.2023.10448665.
Gurav, Aniket Anand; Jensen, Joakim; Krishnan, Narayanan C. & Chanda, Sukalpa (2023). ResPho(SC)Net: A Zero-Shot Learning Framework for Norwegian Handwritten Word Image Recognition. I Pertusa, Antonio; Gallego, Antonio Javier; Sánchez, Joan Andreu & Domingues, Inês (Red.), Pattern Recognition and Image Analysis: 11th Iberian Conference, IbPRIA 2023, Alicante, Spain, June 27–30, 2023: Proceedings. Springer Nature. ISSN 978-3-031-36616-1. s. 182–196. doi: 10.1007/978-3-031-36616-1_15.
Manna, Siladittya; Das, Dipayan; Bhattacharya, Saumik; Pal, Umapada & Chanda, Sukalpa (2022). PLSM: A Parallelized Liquid State Machine for Unintentional Action Detection. IEEE Transactions on Emerging Topics in Computing. ISSN 2168-6750. 11(2), s. 474–484. doi: 10.1109/TETC.2022.3211011.
Srivastava, Abhishek; Chanda, Sukalpa & Pal, Umapada (2022). AGA-GAN: Attribute Guided Attention Generative Adversarial Network with U-Net for face hallucination. Image and Vision Computing. ISSN 0262-8856. 126. doi: 10.1016/j.imavis.2022.104534.
Bhatt, Ravi; Rai, Anuj; Chanda, Sukalpa & Krishnan, Narayanan C. (2022). Pho(SC)-CTC—a hybrid approach towards zero-shot word image recognition. International Journal on Document Analysis and Recognition. ISSN 1433-2833. doi: 10.1007/s10032-022-00407-6.
Srivastava, Abhishek; Chanda, Sukalpa & Pal, Umapada (2022). Exploiting Multi-Scale Fusion, Spatial Attention and Patch Interaction Techniques for Text-Independent Writer Identification. I Wallraven, Christian; Liu, Qingshan & Nagahara, Hajime (Red.), Pattern Recognition : 6th Asian Conference, ACPR 2021, Jeju Island, South Korea, November 9–12, 2021 : Revised Selected Papers : Part II. Springer Nature. ISSN 978-3-031-02444-3. s. 203–217. doi: 10.1007/978-3-031-02444-3_15. Vis sammendrag
Text independent writer identification is a challenging problem that differentiates between different handwriting styles to decide the author of the handwritten text. Earlier writer identification relied on handcrafted features to reveal pieces of differences between writers. Recent work with the advent of convolutional neural network, deep learning-based methods have evolved. In this paper, three different deep learning techniques - spatial attention mechanism, multi-scale feature fusion and patch-based CNN were proposed to effectively capture the difference between each writer’s handwriting. Our methods are based on the hypothesis that handwritten text images have specific spatial regions which are more unique to a writer’s style, multi-scale features propagate characteristic features with respect to individual writers and patch-based features give more general and robust representations that helps to discriminate handwriting from different writers. The proposed methods outperforms various state-of-the-art methodologies on word-level and page-level writer identification methods on three publicly available datasets - CVL, Firemaker, CERUG-EN datasets and give comparable performance on the IAM dataset.
Chowdhury, Tamal; Chanda, Sukalpa; Bhattacharya, Saumik; Biswas, Soma & Pal, Umapada (2022). Contact-Less Heart Rate Detection in Low Light Videos. I Wallraven, Christian; Liu, Qingshan & Nagahara, Hajime (Red.), Pattern Recognition : 6th Asian Conference, ACPR 2021, Jeju Island, South Korea, November 9–12, 2021 : Revised Selected Papers : Part I. Springer Nature. ISSN 978-3-031-02375-0. s. 77–91. doi: 10.1007/978-3-031-02375-0_6. Vis sammendrag
Heart Rate is considered as an important and widely accepted biological indicator of a person’s overall physiological state. Remotely measuring the heart rate has several benefits in different medical and in computational applications. It helps monitoring the overall health of a person and analyse the effect of various physical, environmental and emotional factors of an individual. Various methods have been proposed in recent years to measure the heart rate remotely using RGB videos. Most of the methods are based on skin color intensity variations which are not visible to the naked eye but can be captured by a digital camera. Signal processing and traditional machine learning techniques have tried to solve this problem using mainly frequency domain analysis of this time varying signal. However these methods are primarily based on face detection and ROI selection in a sufficiently illuminated environment, and fail to produce any output in low lighting conditions which is of utmost importance for the purpose of constant monitoring. Here, we have proposed a 1-dimensional convolutional neural network based framework that processes a magnified version of the time series color variation data in the frequency domain to build an autonomous heart rate monitoring system. With the help of artificial illumination this method can even perform well in low light conditions. Also, we have collected our own dataset that currently contains short frontal face video clips of 50 subjects along with their ground truth heart rate values both in normal and low lighting conditions. We have compared our method with the heuristic signal processing approach to validate its efficacy.
Das, Dipayan; Bhattacharya, Saumik; Pal, Umapada & Chanda, Sukalpa (2021). PLSM: A Parallelized Liquid State Machine for Unintentional Action Detection. arXiv.org. ISSN 2331-8422. doi: 10.48550/arXiv.2105.09909.
Chanda, Sukalpa; Haitink, Daniel; Prasad, Prashant Kumar; Baas, Jochem; Pal, Umapada & Schomaker, Lambert (2021). Recognizing Bengali Word Images - A Zero-Shot Learning Perspective. I Cucchiara, Rita (Red.), Proceedings of the 25th International Conference on Pattern Recognition, ICPR2020. IEEE (Institute of Electrical and Electronics Engineers). ISSN 978-1-7281-8808-9. s. 5603–5610. doi: 10.1109/ICPR48806.2021.9412607. Vis sammendrag
Zero-Shot Learning(ZSL) techniques could classify a completely unseen class, which it has never seen before during training. Thus, making it more apt for any real-life classification problem, where it is not possible to train a system with annotated data for all possible class types. This work investigates recognition of word images written in Bengali Script in a ZSL framework. The proposed approach performs Zero-Shot word recognition by coupling deep learned features procured from various CNN architectures along with 13 basic shapes/stroke primitives commonly observed in Bengali script characters. As per the notion of ZSL framework those 13 basic shapes are termed as “Signature/Semantic Attributes”. The obtained results are promising while evaluation was carried out in a FiveFold cross-validation setup dealing with samples from 250 word classes.
Srivastava, Abhishek; Chanda, Sukalpa; Jha, Debesh; Riegler, Michael; Halvorsen, Pål & Johansen, Dag [Vis alle 7 forfattere av denne artikkelen] (2021). PAANet: Progressive Alternating Attention for Automatic Medical Image Segmentation. I IEEE, . (Red.), 4th International Conference on Bio-Engineering for Smart Technologies (BioSMART) : [Proceedings]. IEEE (Institute of Electrical and Electronics Engineers). ISSN 978-1-6654-0810-3. doi: 10.1109/BioSMART54244.2021.9677844. Vis sammendrag
Medical image segmentation can provide detailed information for clinical analysis which can be useful for scenarios where the detailed location of a finding is important. Knowing the location of a disease can play a vital role in treatment and decision-making. Convolutional neural network (CNN) based encoder-decoder techniques have advanced the performance of automated medical image segmentation systems. Several such CNN-based methodologies utilize techniques such as spatial- and channel-wise attention to enhance performance. Another technique that has drawn attention in recent years is residual dense blocks (RDBs). The successive convolutional layers in densely connected blocks are capable of extracting diverse features with varied receptive fields and thus, enhancing performance. However, consecutive stacked convolutional operators may not necessarily generate features that facilitate the identification of the target structures. In this paper, we propose a progressive alternating attention network (PAANet). We develop progressive alternating attention dense (PAAD) blocks, which construct a guiding attention map (GAM) after every convolutional layer in the dense blocks using features from all scales. The GAM allows the following layers in the dense blocks to focus on the spatial locations relevant to the target region. Every alternate PAAD block inverts the GAM to generate a reverse attention map which guides ensuing layers to extract boundary and edge-related information, refining the segmentation process. Our experiments on three different biomedical image segmentation datasets exhibit that our PAANet achieves favorable performance when compared to other state-of-the-art methods.
Rai, Anuj; Krishnan, Narayanan C. & Chanda, Sukalpa (2021). Pho(SC)Net: An Approach Towards Zero-Shot Word Image Recognition in Historical Documents. I Lladós, Josep; Lopresti, Daniel & Uchida, Seiichi (Red.), Document Analysis and Recognition – ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part I. Springer. ISSN 978-3-030-86548-1. s. 19–33. doi: 10.1007/978-3-030-86549-8_2. Vis sammendrag
Annotating words in a historical document image archive for word image recognition purpose demands time and skilled human resource (like historians, paleographers). In a real-life scenario, obtaining sample images for all possible words is also not feasible. However, Zero-shot learning methods could aptly be used to recognize unseen/out-of-lexicon words in such historical document images. Based on previous state-of-the-art methods for word spotting and recognition, we propose a hybrid representation that considers the character’s shape appearance to differentiate between two different words and has shown to be more effective in recognizing unseen words. This representation has been termed as Pyramidal Histogram of Shapes (PHOS), derived from PHOC, which embeds information about the occurrence and position of characters in the word. Later, the two representations are combined and experiments were conducted to examine the effectiveness of an embedding that has properties of both PHOS and PHOC. Encouraging results were obtained on two publicly available historical document datasets and one synthetic handwritten dataset, which justifies the efficacy of “Phos” and the combined “Pho(SC)” representation.
Shivakumara, Palaiahnakote; Jain, Tanmay; Surana, Nitish; Pal, Umapada; Lu, Tong & Blumenstein, Michael [Vis alle 7 forfattere av denne artikkelen] (2021). A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification. I Smith, Elisa H. Barney & Pal, Umapada (Red.), Document Analysis and Recognition – ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II. Springer. ISSN 978-3-030-86158-2. s. 158–173. doi: 10.1007/978-3-030-86159-9_11. Vis sammendrag
Due to the presence of struck-out handwritten words in document images, the performance of different methods degrades for several important applications, such as handwriting recognition, writer, gender, fraudulent document identification, document age estimation, writer age estimation, normal/abnormal behavior of person analysis, and descriptive answer evaluation. This work proposes a new method which combines connected component analysis for text component detection and deep learning for classification of struck-out and non-struck-out words. For text component detection, the proposed method finds the stroke width to detect edges of texts in images, and then performs smoothing operations to remove noise. Furthermore, morphological operations are performed on smoothed images to label connected components as text by fixing bounding boxes. Inspired by the great success of deep learning models, we explore DenseNet for classifying struck-out and non-struck-out handwritten components by considering text components as input. Experimental results on our dataset demonstrate the proposed method outperforms the existing methods in terms of classification rate.
Chowdhury, Tamal; Shivakumara, Palaiahnakote; Pal, Umapada; Lu, Tong; Ramachandra, Raghavendra & Chanda, Sukalpa (2021). DCINN: Deformable Convolution and Inception Based Neural Network for Tattoo Text Detection Through Skin Region. I Lladós, Josep; Lopresti, Daniel & Uchida, Seiichi (Red.), Document Analysis and Recognition – ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II. Springer. ISSN 978-3-030-86330-2. s. 335–350. doi: 10.1007/978-3-030-86331-9_22.
Vasudeva, Bhavya; Deora, Puneesh; Bhattacharya, Saumik; Pal, Umapada & Chanda, Sukalpa (2021). LoOp: Looking for Optimal Hard Negative Embeddings for Deep Metric Learning. IEEE International Conference on Computer Vision (ICCV). ISSN 1550-5499. s. 10614–10623. doi: 10.1109/ICCV48922.2021.01046.
Srivastava, Abhishek; Jha, Debesh; Chanda, Sukalpa; Pal, Umapada; Johansen, Håvard D. & Johansen, Dag [Vis alle 9 forfattere av denne artikkelen] (2021). MSRF-Net: A Multi-Scale Residual Fusion Network for Biomedical Image Segmentation. IEEE journal of biomedical and health informatics. ISSN 2168-2194. 26(5), s. 2252–2263. doi: 10.1109/JBHI.2021.3138024. Fulltekst i vitenarkiv Vis sammendrag
Methods based on convolutional neural networks have improved the performance of biomedical image segmentation. However, most of these methods cannot efficiently segment objects of variable sizes and train on small and biased datasets, which are common for biomedical use cases. While methods exist that incorporate multi-scale fusion approaches to address the challenges arising with variable sizes, they usually use complex models that are more suitable for general semantic segmentation problems. In this paper, we propose a novel architecture called MultiScale Residual Fusion Network (MSRF-Net), which is specially designed for medical image segmentation. The proposed MSRF-Net is able to exchange multi-scale features of varying receptive fields using a Dual-Scale Dense Fusion (DSDF) block. Our DSDF block can exchange information rigorously across two different resolution scales, and our MSRF sub-network uses multiple DSDF blocks in sequence to perform multi-scale fusion. This allows the preservation of resolution, improved information flow and propagation of both high- and low-level features to obtain accurate segmentation maps. The proposed MSRF-Net allows to capture object variabilities and provides improved results on different biomedical datasets. Extensive experiments on MSRF-Net demonstrate that the proposed method outperforms the cutting-edge medical image segmentation methods on four publicly available datasets. We achieve the Dice Coefficient (DSC) of 0.9217, 0.9420, and 0.9224, 0.8824 on Kvasir-SEG, CVC-ClinicDB, 2018 Data Science Bowl dataset, and ISIC-2018 skin lesion segmentation challenge dataset respectively. We further conducted generalizability tests and achieved DSC of 0.7921 and 0.7575 on CVCClinicDB and Kvasir-SEG, respectively.
Chanda, Sukalpa; Prasad, Prashant Kumar; Hast, Anders; Brun, Anders; Martensson, Lasse & Pal, Umapada (2020). Finding Logo and Seal in Historical Document Images - An Object Detection Based Approach. Lecture Notes in Computer Science (LNCS). ISSN 0302-9743. 12046, s. 821–834. doi: 10.1007/978-3-030-41404-7_58.
Chanda, Sukalpa; Chakrapani, Asish; Brun, Anders; Hast, Anders; Pal, Umapada & Doermann, David (2019). Face Recognition - A One-Shot Learning Perspective. I Yetongnon, Kokou; Dipanda, Albert; Sanniti di Baja, Gabriella; Gallo, Luigi & Chbeir, Richard (Red.), 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS 2019). IEEE (Institute of Electrical and Electronics Engineers). ISSN 978-1-7281-5686-6. s. 113–119. doi: 10.1109/SITIS.2019.00029.
Rakotomamonjy, Alain & Chanda, Sukalpa (2014). ℓp-norm multiple kernel learning with low-rank kernels. Neurocomputing. ISSN 0925-2312. 143(2), s. 68–79. doi: 10.1016/j.neucom.2014.06.019.
Sharma, Nabin; Chanda, Sukalpa; Pal, Umapada & Blumenstein, Michael (2013). Word-wise Script Identification from Video Frames. I O’Conner, Lisa (Red.), Proceedings of 12th International Conference on Document Analysis and Recognition ICDAR 2013, 25–28 August 2013,Washington, DC. IEEE (Institute of Electrical and Electronics Engineers). ISSN 9781479901937. s. 867–871. doi: 10.1109/icdar.2013.177.
Pal, Srikanta; Chanda, Sukalpa; Pal, Umapada; Franke, Katrin & Blumenstein, Michael (2012). Off-line signature verification using G-SURF. I Abraham, Ajith; Zomaya, Albert; Ventura, Sebastián; Yager, Ronald; Snasel, Vaclav; Muda, Azah Kamilah & Samuel, Philip (Red.), Proceedings of the 12th International Conference on Intelligent Systems Design and Applications (ISDA); 27 - 29 November 2012; Kochi, India. IEEE conference proceedings. ISSN 978-1-4673-5119-5. s. 586–591 . doi: 10.1109/isda.2012.6416603. Vis sammendrag
In the field of biometric authentication, automatic signature identification and verification has been a strong research area because of the social and legal acceptance and extensive use of the written signature as an easy method for authentication. Signature verification is a process in which the questioned signature is examined in detail in order to determine whether it belongs to the claimed person or not. Signatures provide a secure means for confirmation and authorization in legal documents. So nowadays, signature identification and verification becomes an essential component in automating the rapid processing of documents containing embedded signatures. Sometimes, part-based signature verification can be useful when a questioned signature has lost its original shape due to inferior scanning quality. In order to address the above-mentioned adverse scenario, we propose a new feature encoding technique. This feature encoding is based on the amalgamation of Gabor filter-based features with SURF features (G-SURF). Features generated from a signature are applied to a Support Vector Machine (SVM) classifier. For experimentation, 1500 (50×30) forgeries and 1200 (50×24) genuine signatures from the GPDS signature database were used. A verification accuracy of 97.05% was obtained from the experiments.
Imran, Ali Shariq; Chanda, Sukalpa; Alaya Cheikh, Faouzi; Franke, Katrin & Pal, Umapada (2012). Cursive Handwritten Segmentation and Recognition for Instructional Videos. I N, N (Red.), SITIS 2012, The 8th International Conference on Signal Image Technology and Internet Based Systems; 25-29, November 2012, Sorrento - Naples, Italy. IEEE (Institute of Electrical and Electronics Engineers). ISSN 978-1-4673-5152-2. s. 155–160. Vis sammendrag
In this paper, we address the issues pertaining to segmentation and recognition of cursive handwritten text from chalkboard lecture videos. Recognizing handwritten text is a challenging problem in instructor-led lecture video. The task gets even tougher with varying handwriting styles and blackboard type. Unlike handwritten text on whiteboard and electronic boards, chalkboard represents serious challenges such as, lack of uniform edge density, weak chalk contrast against blackboard and leftover chalk dust noise as a result of erasing – and many others. Moreover, the varying color of boards and the illumination changes within the video makes it impossible to use trivial thresholding techniques, for the extraction of content. Many universities throughout the world still heavily rely on chalkboard as a mode of instruction. Therefore, recognizing these lecture content will not only aid in indexing and retrieval applications but will also help understand high level video semantics, useful for Multi-media Learning Objects (MLO). In order to encounter those adversaries, we here propose a system for segmentation and recognition of cursive handwritten text from chalkboard lecture videos. We first create a foreground model to segment background blackboard.We then segment the text characters using one-dimensional vertical histogram. Later, we extract gradient based features and classify those characters using an SVM classifier. We obtained an encouraging accuracy of 86.28% on 5-fold cross validation.
Chanda, Sukalpa; Franke, Katrin & Pal, U (2012). Clustering Document Fragments using Background Color and Texture Information. Proceedings of SPIE, the International Society for Optical Engineering. ISSN 0277-786X. 8297. doi: 10.1117/12.910567. Vis sammendrag
Forensic analysis of questioned documents sometimes can be extensively data intensive. A forensic expert might need to analyze a heap of document fragments and in such cases to ensure reliability he/she should focus only on relevant evidences hidden in those document fragments. Relevant document retrieval needs finding of similar document fragments. One notion of obtaining such similar documents could be by using document fragment's physical characteristics like color, texture, etc. In this article we propose an automatic scheme to retrieve similar document fragments based on visual appearance of document paper and texture. Multispectral color characteristics using biologically inspired color differentiation techniques are implemented here. This is done by projecting document color characteristics to Lab color space. Gabor filter-based texture analysis is used to identify document texture. It is desired that document fragments from same source will have similar color and texture. For clustering similar document fragments of our test dataset we use a Self Organizing Map (SOM) of dimension 5×5, where the document color and texture information are used as features. We obtained an encouraging accuracy of 97.17% from 1063 test images.
Chanda, Sukalpa; Franke, Katrin & Pal, Umapada (2012). Text Independent Writer Identification for Oriya Script. I N, N (Red.), 10th IAPR International Workshop on Document Analysis Systems, DAS 2012. IEEE (Institute of Electrical and Electronics Engineers). ISSN 978-0-7695-4661-2. s. 369–373. doi: 10.1109/das.2012.86.
Chanda, Sukalpa; Pal, Umapada & Franke, Katrin (2012). Font identification — In context of an Indic script. I Saito, Hideo & Aoki, Yoshimitsu (Red.), Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012); November 11-15, 2012; Tsukuba, Japan. IEEE conference proceedings. ISSN 978-4-9906441-0-9. s. 1655–1658. Vis sammendrag
Font can be used as a notion of similarity amongst multiple documents written in same script. We could automatically retrieve document images with specific font from a huge digital document repository. So Optical Font Recognition could be a useful pre-processing step in an automated questioned document analysis system for sorting documents with similar fonts. We propose a scheme to identify 10 different fonts for an Indic script (Bangla). Curvature-based features are extracted from segmented characters and are fed to a Support Vector Machine (SVM) classifier. The classifier determines the font type for each segmented character obtained from a document. Later, font identification for that document is executed on the basis of majority voting amongst 10 different fonts for all characters. Using a Multiple Kernel SVM classifier we obtained 98.5% accuracy from 400 test documents (40 documents for each font type).
Chanda, Sukalpa; Pal, Umapada & Franke, Katrin (2012). Similar shaped part-based character recognition using G-SURF. I Abraham, Ajith; Zomaya, Albert; Wadhai, Vijay; Yager, Ronald; Muda, Azah Kamilah & Koeppen, Mario (Red.), Proceedings of the 12th International Conference on Hybrid Intelligent Systems (HIS); 4-7 December 2012; Pune, India. IEEE conference proceedings. ISSN 9781467351164. s. 179–184. doi: 10.1109/his.2012.6421330. Vis sammendrag
Classification/misclassification of similar shaped characters largely affects OCR accuracy. Sometimes occlusion/insertion of a part of character (due to inferior scanning quality) also makes it look alike another character type. For such adverse situations a part based character recognition system could be more effective. In order to encounter mentioned adverse scenario we propose a new feature encoding technique. This feature encoding is based on the amalgamation of Gabor filter-based features with SURF features (G-SURF). Features generated from a character are provided to Support Vector Machine (SVM) classifier. We obtained an encouraging accuracy on similar shaped characters from three different scripts.
Chanda, Sukalpa; Franke, Katrin & Umapada, Pal (2011). Identification of Indic Scripts on Torn-Documents. I N, N (Red.), ICDAR 2011, Proceedings of the 11th International Conference on Document Analysis and Recognition; 18-21 September 2011, Beijing, China. IEEE (Institute of Electrical and Electronics Engineers). ISSN 978-0-7695-4520-2. s. 713–717. doi: 10.1109/icdar.2011.149.
Chanda, Sukalpa; Franke, Katrin; Pal, Umapada & Wakabayashi, Tetsushi (2010). Text Independent Writer Identification for Bengali Script. I N, N (Red.), Proceedings of the 20th International Conference on Pattern Recognition (ICPR) 2010. IEEE (Institute of Electrical and Electronics Engineers). ISSN 9781424475421. s. 2005–2008. Vis sammendrag
Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned document cannot be always ensured. Also, compromising in terms of systems reliability under such situation is not desirable. We here propose a system to encounter such adverse situation in the context of Bengali script. Experiments with discrete directional feature and gradient feature are reported here, along with Support Vector Machine (SVM) as classifier. We got promising results of 95.19% writer identification accuracy at first top choice and 99.03% when considering first three top choices.
Chanda, Sukalpa; Pal, Umapada; Franke, Katrin & Kimura, Fumitaka (2010). Script Identification - A Han and Roman Script Perspective. I N, N (Red.), Proceedings of the 20th International Conference on Pattern Recognition (ICPR) 2010. IEEE (Institute of Electrical and Electronics Engineers). ISSN 9781424475421. s. 2708–2711. Vis sammendrag
All Han-based scripts (Chinese, Japanese, and Korean) possess similar visual characteristics. Hence system development for identification of Chinese, Japanese and Korean scripts from a single document page is quite challenging. It is noted that a Han-based document page might also have Roman script in them. A multi-script OCR system dealing with Chinese, Japanese, Korean, and Roman scripts, demands identification of scripts before execution of respective OCR modules. We propose a system to address this problem using directional features along with a Gaussian Kernel-based Support Vector Machine. We got promising results of 98.39% script identification accuracy at character level and 99.85% at block level, when no rejection was considered.
Chanda, Sukalpa; Franke, Katrin & Pal, Umapada (2010). Document-Zone Classification in Torn Documents. I Kellenberger, Patrick (Red.), ICFHR 2010 : 12th International Conference on Frontiers in HandwritingRecognition, ICFHR 2010, Kolkata, India, 16-18 November 2010. IEEE (Institute of Electrical and Electronics Engineers). ISSN 978-0-7695-4221-8. s. 25–30.
Chanda, Sukalpa; Franke, Katrin & Pal, Umapada (2010). Structural handwritten and machine print classification for sparse content and arbitrary oriented document fragments. I Shin, Dongwan; Ossowski, Sascha & Schumacher, Michael (Red.), Proceedings of the 2010 ACM Symposium on Applied Computing, Sierre, Switzerland, March 22-26, 2010, (SAC'10). ACM Publications. ISSN 978-1-60558-639-7. s. 18–22. Vis sammendrag
Discriminating handwritten and printed text is a challenging task in an arbitrary orientation scenario. The task gets even tougher when the text content is by nature sparse in the document, e.g. in torn document pieces. We here propose a system for discriminating handwritten and printed text in the context of sparse data and arbitrary orientation. A chain-code feature is used with Support Vector Machine (SVM) classifier for the purpose. Prior to feature extraction and classification some preprocessing steps (like region growing and angle estimation using Principle Component Analysis) are performed in order to resolve the arbitrary orientation issue. We got promising results of 96.90% accuracy, even when the document consists of sparse data with arbitrary orientation.
Chanda, Sukalpa; Pal, Srikanta; Franke, Katrin & Pal, Umapada (2009). Two-stage Approach for Word-wise Script Identification. I N, N (Red.), ICDAR '09. 10th International Conference on Document Analysis and Recognition, 2009. IEEE (Institute of Electrical and Electronics Engineers). ISSN 9781424445004. s. 926–930. Vis sammendrag
A two-stage approach for word-wise identification of English (Roman), Devnagari and Bengali (Bangla) scripts is proposed. This approach balances the tradeoff between recognition accuracy and processing speed. The 1st stage allows identifying scripts with high speed, yet less accuracy when dealing with noisy data. The advanced 2nd stage processes only those samples that yield low recognition confidence in the first stage. For both stages a rough character segmentation is performed and features are computed on segmented character components. Features used in the 1st stage are a 64-dimensional chain-code-histogram feature, while 400-dimensional gradient features are used in the 2nd stage. Final classification of a word to a particular script is done via majority voting of each recognized character component of the word. Extensive experiments with various confidence scores were conducted and reported here. The overall recognition accuracy and speed is remarkable. Correct classification of 98.51% on 11,123 test words is achieved, even when the recognition-confidence is as high as 95% at both stages.

Se alle arbeider i Cristin

Adak, Chandranath; Sharma, Priyanshi & Chanda, Sukalpa (2022). DAZeTD: Deep Analysis of Zones in Torn Documents.
Srivastava, Abhishek; Chanda, Sukalpa; Jha, Debesh; Pal, Umapada & Ali, Sharib (2022). GMSRF-Net: An improved generalizability with global multi-scale residual fusion network for polyp segmentation.
Chowdhury, Tamal; Chanda, Sukalpa; Bhattacharya, Saumik; Biswas, Soma & Pal, Umapada (2021). Contact-Less Heart Rate Detection in Low Light Videos.
Srivastava, Abhishek; Chanda, Sukalpa; Jha, Debesh; Riegler, Michael; Halvorsen, Pål & Johansen, Dag [Vis alle 7 forfattere av denne artikkelen] (2021). PAANet: Progressive Alternating Attention for Automatic Medical Image Segmentation.
Chanda, Sukalpa & Chanda, Sukalpa (2020). Bengali Place Name Recognition - Comparative Analysis using Different CNN Architectures .
Chanda, Sukalpa & Hast, Anders (2019). Face Recognition - A One-Shot Learning Perspective.
Chanda, Sukalpa (2019). One-Shot Learning-Based Handwritten Word Recognition.
Chanda, Sukalpa (2019). Finding Logo and Seal in Historical Document Images - An Object Detection based Approach.

Se alle arbeider i Cristin

Publisert 10. feb. 2022 09:49 - Sist endret 10. feb. 2022 09:49