Please use this identifier to cite or link to this item:
Title: Out-Door Urdu Text Detection and Recognition Using Deep Learning
Authors: Arafat, Syed Yasser
Keywords: Physical Sciences
Computer Science
Issue Date: 2021
Publisher: University of Engineering & Technology, Taxila
Abstract: Urdu is the national language of Pakistan and 6-Indian states, hence covering more than 280 million people. Urdu language script is a super-set of the Arabic set. Text/Script detection and recognition is an essential part of the PhotoOCR system. Out-door scene images containing Urdu text/script have different writing styles, multiple size ligatures, image degradation, and presence of diacritics in low recognition rates. In this dissertation, three main issues have been touched to improve Urdu PhotoOCR's state: firstly, an out door/scene dataset construction, secondly application of various approaches for detection of Urdu text in the wild/outdoor environment, and thirdly recognition of the characters/ligatures written in a given scene/image. The first issue was the availability of the out-door dataset, which was addressed by constructing 2-types of datasets, namely synthetic and purely outdoor snapped images containing Urdu text. Multiple synthetic image datasets containing Urdu text have been constructed with a maximum of 56K number of images in them; the Urdu-oriented-text dataset is also developed. Similarly, 2K+ images dataset (UrText) of snapped out-door pictures have been developed. Annotations for the Urdu text datasets were also developed. For the second issue of Urdu text detection, various detectors with a modified input layer, anchor boxes, and preprocessing are tested and reported for Urdu text localization. At least 3-types of Urdu detectors are presented and compared. Finally, we give a benchmark result for the pure out-door Urdu dataset UrText, with an AP of 0.48. Also, an AP of 0.47 was achieved for the Urdu-Text dataset. Similarly, an AP of 0.9812 is demonstrated for 4K synthetic images containing Urdu text. For the third recognition issue, ‘Two Stream Deep Neural Network’ (TSDNN) is developed to recognize the Urdu text as a sequence of characters/ligatures. TSDNN attained a partial sequence recognition rate of 94.90% and 95.20% for 4K and 51K datasets respectively. Similarly, a partial sequence recognition rate of 76.60% is realized for real-world out-door images. Furthermore, a Regression Residual Neural Network (RRNN) is developed to recognize the oriented Urdu text. RRNN demonstrated 79% and 99% accuracy for 4K and 51K images, respectively. All three issues are handled, and the techniques presented in this dissertation are compared with existing approaches for out-door/scene images containing Urdu text. Consequently, the proposed techniques can reliably be applied to PhotoOCR to enhance their capabilities effectively. Further, future directions are given to continue the journey of improvements.
Gov't Doc #: 27006
Appears in Collections:PhD Thesis of All Public / Private Sector Universities / DAIs.

Files in This Item:
File Description SizeFormat 
Syed Yasser Arafat Computer Science 2021 uet taxila.pdf 5.10.22.pdfPh.D thesis10 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.