Please use this identifier to cite or link to this item: http://prr.hec.gov.pk/jspui/handle/123456789/18169
Title: Prosody Modeling Approaches to Generate Prosody for Sindhi Speech Processing Applications
Authors: Mahar, Shahid Ali
Keywords: Computer Science
Computer & IT
Issue Date: 2021
Publisher: Shah Abdul Latif University, Khairpur.
Abstract: Lot of research has been done so far in order to provide synthetic speech of immense quality. Researchers contribute with number of techniques and methods to uplift the ground of speech processing systems by keeping in view the factors of naturalness and intelligibility. Whereas, prosodic information is incomparably contributes to improve the synthetic speech quality. Hence, ameliorate the prosody modeling to synthesis the speech is obligatory and so is interesting. The prosodic information can be procured via pitch variations, duration, volume intensity and voice quality. The rhythm as well as melody of languages can also be used to represent prosodic information. This research aims to provide a series of Sindhi Prosody based experiments conducted to enrich speech processing systems by using most advanced method of machine learning i.e. Artificial Neural Network. Moreover, the main focus in this thesis work is the experimentation of Sindhi recorded sounds based on the consideration of two prosodic related features namely sound duration and pitch. Using the similar models of prosody generation like Fujisaki Parameters Extraction Model, Superposition of Functional Contours and Neural Network to generate the prosodic information of Sindhi language in the same way like previous practice was the central idea of this research work. The undergraduate students were selected for recording the sounds of Sindhi language. The prosodic information from recorded Sindhi sounds was engendered using all above mentioned techniques individually as well as jointly. A reasonable structure that is custom-made to ensemble subjectively vast areas which include individual models for all view points as well as possibilities where extents of prosody are also presented. The sounds required for the experimentation were recorded and segmented into various units. The prosodic limitation and boundaries were also predefined to the male and female speakers. A speech databases was developed which stored the segmentation of recorded sounds in terms of words, syllables and phonemes in a binary format. For creation of F0 contour, Fujisaki parameters model is selected which is implemented with neural network approach. The results are separately calculated in ix terms of standard deviation using one phoneme words to six phoneme words. The Superposition of Functional Contours method is implemented using the libraries of Python and results are calculated in Mean duration and pitches. The outcomes are measured with obtained and observed parameters. The prosodic information is also obtained using the neural network approach. Limited observations of syllable duration and syllable pitches are examined due to large number of experimental results. Experimentally, it is perceived that the rich Sindhi prosody can appropriately be displayed by approaches amended in this research work correspondingly discussed formerly. And the reason is the realistic problems where the quality as well as quantity of the training data effects not the consolidation of essential premises in to the system. The outcomes of the experiments are quite acceptable and at satisfactory level as well as befitting for speech processing systems. Various acoustic properties are used to generate an emergence of prosodic features. This study is oriented on efficient and effective extraction of prosodic features from Sindhi speech. This research study covers mainly the pitch and the duration features hence, more efforts and time is needed to develop a Sindhi speech prosody generation model. Furthermore, this study may stand as pioneering one to save the way for the development of Sindhi speech prosody modeling. Researchers as well as software developers of the arena of Sindhi speech processing applications would be rescored form the results and discussion of this research work.
Gov't Doc #: 24292
URI: http://prr.hec.gov.pk/jspui/handle/123456789/18169
Appears in Collections:PhD Thesis of All Public / Private Sector Universities / DAIs.

Files in This Item:
File Description SizeFormat 
Shahid Ali Mahar CS 2021 Salu Khairpur.pdfphd.Thesis6.81 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.