
In the drop-down menu, select 'File' and then 'Export' In the drop-down list of file types, select BibTeX (.bib). Within your 'My Library' in Mendeley, select the references you would like to export to BibTeX. Mendeley can do a one-time export of citations as a.
For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. Licence This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. Each products score is calculated with real-time.Institute of Mathematics, University of the Philippines Diliman, Quezon City, Metro Manila, Philippines DOI 10.7717/peerj-cs.596 Published Accepted Received Academic Editor Thippa Reddy Gadekallu Subject Areas Computational Linguistics, Computer Vision, Natural Language and Speech, Optimization Theory and Computation, Scientific Computing and Simulation Keywords Baybayin, Optical character recognition, Support vector machine, Baybayin word recognition Copyright © 2021 Pino et al. By contrast, ReadCube Papers rates 4.4/5 stars with 103 reviews.

In this work, we propose an algorithm that provides the Latin transliteration of a Baybayin word in an image. However, all those studies anchored on the classification and recognition at the character level. Numerous works have proposed different techniques in recognizing Baybayin scripts. Since then, Baybayin OCR has become a field of research interest. With the effort in reintroducing the script, in 2018, the Committee on Basic Education and Culture of the Philippine Congress approved House Bill 1022 or the ”National Writing System Act,” which declares the Baybayin script as the Philippines’ national writing system.
Currently, Baybayin is an obsolete writing script but it has penetrated the interest as a design for a tattoo or for Filipino-themed apparel ( Cabuay, 2009). The proposed system can be used in automated transliterations of Baybayin texts transcribed in old books, tattoos, signage, graphic designs, and documents, among others.Baybayin is a pre-colonial writing system primarily used by Tagalogs in the northern Philippines. Based on our review of the literature, this is the first work that recognizes Baybayin scripts at the word level. The system was tested using a novel dataset of Baybayin word images and achieved a competitive 97.9% recognition accuracy. The method involves isolation of each Baybayin character, then classifying each character according to its equivalent syllable in Latin script, and finally concatenate each result to form the transliterated word.
Each consonant character is read with a default vowel sound ‘ ∖a∖’. Its alphabet comprises 17 main characters, 14 of which are (syllabic) consonants, and the remaining three are vowels (see Fig. Further, the said bill requires the local manufacturers to imprint Baybayin scripts with their translation on product labels, and at least four (4) Executive Departments are assigned to promulgate the said script ( Lim & Manipon, 2019).The Baybayin is a left-to-right writing system of the Tagalog language.
Every human that uses any technology has benefited from machine learning. Figure 1B shows an instance of the distinguishable phonetic features of a Baybayin consonant character using diacritics.With recent advancements and innovations, machine learning is one of the most powerful technologies in today’s world. Utilizing diacritics can also be interpreted to silence the vowel sounds. A diacritic placed above a consonant character may have pronounced vowels ‘ ∖e∖’ or ‘ ∖i∖’. For example, an accent written below a consonant character may represent an accompaniment vowel ‘ ∖o∖’ or ‘ ∖u∖’ sound.
The first Baybayin OCR study was done by Recario et al. OCR research studies consider several, or a particular level for recognition: on-page, line, block, word, or character level ( Ghosh, Dube & Shivaprasad, 2010).Studies on Baybayin character recognition have started gaining popularity. It is designed to process and read images that consist entirely of text, in handwritten or typewritten form Mithe, Indalkar & Divekar (2013). OCR is a technology that automatically recognizes characters through an optical mechanism. One contribution of machine learning that is a continuously developing field is optical character recognition (OCR).

Applications of SVM can be found in various fields of science and engineering ( Thomé, 2012 Sapankevych & Sankar, 2009 Nayak, Naik & Behera, 2015 Yang, 2004 Rivero, Lemence & Kato, 2017 Rivero & Kato, 2018 Do & Le, 2019 Le et al., 2019 Le, 2019 Byun & Lee, 2003). SVM has attracted researchers because of its robustness and high recognition accuracy ( Thomé, 2012). Recio & Mendoza (2019) employed a three-step detection approach to edges of texts images with Baybayin transcriptions.In Pino, Mendoza & Sambayan (2021), a Baybayin character recognition system has been proposed using SVM, which is a classification algorithm with extensive applications in data categorization ( Bishop, 2006).
With 97.06% average recognition rate, Hangarge, Santosh & Pardeshi (2013) have distinguished six Brahmic scripts, namely, Kannada, Devanagari, Tamil, Malayalam, Latin, and Telugu, using directional discrete cosine transforms and linear discriminant analysis. Their work yields a 97.39% recognition rate in categorizing Latin from Hindi script. Using Gabor filters and four classifier systems, Jaeger, Ma & Doermann (2005) have reported a script identification system that discriminates Latin from Arabic, Korean, and Hindi writing systems. Various machine learning algorithms have been used in word-level recognition of different writing systems. This work aims to fill this research gap.
The pyramid histogram of oriented gradient feature with an SVM classifier was used to recognize Bangla script at word level as reported by Bhunia et al. With 91.38% word recognition accuracy, Sankaran & Jawahar (2012) have proposed a recognition scheme for printed Devanagari script using bidirectional long short-term memory (Bi-LSTM). The study has concluded with a 65% recognition accuracy. For Arabic script, Erlandson, Trenkle & Vogt (1996) have proposed a word-level recognition by extracting morphological details of an Arabic word image and matching its feature vectors. (2012) in which they acquired an 83.9% accuracy. An approach using an unsupervised feature learning algorithm and CNN for Latin scripts word-level recognition was presented by Wang et al.
Their result obtained an accuracy of 95.7% using a vector space model-inspired classifier. (2005) for Chinese word recognition. A pragmatic mathematical approach has been proposed by Gao et al. Their work resulted in an 88.59% F 1 Score. Pham & Le-Hong (2017) demonstrated a Vietnamese-named entity recognition where they utilized a combination of Bi-LSTM, CNN, and conditional random field (CRF) models.
The study achieved an 82.5% recognition accuracy. (2008), where they utilized a Discrete Cosine Transform (DCT) technique for feature extraction and multilayer perceptron (MLP) neural network for classification. Another word-based Arabic script recognition system had been reported by AlKhateeb et al.
