Home
World Journal of Advanced Research and Reviews
International Journal with High Impact Factor for fast publication of Research and Review articles

Main navigation

  • Home
  • Past Issues

Enhanced named entity recognition algorithm for Filipino cultural and heritage texts

Breadcrumb

  • Home
  • Enhanced named entity recognition algorithm for Filipino cultural and heritage texts

Jhan Lou Robantes 1, *, Andreo Serrano 1, Raymund Dioses 2 and Dan Michael Cortez 2

1 Bachelor of Science in Computer Science, College of Information Systems and Technology Management, Pamantasan Ng Lungsod Ng Maynila, Philippines.
2 College of Information Systems and Technology Management, Pamantasan Ng Lungsod Ng Maynila, Philippines.

Research Article
 

World Journal of Advanced Research and Reviews, 2024, 24(03), 2177-2186
Article DOI: 10.30574/wjarr.2024.24.3.3905
DOI url: https://doi.org/10.30574/wjarr.2024.24.3.3905

Received on 11 November 2024; revised on 18 December 2024; accepted on 20 December 2024

Named Entity Recognition (NER) is a crucial natural language processing task that extracts and classifies named entities from unstructured text into predefined categories. While existing NER methods have shown success in general domains, they often face significant challenges when applied to specialized contexts like Filipino cultural and historical texts. These challenges stem from the unique linguistic features, and diverse naming conventions. This research introduces an enhanced rule-based NER approach that specifically addresses these challenges. At its core, the system utilizes curated Corpus of Historical Filipino and Philippine English (COHFIE), which serves as both training and evaluation data. This research presents an enhanced rule-based approach for NER using a Corpus of Historical Filipino and Philippine English (COHFIE) building on pattern-learning methods, incorporating character and token features, and by using positive and negative example sets. To enrich the classification process, we used the International Committee for Documentation – Conceptual Reference Model (CIDOC-CRM), a cultural heritage framework, to provide a more nuanced categorization of entities based on their historical and cultural significance. Tested across existing Filipino based models (calamanCy and RoBERTa), the enhanced model shows improvement on identifying entities related to Filipino culture (CUL) and history terms (PER, ORG, LOC).

Named Entity Recognition; Natural Language Processing; Filipino Corpus; CIDOC-CRM

https://wjarr.com/node/17050

Get Your e Certificate of Publication using below link

Download Certificate

Preview Article PDF

Jhan Lou Robantes, Andreo Serrano, Raymund Dioses and Dan Michael Cortez. Enhanced named entity recognition algorithm for Filipino cultural and heritage texts. World Journal of Advanced Research and Reviews, 2024, 24(03), 2177-2186. Article DOI: https://doi.org/10.30574/wjarr.2024.24.3.3905

Copyright © 2024 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0

Footer menu

  • Contact

Copyright © 2026 World Journal of Advanced Research and Reviews - All rights reserved

Developed & Designed by VS Infosolution