Yousef S. Abuzir
Pages: 723-731
Published: 2 Jun 2014
Views: 2,809
Downloads: 845
Abstract: In recent years, several approaches have been proposed to extract information from web pages on the internet. In this research, a key technique focused on crawling and ontology used to discover knowledge from web. In this paper, we present intelligent crawling system that uses pattern and ontology to extract particular information from WEB sites. The system developed as an efficient tool to construct researcher’s profile automatically from web pages. Moreover, some searching and indexing methods, text mining and computational linguistics for underlying this problem are exploited. We evaluated the performance of our system on an information extraction task from different real academic web sites. Experimental results show that with the extraction rules based on pattern discovery and ontology, our system achieves 84.90 % average of overall precision.
Keywords: information extraction, knowledge discovery, web mining, ontology, agent, crawl
Cite this article: Yousef S. Abuzir. INTELLIGENT AGENT FOR INFORMATION EXTRACTION BASED ON PATTERN DISCOVERY AND ONTOLOGY. Journal of International Scientific Publications: Materials, Methods & Technologies 8, 723-731 (2014).
Back to the contents of the volume
© 2025 The Author(s). This is an open access article distributed under the terms of the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This permission does not cover any third party copyrighted material which may appear in the work requested.
Disclaimer: The Publisher and/or the editor(s) are not responsible for the statements, opinions, and data contained in any published works. These are solely the views of the individual author(s) and contributor(s). The Publisher and/or the editor(s) disclaim any liability for injury to individuals or property arising from the ideas, methods, instructions, or products mentioned in the content.