Date of Completion

12-9-2013

Embargo Period

12-6-2020

Keywords

natural language processing, web content mining, semantic processing, dynamic ontology development, collaboration system, information retrieval, search, biomedical literature mining, text visualization, document visualization, gene regulatory relationships, cell signalling, picture rendering, web search

Major Advisor

Dr. Dong-Guk Shin

Associate Advisor

Dr. Robert McCartney

Associate Advisor

Dr. Xiaoyan Wang

Field of Study

Computer Science and Engineering

Degree

Doctor of Philosophy

Open Access

Campus Access

Abstract

Semantic processing system (SPS) is a system that performs phrase search of web content. SPS takes a user query in natural language, converts it to a keyword query, expands the keyword query with synonyms, hypernyms, hyponyms, and meronyms, and presents the keyword query to a search engine. SPS then sifts through the search engine result pages extracting grammatical and semantic information from each page for computing the page's relevance to the natural language query. SPS' relevance computation uses semantic matching of phrases rather than term-and-document frequency weighting—a method that is most commonly used by existing web search engines. SPS consults an ontology that is both "crowd-sourced," i.e., built collaboratively and incrementally by the large number of users and "auto-learned," i.e., contextually inferred from sentences containing desired words. SPS would be suitable for the areas of biomedical literature mining, legal document review and discovery, and news/RSS feed monitoring because these are laden with prose text. We implemented a prototype SPS, experimented with it and demonstrate that SPS outperforms a representative keyword based search engine. The strength of SPS stems from its exploitation of phrase semantics, which is not used in the conventional search engines.

COinS