Configurable Meta-search in the Human Resource Domain

Doctoral Thesis
Jürgen Dorn
First name: 
Last name: 
The Web has drastically changed the online availability of data and the amount of electronically exchanged information. However, the volume and heterogeneity of the information that is available online via Websites or databases make it difficult for a user to visit each and every Website that is relevant to the information needed. Primary search tools i.e. search engines, subject directories and social network search engines are not enough to meet the requirements of the information seeker. Traditional search engines are based on keyword or phrase search, without taking into account the semantics of the word or phrase, and hence may not provide the desired results to the user. Other traditional search tools suffer from low recall and precision. These tools do not provide comprehensive coverage of the Web. To overcome these problems, meta-search engines aim to offer topic-specific search using multiple heterogeneous search engines. In the human resource domain, traditional methods of job/employee search i.e. newspapers, magazines, advertising at job fairs, employment recruitment agencies and registering with search firms, lack the ability for search in the modern employment market. In this dissertation, we propose a new configurable meta-search engine in the human resource domain to provide an ideal platform for meta-search provider and a job seeker. Our aim is to combine the respective benefits of vertical search engines, meta-search engines and semantic search engines within a domain-specific context, in which there is a well-understood domain ontology. We are concerned with techniques to support two key aspects of meta-search engines: i) meta-search engine creation by meta-search engine providers and ii) meta-search engine usage for information seekers. One of the important challenges in accessing heterogeneous and distributed data via a meta-search engine is schema/data matching and integration. We describe an approach to schema and data integration for meta- search engines. During the matching and integration process, we need to handle syntactic, semantic and structural heterogeneity between multiple information sources. In this dissertation, our main objective is to resolve semantic conflicts. Our approach is a hybrid one, in that we use multiple matching criteria and multiple matchers. We employ several element levels, structure levels and ontology based techniques during the integration process. A domain ontology serves as a global ontology and allows us to resolve semantic heterogeneity. Our matching process handles different mapping cardinalities (1:1, 1:n, n:1, m:n). The mappings derived are used to generate an integrated meta-search query interface, to support query processing in the meta-search engine, and to resolve semantic conflicts arising during result extraction from the source search engines. Experiments conducted in the job search domain show that the cumulative use of element-level, structure- level and ontology-based techniques increases the correctness of matching during the automatic integration of source search interfaces. The system supports meta-search provider in the quick development of meta-search engines and is able to understand and integrate schemas from different job search engines semantically. Meta-search provider can easily integrate the new search engines in the meta-search engine. The system can help job seekers in the job search without visiting multiple search engines. Job seekers do not need to spend their time to comb through large numbers of job results in searching for the relevant job. The system can semantically understand the job results and rank them for the job seekers. An important aspect of our meta-search in human resource domain is that it has been designed by applying semantic Web technologies, to solve the problems of meta- search developers and job seekers. We provide the solutions for automatic integration of data, structures and processes in human resource domain into a meta-search by the use of our modelled domain ontology and multiple matchers. We have used HR-XML and different classification schemes in the construction of domain ontology and integrated interface for the meta-search engine. Our modelled domain ontology and HR-XML for the generation of integrated schema and integrated interface are used to understand the meaning of terms and to improve the quality of search interface and search results. Flexible and re-useable design patterns have been introduced for the creation process, usage process and different components of meta-search engine. Design pattern for the creation process helps the meta-search provider and design pattern for the usage process helps the job seeker. Design patterns for different components of meta-search engine help the new developers to speed up the development process. Meta-search increases the Web coverage for job seeker by the combination of specialized search engine, multiple search engines and semantic search into one. We hope that new meta-search engine can be helpful in reducing the unemployment rate of a country.