Characterization of soft-computing-based semantic distances for internet search
- Serrano Guerrero, Jesús
- José Ángel Olivas Varela Director
Universidade de defensa: Universidad de Castilla-La Mancha
Fecha de defensa: 24 de setembro de 2009
- Elie Sánchez Sánchez Presidente/a
- Francisco Pascual Romero Chicharro Secretario/a
- Alejandro Sobrino Vogal
- Enrique Herrera Viedma Vogal
- Manuel Prieto Vogal
Tipo: Tese
Resumo
Metasearch engines are becoming a powerful tool in our lives because of the tremendous explosion in the amount of unstructured data, both internal, corporate document collections, and the immense and grow- ing number of document sources on the World Wide Web, Many well-known problems and techniques related to search engines are being rediscovered as new challenges due to the magnitude of metasearch engines. For example, query expansion is a technique used for boosting perfor- mance of a search engine through di®erent strategies such as inserting query terms, generating new forms from existing query terms, or drop- ping query terms in order to be more restrictive. But when terms from relevant documents are to be added to the query, several questions used to be important: (i) How many documents should a user judge for extracting relevant terms, e.g., the top 10 or the top 100?, (ii) What should be the measure to judge whether a term that occurs in relevant documents is important to be dropped or added to the query? From now on these question are becoming too simple, metasearch engines suppose new more complex questions: (i) How many docu- ments should a user judge for extracting relevant terms and Which search engine can provide the most relevant documents? or (ii) What should be the measure to judge whether a term that occurs in rel- evant documents is important to be dropped or added to the query and What should the operator used for adding new terms considering the di®erent operators o®ered by each search engine? Due to the di±culty answering these complex questions, this disser- tation is focused on di®erent problems that a metasearch engine in- volves, especially, those problems related to the phase called query expansion through Soft-computing techniques.