Information Retrieval is the science behind search technology. Certainly, the most visible instances are the large Web Search engines, the likes of Google and Bing, but information retrieval appears everywhere we have to deal with unstructured data (e.g. free text). The focus of this lecture will be on text IR and music IR.
The objective of this course is to teach students the basics (only a brief introduction) and advanced concepts of Information Retrieval. More specifically, the students should:
- Gain a fundamental understanding on how (web) search engines (like Google, Bing, Lucene, Elasticsearch, …) work
- Learn how to efficiently search a large number of documents and rank them according to their relevance with respect to a given query
- Learn how to evaluate search results and incorporate additional context information (like PageRank) to improve search results
- Learn about Deep Neural Networks and how they can be utilized to improve the search effectiveness (e.g. learn to rank)---in that sense, there will be also a short introduction to Machine Learning and the basics of Neural Networks
- Learn how Neural Networks can be used to create advanced text representations, i.e. Word Embeddings
Differences to the Grundlagen des IR Course (188.977)
- The basic concepts of IR (inverted index, text pre-processing, etc.) are taught in detail in the Grundlagen course. These concepts, will be only briefly refreshed in the advanced course.
- One substantial part of the advanced course will be the topics Machine Learning, Deep Learning and Word Embeddings---whereas, in the Grundlagen course, these topics are not covered.
Lectures (20 h)
- 2x Crash course IR, 2x Machine learning & data annotation, 4x NLP & Neural ranking
Exercises (40 h)
- Exercise1 (Data annotation): 10 h
- Exercise2 (Neural re-ranking in Pytorch): 30 h
Exam (15 h)
- Preparation: 14 h
- Exam: 1 h
Total (75 h)
<p>Exercise and Exam</p>