CS54701 - Information Retrieval

Spring 2016

Credit Hours: 3

Learning Objective:
Students will: 1. Learn the theories and techniques behind Web search engines, E-commerce recommendation systems, etc. 2. Get hands on project experience by developing real-world applications, such as intelligent tools for improving search accuracy from user feedback, email spam detection, recommendation system, or scientific literature organization and mining. 3. Learn tools and techniques to do cutting-edge research in the area of information retrieval or text mining. 4. Open the door to the amazing job opportunities in Search Technology and E-commerce companies such as Google, Microsoft, Yahoo! and Amazon.

The explosive growth of available digital information (e.g., Web pages, emails, news, scientific literature) demands intelligent information agents that can sift through all available information and find out the most valuable and relevant information. Web search engines, such as Google, Yahoo!, and MSN, are several examples of such tools. This course studies the basic principles and practical algorithms used for information retrieval and text mining. The contents includes: statistical characteristics of text, several important retrieval models, text categorization, recommendation system, clustering, information extraction, etc. The course emphasizes both the above applications and solid modeling techniques (e.g., probabilistic modeling) that can be extended for other applications.

Topics Covered:
Overview, Basic Concepts, Retrieval Models, Relevance Feedback, Probability and Statistics Review, Language Model, Text Categorization, Collaborative Filtering, Federated Search, Text Clustering, Link Analysis.

A bachelor degree in computer science or an equivalent field. Students not in the Computer Science master's program should seek department permission to register.

Web Content:
Approximately 3 assignments.


One or two exams (including final).

Introduction to Information Retrieval. Manning, C.; Raghavan, P.; Sch??tze, H. Cambridge University Press (2008). Online free version: http://www-nlp.stanford.edu/IR-book/.

