Prerequisites for Module
None except for course entry requirements.
Corequisite Modules
None.
Precluded Modules
None.
Aims of Module
To provide students with a comprehensive understanding of the main principles and practices underlying the retrieval, extraction and mining of text data and the skills to create systems for a variety of information types in differing search environments.
Learning Outcomes for Module
On completion of this module, students are expected to be able to:
1. |
Critically appraise extraction and search models in information retrieval and Natural Language Processing in relation to big data case studies.
|
2. |
Critically evaluate current research and advanced scholarship in IR and NLP, their role and alternative directions for big data projects.
|
3. |
Combine methods from NLP, topic modelling and text mining tool-kits to develop new extraction processes for real-world tasks.
|
4. |
Plan a comparative study to evaluate and interpret results from designing and developing information retrieval and extraction systems for big data.
|
Indicative Module Content
Comparative analysis of information retrieval and visualisation methods. Text extraction, tokenisation, stemming, bag-of-words, n-gram, statistical language models, vector representations and topic models. Word sense disambiguation, phrase and named entity recognition, POS tagging, shallow parsing, syntax and dependency parsing. Document similarity, clustering and classification, information extraction, sentiment analysis using lexicon-based techniques. Case studies on text classification, topic modelling applied to news articles, intelligent search and browse, sentiment analysis and social media mining.
| Indicative Student Workload
Contact Hours
| Part Time | Laboratories
| 24 | Lectures/ Tutorials
| 24 | Directed Study
| | Coursework preparation
| 25 | Directed Reading
| 27 | Private Study
| | Private Study
| 50 |
Mode of Delivery
This is a lecture based course, supplemented with laboratory sessions, where state-of-the-art extraction and retrieval toolkits will be applied to varied case studies. Tutorials will be used to initiate discussions on research papers from the field to supplement the lectures.
Assessment Plan
|
Learning Outcomes Assessed
| Component 1 | 1,2,3,4
| Component 1 – This is a coursework which consists of a written report on the state-of-the-art in a chosen area of information retrieval or text mining research combined with a class presentation. The report will contribute 40% of the overall module assessment and presentation worth 10% of the total module assessment and a comparative analysis to evaluate methods and systems from NLP, topic modelling and text mining tool kit which will contribute 50% of the total module assessment.
Indicative Bibliography
1. | MANNING, C., RAGHAVAN, P., and SCHUTZE, H., 2008. Introduction to Information Retrieval. Cambridge University Press.
| 2. | RUSELL, A., 2013. Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More. 2nd Edition. O’Reilly Media.
| 3. | MANNING, C., and SCHUTZE, H., 1999. Foundations of Statistical Natural Language Processing. MIT Press.
| 4. | BIRD, S., KLEIN, E., and LOPER, E., 2009. Natural Language Processing with Python. O’Reilly Media.
| 5. | GABER, M.M., COCEA, M., WIRATUNGA, N. and GOKER, A., 2015. Advances in Social Media Analysis. Springer.
| 6. | CROFT, W. B., METZLER, D. and STROHMAN, T., 2015. Search Engines Information Retrieval in Practice. Pearson Education Inc. http://ciir.cs.umass.edu/irbook/
|
|