Module Database Search



MODULE DESCRIPTOR
Module Title
Text Analytics
Reference CMM706 Version 4
Created February 2024 SCQF Level SCQF 11
Approved May 2016 SCQF Points 15
Amended April 2024 ECTS Points 7.5

Aims of Module
To provide students with a comprehensive understanding of the main principles and practices underlying the retrieval, extraction and mining of text data and the skills to create systems for a variety of information types in differing search environments.

Learning Outcomes for Module
On completion of this module, students are expected to be able to:
1 Appraise extraction and search models in information retrieval and Natural Language Processing in relation to big data case studies.
2 Evaluate current research and advanced scholarship in IR and NLP, their role and alternative directions for big data projects.
3 Produce new extraction processes for real-world tasks using a combination of methods from NLP, topic modelling and text mining tool-kits.
4 Design a comparative study to evaluate and interpret results from designing and developing information retrieval and extraction systems for big data.

Indicative Module Content
Comparative analysis of information retrieval and visualisation methods. Text extraction, tokenisation, stemming, bag-of-words, n-gram, statistical language models, vector representations and topic models. Word sense disambiguation, phrase and named entity recognition, POS tagging, shallow parsing, syntax and dependency parsing. Document similarity, clustering and classification, information extraction, sentiment analysis using lexicon-based techniques. Case studies on text classification, topic modelling applied to news articles, intelligent search and browse, sentiment analysis and social media mining.

Module Delivery
This is a lecture-based course, supplemented with laboratory sessions, where state-of-the-art extraction and retrieval toolkits will be applied to varied case studies. Tutorials will be used to initiate discussions on research papers from the field to supplement the lectures. For on-campus learners, teaching and learning will be facilitated hands-on at lecture halls and labs. For online learners teaching and learning will be facilitated in real-time via virtual classrooms using voice and video, collaborative tools, and remote assistance tools.

Indicative Student Workload Full Time Part Time
Contact Hours N/A 48
Non-Contact Hours N/A 102
Placement/Work-Based Learning Experience [Notional] Hours N/A N/A
TOTAL N/A 150
Actual Placement hours for professional, statutory or regulatory body    

ASSESSMENT PLAN
If a major/minor model is used and box is ticked, % weightings below are indicative only.
Component 1
Type: Coursework Weighting: 100% Outcomes Assessed: 1, 2, 3, 4
Description: Coursework consisting of a written report on the state−of−the−art in a chosen area of information retrieval or text mining research, class participation, and a comparative analysis to evaluate methods and systems from NLP, topic modelling and text mining tool kit.

MODULE PERFORMANCE DESCRIPTOR
Explanatory Text
The student must have a grade D on C1 to pass the module.
Module Grade Minimum Requirements to achieve Module Grade:
A The student needs to achieve an A in C1.
B The student needs to achieve a B in C1.
C The student needs to achieve a C in C1.
D The student needs to achieve a D in C1.
E The student needs to achieve an E in C1.
F The student needs to achieve an F in C1.
NS Non-submission of work by published deadline or non-attendance for examination

Module Requirements
Prerequisites for Module None except for course entry requirements.
Corequisites for module None.
Precluded Modules None.

INDICATIVE BIBLIOGRAPHY
1 MANNING, C., RAGHAVAN, P., and SCHUTZE, H., 2008. Introduction to Information Retrieval. Cambridge University Press.
2 RUSELL, A., 2013. Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More. 2nd Edition. O’Reilly Media.
3 MANNING, C., and SCHUTZE, H., 1999. Foundations of Statistical Natural Language Processing. MIT Press.
4 BIRD, S., KLEIN, E., and LOPER, E., 2009. Natural Language Processing with Python. O’Reilly Media.
5 GABER, M.M., COCEA, M., WIRATUNGA, N. and GOKER, A., 2015. Advances in Social Media Analysis. Springer.
6 CROFT, W. B., METZLER, D. and STROHMAN, T., 2015. Search Engines Information Retrieval in Practice. Pearson Education Inc. http://ciir.cs.umass.edu/irbook/


Robert Gordon University, Garthdee House, Aberdeen, AB10 7QB, Scotland, UK: a Scottish charity, registration No. SC013781