Module Database Search



MODULE DESCRIPTOR
Module Title
Natural Language Processing
Reference CM4608 Version 1
Created February 2024 SCQF Level SCQF 10
Approved April 2024 SCQF Points 15
Amended ECTS Points 7.5

Aims of Module
To provide competency with natural language processing and information retrieval concepts and their applications to solve real-world problems.

Learning Outcomes for Module
On completion of this module, students are expected to be able to:
1 Illustrate natural language processing techniques.
2 Examine NLP algorithms to reason with textual content.
3 Devise textual content for algorithms to satisfy information retrieval needs using a range of similarity metrics.
4 Execute a range of technologies learnt to solve a real world problem.

Indicative Module Content
Natural Language Processing Techniques: Tokenization, Stemming, Lemmatization, Part of Speech Tagging (POS), Named Entity Recognition (NER), Chunking, Parsing. Text Representation Techniques: Lexical and paragraph features, Bag-of-words, TF-IDF, Vector Space models, Word and Sentence Embeddings, Contextual Embeddings. Text Classification: Baseline models – Decision trees, Naïve Bayes, Logistic Regression; Black-box models – SVM, Random Forrest, Gradient Boosted Trees; Deep Learning models – RNN, LSTM, CNN, Seq2Seq, Transformers. Document Clustering: K-means, Hierarchical, Density and Distribution based methods. Model Evaluation: Accuracy, Precision, Recall, F1, AUC, SSE Elbow, Silhouette Score, Jacquard Coefficient, Rand-Index. Model Interpretation: Overfitting and regularization, Class Imbalance, Data Augmentation, Hyperparameter tuning, Explainability.

Module Delivery
The module will be delivered through a mixture of lectures, tutorials and laboratory sessions.

Indicative Student Workload Full Time Part Time
Contact Hours 48 N/A
Non-Contact Hours 102 N/A
Placement/Work-Based Learning Experience [Notional] Hours N/A N/A
TOTAL 150 N/A
Actual Placement hours for professional, statutory or regulatory body    

ASSESSMENT PLAN
If a major/minor model is used and box is ticked, % weightings below are indicative only.
Component 1
Type: Coursework Weighting: 100% Outcomes Assessed: 1, 2, 3, 4
Description: Individual coursework covering all learning outcomes.

MODULE PERFORMANCE DESCRIPTOR
Explanatory Text
The calculation of the overall grade for this module is based on 100% weighting of C1. An overall minimum grade of D is required to pass the module.
Module Grade Minimum Requirements to achieve Module Grade:
A The student needs to achieve an A in C1.
B The student needs to achieve a B in C1.
C The student needs to achieve a C in C1.
D The student needs to achieve a D in C1.
E The student needs to achieve an E in C1.
F The student needs to achieve an F in C1.
NS Non-submission of work by published deadline or non-attendance for examination

Module Requirements
Prerequisites for Module CM1601, CM1602, CM1606 or equivalents.
Corequisites for module None.
Precluded Modules None.

INDICATIVE BIBLIOGRAPHY
1 Croft, W. B., Metzler, D. and Strohman, T. 2015. Search Engines Information Retrieval in Practice. Pearson Education Inc.
2 Gaber, M.M., Cocea, M., Wiratunga, N. and Goker, A. 2015. Advances in Social Media Analysis. Springer.
3 Rusell, A. 2013. Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More. 2nd ed. O'Reilly Media.
4 Provost, F. and Fawcett, T. 2013. Data science for business. Sebastopol. O'Reilly Media
5 Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (2019) Advances in Information Retrieval.
6 Bhaskar Mitra and Nick Craswell (2018). An Introduction to Neural Information Retrieval.


Robert Gordon University, Garthdee House, Aberdeen, AB10 7QB, Scotland, UK: a Scottish charity, registration No. SC013781