Module Database Search

MODULE DESCRIPTOR
Module Title
Big Data Programming
Reference	CMM705	Version	4
Created	February 2024	SCQF Level	SCQF 11
Approved	May 2016	SCQF Points	15
Amended	April 2024	ECTS Points	7.5

Aims of Module
To provide a general overview of map-reduce design patterns for large data set processing tasks and to develop specialised knowledge in big data Stream Processing and Scalable Realtime Architecture.

Learning Outcomes for Module
On completion of this module, students are expected to be able to:
1	Appraise the advantages and disadvantages of applying specific big data design patterns given a real-world big data programming task.
2	Prepare a distributed architecture for big data deployment.
3	Produce a scalable program using a big data computation framework to solve a given problem.
4	Deal with relevant offerings for a given big data problem from the state-of-the-art big data offerings.

Indicative Module Content
1. Java programing primer to prepare for Big Data design patterns. 2. HDFS and Hadoop architecture for big data. 3. Case studies on how map reduce programming design patterns (e.g. summerisation, filtering, data organization, Join) can be used to address various real-world problems in processing and analyzing large data sets. 4. Investigate the concepts offered and supported in Spark, and how this contrasts with the Hadoop offering. 5. Use technologies like Spark Streaming and Storm for big-data stream processing

Indicative Module Content

1. Java programing primer to prepare for Big Data design patterns. 2. HDFS and Hadoop architecture for big data. 3. Case studies on how map reduce programming design patterns (e.g. summerisation, filtering, data organization, Join) can be used to address various real-world problems in processing and analyzing large data sets. 4. Investigate the concepts offered and supported in Spark, and how this contrasts with the Hadoop offering. 5. Use technologies like Spark Streaming and Storm for big-data stream processing

Module Delivery
This is a lecture-based module, supplemented with practical sessions, where several Big Data technologies will be used to teach students how to design and implement map-reduce programs guided by design patterns and case studies. The Hadoop eco-system will be studied and other offerings such as the Apache Spark system will be explored. For on-campus learners, teaching and learning will be facilitated hands-on at lecture halls and labs. For online learners teaching and learning will be facilitated in real-time via virtual classrooms using voice and video, collaborative tools, and remote assistance tools.

Module Delivery

This is a lecture-based module, supplemented with practical sessions, where several Big Data technologies will be used to teach students how to design and implement map-reduce programs guided by design patterns and case studies. The Hadoop eco-system will be studied and other offerings such as the Apache Spark system will be explored. For on-campus learners, teaching and learning will be facilitated hands-on at lecture halls and labs. For online learners teaching and learning will be facilitated in real-time via virtual classrooms using voice and video, collaborative tools, and remote assistance tools.

Indicative Student Workload	Full Time	Part Time
Contact Hours	N/A	102
Non-Contact Hours	N/A	48
Placement/Work-Based Learning Experience [Notional] Hours	N/A	N/A
TOTAL	N/A	150
Actual Placement hours for professional, statutory or regulatory body

ASSESSMENT PLAN
If a major/minor model is used and box is ticked, % weightings below are indicative only.
Component 1
Type:	Coursework	Weighting:	100%	Outcomes Assessed:	1, 2, 3, 4
Description:	Coursework assignment consisting of a MapReduce project and a Hadoop eco-system project.

MODULE PERFORMANCE DESCRIPTOR
Explanatory Text
The student must have a grade D on C1 to pass the module.
Module Grade	Minimum Requirements to achieve Module Grade:
A	The student needs to achieve an A in C1.
B	The student needs to achieve a B in C1.
C	The student needs to achieve a C in C1.
D	The student needs to achieve a D in C1.
E	The student needs to achieve an E in C1.
F	The student needs to achieve an F in C1.
NS	Non-submission of work by published deadline or non-attendance for examination

Module Requirements
Prerequisites for Module	None except for course entry requirements.
Corequisites for module	None.
Precluded Modules	None.

INDICATIVE BIBLIOGRAPHY
1	MINER, D., and SHOOK, A., 2012. MapReduce Design Patterns, by O'Reilly Media. O’Reilly.
2	KARAU, H., and KONWINSKY, A., 2015. Learning Spark. O’Reilly.
3	WHITE, T., 2011. Hadoop: The Definitive Guide (2nd edition). O’Reilly.
4	PERERA, S., Gunerathna, T., 2013. Hadoop MapReduce Cookbook. Packt Publishers.
5	MATLOFF, N., 2015. Parallel Computing for Data Science: With Examples in R, C++ and CUDA. CRC Press.

Robert Gordon University, Aberdeen

Module Database Search