DTSA 5504 Data Mining Pipeline
- Specialization: Data Mining Foundations and Practice
- Instructor: Dr. Qin (Christine) Lv, Associate Professor of Computer Science
- Prior knowledge needed: Basic familiarity with Python, data structure and algorithms
Learning Outcomes
- By the end of this course, you will be able to identify the key components of the data mining pipeline and describe how they're related.
- You will be able to identify particular challenges presented by each component of the data mining pipeline.
- You will be able to apply techniques to address challenges in each component of the data mining pipeline.
Course Content
Duration: 5h 18m
This module provides an introduction to data mining and data mining pipeline, including the four views of data mining and the key components in the data mining pipeline.
Duration: 5h 11m
This module covers data understanding by identifying key data properties and applying techniques to characterize different datasets.
Duration: 5h 17m
This module explains why data preprocessing is needed and what techniques can be used to preprocess data.
Duration: 4h 54m
This module covers the key characteristics of data warehousing and the techniques to support data warehousing.
Duration: 4h
You will complete a proctored exam worth 20% of your grade made up of multiple choice questions. You must attempt the final in order to earn a grade in the course. If you've upgraded to the for-credit version of this course, please make sure you review the additional for-credit materials in the Introductory module and anywhere else they may be found.
Note: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click View on Coursera button above for the most up-to-date information.