Data Infrastructure in Production (full-time)
Timing: Full time: Monday, Part-time: Sunday
In this course, you will learn what components make up a data architecture in production and which database technologies are required for different use-cases. We will cover different database technologies and see how they can be used for solving a variety of problems. Then we will take a look at these cloud computing, and data analytics in the cloud using AWS (Amazon Web Services).
In the second half of the course you will also learn about best practices on how to tackle real-world business analytics problems using these technologies and how to deploy business rules implemented in R into production as part of cloud-based stream-processing engines, dashboards or scoring APIs.
By the end of the course you will:
- Understand the building blocks of a production data infrastructure.
- You will have an overview of current data-related cloud computing services.
- You will have hands-on knowledge on how to build a simple, end-to-end data pipeline on Amazon Web Services (AWS).
You will have hands-on knowledge on creating dashboards in R on the top of AWS.
- 20% quizzes at the beginning of class
- 80% final project – you will need to create a data pipeline on AWS
- Students shall not miss more than 1 day of lectures (out of 4 days). Failing to do so will yield an administrative fail grade. (If you have a major impediment please contact the Instructor.)
- To pass, students will need to get at least 50% of the overall grade. Failure to do so, will yield a Fail grade.
Data Analysis 1a; Big Data Computing