Nanodegree key: nd027
Version: 2.0.0
Locale: en-us
Learn to design data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets.
Content
Part 01 : Welcome to the Nanodegree Program
-
Module 01: Welcome
-
Lesson 01: Welcome to the Data Engineering Nanodegree Program
Learn to design data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets.
-
Lesson 02: Knowledge, Community, and Careers
You are starting a challenging but rewarding journey! Take 5 minutes to read how to get help with projects and content.
-
Lesson 03: Get Help with Your Account
What to do if you have questions about your account or general questions about the program.
-
Lesson 04: Introduction to Data Engineering
What does it mean to be a Data Engineer?
-
-
Module 02: Careers Services Orientation
-
Lesson 01: Nanodegree Career Services
The Careers team at Udacity is here to help you move forward in your career - whether it's finding a new job, exploring a new career path, or applying new skills to your current job.
-
Part 02 : Data Modeling
Learn to create relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.
-
Module 01: Data Modeling Lessons and Projects
-
Lesson 01: Introduction to Data Modeling
In this lesson, students will learn the basic difference between relational and non-relational databases, and how each type of database fits the diverse needs of data consumers.
- Concept 01: Introduction to the Course
- Concept 02: What is Data Modeling?
- Concept 03: Why is Data Modeling Important?
- Concept 04: Who does this type of work?
- Concept 05: Intro to Relational Databases
- Concept 06: Relational Databases
- Concept 07: When to use a relational database?
- Concept 08: ACID Transactions
- Concept 09: When Not to Use a Relational Database
- Concept 10: What is PostgreSQL?
- Concept 11: Demos: Creating a Postgres Table
- Concept 12: Exercise 1: Creating a Table with Postgres
- Concept 13: Solution for Exercise 1: Create a Table with Postgres
- Concept 14: NoSQL Databases
- Concept 15: What is Apache Cassandra?
- Concept 16: When to Use a NoSql Database
- Concept 17: When Not to Use a NoSql Database
- Concept 18: Demo 2: Creating table with Cassandra
- Concept 19: Exercise 2: Create table with Cassandra
- Concept 20: Solution for Exercise 2: Create table with Cassandra
- Concept 21: Conclusion
-
Lesson 02: Relational Data Models
In this lesson, students understand the purpose of data modeling, the strengths and weaknesses of relational databases, and create schemas and tables in Postgres
- Concept 01: Learning Objective
- Concept 02: Databases
- Concept 03: Importance of Relational Databases
- Concept 04: OLAP vs OLTP
- Concept 05: Quiz 1
- Concept 06: Structuring the Database: Normalization
- Concept 07: Objectives of Normal Form
- Concept 08: Normal Forms
- Concept 09: Demo 1: Creating Normalized Tables
- Concept 10: Exercise 1: Creating Normalized Tables
- Concept 11: Solution: Exercise 1: Creating Normalized Tables
- Concept 12: Denormalization
- Concept 13: Demo 2: Creating Denormalized Tables
- Concept 14: Denormalization Vs. Normalization
- Concept 15: Exercise 2: Creating Denormalized Tables
- Concept 16: Solution: Exercise 2: Creating Denormalized Tables
- Concept 17: Fact and Dimension Tables
- Concept 18: Star Schemas
- Concept 19: Benefits of Star Schemas
- Concept 20: Snowflake Schemas
- Concept 21: Demo 3: Creating Fact and Dimension Tables
- Concept 22: Exercise 3: Creating Fact and Dimension Tables
- Concept 23: Solution: Exercise 3: Creating Fact and Dimension Tables
- Concept 24: Data Definition and Constraints
- Concept 25: Upsert
- Concept 26: Conclusion
-
Lesson 03: Project: Data Modeling with Postgres
Students will model user activity data to create a database and ETL pipeline in Postgres for a music streaming app. They will define Fact and Dimension tables and insert data into new tables.
-
Lesson 04: NoSQL Data Models
Students will understand when to use non-relational databases based on the data business needs, their strengths and weaknesses, and how to creates tables in Apache Cassandra.
- Concept 01: Learning Objectives
- Concept 02: Non-Relational Databases
- Concept 03: Distributed Databases
- Concept 04: CAP Theorem
- Concept 05: Quiz 1
- Concept 06: Denormalization in Apache Cassandra
- Concept 07: CQL
- Concept 08: Demo 1
- Concept 09: Exercise 1
- Concept 10: Exercise 1 Solution
- Concept 11: Primary Key
- Concept 12: Primary Key
- Concept 13: Demo 2
- Concept 14: Exercise 2
- Concept 15: Exercise 2: Solution
- Concept 16: Clustering Columns
- Concept 17: Demo 3
- Concept 18: Exercise 3
- Concept 19: Exercise 3: Solution
- Concept 20: WHERE Clause
- Concept 21: Demo 4
- Concept 22: Exercise 4
- Concept 23: Lesson Wrap Up
- Concept 24: Course Wrap Up
-
Lesson 05: Project: Data Modeling with Apache Cassandra
Students will model event data to create a non-relational database and ETL pipeline for a music streaming app. They will define queries and tables for a database built using Apache Cassandra.
-
Part 03 : Cloud Data Warehouses
-
Module 01: Cloud Data Warehouses Lessons
-
Lesson 01: Introduction to Data Warehouses
In this lesson, you'll be introduced to data warehouses, the Cloud and AWS.
- Concept 01: Course Introduction
- Concept 02: Lesson Introduction
- Concept 03: Data Warehouse: Business Perspective
- Concept 04: Operational vs. Analytical Processes
- Concept 05: Data Warehouse: Technical Perspective
- Concept 06: Dimensional Modeling
- Concept 07: ETL Demo: Step 1 & 2
- Concept 08: Exercise 1: Step 1 & 2
- Concept 09: ETL Demo: Step 3
- Concept 10: Exercise 1: Step 3
- Concept 11: ETL Demo: Step 4
- Concept 12: Exercise 1: Step 4
- Concept 13: ETL Demo: Step 5
- Concept 14: Exercise 1: Step 5
- Concept 15: ETL Demo: Step 6
- Concept 16: Exercise 1: Step 6
- Concept 17: DWH Architecture: Kimball's Bus Architecture
- Concept 18: DWH Architecture: Independent Data Marts
- Concept 19: DWH Architecture: CIF
- Concept 20: DWH Architecture: Hybrid Bus & CIF
- Concept 21: OLAP Cubes
- Concept 22: OLAP Cubes: Roll-Up and Drill Down
- Concept 23: OLAP Cubes: Slice and Dice
- Concept 24: OLAP Cubes: Query Optimization
- Concept 25: OLAP Cubes Demo: Slicing & Dicing
- Concept 26: Exercise 2: Slicing & Dicing
- Concept 27: OLAP Cubes Demo: Roll-Up
- Concept 28: Exercise 2: Roll-Up & Drill Down
- Concept 29: OLAP Cubes Demo: Grouping Sets
- Concept 30: Exercise 2: Grouping Sets
- Concept 31: OLAP Cubes Demo: CUBE
- Concept 32: Exercise 2: CUBE
- Concept 33: Data Warehouse Technologies
- Concept 34: Demo: Column format in ROLAP
- Concept 35: Exercise 3: Column format in ROLAP
-
Lesson 02: Introduction to Cloud Computing and AWS
In this lesson, you'll be offered an introduction to cloud computing, and guided in setting up an AWS account and credits.
- Concept 01: Lesson Introduction
- Concept 02: Cloud Computing
- Concept 03: Amazon Web Services
- Concept 04: AWS Setup Instructions for Regular account
- Concept 05: Create an IAM Role
- Concept 06: Create Security Group
- Concept 07: Launch a Redshift Cluster
- Concept 08: Create an IAM User
- Concept 09: Delete a Redshift Cluster
- Concept 10: Create an S3 Bucket
- Concept 11: Upload to S3 Bucket
- Concept 12: Create PostgreSQL RDS
- Concept 13: Avoid Paying Unexpected Costs for AWS
-
Lesson 03: Implementing Data Warehouses on AWS
In this lesson, you'll learn to implement a data warehouse on AWS
- Concept 01: Lesson Introduction
- Concept 02: Data Warehouse: A Closer Look
- Concept 03: Choices for Implementing a Data Warehouse
- Concept 04: DWH Dimensional Model Storage on AWS
- Concept 05: Amazon Redshift Technology
- Concept 06: Amazon Redshift Architecture
- Concept 07: Redshift Architecture Example
- Concept 08: SQL to SQL ETL
- Concept 09: SQL to SQL ETL - AWS Case
- Concept 10: Redshift & ETL in Context
- Concept 11: Ingesting at Scale
- Concept 12: Redshift ETL Examples
- Concept 13: Redshift ETL Continued
- Concept 14: Redshift Cluster Quick Launcher
- Concept 15: Exercise 1: Launch Redshift Cluster
- Concept 16: Problems with the Quick Launcher
- Concept 17: Infrastructure as Code on AWS
- Concept 18: Enabling Programmatic Access fo IaC
- Concept 19: Demo: Infrastructure as Code
- Concept 20: Exercise 2: Infrastructure as Code
- Concept 21: Exercise Solution 2: Infrastructure as Code
- Concept 22: Demo: Parallel ETL
- Concept 23: Exercise 3: Parallel ETL
- Concept 24: Exercise Solution 3: Parallel ETL
- Concept 25: Optimizing Table Design
- Concept 26: Distribution Style: Even
- Concept 27: Distribution Style: All
- Concept 28: Distribution Syle: Auto
- Concept 29: Distribution Syle: Key
- Concept 30: Sorting Key
- Concept 31: Sorting Key Example
- Concept 32: Demo: Table Design
- Concept 33: Exercise 4: Table Design
- Concept 34: Exercise Solution 4: Table Design
- Concept 35: Conclusion
-
-
Module 02: Project: Data Warehouse
-
Lesson 01: Project: Data Warehouse
Students will build an ETL pipeline that extracts data from S3, stages them in Redshift, and transforms data into a set of dimensional tables for their analytics team.
-
Part 04 : Data Lakes with Spark
-
Module 01: Data Lakes Lessons
-
Lesson 01: The Power of Spark
In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.
- Concept 01: Introduction
- Concept 02: What is Big Data?
- Concept 03: Numbers Everyone Should Know
- Concept 04: Hardware: CPU
- Concept 05: Hardware: Memory
- Concept 06: Hardware: Storage
- Concept 07: Hardware: Network
- Concept 08: Hardware: Key Ratios
- Concept 09: Small Data Numbers
- Concept 10: Big Data Numbers
- Concept 11: Medium Data Numbers
- Concept 12: History of Distributed Computing
- Concept 13: The Hadoop Ecosystem
- Concept 14: MapReduce
- Concept 15: Hadoop MapReduce [Demo]
- Concept 16: The Spark Cluster
- Concept 17: Spark Use Cases
- Concept 18: Summary
-
Lesson 02: Data Wrangling with Spark
In this lesson, we'll dive into how to use Spark for cleaning and aggregating data.
- Concept 01: Introduction
- Concept 02: Functional Programming
- Concept 03: Why Use Functional Programming
- Concept 04: Procedural Example
- Concept 05: Procedural [Example Code]
- Concept 06: Pure Functions in the Bread Factory
- Concept 07: The Spark DAGs: Recipe for Data
- Concept 08: Maps and Lambda Functions
- Concept 09: Maps and Lambda Functions [Example Code]
- Concept 10: Data Formats
- Concept 11: Distributed Data Stores
- Concept 12: SparkSession
- Concept 13: Reading and Writing Data into Spark Data Frames
- Concept 14: Read and Write Data into Spark Data Frames [example code]
- Concept 15: Imperative vs Declarative programming
- Concept 16: Data Wrangling with DataFrames
- Concept 17: Data Wrangling with DataFrames Extra Tips
- Concept 18: Data Wrangling with Spark [Example Code]
- Concept 19: Quiz - Data Wrangling with DataFrames
- Concept 20: Quiz - Data Wrangling with DataFrames Jupyter Notebook
- Concept 21: Quiz [Solution Code]
- Concept 22: Spark SQL
- Concept 23: Example Spark SQL
- Concept 24: Example Spark SQL [Example Code]
- Concept 25: Quiz - Data Wrangling with SparkSQL
- Concept 26: Quiz [Spark SQL Solution Code]
- Concept 27: RDDs
- Concept 28: Summary
-
Lesson 03: Setting up Spark Clusters with AWS
In this lesson, you will learn to run Spark on a distributed cluster in AWS UI and AWS CLI.
- Concept 01: Introduction
- Concept 02: From Local to Standalone Mode
- Concept 03: Setup Instructions AWS
- Concept 04: Alternate ways to connect to AWS using CLI
- Concept 05: Create EMR Using AWS CLI
- Concept 06: Using Notebooks on Your Cluster
- Concept 07: Spark Scripts
- Concept 08: Submitting Spark Scripts
- Concept 09: Storing and Retrieving Data on the Cloud
- Concept 10: Reading and Writing to Amazon S3
- Concept 11: Understanding difference between HDFS and AWS S3
- Concept 12: Reading and Writing Data to HDFS
- Concept 13: Recap Local Mode to Cluster Mode
-
Lesson 04: Debugging and Optimization
In this lesson, you will learn best practices for debugging and optimizing your Spark applications.
- Concept 01: Debugging is Hard
- Concept 02: Intro: Syntax Errors
- Concept 03: Code Errors
- Concept 04: Data Errors
- Concept 05: Debugging your Code
- Concept 06: How to Use Accumulators
- Concept 07: Spark Broadcast
- Concept 08: Spark WebUI
- Concept 09: Connecting to the Spark Web UI
- Concept 10: Different types of Spark Functions
- Concept 11: Getting Familiar with the Spark UI
- Concept 12: Review of the Log Data
- Concept 13: Intro: Code Optimization
- Concept 14: Understanding Data Skewness
- Concept 15: Optimizing for Data Skewness
- Concept 16: Other Issues and How to Address Them
- Concept 17: Lesson Summary
-
Lesson 05: Introduction to Data Lakes
In this lesson, you'll learn to need for a data lake, how it's different from a data warehouse, and various options to implement it on AWS
- Concept 01: Introduction
- Concept 02: Lesson Overview
- Concept 03: Why Data Lakes: Evolution of the Data Warehouse
- Concept 04: Why Data Lakes: Unstructured & Big Data
- Concept 05: Why Data Lakes: New Roles & Advanced Analytics
- Concept 06: Big Data Effects: Low Costs, ETL Offloading
- Concept 07: Big Data Effects: Schema-on-Read
- Concept 08: Big Data Effects: (Un-/Semi-)Structured support
- Concept 09: Demo: Schema On Read Pt 1
- Concept 10: Demo: Schema On Read Pt 2
- Concept 11: Demo: Schema On Read Pt 3
- Concept 12: Demo: Schema On Read Pt 4
- Concept 13: Exercise 1: Schema On Read
- Concept 14: Demo: Advanced Analytics NLP Pt 1
- Concept 15: Demo: Advanced Analytics NLP Pt 2
- Concept 16: Demo: Advanced Analytics NLP Pt 3
- Concept 17: Exercise 2: Advanced Analytics NLP
- Concept 18: Data Lake Implementation Introduction
- Concept 19: Data Lake Concepts
- Concept 20: Data Lake vs Data Warehouse
- Concept 21: AWS Setup
- Concept 22: Data Lake Options on AWS
- Concept 23: AWS Options: EMR (HDFS + Spark)
- Concept 24: AWS Options: EMR: S3 + Spark
- Concept 25: AWS Options: Athena
- Concept 26: Demo: Data Lake on S3 Pt 1
- Concept 27: Demo: Data Lake on S3 Pt 2
- Concept 28: Exercise 3: Data Lake on S3
- Concept 29: Demo: Data Lake on EMR Pt 1
- Concept 30: Demo: Data Lake on EMR Pt 2
- Concept 31: Demo: Data Lake on Athena Pt 1
- Concept 32: Demo: Data Lake on Athena Pt 2
- Concept 33: Data Lake Issues
- Concept 34: [AWS] Launch EMR Cluster and Notebook
- Concept 35: [AWS] Avoid Paying Unexpected Costs
-
-
Module 02: Project: Data Lake
-
Lesson 01: Project: Data Lake
Students will build a data lake and an ETL pipeline in Spark that loads data from S3, processes the data into analytics tables, and loads them back into S3.
-
-
Module 03: Career Services: GitHub
-
Lesson 01: Optimize Your GitHub Profile
Other professionals are collaborating on GitHub and growing their network. Submit your profile to ensure your profile is on par with leaders in your field.
- Concept 01: Prove Your Skills With GitHub
- Concept 02: Introduction
- Concept 03: GitHub profile important items
- Concept 04: Good GitHub repository
- Concept 05: Interview with Art - Part 1
- Concept 06: Identify fixes for example “bad” profile
- Concept 07: Quick Fixes #1
- Concept 08: Quick Fixes #2
- Concept 09: Writing READMEs with Walter
- Concept 10: Interview with Art - Part 2
- Concept 11: Commit messages best practices
- Concept 12: Reflect on your commit messages
- Concept 13: Participating in open source projects
- Concept 14: Interview with Art - Part 3
- Concept 15: Participating in open source projects 2
- Concept 16: Starring interesting repositories
- Concept 17: Next Steps
-
Part 05 : Data Pipelines with Airflow
-
Module 01: Data Pipelines Lessons
-
Lesson 01: Data Pipelines
Students will get an introduction to data pipelines, Apache Airflow as a data pipeline solution, how Airflow works, how to configure and scheduling data pipelines with Airflow and debug a pipeline job
- Concept 01: Welcome
- Concept 02: AWS Account and Credits
- Concept 03: What is a Data Pipeline?
- Concept 04: Data Validation
- Concept 05: DAGs and Data Pipelines
- Concept 06: Bikeshare DAG
- Concept 07: Introduction to Apache Airflow
- Concept 08: Demo 1: Airflow DAGs
- Concept 09: Workspace Instructions
- Concept 10: Exercise 1: Airflow DAGs
- Concept 11: Solution 1: Airflow DAGs
- Concept 12: How Airflow Works
- Concept 13: Airflow Runtime Architecture
- Concept 14: Building a Data Pipeline
- Concept 15: Demo 2: Run the Schedules
- Concept 16: Exercise 2: Run the Schedules
- Concept 17: Solution 2: Run the Schedules
- Concept 18: Operators and Tasks
- Concept 19: Demo 3: Task Dependencies
- Concept 20: Exercise 3: Task Dependencies
- Concept 21: Solution: Task Dependencies
- Concept 22: Airflow Hooks
- Concept 23: Demo 4: Connections and Hooks
- Concept 24: Exercise 4: Connections and Hooks
- Concept 25: Solution 4: Connections and Hooks
- Concept 26: Demo 5: Context and Templating
- Concept 27: Exercise 5: Context and Templating
- Concept 28: Solution 5: Context and Templating
- Concept 29: Quiz: Review of Pipeline Components
- Concept 30: Demo: Exercise 6: Building the S3 to Redshift DAG
- Concept 31: Exercise 6: Build the S3 to Redshift DAG
- Concept 32: Solution 6: Build the S3 to Redshift DAG
- Concept 33: Conclusion
-
Lesson 02: Data Quality
Students will learn how to track data lineage and set up data pipeline schedules, partition data to optimize pipelines, investigating Data Quality issues, and write tests to ensure data quality.
- Concept 01: What we are going to learn?
- Concept 02: What is Data Lineage?
- Concept 03: Visualizing Data Lineage
- Concept 04: Demo 1: Data Lineage in Airflow
- Concept 05: Exercise 1: Data Lineage in Airflow
- Concept 06: Solution 1: Data Lineage in Airflow
- Concept 07: Data Pipeline Schedules
- Concept 08: Scheduling in Airflow
- Concept 09: Updating DAGs
- Concept 10: Demo 2: Schedules and Backfills in Airflow
- Concept 11: Exercise 2: Schedules and Backfills in Airflow
- Concept 12: Solution 2: : Schedules and Backfills in Airflow
- Concept 13: Data Partitioning
- Concept 14: Goals of Data Partitioning
- Concept 15: Demo 3: Data Partitioning
- Concept 16: Exercise 3: Data Partitioning
- Concept 17: Solution 3: Data Partitioning
- Concept 18: Data Quality
- Concept 19: Demo 4: Data Quality
- Concept 20: Exercise 4: Data Quality
- Concept 21: Solution 4: Data Quality
- Concept 22: Conclusion
-
Lesson 03: Production Data Pipelines
In this last lesson, students will learn how to build Pipelines with maintainability and reusability in mind. They will also learn about pipeline monitoring.
- Concept 01: Lesson Introduction
- Concept 02: Extending Airflow with Plugins
- Concept 03: Extending Airflow Hooks & Contrib
- Concept 04: Demo 1: Operator Plugins
- Concept 05: Exercise 1: Operator Plugins
- Concept 06: Solution 1: Operator Plugins
- Concept 07: Best Practices for Data Pipeline Steps - Task Boundaries
- Concept 08: Demo 2: Task Boundaries
- Concept 09: Exercise 2: Refactor a DAG
- Concept 10: Solution 2: Refactor a DAG
- Concept 11: Subdags: Introduction and When to Use Them
- Concept 12: SubDAGs: Drawbacks of SubDAGs
- Concept 13: Quiz: Subdags
- Concept 14: Demo 3: SubDAGs
- Concept 15: Exercise 3: SubDAGs
- Concept 16: Solution 3: Subdags
- Concept 17: Monitoring
- Concept 18: Monitoring
- Concept 19: Exercise 4: Building a Full DAG
- Concept 20: Solution 4: Building a Full Pipeline
- Concept 21: Conclusion
- Concept 22: Additional Resources: Data Pipeline Orchestrators
-
-
Module 02: Project: Data Pipelines
-
Lesson 01: Project: Data Pipelines
Students continue to work on the music streaming company’s data infrastructure by creating and automating a set of data pipelines with Airflow, monitoring and debugging production pipelines
-
-
Module 03: Career Services: LinkedIn
-
Lesson 01: Take 30 Min to Improve your LinkedIn
Find your next job or connect with industry peers on LinkedIn. Ensure your profile attracts relevant leads that will grow your professional network.
- Concept 01: Get Opportunities with LinkedIn
- Concept 02: Use Your Story to Stand Out
- Concept 03: Why Use an Elevator Pitch
- Concept 04: Create Your Elevator Pitch
- Concept 05: Use Your Elevator Pitch on LinkedIn
- Concept 06: Create Your Profile With SEO In Mind
- Concept 07: Profile Essentials
- Concept 08: Work Experiences & Accomplishments
- Concept 09: Build and Strengthen Your Network
- Concept 10: Reaching Out on LinkedIn
- Concept 11: Boost Your Visibility
- Concept 12: Up Next
-
Part 06 : Capstone Project
-
Module 01: DEND Capstone
-
Lesson 01: Capstone Project
In this Capstone project, students will define the scope of the project and the data they will be working with to demonstrate what they have learned in this Data Engineering Nanodegree.
-
Part 07 (Elective): Intro to Python
Learn Python programming fundamentals such as data types and structures, variables, loops, and functions.
-
Module 01: Lessons
-
Lesson 01: Why Python Programming
Welcome to Introduction to Python! Here's an overview of the course.
-
Lesson 02: Data Types and Operators
Familiarize yourself with the building blocks of Python! Learn about data types and operators, built-in functions, type conversion, whitespace, and style guidelines.
- Concept 01: Introduction
- Concept 02: Arithmetic Operators
- Concept 03: Quiz: Arithmetic Operators
- Concept 04: Solution: Arithmetic Operators
- Concept 05: Variables and Assignment Operators
- Concept 06: Quiz: Variables and Assignment Operators
- Concept 07: Solution: Variables and Assignment Operators
- Concept 08: Integers and Floats
- Concept 09: Quiz: Integers and Floats
- Concept 10: Booleans, Comparison Operators, and Logical Operators
- Concept 11: Quiz: Booleans, Comparison Operators, and Logical Operators
- Concept 12: Solution: Booleans, Comparison and Logical Operators
- Concept 13: Strings
- Concept 14: Quiz: Strings
- Concept 15: Solution: Strings
- Concept 16: Type and Type Conversion
- Concept 17: Quiz: Type and Type Conversion
- Concept 18: Solution: Type and Type Conversion
- Concept 19: String Methods
- Concept 20: String Methods
- Concept 21: Another String Method - Split
- Concept 22: Quiz: String Methods Practice
- Concept 23: Solution: String Methods Practice
- Concept 24: "There's a Bug in my Code"
- Concept 25: Conclusion
- Concept 26: Summary
-
Lesson 03: Data Structures
Use data structures to order and group different data types together! Learn about the types of data structures in Python, along with more useful built-in functions and operators.
- Concept 01: Introduction
- Concept 02: Lists and Membership Operators
- Concept 03: Quiz: Lists and Membership Operators
- Concept 04: Solution: List and Membership Operators
- Concept 05: Why Do We Need Lists?
- Concept 06: List Methods
- Concept 07: Quiz: List Methods
- Concept 08: Check for Understanding: Lists
- Concept 09: Tuples
- Concept 10: Quiz: Tuples
- Concept 11: Sets
- Concept 12: Quiz: Sets
- Concept 13: Dictionaries and Identity Operators
- Concept 14: Quiz: Dictionaries and Identity Operators
- Concept 15: Solution: Dictionaries and Identity Operators
- Concept 16: Quiz: More With Dictionaries
- Concept 17: When to Use Dictionaries?
- Concept 18: Check for Understanding: Data Structures
- Concept 19: Compound Data Structures
- Concept 20: Quiz: Compound Data Structures
- Concept 21: Solution: Compound Data Structions
- Concept 22: Practice Questions
- Concept 23: Solution: Practice Questions
- Concept 24: Conclusion
-
Lesson 04: Control Flow
Build logic into your code with control flow tools! Learn about conditional statements, repeating code with loops and useful built-in functions, and list comprehensions.
- Concept 01: Introduction
- Concept 02: Conditional Statements
- Concept 03: Practice: Conditional Statements
- Concept 04: Solution: Conditional Statements
- Concept 05: Quiz: Conditional Statements
- Concept 06: Solution: Conditional Statements
- Concept 07: Boolean Expressions for Conditions
- Concept 08: Quiz: Boolean Expressions for Conditions
- Concept 09: Solution: Boolean Expressions for Conditions
- Concept 10: For Loops
- Concept 11: Practice: For Loops
- Concept 12: Solution: For Loops Practice
- Concept 13: Quiz: For Loops
- Concept 14: Solution: For Loops Quiz
- Concept 15: Quiz: Match Inputs To Outputs
- Concept 16: Building Dictionaries
- Concept 17: Iterating Through Dictionaries with For Loops
- Concept 18: Quiz: Iterating Through Dictionaries
- Concept 19: Solution: Iterating Through Dictionaries
- Concept 20: While Loops
- Concept 21: Practice: While Loops
- Concept 22: Solution: While Loops Practice
- Concept 23: Quiz: While Loops
- Concept 24: Solution: While Loops Quiz
- Concept 25: For Loops vs. While Loops
- Concept 26: Check for Understanding: For and While Loops
- Concept 27: Solution: Check for Understanding: For and While Loops
- Concept 28: Break, Continue
- Concept 29: Quiz: Break, Continue
- Concept 30: Solution: Break, Continue
- Concept 31: Practice: Loops
- Concept 32: Solution: Loops
- Concept 33: Zip and Enumerate
- Concept 34: Quiz: Zip and Enumerate
- Concept 35: Solution: Zip and Enumerate
- Concept 36: List Comprehensions
- Concept 37: Quiz: List Comprehensions
- Concept 38: Solution: List Comprehensions
- Concept 39: Practice Questions
- Concept 40: Solutions to Practice Questions
- Concept 41: Conclusion
-
Lesson 05: Functions
Learn how to use functions to improve and reuse your code! Learn about functions, variable scope, documentation, lambda expressions, iterators, and generators.
- Concept 01: Introduction
- Concept 02: Defining Functions
- Concept 03: Quiz: Defining Functions
- Concept 04: Solution: Defining Functions
- Concept 05: Check For Understanding: Functions
- Concept 06: Variable Scope
- Concept 07: Variable Scope
- Concept 08: Solution: Variable Scope
- Concept 09: Check For Understanding: Variable Scope
- Concept 10: Documentation
- Concept 11: Quiz: Documentation
- Concept 12: Solution: Documentation
- Concept 13: Lambda Expressions
- Concept 14: Quiz: Lambda Expressions
- Concept 15: Solution: Lambda Expressions
- Concept 16: Conclusion
-
Lesson 06: Scripting
Setup your own programming environment to write and run Python scripts locally! Learn good scripting practices, interact with different inputs, and discover awesome tools.
- Concept 01: Introduction
- Concept 02: Python Installation
- Concept 03: Install Python Using Anaconda
- Concept 04: [For Windows] Configuring Git Bash to Run Python
- Concept 05: Running a Python Script
- Concept 06: Programming Environment Setup
- Concept 07: Editing a Python Script
- Concept 08: Scripting with Raw Input
- Concept 09: Quiz: Scripting with Raw Input
- Concept 10: Solution: Scripting with Raw Input
- Concept 11: Errors and Exceptions
- Concept 12: Errors and Exceptions
- Concept 13: Handling Errors
- Concept 14: Practice: Handling Input Errors
- Concept 15: Solution: Handling Input Errors
- Concept 16: Accessing Error Messages
- Concept 17: Reading and Writing Files
- Concept 18: Quiz: Reading and Writing Files
- Concept 19: Solution: Reading and Writing Files
- Concept 20: Quiz: Practice Debugging
- Concept 21: Solutions for Quiz: Practice Debugging
- Concept 22: Importing Local Scripts
- Concept 23: The Standard Library
- Concept 24: Quiz: The Standard Library
- Concept 25: Solution: The Standard Library
- Concept 26: Techniques for Importing Modules
- Concept 27: Quiz: Techniques for Importing Modules
- Concept 28: Third-Party Libraries
- Concept 29: Experimenting with an Interpreter
- Concept 30: Online Resources
- Concept 31: Practice Question
- Concept 32: Solution for Practice Question
- Concept 33: Conclusion
-
Lesson 07: NumPy
Learn the basics of NumPy and how to use it to create and manipulate arrays.
- Concept 01: Instructors
- Concept 02: Introduction to NumPy
- Concept 03: Why Use NumPy?
- Concept 04: Creating and Saving NumPy ndarrays
- Concept 05: Quiz: Creating and Saving NumPy ndarrays
- Concept 06: Solution: Creating and Saving NumPy ndarrays
- Concept 07: Using Built-in Functions to Create ndarrays
- Concept 08: Create an ndarray
- Concept 09: Accessing, Deleting, and Inserting Elements Into ndarrays
- Concept 10: Slicing ndarrays
- Concept 11: Boolean Indexing, Set Operations, and Sorting
- Concept 12: Manipulating ndarrays
- Concept 13: Arithmetic operations and Broadcasting
- Concept 14: Creating ndarrays with Broadcasting
-
Lesson 08: Pandas
Learn the basics of Pandas Series and DataFrames and how to use them to load and process data.
- Concept 01: Instructors
- Concept 02: Introduction to pandas
- Concept 03: Why Use pandas?
- Concept 04: Creating pandas Series
- Concept 05: Accessing and Deleting Elements in pandas Series
- Concept 06: Arithmetic Operations on pandas Series
- Concept 07: Manipulate a Series
- Concept 08: Creating pandas DataFrames
- Concept 09: Accessing Elements in pandas DataFrames
- Concept 10: Dealing with NaN
- Concept 11: Manipulate a DataFrame
- Concept 12: Loading Data into a pandas DataFrame
-
Lesson 09: Advanced Topics
In this lesson we cover some advanced topics of iterators and generators. You are not required to complete this but we have provided these to give you a taste of these.
-
Part 08 (Elective): SQL for Data Analysis
Learn SQL language fundamentals such as building basic queries and advanced functions like Window Functions, Subqueries and Common Table Expressions.
-
Module 01: Lessons
-
Lesson 01: Basic SQL
In this section, you will gain knowledge about SQL basics for working with a single table. You will learn the key commands to filter a table in many different ways.
- Concept 01: Video: SQL Introduction
- Concept 02: Video: The Parch & Posey Database
- Concept 03: Video + Text: The Parch & Posey Database
- Concept 04: Quiz: ERD Fundamentals
- Concept 05: Video: Why SQL
- Concept 06: Video: How Databases Store Data
- Concept 07: Types of Databases
- Concept 08: Video: Types of Statements
- Concept 09: Statements
- Concept 10: Video: SELECT & FROM
- Concept 11: Your First Queries in SQL Workspace
- Concept 12: Solution: Your First Queries
- Concept 13: Formatting Best Practices
- Concept 14: Video: LIMIT
- Concept 15: Quiz: LIMIT
- Concept 16: Solution: LIMIT
- Concept 17: Video: ORDER BY
- Concept 18: Quiz: ORDER BY
- Concept 19: Solutions: ORDER BY
- Concept 20: Video: ORDER BY Part II
- Concept 21: Quiz: ORDER BY Part II
- Concept 22: Solutions: ORDER BY Part II
- Concept 23: Video: WHERE
- Concept 24: Quiz: WHERE
- Concept 25: Solutions: WHERE
- Concept 26: Video: WHERE with Non-Numeric Data
- Concept 27: Quiz: WHERE with Non-Numeric
- Concept 28: Solutions: WHERE with Non-Numeric
- Concept 29: Video: Arithmetic Operators
- Concept 30: Quiz: Arithmetic Operators
- Concept 31: Solutions: Arithmetic Operators
- Concept 32: Text: Introduction to Logical Operators
- Concept 33: Video: LIKE
- Concept 34: Quiz: LIKE
- Concept 35: Solutions: LIKE
- Concept 36: Video: IN
- Concept 37: Quiz: IN
- Concept 38: Solutions: IN
- Concept 39: Video: NOT
- Concept 40: Quiz: NOT
- Concept 41: Solutions: NOT
- Concept 42: Video: AND and BETWEEN
- Concept 43: Quiz: AND and BETWEEN
- Concept 44: Solutions: AND and BETWEEN
- Concept 45: Video: OR
- Concept 46: Quiz: OR
- Concept 47: Solutions: OR
- Concept 48: Text: Recap & Looking Ahead
-
Lesson 02: SQL Joins
In this lesson, you will learn how to combine data from multiple tables together.
- Concept 01: Video: Motivation
- Concept 02: Video: Why Would We Want to Split Data Into Separate Tables?
- Concept 03: Video: Introduction to JOINs
- Concept 04: Text + Quiz: Your First JOIN
- Concept 05: Solution: Your First JOIN
- Concept 06: Text: ERD Reminder
- Concept 07: Text: Primary and Foreign Keys
- Concept 08: Quiz: Primary - Foreign Key Relationship
- Concept 09: Text + Quiz: JOIN Revisited
- Concept 10: Video: Alias
- Concept 11: Quiz: JOIN Questions Part I
- Concept 12: Solutions: JOIN Questions Part I
- Concept 13: Video: Motivation for Other JOINs
- Concept 14: Video: LEFT and RIGHT JOINs
- Concept 15: Text: Other JOIN Notes
- Concept 16: LEFT and RIGHT JOIN
- Concept 17: Solutions: LEFT and RIGHT JOIN
- Concept 18: Video: JOINs and Filtering
- Concept 19: Quiz: Last Check
- Concept 20: Solutions: Last Check
- Concept 21: Text: Recap & Looking Ahead
-
Lesson 03: SQL Aggregations
In this lesson, you will learn how to aggregate data using SQL functions like SUM, AVG, and COUNT. Additionally, CASE, HAVING, and DATE functions provide you an incredible problem solving toolkit.
- Concept 01: Video: Introduction to Aggregation
- Concept 02: Video: Introduction to NULLs
- Concept 03: Video: NULLs and Aggregation
- Concept 04: Video + Text: First Aggregation - COUNT
- Concept 05: Video: COUNT & NULLs
- Concept 06: Video: SUM
- Concept 07: Quiz: SUM
- Concept 08: Solution: SUM
- Concept 09: Video: MIN & MAX
- Concept 10: Video: AVG
- Concept 11: Quiz: MIN, MAX, & AVG
- Concept 12: Solutions: MIN, MAX, & AVG
- Concept 13: Video: GROUP BY
- Concept 14: Quiz: GROUP BY
- Concept 15: Solutions: GROUP BY
- Concept 16: Video: GROUP BY Part II
- Concept 17: Quiz: GROUP BY Part II
- Concept 18: Solutions: GROUP BY Part II
- Concept 19: Video: DISTINCT
- Concept 20: Quiz: DISTINCT
- Concept 21: Solutions: DISTINCT
- Concept 22: Video: HAVING
- Concept 23: HAVING
- Concept 24: Solutions: HAVING
- Concept 25: Video: DATE Functions
- Concept 26: Video: DATE Functions II
- Concept 27: Quiz: DATE Functions
- Concept 28: Solutions: DATE Functions
- Concept 29: Video: CASE Statements
- Concept 30: Video: CASE & Aggregations
- Concept 31: Quiz: CASE
- Concept 32: Solutions: CASE
- Concept 33: Text: Recap
-
Lesson 04: SQL Subqueries & Temporary Tables
In this lesson, you will be learning to answer much more complex business questions using nested querying methods - also known as subqueries.
- Concept 01: Video: Introduction
- Concept 02: Video: Introduction to Subqueries
- Concept 03: Video + Quiz: Write Your First Subquery
- Concept 04: Solutions: Write Your First Subquery
- Concept 05: Text: Subquery Formatting
- Concept 06: Video: More On Subqueries
- Concept 07: Quiz: More On Subqueries
- Concept 08: Solutions: More On Subqueries
- Concept 09: Quiz: Subquery Mania
- Concept 10: Solution: Subquery Mania
- Concept 11: Video: WITH
- Concept 12: Text + Quiz: WITH vs. Subquery
- Concept 13: Quiz: WITH
- Concept 14: Solutions: WITH
- Concept 15: Video: Subquery Conclusion
-
Lesson 05: SQL Data Cleaning
Cleaning data is an important part of the data analysis process. You will be learning how to perform data cleaning using SQL in this lesson.
- Concept 01: Video: Introduction to SQL Data Cleaning
- Concept 02: Video: LEFT & RIGHT
- Concept 03: Quiz: LEFT & RIGHT
- Concept 04: Solutions: LEFT & RIGHT
- Concept 05: Video: POSITION, STRPOS, & SUBSTR
- Concept 06: Quiz: POSITION, STRPOS, & SUBSTR - AME DATA AS QUIZ 1
- Concept 07: Solutions: POSITION, STRPOS, & SUBSTR
- Concept 08: Video: CONCAT
- Concept 09: Quiz: CONCAT
- Concept 10: Solutions: CONCAT
- Concept 11: Video: CAST
- Concept 12: Quiz: CAST
- Concept 13: Solutions: CAST
- Concept 14: Video: COALESCE
- Concept 15: Quiz: COALESCE
- Concept 16: Solutions: COALESCE
- Concept 17: Video + Text: Recap
-
Lesson 06: SQL Window Functions
Compare one row to another without doing any joins using one of the most powerful concepts in SQL data analysis: window functions.
- Concept 01: Video: Introduction to Window Functions
- Concept 02: Video: Window Functions 1
- Concept 03: Quiz: Window Functions 1
- Concept 04: Solutions: Window Functions 1
- Concept 05: Quiz: Window Functions 2
- Concept 06: Solutions: Window Functions 2
- Concept 07: Video: ROW_NUMBER & RANK
- Concept 08: Quiz: ROW_NUMBER & RANK
- Concept 09: Solutions: ROW_NUMBER & RANK
- Concept 10: Video: Aggregates in Window Functions
- Concept 11: Quiz: Aggregates in Window Functions
- Concept 12: Solutions: Aggregates in Window Functions
- Concept 13: Video: Aliases for Multiple Window Functions
- Concept 14: Quiz: Aliases for Multiple Window Functions
- Concept 15: Solutions: Aliases for Multiple Window Functions
- Concept 16: Video: Comparing a Row to Previous Row
- Concept 17: Quiz: Comparing a Row to Previous Row
- Concept 18: Solutions: Comparing a Row to Previous Row
- Concept 19: Video: Introduction to Percentiles
- Concept 20: Video: Percentiles
- Concept 21: Quiz: Percentiles
- Concept 22: Solutions: Percentiles
- Concept 23: Video: Recap
-
Lesson 07: SQL Advanced JOINS & Performance Tuning
Learn advanced joins and how to make queries that run quickly across giant datasets. Most of the examples in the lesson involve edge cases, some of which come up in interviews.
- Concept 01: Video: Introduction to Advanced SQL
- Concept 02: Text + Images: FULL OUTER JOIN
- Concept 03: Quiz: FULL OUTER JOIN
- Concept 04: Solutions: FULL OUTER JOIN
- Concept 05: Video: JOINs with Comparison Operators
- Concept 06: Quiz: JOINs with Comparison Operators
- Concept 07: Solutions: JOINs with Comparison Operators
- Concept 08: Video: Self JOINs
- Concept 09: Quiz: Self JOINs
- Concept 10: Solutions: Self JOINs
- Concept 11: Video: UNION
- Concept 12: Quiz: UNION
- Concept 13: Solutions: UNION
- Concept 14: Video: Performance Tuning Motivation
- Concept 15: Video + Quiz: Performance Tuning 1
- Concept 16: Video: Performance Tuning 2
- Concept 17: Video: Performance Tuning 3
- Concept 18: Video: JOINing Subqueries
- Concept 19: More Practice!
- Concept 20: Video: SQL Completion Congratulations
-
Part 09 (Elective): Command Line Essentials
The Unix shell is a powerful tool for developers of all sorts. In this lesson, you'll get a quick introduction to the very basics of using it on your own computer.
-
Module 01: Command Line Essentials
-
Lesson 01: Shell Workshop
The Unix shell is a powerful tool for developers of all sorts. In this lesson, you'll get a quick introduction to the very basics of using it on your own computer.
- Concept 01: Introduction to Unix Shell
- Concept 02: Intro to the Shell
- Concept 03: Windows: Installing Git Bash
- Concept 04: Opening a terminal
- Concept 05: Your first command (echo)
- Concept 06: Navigating directories (ls, cd, ..)
- Concept 07: Current working directory (pwd)
- Concept 08: Parameters and options (ls -l)
- Concept 09: Organizing your files (mkdir, mv)
- Concept 10: Downloading (curl)
- Concept 11: Viewing files (cat, less)
- Concept 12: Removing things (rm, rmdir)
- Concept 13: Searching and pipes (grep, wc)
- Concept 14: Shell and environment variables
- Concept 15: Startup files (.bash_profile)
- Concept 16: Controlling the shell prompt ($PS1)
- Concept 17: Aliases
- Concept 18: Keep learning!
-
Part 10 (Elective): Git and Github
Learn how to use version control to save and share your projects with others.
-
Module 01: Git and GitHub
-
Lesson 01: What is Version Control?
Version control is an incredibly important part of a professional programmer's life. In this lesson, you'll learn about the benefits of version control and install the version control tool Git!
-
Lesson 02: Create a Git Repo
Now that you've learned the benefits of Version Control and gotten Git installed, it's time you learn how to create a repository.
-
Lesson 03: Review A Repo's History
Knowing how to review an existing Git repository's history of commits is extremely important. You'll learn how to do just that in this lesson.
-
Lesson 04: Add Commits to A Repo
A repository is nothing without commits. In this lesson, you'll learn how to make commits, write descriptive commit messages, and verify the changes you're about to save to the repository.
-
Lesson 05: Tagging, Branching, and Merging
Being able to work on your project in isolation from other changes will multiply your productivity. You'll learn how to do this isolated development with Git's branches.
-
Lesson 06: Undoing Changes
Help! Disaster has struck! You don't have to worry, though, because your project is tracked in version control! You'll learn how to undo and modify changes that have been saved to the repository.
-
Lesson 07: Working With Remotes
You'll learn how to create remote repositories on GitHub and how to get and send changes to the remote repository.
-
Lesson 08: Working On Another Developer's Repository
In this lesson, you'll learn how to fork another developer's project. Collaborating with other developers can be a tricky process, so you'll learn how to contribute to a public project.
-
Lesson 09: Staying In Sync With A Remote Repository
You'll learn how to send suggested changes to another developer by using pull requests. You'll also learn how to use the powerful
git rebase
command to squash commits together.
-