Loading
  • Main Menu
GreaterHeight Technologies LLC ~ GreaterHeight Academy
  • All Courses
    • BI and Visualization
      • Mastering Data and Business Analytics
        • Basic Excel for Data Analysis
        • Intermediate and Advanced Excel for Data Analysis
        • Excel for Business Analysis & Analyst
        • PivotTable, PowerPivot, PowerQuery & DAX for Data Analysi
        • Data Analytics and Visualization with Tableau
        • Data Analytics with Power-BI
        • Data Analytics and Visualisation with SQL
      • Mastering Python for Data Analytics and Visualization
        • Python Foundation for Data Analytics
        • Data Analysis Using Python With Numpy and Pandas
        • Data Visualization Using Python with Matplotlib and Seaborn
        • Data Science with SQL Server and Azure SQL Database
        • Data Analytics and Visualisation with PowerBI
      • Complete Microsoft Excel Master Program
        • Basic Excel for Data Analysis
        • Excel Interactive Dashboard for Data Analysis
        • Intermediate and Advanced Excel for Data Analysis
        • PivotTable PowerPivot, PowerQuery & DAX for Data Analysis
        • Excel for Data Analysis and Visualization
        • Excel for Business Analysis & Analyst
      • Data Analytics With SQL Master Program
      • Master Data Analytics With PowerBI
      • Financial Reporting with PowerBI
      • Data Analysts with Power-BI
      • Data Analytics and Visualization with Excel
    • Mastering Python
      • Python Developer Masters Program
        • Python Programming Certification Course
        • Data Science With Python Certification
        • Artificial Intelligence Certification Course
        • PySpark Certification Training Course
        • Python Statistics for Data Science
      • The Complete Python Developer
      • Data Analysis and Visualization with Python
      • Complete Data Scientist with Python
      • Data Engineer with SQL and Python
      • Machine Learning Engineer with Python
    • Azure Cloud Computing
      • DevOps Engineer and Solutions Architect Master Program
      • Greaterheight Azure GH-602 Cloud Solution Architect Master
      • Greaterheight Azure GH-601 Cloud DevOps Master
      • Microsoft Azure az-900 Fundamentals
      • Microsoft Azure az-104 Administrator
      • Microsoft Azure az-204 Developer
      • Microsoft Azure az-305 Solutions Architect
      • Microsoft Azure az-400 DevOps Engineer
      • Microsoft Azure AI-900 Fundamentals
      • Microsoft Azure DP-100 Data Science
    • SQL and SQL-Server Database
      • Mastering SQL Server Development
      • Data Analytics With SQL Master Program
      • Data Engineer Course Online Masters Program
      • Data Science with SQL Server and Azure SQL Database
    • DevOps Development Program
      • DevOps Engineer & Solution Architect Expert Program
    • Data Science
      • Data Science With Python Certification
      • Pythom Statistics for Data Science
      • Data Science with SQL Server and Azure SQL Database
      • Complete Data Scientist with Python
  • Who We Serve
    • Individuals
    • Business
    • Universities
  • Partners
    • Employer Networks
    • Community Partnership
    • Opportunity Funds
    • Future Finance
    • Scholarships
  • Resources
    • Webinars
    • Blog
    • Tutorials
    • White Papers
    • Podcast
    • Events
  • Get Advice
GreaterHeight Technologies LLC ~ GreaterHeight Academy
  • All Courses
    • BI and Visualization
      • Mastering Data and Business Analytics
        • Basic Excel for Data Analysis
        • Intermediate and Advanced Excel for Data Analysis
        • Excel for Business Analysis & Analyst
        • PivotTable, PowerPivot, PowerQuery & DAX for Data Analysi
        • Data Analytics and Visualization with Tableau
        • Data Analytics with Power-BI
        • Data Analytics and Visualisation with SQL
      • Mastering Python for Data Analytics and Visualization
        • Python Foundation for Data Analytics
        • Data Analysis Using Python With Numpy and Pandas
        • Data Visualization Using Python with Matplotlib and Seaborn
        • Data Science with SQL Server and Azure SQL Database
        • Data Analytics and Visualisation with PowerBI
      • Complete Microsoft Excel Master Program
        • Basic Excel for Data Analysis
        • Excel Interactive Dashboard for Data Analysis
        • Intermediate and Advanced Excel for Data Analysis
        • PivotTable PowerPivot, PowerQuery & DAX for Data Analysis
        • Excel for Data Analysis and Visualization
        • Excel for Business Analysis & Analyst
      • Data Analytics With SQL Master Program
      • Master Data Analytics With PowerBI
      • Financial Reporting with PowerBI
      • Data Analysts with Power-BI
      • Data Analytics and Visualization with Excel
    • Mastering Python
      • Python Developer Masters Program
        • Python Programming Certification Course
        • Data Science With Python Certification
        • Artificial Intelligence Certification Course
        • PySpark Certification Training Course
        • Python Statistics for Data Science
      • The Complete Python Developer
      • Data Analysis and Visualization with Python
      • Complete Data Scientist with Python
      • Data Engineer with SQL and Python
      • Machine Learning Engineer with Python
    • Azure Cloud Computing
      • DevOps Engineer and Solutions Architect Master Program
      • Greaterheight Azure GH-602 Cloud Solution Architect Master
      • Greaterheight Azure GH-601 Cloud DevOps Master
      • Microsoft Azure az-900 Fundamentals
      • Microsoft Azure az-104 Administrator
      • Microsoft Azure az-204 Developer
      • Microsoft Azure az-305 Solutions Architect
      • Microsoft Azure az-400 DevOps Engineer
      • Microsoft Azure AI-900 Fundamentals
      • Microsoft Azure DP-100 Data Science
    • SQL and SQL-Server Database
      • Mastering SQL Server Development
      • Data Analytics With SQL Master Program
      • Data Engineer Course Online Masters Program
      • Data Science with SQL Server and Azure SQL Database
    • DevOps Development Program
      • DevOps Engineer & Solution Architect Expert Program
    • Data Science
      • Data Science With Python Certification
      • Pythom Statistics for Data Science
      • Data Science with SQL Server and Azure SQL Database
      • Complete Data Scientist with Python
  • Who We Serve
    • Individuals
    • Business
    • Universities
  • Partners
    • Employer Networks
    • Community Partnership
    • Opportunity Funds
    • Future Finance
    • Scholarships
  • Resources
    • Webinars
    • Blog
    • Tutorials
    • White Papers
    • Podcast
    • Events
  • Get Advice



Complete Data Engineer With SQL & Python


The first course will teach you the fundamental concepts of data engineering, including the Extract-Transform-Load (ETL) and Extract-Load-Transform (ELT) workflows. The second course will take you on the advance journey to becoming a Data Engineer, which is ideal for those with foundational SQL knowledge from our Fundamental Data Engineer course.


Get Advice

Complete Data Engineer With SQL & Python Courses


Who this course is for:

  • Computer Science or IT Students or other graduates with passion to get into IT.
  • Data Warehouse Developers who want to transition to Data Engineering roles.
  • ETL Developers who want to transition to Data Engineering roles.
  • Database or PL/SQL Developers who want to transition to Data Engineering roles.
  • BI Developers who want to transition to Data Engineering roles.
  • QA Engineers to learn about Data Engineering.
  • Application Developers to gain Data Engineering Skills.


What you will Learn:

  • Setup Environment to learn SQL and Python essentials for Data Engineering.
  • Database Essentials for Data Engineering using Postgres such as creating tables, indexes, running SQL Queries, using important pre-defined functions, etc.
  • Data Engineering Programming Essentials using Python such as basic programming constructs, collections, Pandas, Database Programming, etc.
  • Data Engineering using Spark Dataframe APIs (PySpark) using Databricks. Learn all important Spark Data Frame APIs such as select, filter, groupBy, orderBy, etc.
  • Data Engineering using Spark SQL (PySpark and Spark SQL). Learn how to write high quality Spark SQL queries using SELECT, WHERE, GROUP BY, ORDER BY, ETC.
  • Relevance of Spark Metastore and integration of Dataframes and Spark SQL
  • Ability to build Data Engineering Pipelines using Spark leveraging Python as Programming Language.
  • Use of different file formats such as Parquet, JSON, CSV etc in building Data Engineering Pipelines.
  • Setup Hadoop and Spark Cluster on GCP using Dataproc.
  • Understanding Complete Spark Application Development Life Cycle to build Spark Applications using Pyspark. Review the applications using Spark UI.

Course Benefits & Key Features

Complete Data Engineer with SQL & Python’s benefits and key features.
Modules

30+ Modules.

Lessons

80+ Lessons

Practical

40+ Hands-On Labs

Life Projects

5+ Projects

Resume

CV Preparations

Job

Jobs Reference

Recording

Session Recording

Interviews

Mock Interviews

Support

On Job Supports

Membership

Membership Access

Networks

Networking

Certification

Certificate of Completion


INSTRUCTOR-LED LIVE ONLINE CLASSES

Our learn-by-building-project method enables you to build

practical or coding experience that sticks. 95% of our                        

learners say they have confidence and remember more               

when they learn by building real world projects which is                

required to work in your real life.


  • Get step-by-step guidance to practice your skills without getting stuck
  • Validate your technical problem-solving skills in a real environment
  • Troubleshoot complex scenarios to practice what you learned
  • Develop production experience that translates into real-world
.

Python Developer Program Job Outlook


.

Ranked #1 Programming
Language

TIOBE and PYPL ranks Python as the most popular global programming language.

Python Salary
Trend

The average salary for a Python Developer is $114,489 per year in the United States.

44.8% Compound annual growth rate (CAGR)

The global python market size is expected to reach USD 100.6 million in 2030.

Why Data Engineer with SQL & Python?

The Backbone of Data Science

Data engineers are on the front lines of data strategy so that others don’t need to be. They are the first people to tackle the influx of structured and unstructured data that enters a company’s systems.


Technically Challenging

One of the Python functions data analysts and scientists use the most is read_csv — from the pandas library. It reads tabular data stored in a text file into Python, so that it can be explored and manipulated.


It's Rewarding

Every day, we create 2.5 quintillion bytes of data. Business Insider reports that there will be more than 64 billion IoT devices by 2025, up from about 10 billion in 2018, and 9 billion in 2017″.


It Pays Well

According to IBM’s The Quant Crunch: “Jobs specifying machine learning skills pay an average of $114,000. Advertised data scientist jobs pay an average of $105,000 and advertised data engineering jobs pay an average of $117,000.”.

Become a Leader

Being a central part of an organization’s decision-making processes, analytics experts often pick up strong leadership skills as well.


It’s Valuable Even If You Don’t Want to Be a Data Engineer

Even if you don’t want to pursue a career as a data engineer, if you want to work in data science, it can be very useful to have some knowledge of data engineering.




GreaterHeight Certificates holders are prepared to work at companies like these.

Some Alumni Testimonies

Investing in the course "Become a Data Analyst" with GreaterHeight Academy is great value for the money and I highly recommend. The trainer is very knowledgeable, very engaging, provided us with quality training sessions on all courses and was easily acessible for queries. We also had access to the course materials and also the timely availability of the recorded videos made it easy and aided the learning process..

QUEEN OBIWULU

Team Lead, Customer Success

The training was fantastic, the instructor is an awesome lecturer, relentless and not tired in his delivery. He obviously enjoys teaching, it comes natural to him. We got more than we expected. He extended my knowledge of Excel beyond what I knew, and the courses were brilliantly delivered. They reach out, follow up, ask questions, and in fact the support has been great. They are highly recommended and I would definitely subscribe to other training programs from them.

BISOLA OGUNRO

Fraud Analytics Risk Oversight Manager

It's one thing to look for just a Data Analysis training, and it's another to get the knowledge transferred through certified professional trainers. No matter your initial level of proficiency in any of the Data Analysis tools, GreaterHeight Academy would meet you there and take you up to a highly proficienct and confident level in a short time at a reasonable pace. I learnt a lot of Data Analysis tools and skills at GreaterHeight from patient and resourceful teachers.

TUNDE MEREDITH

Operation Director - Abbfem Technology

The Data Analysis training program was one of the best I have attended. The way GreaterHeight took off with Excel and concluded the four courses with Excel was a mind blowing - it was WOW!! I concluded that I'm on the right path with the right mentor to take me from a novice to professional. GreaterHeight is the best as far as impacting Data Analysis knowledge is concern. I would shout it at the rooftop to recommend GreaterHeight to any trainee that really wants to learn.

JOHN OSI PETER

Greaterheight

I wanted to take a moment to express my deepest gratitude for the opportunity to study data analytics at GreaterHeight Academy. I am truly impressed by the level of dedication and support that the sponsor and CEO have put into this program. GreaterHeight Academy is without a doubt the best tech institution out there, providing top-notch education and resources for its students. One of the advantages of studying at GreaterHeight Academy is the access to the best tools and technologies in the field. 

AYODELE PAYNE

Sales/Data Analyst

It is an unforgettable experience that will surely stand the test of time learning to become a Data Analyst with GreaterHeights Academy. The Lecture delivery was so impactful and the Trainer is vast and well knowledgeable in using the applicable tools for the Sessions. Always ready to go extra mile with you. The supports you get during and after the lectures are top notch with materials and resources available to build your confidence on and off the job.

ADEBAYO OLADEJO

Customer Service Advisor (Special Operations)

Fundamental Data Engineer with SQL Course


In this course, you'll learn the fundamental concepts of data engineering, including the Extract-Transform-Load (ETL) and Extract-Load-Transform (ELT) workflows. You'll discover how to interact with relational databases such as PostgreSQL to store, modify, and query data. Moving through the track, you'll pick up techniques for querying structured data using SQL, including joining multiple tables, calculating aggregated statistics, filtering, grouping, and writing subqueries. Switching gears, you'll go on to discover database design principles such as star and snowflake schemas, and normalization. You'll use this knowledge to perform typical data engineering tasks such as creating, altering, and deleting tables, and enforcing data consistency by casting data to different data types. The track also shows how you can download PostgreSQL to your operating system, along with setting up and modifying users. Conclude by learning about data warehouse technologies and familiarizing with Snowflake, a popular cloud technology for data engineering!


Understanding Data Engineering
Understand the Basics of Data Engineering

In this course, you’ll learn about a data engineer’s core responsibilities, how they differ from data scientists, and facilitate the flow of data through an organization. Through hands-on exercises you’ll follow Spotflix, a fictional music streaming company, to understand how their data engineers collect, clean, and catalog their data.

Apply in Personal Cases

By the end of the course, you’ll understand what your company's data engineers do, be ready to have a conversation with a data engineer, and have a solid foundation to start your own data engineer journey.

3 Modules | 4+ Hours | 3+ Skills

Course Modules 


In this module, you’ll learn what data engineering is and why demand for them is increasing. You’ll then discover where data engineering sits in relation to the data science lifecycle, how data engineers differ from data scientists, and have an introduction to your first complete data pipeline.


  1. Data engineering and big data
  2. Go with the flow
  3. Not responsible
  4. Big time
  5. Data engineers vs. data scientists
  6. Tell me the truth
  7. Who is it
  8. The data pipeline
  9. It's not true
  10. Pipeline

It’s time to talk about data storage—one of the main responsibilities for a data engineer. In this module, you’ll learn how data engineers manage different data structures, work in SQL—the programming language of choice for querying and storing data, and implement appropriate data storage solutions with data lakes and data warehouses.


  1. Data structures
  2. Structures
  3. What's the difference
  4. SQL databases
  5. We can work it out
  6. Columns
  7. Different breeds
  8. Data warehouses and data lakes
  9. Tell the truth
  10. Our warehouse (in the middle of our street)

Data engineers make life easy for data scientists by preparing raw data for analysis using different processing techniques at different steps. These steps need to be combined to create pipelines, which is when automation comes into play. Finally, data engineers use parallel and cloud computing to keep pipelines flowing smoothly.


  1. Processing data
  2. Connect the dots
  3. Scheduling data
  4. Schedules
  5. One or the other
  6. Parallel computing
  7. Whenever, whenever
  8. Parallel universe
  9. Cloud computing
  10. Obscured by clouds
  11. Somewhere I belong
  12. We are the champions


Introduction to SQL
Learn how Relational Databases are Organized
SQL is an essential language for building and maintaining relational databases, which opens the door to a range of careers in the data industry and beyond. You’ll start this course by covering data organization, tables, and best practices for database construction.

Write Your First SQL Queries
The second half of this course looks at creating SQL queries for selecting data that you need from your database. You’ll have the chance to practice your querying skills before moving on to customizing and saving your results.

Understand the Difference Between PostgreSQL and SQL Server
PostgreSQL and SQL Server are two of the most popular SQL flavors. You’ll finish off this course by looking at the differences, benefits, and applications of each. By the end of the course, you’ll have some hands-on experience in learning SQL and the grounding to start applying it to projects or continue your learning in a more specialized direction.

5 Modules | 3+ Hours | 4 Skills

Course Modules 


Before writing any SQL queries, it’s important to understand the underlying data. In this module, we’ll discover the role of SQL in creating and querying relational databases. Using a database for a local library, we will explore database and table organization, data types and storage, and best practices for database construction.


  1. Introduction to Database Management System
  2. What are the advantages of databases?
  3. Data organization
  4. Introduction to SQL
  5. Tables in SQL
  6. Views in SQL
  7. Table vs Views
  8. Picking a unique ID
  9. Setting the table in style
  10. Finding data types

  1. Introduction
  2. Entity Relationship Model
  3. Relationships in SQL
  4. Recap


  1. Introduction
  2. Downloading SQL Developer Edition
  3. Installing SQL Developer Edition
  4. Connecting to SQL Server
  5. Downloading Sample SQL Database in SQL Management Studio (SSMS)
  6. Configuring SQL Server, and SSMS
  7. Recap


  1. Database Manipulation in SQL
  2. SQL Storage Engines
  3. Creating and Managing Tables in SQL
  4. Creating and Managing Tables in SQL CREATE, DESCRIBE, and SHOW Table
  5. Creating and Managing Tables in SQL ALTER, TRUNCATE, and DROP Tables
  6. Inserting and Querying Data in Tables
  7. Filtering Data From Tables in SQL
  8. Filtering Data From Tables in SQL WHERE and DISTINCT Clauses
  9. Filtering Data From Tables in SQL AND and OR Operators
  10. Filtering Data From Tables in SQL IN and NOT IN Operators
  11. Filtering Data From Tables in SQL BETWEEN and LIKE Operators
  12. Filtering Data From Tables in SQL TOP, IS NULL, and IS NOT NULL Operators
  13. Sorting Table Data
  14. Recap

Learn your first SQL keywords for selecting relevant data from database tables! After practicing querying skills in a database of books, you’ll customize query results using aliasing and save them as views so they can be shared. Finally, you’ll explore the differences between SQL flavors and databases such as SQL Server.


  1. Introducing queries
  2. SQL strengths
  3. Developing SQL style
  4. Querying the books table
  5. Writing queries
  6. Comments in SQL
  7. Making queries DISTINCT
  8. Aliasing
  9. Viewing your query
  10. SQL flavors
  11. Comparing flavors
  12. Limiting results


Intermediate SQL
SQL is widely recognized as the most popular language for turning raw data stored in a database into actionable insights. This course uses a films database to teach how to navigate and extract insights from the data using SQL.

Discover Filtering with SQL
You'll discover techniques for filtering and comparing data, enabling you to extract specific information to gain insights and answer questions about the data.

Get Acquainted with Aggregation
Next, you'll get a taste of aggregate functions, essential for summarizing data effectively and gaining valuable insights from large datasets. You'll also combine this with sorting and grouping data, adding another layer of meaning to your insights and analysis.

Write Clean Queries
Finally, you'll be shown some tips and best practices for presenting your data and queries neatly. Throughout the course, you'll have hands-on practice queries to solidify your understanding of the concepts. By the end of the course, you'll have everything you need to know to analyze data using your own SQL code today!

7 Modules | 8+ Hours | 5+ Skills

Course Modules 


In this first module, you’ll learn how to query a films database and select the data needed to answer questions about the movies and actors. You'll also understand how SQL code is executed and formatted.


  1. SELECT Statement
  2. SELECT DISTINCT
  3. Query execution
  4. Order of execution
  5. SQL style
  6. SQL best practices
  7. Formatting
  8. Non-standard fields

  1. Arithmetic Operators: +, -, *, /, %
  2. Comparison Operators: =, >, <, >=, <=, <>, !=
  3. Logical Operators: AND, OR, NOT
  4. Special Operators: LIKE, IN, NOT, NOT EQUAL, IS NULL, UNION , UNION ALL ,
  5. | Except, Between, ALL and ANY, INTERSECT Clause, EXISTS

Learn about how you can filter numerical and textual data with SQL. Filtering is an important use for this language. You’ll learn how to use new keywords and operators to help you narrow down your query to get results that meet your desired criteria and gain a better understanding of NULL values and how to handle them.


  1. Filtering numbers
  2. Filtering results
  3. Using WHERE with numbers
  4. Using WHERE with text
  5. Multiple criteria
  6. Using AND
  7. Using OR
  8. Using BETWEEN
  9. Filtering text
  10. LIKE and NOT LIKE
  11. WHERE IN
  12. Combining filtering and selecting
  13. Understanding NULL values
  14. Practice with NULLs

Here, we will teach you how to sort and group data. These skills will take your analyses to a new level by helping you uncover critical business insights and identify trends and performance. You'll get hands-on experience to determine which films performed the best and how movie durations and budgets changed over time.


  1. Sorting results
  2. Sorting text
  3. The SQL ORDER BY
  4. ORDER BY - ascending
  5. ORDER BY - descending
  6. Sorting single fields
  7. Sorting multiple fields

  1. Data Definition Language (DDL): CREATE, DROP, ALTER, TRUNCATE
  2. Data Query Language (DQL): SELECT, WHERE
  3. Data Manipulation Language (DML): INSERT, UPDATE, DELETE

  1. NOT NULL Constraints
  2. UNIQUE Constraints
  3. Primary Key Constraints
  4. Foreign Key Constraints
  5. Composite Key
  6. Unique Constraints
  7. Alternate Key
  8. CHECK Constraints
  9. DEFAULT Constraints

SQL allows you to zoom in and out to better understand an entire dataset, its subsets, and its individual records. You'll learn to summarize data using aggregate functions and perform basic arithmetic calculations inside queries to gain insights into what makes a successful film.


  1. COUNT, SUM, AVG, MIN, MAX, Num
  2. Summarizing data
  3. Aggregate functions and data types
  4. Practice with aggregate functions
  5. Summarizing subsets
  6. Grouping data
  7. GROUP BY single fields
  8. GROUP BY multiple fields
  9. Answering business questions
  10. Filtering grouped data
  11. Filter with HAVING
  12. HAVING and sorting
  13. Combining aggregate functions with WHERE
  14. Using ROUND()
  15. ROUND() with a negative parameter
  16. Aliasing and arithmetic
  17. Using arithmetic
  18. Aliasing with functions
  19. Rounding results


Joining Data In SQL
Joining data is an essential skill in data analysis, enabling you to draw information from separate tables together into a single, meaningful set of results. In this comprehensive course on joining data, you'll delve into the intricacies of table joins and relational set theory, learning how to optimize your queries for efficient data retrieval.

Understand Data Joining Fundamentals
You will learn how to work with multiple tables in SQL by navigating and extracting data from various tables within a SQL database using various join types, including inner joins, outer joins, and cross joins. With practice, you'll gain the knowledge of how to select the appropriate join method.

Explore Advanced Data Manipulation Techniques
Next up, you'll explore set theory principles such as unions, intersects, and except clauses, as well as discover the power of nested queries in SQL. Every step is accompanied by exercises and opportunities to apply the theory and grow your confidence in SQL.

5 Modules | 5+ Hours | 4+ Skills

Course Modules 


  1. Introduction to Alias
  2. Introduction to JOINS
  3. Right Cross and Self Join
  4. Operators in SQL
  5. Operators in SQL Updated
  6. Intersect and Emulation
  7. Minus and Emulation
  8. Subquery in SQL
  9. Subqueries with Statements and Operators
  10. Subqueries with Commands
  11. Derived Tables in SQL
  12. EXISTS Operator
  13. NOT EXISTS Operator
  14. EXISTS vs IN Operators
  15. Recap

In this closing Module, you’ll begin by investigating semi-joins and anti-joins. Next, you'll learn how to use nested queries. Last but not least, you’ll wrap up the course with some challenges!

  1. Subquerying with semi joins and anti joins
  2. Multiple WHERE clauses
  3. Semi join
  4. Diagnosing problems using anti join
  5. Subqueries inside WHERE and SELECT
  6. Subquery inside WHERE
  7. WHERE do people live?
  8. Subquery inside SELECT
  9. Subqueries inside FROM
  10. Subquery inside FROM
  11. Subquery challenge
  12. Final challenge
  13. The finish line

In this module, you’ll be introduced to the concept of joining tables and will explore all the ways you can enrich your queries using joins—beginning with inner joins.

  1. The ins and outs of INNER JOIN
  2. Your first join
  3. Joining with aliased tables
  4. USING in action
  5. Defining relationships
  6. Relationships in our database
  7. Inspecting a relationship
  8. Multiple joins
  9. Joining multiple tables
  10. Checking multi-table joins

After familiarizing yourself with inner joins, you will come to grips with different kinds of outer joins. Next, you will learn about cross joins. Finally, you will learn about situations in which you might join a table with itself.

  1. LEFT and RIGHT JOINs
  2. Remembering what is LEFT
  3. This is a LEFT JOIN, right?
  4. Building on your LEFT JOIN
  5. Is this RIGHT?
  6. FULL JOINs
  7. Comparing joins
  8. Chaining FULL JOINs
  9. Crossing into CROSS JOIN
  10. Histories and languages
  11. Choosing your join
  12. Self joins
  13. Comparing a country to itself
  14. All joins on deck

In this module, you will learn about using set theory operations in SQL, with an introduction to UNION, UNION ALL, INTERSECT, and EXCEPT clauses. You’ll explore the predominant ways in which set theory operations differ from join operations.

  1. Set theory for SQL Joins
  2. UNION vs. UNION ALL
  3. Comparing global economies
  4. Comparing two set operations
  5. At the INTERSECT
  6. INTERSECT
  7. Review UNION and INTERSECT
  8. EXCEPT
  9. You've got it, EXCEPT...
  10. Calling all set operators


Introduction to Relational Databases in SQL
Explore the Role of SQL in Relational Database Management
There are a lot of reasons why SQL is the go-to query language for relational database management. The main one is that SQL is a powerful language that can handle large amounts of data in complex ways and solve tricky analytical questions. In this course, you will gain an introduction to relational databases in SQL.

Learn how to create tables and specify their relationships, as well as how to enforce data integrity. Additionally, discover other unique features of database systems, such as constraints.

Create Your First Database
You begin the section by creating your first database with simple SQL commands. Next, you’ll learn how to update your database as the structure changes by migrating data and deleting tables.

In the final module, you will glue tables in foreign keys together and establish relationships that greatly benefit your data quality. Finally, you will run ad hoc analyses on your new database.

Understand the Basics of Relational Databases
By the end of the course, you will gain a basic yet essential understanding of SQL relational databases. They are widely used in various data science fields (from healthcare to finance) and have consequently become one of the crucial languages for data scientists. If you're interested in deepening your knowledge further, you may be interested in our SQL for Database Administrators, SQL Server Developer, and SQL Server for Database Administrators Tracks!

4 Modules | 5+ Hours | 4+ Skills

Course Modules 


In this module, you'll create your very first database with a set of simple SQL commands. Next, you'll migrate data from existing flat tables into that database. You'll also learn how meta-information about a database can be queried.


  1. Introduction to relational databases
  2. Attributes of relational databases
  3. Query information_schema with SELECT
  4. Tables: At the core of every database
  5. CREATE your first few TABLEs
  6. ADD a COLUMN with ALTER TABLE
  7. Update your database as the structure changes
  8. RENAME and DROP COLUMNs in affiliations
  9. Migrate data with INSERT INTO SELECT DISTINCT
  10. Delete tables with DROP TABLE

After building a simple database, it's now time to make use of the features. You'll specify data types in columns, enforce column uniqueness, and disallow NULL values in this module.


  1. Better data quality with constraints
  2. Types of database constraints
  3. Conforming with data types
  4. Type CASTs
  5. Working with data types
  6. Change types with ALTER COLUMN
  7. Convert types USING a function
  8. The not-null and unique constraints
  9. Disallow NULL values with SET NOT NULL
  10. What happens if you try to enter NULLs?
  11. Make your columns UNIQUE with ADD CONSTRAINT

Now let’s get into the best practices of database engineering. It's time to add primary and foreign keys to the tables. These are two of the most important concepts in databases, and are the building blocks you’ll use to establish relationships between tables.


  1. Keys and superkeys
  2. Get to know SELECT COUNT DISTINCT
  3. Identify keys with SELECT COUNT DISTINCT
  4. Primary keys
  5. Identify the primary key
  6. ADD key CONSTRAINTs to the tables
  7. Surrogate keys
  8. Add a SERIAL surrogate key
  9. CONCATenate columns to a surrogate key
  10. Test your knowledge before advancing

In the final module, you'll leverage foreign keys to connect tables and establish relationships that will greatly benefit your data quality. And you'll run ad hoc analyses on your new database.


  1. Model 1:N relationships with foreign keys
  2. REFERENCE a table with a FOREIGN KEY
  3. Explore foreign key constraints
  4. JOIN tables linked by a foreign key
  5. Model more complex relationships
  6. Add foreign keys to the "affiliations" table
  7. Populate the "professor_id" column
  8. Drop "firstname" and "lastname"p
  9. Referential integrity
  10. Referential integrity violations
  11. Change the referential integrity behavior of a key
  12. Roundup
  13. Count affiliations per university
  14. Join all the tables together.


Database Design
A good database design is crucial for a high-performance application. Just like you wouldn't start building a house without the benefit of a blueprint, you need to think about how your data will be stored beforehand. Taking the time to design a database saves time and frustration later on, and a well-designed database ensures ease of access and retrieval of information. While choosing a design, a lot of considerations have to be accounted for. In this course, you'll learn how to process, store, and organize data in an efficient way. You'll see how to structure data through normalization and present your data with views. Finally, you'll learn how to manage your database and all of this will be done on a variety of datasets from book sales, car rentals, to music reviews.


4 Modules | 5+ Hours | 4 Skills

Course Modules 


Start your journey into database design by learning about the two approaches to data processing, OLTP and OLAP. In this first module, you'll also get familiar with the different forms data can be stored in and learn the basics of data modeling.


  1. OLTP and OLAP
  2. OLAP vs. OLTP
  3. Which is better?
  4. Storing data
  5. Name that data type!
  6. Ordering ETL Tasks
  7. Recommend a storage solution
  8. Database design
  9. Classifying data models
  10. Deciding fact and dimension tables
  11. Querying the dimensional model

In this module, you will take your data modeling skills to the next level. You'll learn to implement star and snowflake schemas, recognize the importance of normalization and see how to normalize databases to different extents.


  1. Star and snowflake schema
  2. Running from star to snowflake
  3. Adding foreign keys
  4. Extending the book dimension
  5. Normalized and denormalized databases
  6. Querying the star schema
  7. Querying the snowflake schema
  8. Updating countries
  9. Extending the snowflake schema
  10. Normal forms
  11. Converting to 1NF
  12. Converting to 2NF
  13. Converting to 3NF

Get ready to work with views! In this module, you will learn how to create and query views. On top of that, you'll master more advanced capabilities to manage them and end by identifying the difference between materialized and non-materialized views.


  1. Database views
  2. Tables vs. views
  3. Viewing views
  4. Creating and querying a view
  5. Managing views
  6. Creating a view from other views
  7. Granting and revoking access
  8. Updatable views
  9. Redefining a view
  10. Materialized views
  11. Materialized versus non-materialized
  12. Creating and refreshing a materialized view
  13. Managing materialized views

This final module ends with some database management-related topics. You will learn how to grant database access based on user roles, how to partition tables into smaller pieces, what to keep in mind when integrating data, and which DBMS fits your business needs best.


  1. Database roles and access control
  2. Create a role
  3. GRANT privileges and ALTER attributes
  4. Add a user role to a group role
  5. Table partitioning
  6. Reasons to partition
  7. Partitioning and normalization
  8. Creating vertical partitions
  9. Creating horizontal partitions
  10. Data integration
  11. Data integration do's and dont's
  12. Analyzing a data integration plan
  13. Picking a Database Management System (DBMS)
  14. SQL versus NoSQL
  15. Choosing the right DBMS


Data Warehousing Concepts
This introductory and conceptual course will help you understand the fundamentals of data warehousing. You’ll gain a strong understanding of data warehousing basics through industry examples and real-world datasets. Some have forecasted that the global data warehousing market is expected to reach over $50 billion in 2028. This industry has continued to evolve over the years and has been a critical component of the data revolution for many organizations. There has never been a better time to learn about data warehousing.

4 Modules | 6+ Hours | 4+ Skills

Course Modules 


Prepare for your data warehouse learning journey by grounding yourself in some foundational concepts. To begin this course, you’ll learn what a data warehouse is and how it compares and contrasts to similar-sounding technologies, data marts and data lakes. You’ll also learn how different personas help support the various stages of a data warehouse project.


  1. What is a data warehouse?
  2. Knowing the what and why
  3. Possible use cases for a data warehouse for Zynga
  4. What's the difference between data warehouses and data lakes?
  5. Data warehouses vs. data lakes
  6. Data warehouses vs. data marts
  7. Deciding between a data lake, warehouse, and mart
  8. Data warehouses support organizational analysis
  9. Data warehouse life cycle
  10. Support where needed
  11. Who does what?

Now, you’ll gain a better understanding of data warehouse architecture by learning the typical layers of a data warehouse and how the presentation layer supports analysts. Additionally, you’ll learn about Bill Inmon and his top-down approach and how it compares to Ralph Kimball and his bottom-up approach. Finally, you’ll understand the difference between OLAP and OLTP systems.


  1. What are the different layers of a data warehouse?
  2. Ordering data warehouse layers
  3. Understanding ETL
  4. Pick the correct layer
  5. The presentation layer
  6. Stepping into a consultant's shoes
  7. Supporting analysts and data scientist users
  8. Data warehouse architectures
  9. Top-down vs bottom-up
  10. Characteristics of top-down and bottom-up
  11. Choosing a top-down approach
  12. OLAP and OLTP systems
  13. The OLAP data cube
  14. OLAP vs. OLTP scenarios
  15. Understanding OLTP

Here, you’ll learn how to organize the data in your data warehouse with an excellent data model. First, you’ll cover the basics of data modeling by learning what a fact and a dimension table are and how you use them in the star and snowflake schemes. Then, you’ll review how to create a data model using Kimball's four-step process and how to deal with slowly changing dimensions.


  1. Data warehouse data modeling
  2. Understanding facts and dimensional tables
  3. One starry and snowy night
  4. Fact or dimension?
  5. Kimball's four step process
  6. Ordering Kimball's steps
  7. Deciding on the grain
  8. Selecting reasonable facts
  9. Slowly changing dimensions
  10. Pop-quiz on slow changes
  11. Difference between type I, II, and III
  12. Row vs. column data store
  13. Categorizing row and column store scenarios
  14. Why is column store faster?
  15. Which queries are faster?

You’ll wrap up the course by learning the pros and cons of ETL and ELT processes and on-premise versus an in-cloud implementation. You’ll conclude by walking through an example, making key decisions on warehouse design and implementation.


  1. ETL and ELT
  2. ETL compared to ELT
  3. Differences between ETL and ELT
  4. Selecting ELT
  5. Data cleaning
  6. Cleaning operations
  7. Finding truth in data transformations
  8. Understanding data governance
  9. On premise and cloud data warehouses
  10. Knowing the differences between on-premise and cloud
  11. Matching implementation to justification
  12. Data warehouse design example
  13. Connecting it all
  14. Selecting bottom-up
  15. Do you know it all?
  16. Wrap-up!


Introduction to Snowflake
Dive into Snowflake's universe! This course will take you from its foundational architecture to mastering advanced SQL techniques. In our data-driven era, data warehousing is crucial. Snowflake, a cloud-native platform, is redefining scalability and performance. You will dive deep into its significance and learn what differentiates it from competitors like Google BigQuery, Amazon Redshift, Databricks, and Postgres.


Snowflake Basics

You'll start by uncovering Snowflake's distinct architecture. Grasp fundamental database concepts, including DDL (Data Definition Language) and DML (Data Manipulation Language). Dive deeper into the importance of data types, their conversions, and the specifics of Snowflake's functionality.


Advanced Techniques

Once you have the basics, it's time to elevate your skills. You'll delve into joins, subqueries, and query optimization. Play with semi-structured data, focusing on `JSON`.


Seal Your Snowflake Expertise

By the end of this course, you'll have a strong Snowflake understanding, ready to handle data and conduct deep SQL analyses. Whether you're an analyst, data engineer, or a curious tech enthusiast, this course offers a comprehensive view of Snowflake's capabilities, preparing you for the ever-evolving data-driven landscape!

3 Modules | 4+ Hours | 3+ Skills

Course Modules 


In this module, you will learn about Snowflake, a cloud-based data warehouse that offers a unique architecture. We will discuss its key features, use cases, architecture, and how it compares to its competitors. You will also get started with SnowflakeSQL, exploring its basic syntax and similarities with PostgreSQL.


  1. What is Snowflake?
  2. Traditional vs. cloud data warehouse
  3. Row versus column oriented database
  4. Snowflake use cases
  5. Introduction to Snowflake SQL
  6. Snowflake Architecture
  7. Decoupling Compute & Storage
  8. Snowflake Architecture Layers
  9. Virtual Warehouse
  10. Snowflake Competitors and why use Snowflake
  11. Data warehousing platforms
  12. Features: Snowflake & its competitors
  13. Snowflake SQL: Using SELECT and WHERE in Snowflake

In this module, you'll embark on a journey through Snowflake SQL. You'll start by discovering various methods to connect and interface with Snowflake. As you delve deeper, you'll grasp the significance of Snowflake Staging. Navigate the vast landscapes of Snowflake's databases using essential commands, and broaden your understanding of its data types, learning to convert them and drawing comparisons with Postgres. Conclude your exploration by mastering Snowflake's functions and honing data sorting and grouping techniques.


  1. Connecting to Snowflake and DDL commands
  2. Snowflake connections and DDL commands
  3. Snowflake Staging
  4. Snowflake database structures and DML
  5. Loading data
  6. DESCRIBE & SHOW
  7. Snowflake data type and data type conversion
  8. Data types
  9. Datatype conversion
  10. Functions, sorting, and grouping
  11. String functions
  12. Functions & Grouping
  13. DATE & TIME

In module 3, you'll advance your skills in Snowflake SQL. You'll begin by exploring diverse join methods and building complex queries with subqueries and CTEs. We'll emphasize query optimization, showing you ways to enhance the speed and efficiency of your SQL tasks. At the end, we'll delve into handling semi-structured data like JSON.


  1. Joining in Snowflake
  2. NATURAL JOIN
  3. The world of JOINS
  4. Subquerying and Common Table Expressions
  5. Subqueries
  6. Understanding CTE
  7. CTEs
  8. Snowflake Query Optimization
  9. Essentials of query optimization
  10. Early filtering
  11. Query history
  12. Handling semi-structured data
  13. PARSE_JSON & OBJECT_CONSTRUCT
  14. Querying JSON data
  15. JSONified
  16. Wrap-up!


Understanding Data Visualization
Visualizing data using charts, graphs, and maps is one of the most impactful ways to communicate complex data. In this course, you’ll learn how to choose the best visualization for your dataset, and how to interpret common plot types like histograms, scatter plots, line plots and bar plots. You'll also learn about best practices for using colors and shapes in your plots, and how to avoid common pitfalls. Through hands-on exercises, you'll visually explore over 20 datasets including global life expectancies, Los Angeles home prices, ESPN's 100 most famous athletes, and the greatest hip-hop songs of all time.

4 Modules | 5+ Hours | 4+ Skills

Course Modules 


In this module you’ll learn the value of visualizations, using real-world data on British monarchs, Australian salaries, Panamanian animals, and US cigarette consumption, to graphically represent the spread of a variable using histograms and box plots.


  1. A plot tells a thousand words
  2. Motivating visualization
  3. Continuous vs. categorical variables
  4. Histograms
  5. Interpreting histograms
  6. Adjusting bin width
  7. Box plots
  8. Interpreting box plots
  9. Ordering box plots

You’ll learn how to interpret data plots and understand core data visualization concepts such as correlation, linear relationships, and log scales. Through interactive exercises, you’ll also learn how to explore the relationship between two continuous variables using scatter plots and line plots. You'll explore data on life expectancies, technology adoption, COVID-19 coronavirus cases, and Swiss juvenile offenders. Next you’ll be introduced to two other popular visualizations—bar plots and dot plots—often used to examine the relationship between categorical variables and continuous variables. Here, you'll explore famous athletes, health survey data, and the price of a Big Mac around the world.


  1. Scatter plots
  2. Interpreting scatter plots
  3. Trends with scatter plots
  4. Line plots
  5. Interpreting line plots
  6. Logarithmic scales for line plots
  7. Line plots without dates on the x-axis
  8. Bar plots
  9. Interpreting bar plots
  10. Interpreting stacked bar plots
  11. Dot plots
  12. Interpreting dot plots
  13. Sorting dot plots

It’s time to make your insights even more impactful. Discover how you can add color and shape to make your data visualizations clearer and easier to understand, especially when you find yourself working with more than two variables at the same time. You'll explore Los Angeles home prices, technology stock prices, math anxiety, the greatest hiphop songs, scotch whisky preferences, and fatty acids in olive oil.


  1. Higher dimensions
  2. Another dimension for scatter plots
  3. Another dimension for line plots
  4. Using color
  5. Eye-catching colors
  6. Qualitative, sequential, diverging
  7. Highlighting data
  8. Plotting many variables at once
  9. Interpreting pair plots
  10. Interpreting correlation heatmaps
  11. Interpreting parallel coordinates plots

In this final module, you’ll learn how to identify and avoid the most common plot problems. For example, how can you avoid creating misleading or hard to interpret plots, and will your audience understand what it is you’re trying to tell them? All will be revealed! You'll explore wind directions, asthma incidence, and seats in the German Federal Council.


  1. Polar coordinates
  2. Pie plots
  3. Rose plots
  4. Axes of evil
  5. Bar plot axes
  6. Dual axes
  7. Sensory overload
  8. Chartjunk
  9. Multiple plots
  10. Wrap-up!


How to Install PostgreSQL on Windows
In this tutorial, you will learn how to install PostgreSQL on two different operating systems - Windows and Mac.

PostgreSQL is an open-source and light-weighted relational database management system (RDBMS). It is widely popular among developers and has been well-accepted by the industry. This tutorial is going to show you how you can install a specific version of PostgreSQL on either Windows or Mac.

Install PostgreSQL


Data Engineer with Python Course


Advance your journey to becoming a Data Engineer with our Python-focused track, which is ideal for those with foundational SQL knowledge from our Associate Data Engineer track. This track dives deeper into the world of data engineering, emphasizing Python's role in automating and optimizing data processes. Starting with an understanding of cloud computing, you'll progress through Python programming from basics to advanced topics, including data manipulation, cleaning, and analysis. Engage in hands-on projects to apply what you've learned in real-world scenarios. You'll explore efficient coding practices, software engineering principles, and version control with Git, preparing you for professional data engineering challenges. Introduction to data pipelines and Airflow will equip you with the skills to design, schedule, and monitor complex data workflows!


Understanding Cloud Computing
Learn About Cloud Computing
Every day, we interact with the cloud—whether it’s using Google Drive, apps like Salesforce, or accessing our favorite websites. Cloud computing has become the norm for many companies, but what exactly is the cloud, and why is everyone rushing to adopt it?

Designed for complete novices, this cloud computing course breaks down what the cloud is and explains terminology such as scalability, latency, and high-availability.

Understand the Cloud Computing Basics
You’ll start by looking at the very basics of cloud computing, learning why it’s growing in popularity, and what makes it such a powerful option. You’ll explore the different service models businesses can choose from and how they're implemented in different situations.

As this is a no-code course, you can learn about cloud computing at a more conceptual level, exploring ideas of data protection, the various cloud providers, and how organizations can use cloud deployment.

Discover the Advantages of Cloud Computing
This course will demonstrate the many advantages of cloud computing, including ease of remote collaboration, how there are no hardware limitations, and reliable disaster recovery.

As you progress, you'll also discover the range of tools provided by major cloud providers and look at cloud computing examples from Amazon Web Services (AWS), Microsoft Azure, and Google Cloud. By the end of this course, you'll be able to confidently explain how cloud tools can increase productivity and save money, as well as ask the right questions about how to optimize your use of cloud tools.

3 Modules | 4+ Hours | 3+ Skills

Course Modules 


In this module, you’ll learn why cloud computing is growing in popularity, how it compares to an on-premise solution, and what makes it so powerful. Next, you’ll learn about the three different service models—IaaS, PaaS, and SaaS—and how they each satisfy a unique set of business requirements.


  1. What is cloud computing?
  2. Understanding the cloud
  3. Cloud vs. on-premise
  4. Cloud computing services
  5. The power of the cloud
  6. Primary cloud services
  7. Key characteristics
  8. Cloud service models
  9. Outsourcing IT services
  10. IaaS, PaaS, or SaaS?
  11. Level of abstraction

Now that you understand the power of cloud computing, it’s time to discover how it’s implemented using one of three deployment methods—private, public, and hybrid. You'll then find out how data protection regulations can affect cloud infrastructure. Lastly, you’ll meet the important roles within an organization that can make your cloud deployment a reality.


  1. Cloud deployment models
  2. Private or public?
  3. Pick the best model
  4. Regulations on the cloud
  5. Time limits on storing data
  6. Personal data
  7. Cloud computing roles
  8. Microsoft cloud skills report
  9. Cloud roles
  10. In other tracks

In the final module, you’ll be introduced to the major cloud infrastructure players, including AWS, Microsoft Azure, and Google Cloud. You’ll become more familiar with their market positioning, the products they offer, and who their main customers are and how they use cloud computing.


  1. An overview of providers
  2. The big three
  3. The risk of vendor lock-in
  4. Amazon Web Services
  5. AWS or not AWS!
  6. NerdWallet
  7. Microsoft Azure
  8. Which service to pick?
  9. The Ottawa hospital
  10. Google Cloud
  11. Lush migration
  12. True or false?
  13. Cloud providers and their services
  14. Wrap-up!


Introduction to Python for Developers
What is Python and why use it?
Learn all about Python a versatile and powerful language, perfect for software development. No prior experience required!

Learn the fundamentals
Perform calculations, store and manipulate information in variables using various data structures, and write descriptive comments describing your code to others.

Build your workflow
Use comparison operators in combination with for and while loops to execute code based on conditions being met, enabling a fully customizable workflow.

3 Modules | 4+ Hours | 3+ Skills

Course Modules 


Discover the wonders of Python - why it is popular and how to use it. No prior knowledge required!


  1. What is Python?
  2. The benefits of Python
  3. Use-cases for Python
  4. How to run Python code
  5. Working with Python files
  6. Python as a calculator
  7. Advanced calculations
  8. Variables and data types
  9. Naming conventions
  10. Checking data types
  11. Working with variables
  12. Checking and updating conditions

Learn how and when to use Python's built-in data structures, including lists, dictionaries, sets, and tuples!


  1. Working with strings
  2. Multi-line strings
  3. Modifying string variables
  4. Lists
  5. Building a party playlist
  6. Subsetting lists
  7. Dictionaries
  8. Building a playlist dictionary
  9. Working with dictionaries
  10. Sets and tuples
  11. Last quarter's revenue
  12. DJ Sets
  13. Choosing a data structure

Conditional statements and operators, for and while loops all combine to enable customized workflows for your needs!


  1. Conditional statements and operators
  2. Conditional statements
  3. Checking inflation
  4. On the rental market
  5. For loops
  6. Looping through a list
  7. Updating a variable with for loops
  8. Conditional looping with a dictionary
  9. While loops
  10. Breaking a while loop
  11. Converting to a while loop
  12. Conditional while loops
  13. Building a workflow
  14. Appending to a list
  15. Book genre popularity
  16. Working with keywords
  17. Recap!


Intermediate Python for Developers
Elevate your Python skills to the next level
This course will delve deeper into Python's rich ecosystem, focusing on essential aspects such as built-in functions, modules, and packages. You'll learn how to harness the power of Python's built-in functions effectively, enabling you to streamline your code. The course will introduce you to the power of Python's modules, empowering you to develop quicker by reusing existing code rather than writing your own from scratch every time! You'll see how people have extended modules to create their own open-source software, known as packages, discovering how to download, import, and work with packages in your programs.

Master custom functions
You'll learn best practices for defining functions, including comprehensive knowledge of how to write user-friendly docstrings to ensure clarity and maintainability. You'll dive into advanced concepts such as default arguments, enabling you to create versatile functions with predefined values. The course will equip you with the knowledge and skills to handle arbitrary positional and keyword arguments effectively, enhancing the flexibility and usability of your functions. By understanding how to work with these arguments, you'll be able to create more robust and adaptable solutions to various programming challenges.

Debug your code and use error handling techniques
You'll learn to interpret error messages, including tracebacks from incorrectly using functions from packages. You'll use keywords and techniques to adapt your custom functions, effectively handling errors and providing bespoke feedback messages to developers who misuse your code!

3 Modules | 3+ Hours | 3+ Skills

Course Modules 


Discover Python's rich ecosystem of built-in functions and modules, plus how to download and work with packages.


  1. Built-in functions
  2. Get some assistance
  3. Counting the elements
  4. Performing calculations
  5. Modules
  6. What is a module?
  7. Working with the string module
  8. Importing from a module
  9. Packages
  10. Package or module?
  11. Working with pandas
  12. Performing calculations with pandas

Learn the fundamentals of functions, from Python's built-in functions to creating your own from scratch!


  1. Defining a custom function
  2. Custom function syntax
  3. Cleaning text data
  4. Building a password checker
  5. Default and keyword arguments
  6. Positional versus keyword arguments
  7. Adding a keyword argument
  8. Data structure converter function
  9. Docstrings
  10. Single-line docstrings
  11. Multi-line docstrings
  12. Arbitrary arguments
  13. Adding arbitrary arguments
  14. Arbitrary keyword arguments

Build lambda functions on the fly, and discover how to error-proof your code!


  1. Lambda functions
  2. Adding tax
  3. Calling lambda in-line
  4. Lambda functions with iterables
  5. Introduction to errors
  6. Debugging code
  7. Module and package tracebacks
  8. Fixing an issue
  9. Error handling
  10. Avoiding errors
  11. Returning errors
  12. Recap!


Introduction to Importing Data in Python
As a data scientist, you will need to clean data, wrangle and munge it, visualize it, build predictive models, and interpret these models. Before you can do so, however, you will need to know how to get data into Python. In this course, you'll learn the many ways to import data into Python: from flat files such as .txt and .csv; from files native to other software such as Excel spreadsheets, Stata, SAS, and MATLAB files; and from relational databases such as SQLite and PostgreSQL.

3 Modules | 4+ Hours | 3+ Skills

Course Modules 


In this module, you'll learn how to import data into Python from all types of flat files, which are a simple and prevalent form of data storage. You've previously learned how to use NumPy and pandas—you will learn how to use these packages to import flat files and customize your imports.


  1. Welcome to the course!
  2. Importing entire text files
  3. Importing text files line by line
  4. The importance of flat files in data science
  5. Pop quiz: what exactly are flat files?
  6. Why we like flat files and the Zen of Python
  7. Importing flat files using NumPy
  8. Using NumPy to import flat files
  9. Customizing your NumPy import
  10. Importing different datatypes
  11. Importing flat files using pandas
  12. Using pandas to import flat files as DataFrames (1)
  13. Using pandas to import flat files as DataFrames (2)
  14. Customizing your pandas import
  15. Final thoughts on data import

You've learned how to import flat files, but there are many other file types you will potentially have to work with as a data scientist. In this module, you'll learn how to import data into Python from a wide array of important file types. These include pickled files, Excel spreadsheets, SAS and Stata files, HDF5 files, a file type for storing large quantities of numerical data, and MATLAB files.


  1. Introduction to other file types
  2. Not so flat any more
  3. Loading a pickled file
  4. Listing sheets in Excel files
  5. Importing sheets from Excel files
  6. Customizing your spreadsheet import
  7. Importing SAS/Stata files using pandas
  8. How to import SAS7BDAT
  9. Importing SAS files
  10. Using read_stata to import Stata files
  11. Importing Stata files
  12. Importing HDF5 files
  13. Using File to import HDF5 files
  14. Using h5py to import HDF5 files
  15. Extracting data from your HDF5 file
  16. Importing MATLAB files
  17. Loading .mat files
  18. The structure of .mat in Python

In this module, you'll learn how to extract meaningful data from relational databases, an essential skill for any data scientist. You will learn about relational models, how to create SQL queries, how to filter and order your SQL records, and how to perform advanced queries by joining database tables.


  1. Introduction to relational databases
  2. Pop quiz: The relational model
  3. Creating a database engine in Python
  4. Creating a database engine
  5. What are the tables in the database?
  6. Querying relational databases in Python
  7. The Hello World of SQL Queries!
  8. Customizing the Hello World of SQL Queries
  9. Filtering your database records using SQL's WHERE
  10. Ordering your SQL records with ORDER BY
  11. Querying relational databases directly with pandas
  12. Pandas and The Hello World of SQL Queries!
  13. Pandas for more complex querying
  14. Advanced querying: exploiting table relationships
  15. The power of SQL lies in relationships between tables: INNER JOIN
  16. Filtering your INNER JOIN
  17. Final Thoughts


Intermediate Importing Data in Python

As a data scientist, you will need to clean data, wrangle and munge it, visualize it, build predictive models and interpret these models. Before you can do so, however, you will need to know how to get data into Python. In the prequel to this course, you learned many ways to import data into Python: from flat files such as .txt and .csv; from files native to other software such as Excel spreadsheets, Stata, SAS, and MATLAB files; and from relational databases such as SQLite and PostgreSQL. In this course, you'll extend this knowledge base by learning to import data from the web and by pulling data from Application Programming Interfaces— APIs—such as the Twitter streaming API, which allows us to stream real-time tweets.

3 Modules | 4+ Hours | 3+ Skills

Course Modules 


The web is a rich source of data from which you can extract various types of insights and findings. In this module, you will learn how to get data from the web, whether it is stored in files or in HTML. You'll also learn the basics of scraping and parsing web data.


  1. Importing flat files from the web
  2. Importing flat files from the web: your turn!
  3. Opening and reading flat files from the web
  4. Importing non-flat files from the web
  5. HTTP requests to import files from the web
  6. Performing HTTP requests in Python using urllib
  7. Printing HTTP request results in Python using urllib
  8. Performing HTTP requests in Python using requests
  9. Scraping the web in Python
  10. Parsing HTML with BeautifulSoup
  11. Turning a webpage into data using BeautifulSoup: getting the text
  12. Turning a webpage into data using BeautifulSoup: getting the hyperlinks

In this module, you will gain a deeper understanding of how to import data from the web. You will learn the basics of extracting data from APIs, gain insight on the importance of APIs, and practice extracting data by diving into the OMDB and Library of Congress APIs.


  1. Introduction to APIs and JSONs
  2. Pop quiz: What exactly is a JSON?
  3. Loading and exploring a JSON
  4. Pop quiz: Exploring your JSON
  5. APIs and interacting with the world wide web
  6. Pop quiz: What's an API?
  7. API requests
  8. JSON–from the web to Python
  9. Checking out the Wikipedia API

In this module, you will consolidate your knowledge of interacting with APIs in a deep dive into the Twitter streaming API. You'll learn how to stream real-time Twitter data, and how to analyze and visualize it.


  1. The Twitter API and Authentication
  2. Streaming tweets
  3. Load and explore your Twitter data
  4. Twitter data to DataFrame
  5. A little bit of Twitter text analysis
  6. Plotting your Twitter data
  7. Final Thoughts


Data Cleaning in Python
Discover How to Clean Data in Python
It's commonly said that data scientists spend 80% of their time cleaning and manipulating data and only 20% of their time analyzing it. Data cleaning is an essential step for every data scientist, as analyzing dirty data can lead to inaccurate conclusions.

In this course, you will learn how to identify, diagnose, and treat various data cleaning problems in Python, ranging from simple to advanced. You will deal with improper data types, check that your data is in the correct range, handle missing data, perform record linkage, and more!

Learn How to Clean Different Data Types
The first module of the course explores common data problems and how you can fix them. You will first understand basic data types and how to deal with them individually. After, you'll apply range constraints and remove duplicated data points.

The last module explores record linkage, a powerful tool to merge multiple datasets. You'll learn how to link records by calculating the similarity between strings. Finally, you'll use your new skills to join two restaurant review datasets into one clean master dataset.

Gain Confidence in Cleaning Data
By the end of the course, you will gain the confidence to clean data from various types and use record linkage to merge multiple datasets. Cleaning data is an essential skill for data scientists. If you want to learn more about cleaning data in Python and its applications, check out the following tracks: Data Scientist with Python and Importing & Cleaning Data with Python.

4 Modules | 5+ Hours | 4 Skills

Course Modules 


In this module, you'll learn how to overcome some of the most common dirty data problems. You'll convert data types, apply range constraints to remove future data points, and remove duplicated data points to avoid double-counting.


  1. Data type constraints
  2. Common data types
  3. Numeric data or ... ?
  4. Summing strings and concatenating numbers
  5. Data range constraints
  6. Tire size constraints
  7. Back to the future
  8. Uniqueness constraints
  9. How big is your subset?
  10. Finding duplicates
  11. Treating duplicates

Categorical and text data can often be some of the messiest parts of a dataset due to their unstructured nature. In this module, you’ll learn how to fix whitespace and capitalization inconsistencies in category labels, collapse multiple categories into one, and reformat strings for consistency.


  1. Membership constraints
  2. Members only
  3. Finding consistency
  4. Categorical variables
  5. Categories of errors
  6. Inconsistent categories
  7. Remapping categories
  8. Cleaning text data
  9. Removing titles and taking names
  10. Keeping it descriptive

In this module, you’ll dive into more advanced data cleaning problems, such as ensuring that weights are all written in kilograms instead of pounds. You’ll also gain invaluable skills that will help you verify that values have been added correctly and that missing values don’t negatively impact your analyses.


  1. Uniformity
  2. Ambiguous dates
  3. Uniform currencies
  4. Uniform dates
  5. Cross field validation
  6. Cross field or no cross field?
  7. How's our data integrity?
  8. Completeness
  9. Is this missing at random?
  10. Missing investors
  11. Follow the money

Record linkage is a powerful technique used to merge multiple datasets together, used when values have typos or different spellings. In this module, you'll learn how to link records by calculating the similarity between strings—you’ll then use your new skills to join two restaurant review datasets into one clean master dataset.


  1. Comparing strings
  2. Minimum edit distance
  3. The cutoff point
  4. Remapping categories II
  5. Generating pairs
  6. To link or not to link?
  7. Pairs of restaurants
  8. Similar restaurants
  9. Linking DataFrames
  10. Getting the right index
  11. Linking them together!
  12. Wrap-up!


Writing Efficient Python Code
As a Data Scientist, the majority of your time should be spent gleaning actionable insights from data -- not waiting for your code to finish running. Writing efficient Python code can help reduce runtime and save computational resources, ultimately freeing you up to do the things you love as a Data Scientist. In this course, you'll learn how to use Python's built-in data structures, functions, and modules to write cleaner, faster, and more efficient code. We'll explore how to time and profile code in order to find bottlenecks. Then, you'll practice eliminating these bottlenecks, and other bad design patterns, using Python's Standard Library, NumPy, and pandas. After completing this course, you'll have the necessary tools to start writing efficient Python code!

4 Modules | 5+ Hours | 4 Skills

Course Modules 


In this module, you'll learn what it means to write efficient Python code. You'll explore Python's Standard Library, learn about NumPy arrays, and practice using some of Python's built-in tools. This module builds a foundation for the concepts covered ahead.


  1. Welcome!
  2. Pop quiz: what is efficient
  3. A taste of things to come
  4. Zen of Python
  5. Building with built-ins
  6. Built-in practice: range()
  7. Built-in practice: enumerate()
  8. Built-in practice: map()
  9. The power of NumPy arrays
  10. Practice with NumPy arrays
  11. Bringing it all together: Festivus!

In this module, you will learn how to gather and compare runtimes between different coding approaches. You'll practice using the line_profiler and memory_profiler packages to profile your code base and spot bottlenecks. Then, you'll put your learnings to practice by replacing these bottlenecks with efficient Python code.


  1. Examining runtime
  2. Using %timeit: your turn!
  3. Using %timeit: specifying number of runs and loops
  4. Using %timeit: formal name or literal syntax
  5. Using cell magic mode (%%timeit)
  6. Code profiling for runtime
  7. Pop quiz: steps for using %lprun
  8. Using %lprun: spot bottlenecks
  9. Using %lprun: fix the bottleneck
  10. Code profiling for memory usage
  11. Pop quiz: steps for using %mprun
  12. Using %mprun: Hero BMI
  13. Using %mprun: Hero BMI 2.0
  14. Bringing it all together: Star Wars profiling

This module covers more complex efficiency tips and tricks. You'll learn a few useful built-in modules for writing efficient code and practice using set theory. You'll then learn about looping patterns in Python and how to make them more efficient.


  1. Efficiently combining, counting, and iterating
  2. Combining Pokémon names and types
  3. Counting Pokémon from a sample
  4. Combinations of Pokémon
  5. Set theory
  6. Comparing Pokédexes
  7. Searching for Pokémon
  8. Gathering unique Pokémon
  9. Eliminating loops
  10. Gathering Pokémon without a loop
  11. Pokémon totals and averages without a loop
  12. Writing better loops
  13. One-time calculation loop
  14. Holistic conversion loop
  15. Bringing it all together: Pokémon z-scores

This module offers a brief introduction on how to efficiently work with pandas DataFrames. You'll learn the various options you have for iterating over a DataFrame. Then, you'll learn how to efficiently apply functions to data stored in a DataFrame.


  1. Intro to pandas DataFrame iteration
  2. Iterating with .iterrows()
  3. Run differentials with .iterrows()
  4. Another iterator method: .itertuples()
  5. Iterating with .itertuples()
  6. Run differentials with .itertuples()
  7. pandas alternative to looping
  8. Analyzing baseball stats with .apply()
  9. Settle a debate with .apply()
  10. Optimal pandas iterating
  11. Replacing .iloc with underlying arrays
  12. Bringing it all together: Predict win percentage
  13. Wrap up!


Streamlined Data Ingestion with pandas
Before you can analyze data, you first have to acquire it. This course teaches you how to build pipelines to import data kept in common storage formats. You’ll use pandas, a major Python library for analytics, to get data from a variety of sources, from spreadsheets of survey responses, to a database of public service requests, to an API for a popular review site. Along the way, you’ll learn how to fine-tune imports to get only what you need and to address issues like incorrect data types. Finally, you’ll assemble a custom dataset from a mix of sources.

4 Modules | 5+ Hours | 4 Skills

Course Modules 


Practice using pandas to get just the data you want from flat files, learn how to wrangle data types and handle errors, and look into some U.S. tax data along the way.


  1. Introduction to flat files
  2. Get data from CSVs
  3. Get data from other flat files
  4. Modifying flat file imports
  5. Import a subset of columns
  6. Import a file in chunks
  7. Handling errors and missing data
  8. Specify data types
  9. Set custom NA values
  10. Skip bad data

Automate data imports from that staple of office life, Excel files. Import part or all of a workbook and ensure boolean and datetime data are properly loaded, all while learning about how other people are learning to code.


  1. Introduction to spreadsheets
  2. Get data from a spreadsheet
  3. Load a portion of a spreadsheet
  4. Getting data from multiple worksheets
  5. Select a single sheet
  6. Select multiple sheets
  7. Work with multiple spreadsheets
  8. Modifying imports: true/false data
  9. Set Boolean columns
  10. Set custom true/false values
  11. Modifying imports: parsing dates
  12. Parse simple dates
  13. Get datetimes from multiple columns
  14. Parse non-standard date formats

Combine pandas with the powers of SQL to find out just how many problems New Yorkers have with their housing. This module features introductory SQL topics like WHERE clauses, aggregate functions, and basic joins.


  1. Introduction to databases
  2. Connect to a database
  3. Load entire tables
  4. Refining imports with SQL queries
  5. Selecting columns with SQL
  6. Selecting rows
  7. Filtering on multiple conditions
  8. More complex SQL queries
  9. Getting distinct values
  10. Counting in groups
  11. Working with aggregate functions
  12. Loading multiple tables with joins
  13. Joining tables
  14. Joining and filtering
  15. Joining, filtering, and aggregating

Learn how to work with JSON data and web APIs by exploring a public dataset and getting cafe recommendations from Yelp. End by learning some techniques to combine datasets once they have been loaded into data frames.


  1. Introduction to JSON
  2. Load JSON data
  3. Work with JSON orientations
  4. Introduction to APIs
  5. Get data from an API
  6. Set API parameters
  7. Set request headers
  8. Working with nested JSONs
  9. Flatten nested JSONs
  10. Handle deeply nested data
  11. Combining multiple datasets
  12. Concatenate dataframes
  13. Merge dataframes
  14. Wrap-up!


Learn Git
This course introduces learners to version control using Git. You will discover the importance of version control when working on data science projects and explore how you can use Git to track files, compare differences, modify and save files, undo changes, and allow collaborative development through the use of branches. You will gain an introduction to the structure of a repository, how to create new repositories and clone existing ones, and show how Git stores data. By working through typical data science tasks, you will gain the skills to handle conflicting files!


4 Modules | 5+ Hours | 4 Skills

Course Modules

In the first module, you’ll learn what version control is and why it is essential for data projects. Then, you’ll discover what Git is and how to use it for a version control workflow.


  1. Introduction to version control with Git
  2. Using the shell
  3. Checking the version of Git
  4. Saving files
  5. Where does Git store information?
  6. The Git workflow
  7. Adding a file
  8. Adding multiple files
  9. Comparing files
  10. What has changed?
  11. What is going to be committed?
  12. What's in the staging area?

Next, you’ll examine how Git stores data, learn essential commands to compare files and repositories at different times, and understand the process for restoring earlier versions of files in your data projects.


  1. Storing data with Git
  2. Interpreting the commit structure
  3. Viewing a repository's history
  4. Viewing a specific commit
  5. Viewing changes
  6. Comparing to the second most recent commit
  7. Comparing commits
  8. Who changed what?
  9. Undoing changes before committing
  10. How to unstage a file
  11. Undoing changes to unstaged files
  12. Undoing all changes
  13. Restoring and reverting
  14. Restoring an old version of a repo
  15. Deleting untracked files
  16. Restoring an old version of a file

In this module, you'll learn tips and tricks for configuring Git to make you more efficient! You'll also discover branches, identify how to create and switch to different branches, compare versions of files between branches, merge branches together, and deal with conflicting files across branches.


  1. Configuring Git
  2. Modifying your email address in Git
  3. Creating an alias
  4. Ignoring files
  5. Branches
  6. Branching and merging
  7. Creating new branches
  8. Checking the number of branches
  9. Comparing branches
  10. Working with branches
  11. Switching branches
  12. Merging two branches
  13. Handling conflict
  14. Recognizing conflict syntax
  15. Resolving a conflict

This final module is all about collaboration! You'll gain an introduction to remote repositories and learn how to work with them to synchronize content between the cloud and your local computer. You'll also see how to create new repositories and clone existing ones, along with discovering a workflow to minimize the risk of conflicts between local and remote repositories.


  1. Creating repos
  2. Setting up a new repo
  3. Converting an existing project
  4. Working with remotes
  5. Cloning a repo
  6. Defining and identifying remotes
  7. Gathering from a remote
  8. Fetching from a remote
  9. Pulling from a remote
  10. Pushing to a remote
  11. Pushing to a remote repo
  12. Handling push conflicts
  13. Wrap up!


Software Engineering Principles in Python
Data scientists can experience huge benefits by learning concepts from the field of software engineering, allowing them to more easily reutilize their code and share it with collaborators. In this course, you'll learn all about the important ideas of modularity, documentation, & automated testing, and you'll see how they can help you solve Data Science problems quicker and in a way that will make future you happy. You'll even get to use your acquired software engineering chops to write your very own Python package for performing text analytics.

4 Modules | 5+ Hours | 4 Skills

Course Modules 


Why should you as a Data Scientist care about Software Engineering concepts? Here we'll cover specific Software Engineering concepts and how these important ideas can revolutionize your Data Science workflow!


  1. Python, data science, & software engineering
  2. The big ideas
  3. Python modularity in the wild
  4. Introduction to packages & documentation
  5. Installing packages with pip
  6. Leveraging documentation
  7. Conventions and PEP 8
  8. Using pycodestyle
  9. Conforming to PEP 8
  10. PEP 8 in documentation

Become a fully fledged Python package developer by writing your first package! You'll learn how to structure and write Python code that you can be installed, used, and distributed just like famous packages such as NumPy and Pandas.


  1. Writing your first package
  2. Minimal package requirements
  3. Naming packages
  4. Recognizing packages
  5. Adding functionality to packages
  6. Adding functionality to your package
  7. Using your package's new functionality
  8. Making your package portable
  9. Writing requirements.txt
  10. Installing package requirements
  11. Creating setup.py
  12. Listing requirements in setup.py

Object Oriented Programming is a staple of Python development. By leveraging classes and inheritance your Python package will become a much more powerful tool for your users.


  1. Adding classes to a package
  2. Writing a class for your package
  3. Using your package's class
  4. Adding functionality to classes
  5. Writing a non-public method
  6. Using your class's functionality
  7. Classes and the DRY principle
  8. Using inheritance to create a class
  9. Adding functionality to a child class
  10. Using your child class
  11. Multilevel inheritance
  12. Exploring with dir and help
  13. Creating a grandchild class
  14. Using inherited methods

You've now written a fully functional Python package for text analysis! To make maintaining your project as easy as possible we'll leverage best practices around concepts such as documentation and unit testing.


  1. Documentation
  2. Identifying good comments
  3. Identifying proper docstrings
  4. Writing docstrings
  5. Readability counts
  6. Using good function names
  7. Using good variable names
  8. Refactoring for readability
  9. Unit testing
  10. Using doctest
  11. Using pytest
  12. Documentation & testing in practice
  13. Documenting classes for Sphinx
  14. Identifying tools
  15. Final Thoughts


ETL and ELT in Python
Empowering Analytics with Data Pipelines
Data pipelines are at the foundation of every strong data platform. Building these pipelines is an essential skill for data engineers, who provide incredible value to a business ready to step into a data-driven future. This introductory course will help you hone the skills to build effective, performant, and reliable data pipelines.

Building and Maintaining ETL Solutions

Throughout this course, you’ll dive into the complete process of building a data pipeline. You’ll grow skills leveraging Python libraries such as pandas and json to extract data from structured and unstructured sources before it’s transformed and persisted for downstream use. Along the way, you’ll develop confidence tools and techniques such as architecture diagrams, unit-tests, and monitoring that will help to set your data pipelines out from the rest. As you progress, you’ll put your new-found skills to the test with hands-on exercises.


Supercharge Data Workflows
After completing this course, you’ll be ready to design, develop and use data pipelines to supercharge your data workflow in your job, new career, or personal project.

4 Modules | 5+ Hours | 4 Skills

Course Modules 


Get ready to discover how data is collected, processed, and moved using data pipelines. You will explore the qualities of the best data pipelines, and prepare to design and build your own.


  1. Introduction to ETL and ELT Pipelines
  2. Running an ETL Pipeline
  3. ELT in Action
  4. ETL and ELT Pipelines
  5. Building ETL and ELT Pipelines
  6. Building an ETL Pipeline
  7. The "T" in ELT
  8. Extracting, Transforming, and Loading Student Scores Data

Dive into leveraging pandas to extract, transform, and load data as you build your first data pipelines. Learn how to make your ETL logic reusable, and apply logging and exception handling to your pipelines.


  1. Extracting data from structure sources
  2. Extracting data from parquet files
  3. Pulling data from SQL databases
  4. Building functions to extract data
  5. Transforming data with pandas
  6. Filtering pandas DataFrames
  7. Transforming sales data with pandas
  8. Validating data transformations
  9. Persisting data with pandas
  10. Loading sales data to a CSV file
  11. Customizing a CSV file
  12. Persisting data to files
  13. Monitoring a data pipeline
  14. Logging within a data pipeline
  15. Handling exceptions when loading data
  16. Monitoring and alerting within a data pipeline

Supercharge your workflow with advanced data pipelining techniques, such as working with non-tabular data and persisting DataFrames to SQL databases. Discover tooling to tackle advanced transformations with pandas, and uncover best-practices for working with complex data.


  1. Extracting non-tabular data
  2. Ingesting JSON data with pandas
  3. Reading JSON data into memory
  4. Transforming non-tabular data
  5. Iterating over dictionaries
  6. Parsing data from dictionaries
  7. Transforming JSON data
  8. Transforming and cleaning DataFrames
  9. Advanced data transformation with pandas
  10. Filling missing values with pandas
  11. Grouping data with pandas
  12. Applying advanced transformations to DataFrames
  13. Loading data to a SQL database with pandas
  14. Loading data to a Postgres database
  15. Validating data loaded to a Postgres Database

In this final module, you’ll create frameworks to validate and test data pipelines before shipping them into production. After you’ve tested your pipeline, you’ll explore techniques to run your data pipeline end-to-end, all while allowing for visibility into pipeline performance.


  1. Manually testing a data pipeline
  2. Testing data pipelines
  3. Validating a data pipeline at "checkpoints"
  4. Testing a data pipeline end-to-end
  5. Unit-testing a data pipeline
  6. Validating a data pipeline with assert
  7. Writing unit tests with pytest
  8. Creating fixtures with pytest
  9. Unit testing a data pipeline with fixtures
  10. Running a data pipeline in production
  11. Orchestration and ETL tools
  12. Data pipeline architecture patterns
  13. Running a data pipeline end-to-end
  14. Wrap-Up!


Introduction to Apache Airflow in Python
Now Updated to Apache Airflow 2.7 - Delivering data on a schedule can be a manual process. You write scripts, add complex cron tasks, and try various ways to meet an ever-changing set of requirements—and it's even trickier to manage everything when working with teammates. Apache Airflow can remove this headache by adding scheduling, error handling, and reporting to your workflows. In this course, you'll master the basics of Apache Airflow and learn how to implement complex data engineering pipelines in production. You'll also learn how to use Directed Acyclic Graphs (DAGs), automate data engineering workflows, and implement data engineering tasks in an easy and repeatable fashion—helping you to maintain your sanity.

4 Modules | 5+ Hours | 4 Skills

Course Modules 


In this module, you’ll gain a complete introduction to the components of Apache Airflow and learn how and why you should use them.


  1. Introduction to Apache Airflow
  2. Testing a task in Airflow
  3. Examining Airflow commands
  4. Airflow DAGs
  5. Defining a simple DAG
  6. Working with DAGs and the Airflow shell
  7. Troubleshooting DAG creation
  8. Airflow web interface
  9. Starting the Airflow webserver
  10. Navigating the Airflow UI
  11. Examining DAGs with the Airflow UI

What’s up DAG? Now it’s time to learn the basics of implementing Airflow DAGs. Through hands-on activities, you’ll learn how to set up and deploy operators, tasks, and scheduling.


  1. Airflow operators
  2. Defining a BashOperator task
  3. Multiple BashOperators
  4. Airflow tasks
  5. Define order of BashOperators
  6. Determining the order of tasks
  7. Troubleshooting DAG dependencies
  8. Additional operators
  9. Using the PythonOperator
  10. More PythonOperators
  11. EmailOperator and dependencies
  12. Airflow scheduling
  13. Schedule a DAG via Python
  14. Deciphering Airflow schedules
  15. Troubleshooting DAG runs

In this module, you’ll learn how to save yourself time using Airflow components such as sensors and executors while monitoring and troubleshooting Airflow workflows.


  1. Airflow sensors
  2. Sensors vs operators
  3. Sensory deprivation
  4. Airflow executors
  5. Determining the executor
  6. Executor implications
  7. Debugging and troubleshooting in Airflow
  8. DAGs in the bag
  9. Missing DAG
  10. SLAs and reporting in Airflow
  11. Defining an SLA
  12. Defining a task SLA
  13. Generate and email a report
  14. Adding status emails

Put it all together. In this final module, you’ll apply everything you've learned to build a production-quality workflow in Airflow.


  1. Working with templates
  2. Creating a templated BashOperator
  3. Templates with multiple arguments
  4. More templates
  5. Using lists with templates
  6. Understanding parameter options
  7. Sending templated emails
  8. Branching
  9. Define a BranchPythonOperator
  10. Branch troubleshooting
  11. Creating a production pipeline
  12. Creating a production pipeline #1
  13. Creating a production pipeline #2
  14. Adding the final changes to your pipeline
  15. Wrap-up!

COMPLETE DATA ENGINEER WITH SQL & PYTHON COST


United States

$899.99

United Kingdom

£799.99

Career and Certifications


GreaterHeight Academy's Certificate Holders also prepared work at companies like:



Our Advisor is just a CALL away

+1 5169831065                                    +447474275645
Available 24x7 for your queries


Talk to our advisors

Our advisors will get in touch with you in the next 24 hours.


Get Advice


FAQs

Complete Data Analysis & Visualization with Python Course

  • Python, created by Guido van Rossum in 1991, is a high-level, readable programming language known for its simplicity. It's versatile, with applications in web development, data analysis, AI, and more. Python's extensive standard library and rich ecosystem enhance its capabilities. It's cross-platform compatible and supported by a large community. Python's popularity has grown, making it widely used in diverse industries.

  • A Python developer is a software developer or programmer who specializes in using the Python programming language for creating applications, software, or solutions. They have expertise in writing Python code, understanding the language's syntax, libraries, and frameworks. Python developers are skilled in utilizing Python's features to develop web applications, data analysis tools, machine learning models, automation scripts, and other software solutions.
  • They work in various industries, collaborating with teams or independently to design, implement, test, and maintain Python-based projects. Python developers often possess knowledge of related technologies and tools to enhance their development process.

  • Python Developer Masters Program is a structured learning path recommended by leading industry experts and ensures that you transform into a proficient Python Developer. Being a full fledged Python Developer requires you to master multiple technologies and this program aims at providing you an in-depth knowledge of the entire Python programming practices. Individual courses at GreaterHeight Academy focus on specialization in one or two specific skills; however, if you intend to become a master in Python programming then this is your go to path to follow.

  • Yes. But you can also raise a ticket with the dedicated support team at any time. If your query does not get resolved through email, we can also arrange one-on-one sessions with our support team. However, our support is provided for a period of Twelve Weeks from the start date of your course.

There are several reasons why becoming a Python developer can be a rewarding career choice. Here are a few:

  • Versatility and Popularity: Python is a versatile programming language that can be used for various purposes, such as web development, data analysis, machine learning, artificial intelligence, scientific computing, and more. It has gained immense popularity in recent years due to its simplicity, readability, and extensive library ecosystem. Python is widely used in both small-scale and large-scale projects, making it a valuable skill in the job market.
  • Ease of Learning: Python has a clean and intuitive syntax that emphasizes readability, which makes it relatively easy to learn compared to other programming languages. Its simplicity allows beginners to grasp the fundamentals quickly and start building useful applications in a relatively short amount of time. This accessibility makes Python an attractive choice for both novice and experienced programmers.
  • Rich Ecosystem and Libraries: Python offers a vast collection of libraries and frameworks that can accelerate development and simplify complex tasks. For example, Django and Flask are popular web development frameworks that provide robust tools for building scalable and secure web applications. NumPy, Pandas, and Matplotlib are widely used libraries for data analysis and visualization. TensorFlow and PyTorch are prominent libraries for machine learning and deep learning. These libraries, among many others, contribute to Python's efficiency and effectiveness as a development language.
  • Job Opportunities: The demand for Python developers has been steadily growing in recent years. Many industries, including technology, finance, healthcare, and academia, rely on Python for various applications. By becoming a Python developer, you open up a wide range of career opportunities, whether you choose to work for a large corporation, a startup, or even as a freelancer. Additionally, Python's versatility allows you to explore different domains and switch roles if desired.
  • Community and Support: Python has a vibrant and supportive community of developers worldwide. This community actively contributes to the language's development, creates open-source libraries, and provides assistance through forums, online communities, and resources.

  • There are no prerequisites for enrollment to this Masters Program. Whether you are an experienced professional working in the IT industry or an aspirant planning to enter the world of Python programming, this masters program is designed and developed to accommodate various professional backgrounds.

  • Python Developer Masters Program has been curated after thorough research and recommendations from industry experts. It will help you differentiate yourself with multi-platform fluency and have real-world experience with the most important tools and platforms. GreaterHeight Academy will be by your side throughout the learning journey - We’re Ridiculously Committed.

  • The recommended duration to complete this Python Developer Masters Program is about 20 weeks, however, it is up to the individual to complete this program at their own pace.

The roles and responsibilities of a Python developer may vary depending on the specific job requirements and industry. However, here are some common tasks and responsibilities associated with the role:

  1. Developing Applications: Python developers are responsible for designing, coding, testing, and debugging applications using Python programming language. This includes writing clean, efficient, and maintainable code to create robust software solutions.
  2. Web Development: Python is widely used for web development. As a Python developer, you may be involved in building web applications, using frameworks like Django or Flask. This includes developing backend logic, integrating databases, handling data processing, and ensuring the smooth functioning of the web application.
  3. Data Analysis and Visualization: Python offers powerful libraries like NumPy, Pandas, and Matplotlib, which are extensively used for data analysis and visualization. Python developers may be responsible for manipulating and analyzing large datasets, extracting insights, and presenting them visually.
  4. Machine Learning and AI: Python is a popular choice for machine learning and artificial intelligence projects. Python developers may work on implementing machine learning algorithms, training models, and integrating them into applications. This involves using libraries like TensorFlow, PyTorch, or scikit-learn.
  5. Collaborating and Teamwork: Python developers often work as part of a development team. They collaborate with other team members, including designers, frontend developers, project managers, and stakeholders. Effective communication and teamwork skills are crucial to ensure smooth project execution.
  6. Documentation: Python developers are expected to document their code, providing clear explanations and instructions for others who may work on or maintain the codebase in the future. Documentation helps in understanding the code and facilitating collaboration.
  7. Continuous Learning: Technology is constantly evolving, and as a Python developer, you need to stay updated with the latest advancements, libraries, frameworks, and best practices. Continuous learning and self-improvement are essential to excel in this role.

The Python Developer training course is for those who want to fast-track their Python programming career. This Python Developer Masters Program will benefit people working in the following roles:

  1. Freshers
  2. Engineers
  3. IT professionals
  4. Data Scientist
  5. Machine Learning Engineer
  6. AI Engineer
  7. Business analysts
  8. Data analysts

  • Top companies such as Microsoft, Google, Meta, Citibank, Well Fargo, and many more are actively hiring certified Python professionals at various positions.

  • On completing this Python Developer Masters Program, you’ll be eligible for the roles like: Python Developer, Web Developer, Data Analyst, Data Scientist, Software Engineer and many more.

  • There is undoubtedly great demand for data analytics as 96% of organizations seek to hire Data Analysts. The most significant data analyst companies that employ graduates who wish to have a data analyst career are Manthan, SAP, Oracle, Accenture Analytics, Alteryx, Qlik, Mu Sigma Analytics, Fractal Analytics, and Tiger Analytics. Professional Data Analyst training will make you become a magician of any organization, and you will spin insights by playing with big data.

A successful data analyst possesses a combination of technical skills and leadership skills.

  • Technical skills include knowledge of database languages such as SQL, R, or Python; spreadsheet tools such as Microsoft Excel or Google Sheets for statistical analysis; and data visualization software such as Tableau or Qlik. Mathematical and statistical skills are also valuable to help gather, measure, organize, and analyze data while using these common tools.
  • Leadership skills prepare a data analyst to complete decision-making and problem-solving tasks. These abilities allow analysts to think strategically about the information that will help stakeholders make data-driven business decisions and to communicate the value of this information effectively. For example, project managers rely on data analysts to track the most important metrics for their projects, to diagnose problems that may be occurring, and to predict how different courses of action could address a problem.

Career openings are available practically from all industries, from telecommunications to retail, banking, healthcare, and even fitness. Without extensive training and effort, it isn't easy to get data analyst career benefits. So, earning our Data Analyst certification will allow you to keep up-to-date on recent trends in the industry.

  • Yes, we do. We will discuss all possible technical interview questions and answers during the training program so that you can prepare yourself for interview.

  • No. Any abuse of copyright is taken seriously. Thanks for your understanding on this one.

  • Yes, we would be providing you with the certificate of completion of the program once you have successfully submitted all the assessment and it has been verified by our subject matter experts.

  • GreaterHeight is offering you the most updated, relevant, and high-value real-world projects as part of the training program. This way, you can implement the learning that you have acquired in real-world industry setup. All training comes with multiple projects that thoroughly test your skills, learning, and practical knowledge, making you completely industry ready.
  • You will work on highly exciting projects in the domains of high technology, ecommerce, marketing, sales, networking, banking, insurance, etc. After completing the projects successfully, your skills will be equal to 6 months of rigorous industry experience.

All our mentors are highly qualified and experience professionals. All have at least 15-20 yrs. of development experience in various technologies and are trained by GreaterHeight Academy to deliver interactive training to the participants.

Yes, we do. As the technology upgrades, we do update our content and provide your training on latest version of that technology.

  • All online training classes are recorded. You will get the recorded sessions so that you can watch the online classes when you want. Also, you can join other class to do your missing classes.

OUR POPULAR COURSES

Data  Analytics and Visualization With Python

Data Analytics and Visualization With Python

Advanced developments of expertise in cleaning, transforming, and modelling data to obtain insight into  corporate decision making as a Senior Data Analyst - Using Python.

View Details
Data Science Training Masters Program

Data Science Training Masters Program

Learn Python, Statistics, Data Preparation, Data Analysis, Querying Data, Machine Learning, Clustering, Text Processing, Collaborative Filtering, Image Processing, etc..

View Details
Microsoft Azure DP-100 Data Science

Microsoft Azure DP-100 Data Science

You will Optimize & Manage Models,   Perform Administration by using T-SQL, Run Experiment & Train Models, Deploy & Consume Models, and Automate Tasks.

View Details
Machine Learning using Python

Machine Learning using Python

Learn Data Science and Machine Learning from scratch, get hired, and have fun along the way with the most modern, up-to-date Data Science course on.

View Details
Microsoft Azure PL-300 Data Analysis

Microsoft Azure PL-300 Data Analysis

You will learn how to Design a Data Model in Power BI, Optimize Model Performance,   Manage Datasets in Power BI and Create Paginated Reports.

View Details
Microsoft Azure DP-203 Data Engineer

Microsoft Azure DP-203 Data Engineer

You will learn Batch & Real Time Analytics, Azure Synapse Analytics, Azure Databricks,   Implementing Security and ETL & ELT Pipelines.

View Details

The GreaterHeight Advantage

0+

Accredited Courseware

Most of our training courses are accredited by the respective governing bodies.

0+

Assured Classes

All our training courses are assured & scheduled dates are confirmed to run by SME.

0+

Expert Instructor Led Programs

We have well equipped and highly experienced instructors to train the professionals.

OUR CLIENTS

We Have Worked With Some Amazing Companies Around The World

Our awesome clients we've had the pleasure to work with!


Client 01
Client 02
Client 03
Client 04
Client 05
Client 06
Client 07
  • Contact info
  • Facebook
  • WhatsApp
  • (+86)1234567809
  • creative@gmail.com
  • Back to top