0% found this document useful (0 votes)
48 views9 pages

Azure Data Engineering Pipeline Overview

The document presents an overview of Azure Data Engineering by Tanishka, highlighting her background, skills, and current roles in AI and data engineering. It details a project focused on building an Azure Data Engineering pipeline for analyzing Tokyo Olympics data, including architecture, data ingestion, processing, and visualization. Challenges faced during the project and their solutions are also discussed, along with resources for further learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views9 pages

Azure Data Engineering Pipeline Overview

The document presents an overview of Azure Data Engineering by Tanishka, highlighting her background, skills, and current roles in AI and data engineering. It details a project focused on building an Azure Data Engineering pipeline for analyzing Tokyo Olympics data, including architecture, data ingestion, processing, and visualization. Challenges faced during the project and their solutions are also discussed, along with resources for further learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Azure Data Engineering

Presented by Tanishka

by Tanishka .
Meet Tanishka
Hi, I9m Tanishka, a DevOps & Cloud Enthusiast with a strong
interest in Data Engineering & AI-driven Automation.

 Current Roles & Learning:

ML Intern at Unified Mentor 3 Working on AI/ML & data-


driven solutions.
Pursuing MSc in AI & ML from IIITB & LJMU (UK)
Microsoft Azure AI-900 Certified.
Skills & Expertise
Cloud & DevOps Tools

Microsoft Azure: Data Factory, Synapse Analytics, Blob Storage


Terraform & Kubernetes for Infrastructure as Code (IaC)
CI/CD Pipelines: Jenkins, GitHub Actions, ArgoCD

 Data Engineering & AI Tools

Azure Databricks (PySpark) for big data processing


SQL & Power BI for analytics and visualization
Machine Learning & Python for predictive insights
Data Engineering Project:
Objective: Build an end-to-end Azure Data Engineering pipeline to process
& analyze Tokyo Olympics data.

' Key Features:

Ingests real-time & historical Olympics data from multiple sources.


Cleans, transforms, and stores structured data for analysis.
Visualizes athlete performance, medal standings & country trends.

 Architecture Overview:

Ingestion: Azure Data Factory (ADF) + APIs & Blob Storage


Processing: Azure Databricks (PySpark)
Storage & Analytics: Azure Synapse + Power BI
Orchestration: Azure Logic Apps & Automation
Architecture Breakdown
Data Ingestion
Gathering data from various sources

Processing
Transforming and cleaning the data

Storage
Storing data in Azure Synapse Analytics

Visualization
Presenting insights through dashboards
Incremental Data Loading for Tokyo Olympics
Project

Change Data Capture


2
Identify new or updated Tokyo
Olympics data.
Initial Load
Load all historical Tokyo Olympics 1
data.
Incremental Load

3 Load only the changed Tokyo Olympics


data.

Load only the new or updated Tokyo Olympics data. This optimizes performance and reduces load times in the Azure Data
Engineering pipeline.
Challenges & Solutions
Challenge 1
Handling large data volumes.

Solution: Optimized data partitioning.

Challenge 2
Optimizing ETL performance.

Solution: Used Azure best practices.


Scope of Azure Data
Engineering
Data Integration
Integrating diverse data sources into a unified platform.

Big Data Processing


Handling large volumes of data using scalable solutions.

Data Warehousing
Building and managing data warehouses for analytical reporting.

Machine Learning
Enabling machine learning models with reliable data pipelines.
Next Steps & Resources

Microsoft Learn
Expand your knowledge

GitHub
Explore community projects

Azure Documentation and Stack Overflow provide additional support.

You might also like