Zuber Ahmed
LinkedIn | Kaggle | +91 8369088339 | xuberswork@[Link]
Summary
Experienced Big Data Engineer with 3+ years of success in designing and optimizing scalable ETL pipelines and deliv-
ering efficient Modern Lakehouse Architectures that enhance performance and reduce project costs to build reliable,
production-ready data platforms. Also brings experience in machine learning and predictive analytics, adding analyt-
ical depth to end-to-end data engineering [Link] in collaborating with cross functional teams to deliver
high-quality solution saving project cost by around 30 to 40%.
Tools/Skills
Languages/Framework: Python, Spark/PySpark, Pandas, NumPy, Seaborn, Scikit-learn, XGBoost, LightGBM.
Database: SQL, PostgreSQL , Neo4j.
Cloud/DWH: Databricks, Azure Data Factory, Azure Synapse, Azure Blob Storage, GCP BigQuery, AWS S3.
Skills/Misc: Data Modeling, Data Pipelines (ETL / ELT), Data Quality, Airflow, Machine Learning, Forecasting, NLP,
Docker, Jenkins, FastAPI, Data Cleaning, Gradient Boosting, Git, BitBucket, Jira, Confluence, Presentation.
Work Experience
DecisionNXT
Data Engineer Consultant (Remote), December 2023 - Present
• Pioneered in developing a custom SaaS analytics platform for a pharmaceutical client, integrating diverse data
sources, and replacing multiple BI tools for 3+ projects.
• Reduced 100% license costs and vendor lock-in, allowing significant operational savings.
• Migrated legacy Lambda Architecture to Medallion Architecture with Unity Catalog, centralizing metadata
governance and access controls for Delta Lake tables across 10+ business units.
• Designed scalable data pipelines with Databricks Spark optimizations—partitioning, caching, bucketing,
CDC/SCD, DLT cutting execution times by 30-60% in large-scale data processing.
• Expanded scalable data ingestion and transformation workflows using Azure SDK and Jenkins, reducing
manual intervention and deployment time by 30%.
• Achieved performance and Data Model tuning on SQL tables and queries which reduced UI reloading by 2-3x.
• Initiated the development of the POC, communicated, and presented results to senior leadership, obtaining
approval for a solution that improved deployment speed by 25%.
Piramal Groups
Data Science Engineer (Mumbai), June 2022 - November 2023
• Initiated an end-to-end NLP sentiment analysis pipeline using BERT, PyTorch, and PySpark, automating feed-
back classification in 120–180 seconds and improving overall processing efficiency by 50–70%.
• Attained EXIM data workflows by integrating Azure Blob Storage and Azure Databricks, enabling efficient in-
gestion, cleaning, and transformation of raw data with nearly 80% higher processing efficiency which obtained
Qlik Sense dashboards for insights into Import and export trends of competitors.
• Coordinated cross-functional teams to integrate 10 data sources, improving the accuracy of reporting by 30%
along with Data Quality Checks.
• Engineered a Qlik Sense dashboard for the Manipur plant to analyze API formulation cycle times, boosting over-
all efficiency by 30%.
• This low-latency solution helped achieve cost savings of approximately $ 130 per month per block.
Education
St. Francis Institute of Technology, Mumbai, India
Bachelor of Engineering in EXTC, April 2017- July 2021
Subjects: Signal Processing,Computer Vision, Linear Algebra, Calculus, Statistics, Python, SQL