Bipina Poudel

Toronto, Canada · Open to opportunities

Bipina Poudel

Senior Data Engineer · CAPM®

Senior Data Engineer with 7+ years of experience building scalable cloud-native data platforms processing multi-terabyte enterprise datasets across AI analytics, telecom, and healthcare domains. Expertise in AWS, Azure Databricks, Snowflake, PySpark, Apache Spark, Kafka, Airflow, Terraform, and Spark Streaming. Proven track record of optimizing ETL/ELT pipelines, enabling real-time analytics, and implementing scalable Lakehouse architectures supporting enterprise reporting and AI/ML workloads.

7+Years Experience
3Industries
2Cloud Platforms
3Certifications

Experience

Where I've worked and what I've built

Senior Data Engineer

Skill Squirrel · Toronto, Ontario

Feb 2024 – Present

Designed and developed scalable ETL/ELT pipelines using Azure Databricks, PySpark, Azure Data Factory, and AWS Glue, processing 5TB+ workforce analytics data daily to support enterprise AI and workforce intelligence.

  • Built enterprise Lakehouse architecture using Azure Data Lake Gen2, AWS S3, Delta Lake, and Snowflake following Medallion Architecture standards
  • Developed Kafka and Spark Structured Streaming pipelines processing 1M+ candidate activity events daily for real-time workforce intelligence analytics
  • Implemented Apache Airflow DAGs and Delta Live Tables (DLT) workflows, improving pipeline reliability and reducing manual intervention by 40%
  • Optimized Spark workloads using partitioning, caching, AQE, and broadcast joins, reducing ETL processing time by 45%
  • Developed reusable PySpark and dbt transformation frameworks supporting CDC-based incremental data ingestion and SCD Type 2 implementations
  • Integrated Great Expectations and Data Observability frameworks, improving enterprise data quality accuracy to 99.5%
  • Built CI/CD pipelines using Terraform, Azure DevOps, Jenkins, and GitHub Actions, automating deployment and infrastructure provisioning
  • Collaborated with AI/ML teams to build feature-engineered datasets and MLflow-integrated pipelines supporting predictive hiring models

Environment: AWS (S3, Glue, Lambda), Azure Databricks, PySpark, Spark SQL, Delta Lake, Snowflake, Kafka, Airflow, Azure Data Factory, dbt, Terraform, Python, SQL, Docker, Jenkins, GitHub Actions, CI/CD

Data Engineer

T-Mobile · Bellevue, WA

Sept 2021 – Jan 2024

Developed scalable PySpark and Spark SQL pipelines processing 10TB+ telecom customer, billing, subscriber, and network analytics data daily, supporting Customer 360 analytics for 50M+ subscribers.

  • Built large-scale streaming ingestion frameworks using AWS Glue, Kafka, Spark Structured Streaming, and Azure Event Hub, supporting near real-time telecom analytics
  • Designed cloud-native Lakehouse architecture using AWS S3, Delta Lake, Snowflake, and Azure Databricks for Customer 360 analytics
  • Developed Kafka-based real-time event processing pipelines, reducing network alert processing latency from 2 hours to under 15 minutes
  • Implemented Apache Airflow and Azure Data Factory workflows orchestrating 500+ enterprise ETL jobs with 99.9% SLA compliance
  • Utilized AWS EMR distributed clusters for processing large-scale telecom network and operational datasets across multi-region environments
  • Optimized Snowflake and Spark performance using clustering, partitioning, caching, and query tuning, improving analytics query performance by 60%
  • Automated infrastructure provisioning and CI/CD deployment pipelines using Terraform, Jenkins, Docker, and GitHub Actions, reducing deployment effort by 70%

Environment: AWS (Glue, EMR, S3, Lambda), Azure Databricks, PySpark, Spark SQL, Snowflake, Kafka, Spark Streaming, Airflow, Azure Data Factory, dbt, Terraform, Python, SQL, Docker, Jenkins, CI/CD

Data Engineer

Cedar Gate Technologies · Greenwich, CT

Jul 2019 – Aug 2021

Developed ETL/ELT pipelines using AWS Glue, Databricks, PySpark, and Azure Data Factory processing 100M+ healthcare records in a HIPAA-compliant environment.

  • Built enterprise Lakehouse architecture using AWS S3, ADLS Gen2, Delta Lake, and Snowflake with Medallion Architecture patterns
  • Automated healthcare ETL workflows using Apache Airflow, improving SLA compliance to 99.9%
  • Designed dimensional models and CDC pipelines for enterprise claims, provider, and member reporting
  • Developed Kafka streaming pipelines for real-time healthcare event processing
  • Implemented IAM security policies and HIPAA-compliant governance controls
  • Optimized Spark and Snowflake workloads, reducing ETL failures by 35%
  • Automated deployment and monitoring using Terraform, Jenkins, and CloudWatch

Environment: AWS (Glue, S3, Redshift, IAM), Azure Databricks, PySpark, Spark SQL, Delta Lake, Snowflake, Kafka, Airflow, Terraform, Python, SQL Server, CI/CD


Skills & Technologies

Tools I work with day to day

Programming Languages

PythonSQLJavaScalaShell Scripting

Big Data Technologies

Apache SparkPySparkSpark SQLKafkaHadoopHiveDelta LakeDelta Live Tables (DLT)

Cloud Platforms

AWS (S3, Glue, Lambda, Redshift, IAM, EC2, CloudWatch, CDK, Step Functions, MWAA, CodePipeline)Azure (Databricks, ADF, ADLS Gen2, Synapse Analytics, Event Hub, Key Vault)

Data Warehousing & Databases

SnowflakeAmazon RedshiftAzure SynapseSQL ServerPostgreSQLMySQLMongoDB

ETL & Orchestration

dbtApache AirflowAWS MWAAAWS GlueAzure Data FactoryTalendFivetranAWS Step Functions

Streaming Technologies

Apache KafkaSpark Structured StreamingAzure Event Hub

DevOps & Infrastructure

TerraformAWS CDKDockerKubernetesJenkinsAzure DevOpsGitHub ActionsGitCI/CD

Data Governance & Observability

Unity CatalogGreat ExpectationsData LineageData ObservabilityPrometheusGrafanaCloudWatchGDPRCCPAHIPAAPII Masking

Visualization & Reporting

Power BITableauLookerReal-Time Dashboards

Projects

Things I've built and problems I've solved

More on github.com/bipinapoudel


Certifications

Professional credentials and qualifications

☁️

AWS Certified Data Engineering – Associate

Amazon Web Services

🔷

Microsoft Certified: Azure Databricks Data Engineer Associate

Microsoft

📋

Certified Associate in Project Management (CAPM)®

Project Management Institute


Education

Academic background

P.G. in Project Management – IT

Seneca College

Toronto, Canada

P.G. in Cyber Security and Threat Management

Seneca College

Toronto, Canada

B.S. in Computing (Honours) – Information Technology

Leeds Beckett University

Kathmandu, Nepal

Get in Touch

Have a project or opportunity in mind? Let's talk.

I'm open to data engineering roles, freelance projects, and collaborations. Whether it's a pipeline problem or a full platform build — I'd love to hear from you.