Skip to content
View shahidmalik4's full-sized avatar

Block or report shahidmalik4

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
shahidmalik4/README.md

Hi, I'm Shahid Malik 👋

I build pipelines, infrastructure, and backend data systems that make analytics reliable at scale.


👨‍💻 About Me

For 4+ years I've been the only data person at my company. Not just analyzing data, but doing the actual work: migrating the entire database from MS SQL Server to PostgreSQL, managing CRM and payments data, writing SQL and Python scripts for analytics, ad-hoc requests, automation, and scheduling jobs.

Building systems is genuinely what I enjoy. Pipelines, infrastructure, backend data systems that run reliably without anyone babysitting them. That's the work I want to keep doing, and Data Engineering is the role that matches it.

What I've shipped at work:

  • Led end-to-end migration of production database from MS SQL Server to PostgreSQL, including schema conversion, data validation, and cutover planning
  • Cut manual reporting effort by 30-40% through Python automation
  • Contributed to PKR 89M+ revenue growth and 11% margin improvement
  • Built data pipelines and dashboards used by leadership for revenue and margin decisions

🛠 Tech Stack

Layer Tools
Ingestion Python, dlt, Airbyte
Transformation dbt, SQL
Orchestration Airflow
Warehousing Snowflake, PostgreSQL, Redshift
Big Data Databricks, Apache Spark (PySpark)
Infrastructure Docker, GitHub Actions CI/CD, AWS (S3, Glue, Athena)
Language Python, FastAPI
Analytics Power BI, Metabase, Pandas

Pinned Loading

  1. dbt-airflow-data-pipeline dbt-airflow-data-pipeline Public

    A full analytics workflow simulating a real-world business environment! The project starts with raw transactional data (TPCH dataset) and transforms it into clean, aggregated KPIs, ready for analys…

    Python 5

  2. pyspark-snowflake-dbt-pipeline pyspark-snowflake-dbt-pipeline Public

    This project is a data engineering pipeline leveraging PySpark, Snowflake, Airflow, dbt, and Streamlit to extract, transform, and load millions of records daily. It streamlines data processing, ena…

    Python 2

  3. analytics-pipeline-fastapi-dbt analytics-pipeline-fastapi-dbt Public

    A full-stack data analytics pipeline using DBT, FastAPI, Streamlit and Postgres. Transforms raw data into modeled tables and exposes KPIs via API endpoints, with an interactive dashboard for visual…

    Python 1

  4. aws-glue-stepfunctions-etl aws-glue-stepfunctions-etl Public

    This project automates an ETL pipeline using AWS Glue, S3, Athena, and Step Functions to transform raw Airbnb data. It cleanses, enriches, and organizes the data into separate raw and transformed d…

    Python 1

  5. dbt-snowflake-data-pipeline dbt-snowflake-data-pipeline Public

    The dbt Snowflake Data Pipeline project uses dbt to transform data in Snowflake, creating efficient, scalable data models for analysis. It leverages incremental models to handle large datasets and…

    1

  6. data-platform-forge data-platform-forge Public

    A production-style local data platform built with modern data engineering tools. This project simulates a real-world ELT pipeline — from raw data ingestion through transformation and orchestration …

    Python