SoundScope: Unveiling the Beat of Data

SoundScope uses Kafka for data collection, GCS for temporary storage, DBT for data transformation, and BigQuery for analysis. The pipeline is orchestrated by Airflow and deployed on GCP with Docker and Terraform.

Architecture

Apache Kafka: Acts as a messaging queue to handle the real-time data ingestion.
Apache Spark: Performs stream processing to transform the raw data.
Data Lake: Stores processed data for further analysis.
Google Cloud Storage (GCS): Temporary storage for intermediate data.
DBT (Data Build Tool): Transforms and models the data stored in GCS.
BigQuery: Performs data analysis and querying.
Data Studio: Visualizes the analyzed data through interactive dashboards.
Airflow: Orchestrates the entire pipeline, ensuring smooth data flow and task management.
Docker and Terraform: Used for containerization and infrastructure as code, respectively, to deploy the pipeline on GCP.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Screenshots		Screenshots
airflow		airflow
dbt		dbt
eventsim		eventsim
kafka		kafka
scripts		scripts
spark_streaming		spark_streaming
terraform		terraform
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SoundScope: Unveiling the Beat of Data

Architecture

Real-time data analytics Dashboard in Looker Studio

Kafka Console

Airflow Console

Author

About

Releases

Packages

Languages

Aayan107/SoundScope

Folders and files

Latest commit

History

Repository files navigation

SoundScope: Unveiling the Beat of Data

Architecture

Real-time data analytics Dashboard in Looker Studio

Kafka Console

Airflow Console

Author

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages