| Names | Roles | Emails | GitHub Handles |
|---|---|---|---|
| Supriya Nanjundaswamy | Mentor | snanjundaswamy1@StateStreet.com | |
| Yashaswi Upmon | Mentor | YUpmon@StateStreet.com | |
| Rishi Dubey | Mentor | RDubey8@StateStreet.com | |
| James Colley | Mentor | JColley@StateStreet.com | |
| Amruth Niranjan | Student | amruth@bu.edu | amruth-sn |
| Hrishav Varma | Student | hri@bu.edu | VHri |
| Krish Shah | Student | kshah26@bu.edu | krish-shahh |
| Yuzhe Xu | Student | yx8756a@bu.edu | yuzhexu |
| Rithvik Nakirikanti | Student | rithvikn@bu.edu | rithvik213 |
This project focuses on automating the deployment and management of Snowflake resources using CI/CD pipelines, specifically leveraging Liquibase for database migrations and Harness for orchestration. The goal is to streamline the creation, modification, and deletion of key Snowflake resources, including databases, schemas, users, roles, and data warehouses, in a secure and scalable manner.
Instructions for setting up and using the Automation of Snowflake Resource Deployment project can be found on the documentation website.
The project aims to deliver an automated, secure, and efficient pipeline for managing Snowflake resources, scalable to diverse workloads. By integrating Liquibase and Harness, this solution will enable teams to deploy database changes and resource configurations with minimal manual intervention. The automation will improve deployment consistency, reduce errors, and ensure secure and auditable management of Snowflake environments.
Key goals:
- Automate Snowflake resource deployments via CI/CD pipelines.
- Establish version control system for database migrations using Liquibase.
- Ensure secure and role-based access control for Snowflake resources.
- Develop a scalable and easy-to-manage pipeline using Harness.
- Build CLI tools to generalize the solution for broader adoption.
The project will primarily serve the following user personas:
-
DevOps Engineers: Manage the CI/CD pipelines and ensure smooth deployment of Snowflake resources.
- Interaction: Configure and manage CI/CD pipelines in Harness, monitor performance, and troubleshoot issues.
-
Database Administrators: Oversee Snowflake resource configurations and use the automation to deploy database changes.
- Interaction: Review and approve Liquibase migration scripts, ensure resource integrity, and manage performance.
-
Data Engineers: Manage the data warehouse environment and ensure that changes to databases and schemas are rolled out without manual intervention.
- Interaction: Monitor data warehouse performance, validate data integrity post-deployment, and provide feedback for improvements.
-
Security Engineers: Monitor and manage role-based access control and ensure the secure deployment of resources.
- Interaction: Set up access controls, conduct audits, and monitor for security breaches.
- Automation of the following Snowflake resources using Liquibase and Harness:
- Databases
- Schemas
- Users and Roles
- Data Warehouses
- CI/CD pipeline configuration using Harness for managing different deployment environments (development, staging, production).
- Secure handling of credentials with role-based access control policies.
- Integration of Snowflake Cortex and ML capabilities for classification and forecasting.
- Real-time data streaming or ingestion processes.
- Data integration solutions beyond database and resource management.
The solution revolves around leveraging Liquibase to manage Snowflake database migrations and using Harness to orchestrate automated CI/CD workflows. The key components include:
- Liquibase: Manages version-controlled SQL migration scripts for Snowflake resources.
- Harness: Orchestrates the CI/CD pipeline, automatically triggering deployments upon changes to the codebase.
- Snowflake: The target data warehouse environment where databases, schemas, and user roles will be managed.
- Version Control (Git): Manages code changes and triggers CI/CD pipelines.
- Liquibase ensures a consistent and trackable process for database schema management.
- Harness allows for secure and automated deployment, reducing manual errors and improving deployment efficiency.
- Security: Proper access control and credentials management is critical in managing a cloud-based data warehouse such as Snowflake.
- Successfully automate the creation, modification, and deletion of Snowflake resources (databases, schemas, users/roles, data warehouses) using Liquibase and Harness.
- CI/CD pipelines are fully functional and automatically trigger on code changes.
- Secure handling of credentials with proper role-based access control implemented in Snowflake.
- Full testing is completed, and rollback mechanisms are in place to handle any deployment failures.
- Implement logging and monitoring of deployment pipelines.
- Advanced role-based access management for complex team structures.
- Basic CI/CD pipelines for managing Snowflake databases and schemas.
- Expansion to handle user and role management within Snowflake.
- Full integration of all Snowflake resources in the deployment pipeline.
- Error handling, rollback mechanisms, and enhanced security features.
- Understanding Project Details
- Defined system architecture and tools and created the System Architecture Diagram
- Initial setup of the pipeline architecture.
- Designed initial Liquibase migration scripts for database and schema management and implemented logging.
- Sprint 1 Video | Sprint 1 Slides
- Complete the first fully functional CI/CD pipeline for Snowflake, integrating Liquibase with Harness.
- Integrated Git version control for automated deployments.
- Configured rollback mechanisms and developed unit tests for migration rollbacks.
- Sprint 2 Video | Sprint 2 Slides
- Implemented Role-Based Access Control (RBAC) for secure credential management.
- Created a template warehouse and integrated its scaling with CI/CD pipelines.
- Created comprehensive documentation for the users of the project.
- Sprint 3 Video | Sprint 3 Slides
- Automated Snowflake warehouse scaling with scripts.
- Conducted unit and pipeline testing with large public datasets.
- Researched and integrated AI/ML features, including Cortex and Snowflake ML Studio.
- Delivered a Snowflake Cortex demo and team training on Snowflake Quickstart tutorials.
- Sprint 4 Video | Sprint 4 Slides
- Created a CLI tool for any users to integrate this pipeline with their general use cases.
- Efforts to enhance the visibility of the project - Published a Medium article and promoted the project on forums.
- Enhanced documentation with starting and troubleshooting steps.
- Sprint 5 Video | Sprint 5 Slides | Medium Article | Snowpilot GitHub Repository
- Automated Snowflake resource deployment using Liquibase and Harness.
- Built CI/CD pipelines with Git integration, rollback mechanisms, and automated changelogs.
- Enforced secure credential management with Role-Based Access Control (RBAC).
- Conducted unit tests and validated pipeline performance using large public datasets.
- Integrated dynamic warehouse scaling scripts into CI/CD pipelines.
- Delivered comprehensive user guides and troubleshooting documentation.
- Developed a reusable CLI for external user adoption and published a Medium article.
- Final Demo Video | Final Demo Slides | Documentation

