Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions docs/adr/0061-subscription-per-customer-model-on-azure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Use one Subscription per Customer on Azure

- Status: Accepted
- Deciders: PA and P.O
- Date: 2023-03-19

## Context and Problem Statement

Welkin can be run on Azure now and so, we explored different strategies for resource isolation within Azure to best serve multiple customers.
The options considered included isolation at the resource group level versus adopting a per subscription per customer approach.

Should we consider the per Subscription-per-Customer Model for managing Azure resources under one tenant?

## Decision Drivers

- We want to maintain Platform security and stability.
- We want to find a solution which is scalable and minimises Platform Administrator burden.
- We want to make the Platform Administrator life easier.
- We want to have simplified billing and invoices.

## Considered Options

1. Isolation at the Subscription Level

- `Good`, because subscriptions provide clear separation for billing and invoice purposes considering Azure invoices are prepared based on the Subscription ID, making it easier to track and manage costs per customer.
- `Good`, because Azure typically operates with one tenant per organisation, closely tied to the Active Directory (AD) or Azure AD. Utilising multiple subscriptions under one tenant provides a coherent organisational structure, enhancing management and governance.
- `Good`, because Role-Based Access Control (RBAC) settings are easier to manage at the subscription level, offering straightforward governance and security controls for each customer.
- `Good`, because Azure imposes certain limits at the subscription level; having separate subscriptions helps in managing these limits more effectively.
- `Good`, because virtual networks are scoped to a subscription, simplifying network management and isolation between customers.
- `Bad`, because setting up network communication between subscriptions will require additional work than within a single subscription.

1. Isolation at the Resource Group Level

- `Good` because resource groups allow for organising resources more flexibly within a single subscription, facilitating easier management of resources.
- `Good` because managing subscriptions can reduce the complexity and overhead associated with subscription management, permissions, and billing setups.
- `Good` because setting up network communication between resource groups will be less complex as compared with across the subscriptions.
- `Bad`, because while Azure Cost Management can track costs by resource group, billing separation is not as straightforward as with subscriptions, potentially complicating cost allocation and invoicing for different customers.
- `Bad`, because fine-grained access control is more challenging to implement and manage effectively at the resource group level compared to subscription-level controls.
- `Bad`, because the resource groups share the same subscription limits, there's a risk of hitting these limits, which could impact scalability and performance.

## Decision Outcome

Chosen option:

Isolation at the Subscription level i.e Subscription-per-Customer Model because, as an organization, we wanted to ensure complete isolation for billing, invoices, resources, and compliance, while maintaining the flexibility to communicate and share resources across subscriptions as necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about environments of the same customer? How will those be isolated? Separate subscription or resource groups?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say separate subscriptions. So we can treat subscriptions as projects in openstack and then we have some similarity with them.

### Positive Consequences

- We maintain Platform security and resource isolation.
- We don't increase the operational complexity.
- We have stricter access controls and limit the scope of potential security breaches.
- We have simplified billing and invoices.
- We avoid potential resource contention issues.

### Negative Consequences

- Managing multiple subscriptions can increase the administrative workload.
- Network communication across tenant requires advanced networking setups, like virtual network peering, etc
8 changes: 4 additions & 4 deletions docs/adr/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
---
tags:
#- ISO 27001 A.14.1.1 Information Security Requirements Analysis & Specification
#- ISO 27001:2013 A.14.2.4 Restrictions on Changes to Software Packages
- NIST SP 800-171 3.13.2
- NIST SP 800-171 3.13.3
- ISO 27001 Annex A 8.27 Secure System Architecture and Engineering Principles
Expand Down Expand Up @@ -88,7 +86,7 @@ This log lists the architectural decisions for Welkin.
- [ADR-0010](0010-run-managed-services-in-workload-cluster.md) - Run managed services in Workload Cluster
- [ADR-0011](0011-let-upstream-projects-handle-crds.md) - [Superseded by [ADR-0046](0046-handle-crds.md)] Let upstream projects handle CRDs
- [ADR-0012](0012-do-not-persist-dex.md) - [Superseded by [ADR-0017](0017-persist-dex.md)] Do not persist Dex
- [ADR-0013](0013-configure-alerts-in-omt.md) - Configure Alerts in On-call Management Tool (e.g., Opsgenie)
- [ADR-0013](0013-configure-alerts-in-omt.md) - [Superseded by [ADR-0060](0060-group-alerts-in-alertmanager.md)] Configure Alerts in On-call Management Tool (e.g., Opsgenie)
- [ADR-0014](0014-use-bats-for-testing-bash-wrappers.md) - Use bats for testing bash wrappers
- [ADR-0015](0015-we-believe-in-community-driven-open-source.md) - We believe in community-driven open source
- [ADR-0016](0016-gid-0-is-okey-but-not-by-default.md) - [Superseded by [ADR-0040](0040-allow-group-id-0.md)] gid=0 is okay, but not by default
Expand All @@ -107,7 +105,7 @@ This log lists the architectural decisions for Welkin.
- [ADR-0029](0029-expose-jaeger-ui.md) - Expose Jaeger UI in WC
- [ADR-0030](0030-run-argocd-on-elastisys-nodes.md) - Run ArgoCD on the Elastisys Nodes
- [ADR-0031](0031-run-csi-cinder-controllerplugin-on-elastisys-nodes.md) - Run csi-cinder-controllerplugin on the Elastisys Nodes
- [ADR-0032](0032-boot-disk-size.md) - Boot disk size on Nodes
- [ADR-0032](0032-boot-disk-size.md) - [Superseded by [ADR-0058](0058-boot-disk-sizes.md)]Boot disk size on Nodes
- [ADR-0033](0033-run-cluster-api-controllers-on-service-cluster.md) - Run Cluster API controllers on Management Cluster
- [ADR-0034](0034-how-to-run-multiple-ams-packages-of-the-same-type.md) - How to run multiple AMS packages of the same type in the same environment
- [ADR-0035](0035-run-tekton-on-service-cluster.md) - Run Tekton on Management Cluster
Expand Down Expand Up @@ -135,6 +133,8 @@ This log lists the architectural decisions for Welkin.
- [ADR-0057](0057-why-we-do-not-use-cloud-managed-kubernetes-services.md) - Do Not Use Managed Kubernetes Services
- [ADR-0058](0058-boot-disk-sizes.md) - Boot disk size on Nodes
- [ADR-0059](0059-welkin-to-consist-public-open-source-code-and-proprietary-documentation.md) - Welkin to Consist of Public Open Source Code and Proprietary Documentation
- [ADR-0060](0060-group-alerts-in-alertmanager.md) - Group alerts in Alertmanager
- [ADR-0061](0061-subscription-per-customer-model-on-azure.md) - Use one Subscription per Customer on Azure

<!-- adrlogstop -->

Expand Down