Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: New Oracle odaa sections #463

Closed
wants to merge 11 commits into from
5 changes: 5 additions & 0 deletions azure-resources/Oracledatabase/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: "Oracledatabase"
geekdocCollapseSection: true
geekdocHidden: false
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Cloudexadatainfrastructures"
geekdocCollapseSection: true
geekdocHidden: false
---

{{< azure-resources-recommendationlist name="azure-resources-recommendationlist" >}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// under-development
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
- description: Ensure ODAA infrastructure is in Available state under normal operations
aprlGuid: c99d730b-8754-447f-bd5d-3e8850a12235
recommendationTypeId: null
recommendationControl: Other Best Practices
recommendationImpact: High
recommendationResourceType: oracle.database/cloudExadataInfrastructures
recommendationMetadataState: Active
longDescription: |
Cloud Exadata infrastructures can be in a number of states during its lifecycle (Available, Failed, Maintenance In Progress, Provisioning, Terminated, Terminating and Updating).
Ensure that the infrastructure is in Available state under normal operations.
potentialBenefits: Ensure service is available
pgVerified: false
publishedToLearn: false
automationAvailable: arg
tags: null
learnMoreLink:
- name: Cloud Exadata infrastructure lifecycle
url: "https://learn.microsoft.com/en-us/rest/api/oracle/cloud-exadata-infrastructures/get?view=rest-oracle-2023-09-01&tabs=HTTP#cloudexadatainfrastructurelifecyclestate"
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Cloudvmclusters"
geekdocCollapseSection: true
geekdocHidden: false
---

{{< azure-resources-recommendationlist name="azure-resources-recommendationlist" >}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// under-development
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
- description: Ensure ODAA clusters are in Available state under normal operations
aprlGuid: 4b33324a-70cd-4bac-bdae-da4c382c436b
recommendationTypeId: null
recommendationControl: Other Best Practices
recommendationImpact: High
recommendationResourceType: oracle.database/cloudvmclusters
recommendationMetadataState: Active
longDescription: |
Cloud Vm Clusters can be in a number of states during their lifecycle (Available, Failed, Maintenance In Progress, Provisioning, Terminated, Terminating and Updating).
Ensure that the ODAA clusters are in Available state under normal operations.
potentialBenefits: Ensure service is available
pgVerified: false
publishedToLearn: false
automationAvailable: arg
tags: null
learnMoreLink:
- name: Cloud VM Cluster lifecycle
url: "https://learn.microsoft.com/en-us/rest/api/oracle/cloud-vm-clusters/get?view=rest-oracle-2023-09-01&tabs=HTTP#cloudvmclusterlifecyclestate"
18 changes: 18 additions & 0 deletions azure-specialized-workloads/oracle/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: "Oracle Database@Azure"
geekdocCollapseSection: true
geekdocHidden: false
---

## Relevant Azure Resource Recommendations

| Recommendation | Provider Namespace | Resource Type |
| :----------------------------------------------------------------------------------------------- | :----------------: | :-------------: |
| [Ensure ODAA infrastructure is in Available state under normal operations](../../../Azure-Proactive-Resiliency-Library-v2/azure-resources/Oracledatabase/cloudexadatainfrastructures/#) | Oracledatabase | cloudexadatainfrastructures |
| [Ensure ODAA clusters are in Available state under normal operations](../../../Azure-Proactive-Resiliency-Library-v2/azure-resources/Oracledatabase/cloudexadatavmclusters/#) | Oracledatabase | cloudvmclusters |

<br>

## General Workload Guidance

{{< azure-specialized-workloads-recommendationlist name="azure-specialized-workloads-recommendationlist" >}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// cannot-be-validated-with-arg
201 changes: 201 additions & 0 deletions azure-specialized-workloads/oracle/recommendations.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
- description: Implement a regional replication strategy for Oracle to meet your workload requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be more architectural guidance than a configuration for the Oracle resource that can be configured to improve the resiliency of the Oracle resource deployment. As such, wouldn't this better be addressed through WAF guidance?

aprlGuid: dfeb9c7a-7dae-4751-9625-b23a7160a3e1
recommendationTypeId: null
recommendationControl: Business Continuity
recommendationImpact: High
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription:
Regional replication is a key strategy to ensure business continuity and disaster recovery for your Oracle workloads. Implement a regional replication strategy to meet your workload requirements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be more architectural guidance than a configuration for the Oracle resource that can be configured to improve the resiliency of the Oracle resource deployment. As such, wouldn't this better be addressed through WAF guidance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recommendation is agreed by relevant teams and as such we would like to keep it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @ehaslett this reads as a generic best practice rather than an actionable reliability recommendation. We also already have WAF recommendations RE:05 that cover the same topic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Oracle Maximum Availability Architecture for Oracle Database@Azure Gold Architecture shows that ODAA can be configured for either zonal or regional replication. For regional replication the customer would setup an infrastructure/cluster in both regions then either use Data Guard (active/passive) or Golden Gate (active/active) to replicate either across regions or zones.

@ehaslett or @ejhenry please describe why this is viewed as a best practice and not an actionable reliability recommendation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Oracle Maximum Availability Architecture for Oracle Database@Azure Gold Architecture shows that ODAA can be configured for either zonal or regional replication. For regional replication the customer would setup an infrastructure/cluster in both regions then either use Data Guard (active/passive) or Golden Gate (active/active) to replicate either across regions or zones.

@ehaslett or @ejhenry please describe why this is viewed as a best practice and not an actionable reliability recommendation.

I would ask these questions about the recommendation to determine if it is actionable in terms of APRL guidance and remediation:

  • Can an ARG query be formed to uncover Azure resource configuration(s) that does not meet this guidance?
  • Can a Azure resource configuration that does not meet this guidance be remediated through an Azure resource configuration change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terrymandin I have to defer to @oZakari and @rodrigosantosms on this

Copy link
Collaborator

@oZakari oZakari Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @terrymandin, it is okay if you add recommendations that cannot be added validated with ARG. If they are valid recommendations that should be checked during a WARA engagement for Oracle, then I think they should be added in.

However, we don't have a field dedicated for remediation, so please just incorporate a link to remediation guidance
in the learnMoreLink field.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ehaslett , based on @oZakari feedback indicating that we can include recommendations that cannot be validated with ARG, could you please review the items below and let us know if we we check them in if a remediation link is added?

Copy link
Contributor

@ehaslett ehaslett Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terrymandin I would again have to defer to @ejhenry, @rodrigosantosms and @oZakari about whether or not it is appropriate to evaluate an Oracle configuration (i.e. Oracle portal) vs an Azure configuration, and then also provide remediation guidance for something outside of Azure (i.e. Oracle portal).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oZakari could you review this PR. Some of the recommendations were originally rejected as ARG queries could not be written for them. Your feedback indicates that recommendations do not necessarily need ARG queries. In the comment above @ehaslett is deferring to yourself, Eric or Zach as to whether or not they can be included. Would you be willing to re-review them?

If possible, can you also re-open the PR? It was closed by a bot.

potentialBenefits: Ensure business continuity in case of regional failure.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Business continuity and disaster recovery considerations for Oracle Database@Azure
url: "https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/oracle-iaas/oracle-disaster-recovery-oracle-database-azure"
- name: Learn about Oracle Maximum Availability Architecture for Oracle Database@Azure
url: "https://docs.oracle.com/en/solutions/oracle-maa-db-at-azure/#GUID-7723E2B1-9588-40BC-88BE-44637B1AF0D9"
- name: Oracle Database@Azure Evaluations by Oracle MAA
url: "https://docs.oracle.com/en/database/oracle/oracle-database/21/haovw/db-azure1.html#GUID-91572193-DF8E-4D7A-AF65-7A803B89E840"


- description: Implement a strategy for resiliency that is tailored to your Oracle workload requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be more architectural guidance than a configuration for the Oracle resource that can be configured to improve the resiliency of the Oracle resource deployment. As such, wouldn't this better be addressed through WAF guidance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recommendation is agreed by relevant teams and as such we would like to keep it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @ehaslett this reads as a generic best practice rather than an actionable reliability recommendation. We also already have WAF recommendations RE:05 that cover the same topic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basimolimajeed , due to the APRL requirements @ehaslett mentions above, this should be moved to WAF.

aprlGuid: e2750bd4-d12c-409a-bb56-49238d9a8013
recommendationTypeId: null
recommendationControl: Business Continuity
recommendationImpact: High
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription:
Define a BCDR strategy based on the workload's RPO/RTO requirements.
Oracle has set of best practices and reference architectures tailored to meet various workload RPO/RTO requirements. This framework is known as the Oracle Maximum Availability Architecture (MAA).
By default, The Oracle Database@Azure solution follows a Silver level reference architecture. The primary database residing in ExaDB-D provides high availability, data protection, elasticity, and scalability benefits, however it is vulnerable to any AZ failure.
Gold service level reference architectures with Oracle Database@Azure is recommended. It could be within the same Azure Region or across one or more Azure Regions.
potentialBenefits: The cloud MAA architecture achieves data protection and DR.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Business continuity and disaster recovery considerations for Oracle Database@Azure
url: "https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/oracle-iaas/oracle-disaster-recovery-oracle-database-azure"
- name: Learn about Oracle Maximum Availability Architecture for Oracle Database@Azure
url: "https://docs.oracle.com/en/solutions/oracle-maa-db-at-azure/#GUID-7723E2B1-9588-40BC-88BE-44637B1AF0D9"
- name: Oracle Database@Azure Evaluations by Oracle MAA
url: "https://docs.oracle.com/en/database/oracle/oracle-database/21/haovw/db-azure1.html#GUID-91572193-DF8E-4D7A-AF65-7A803B89E840"

- description: Implement a backup and restore strategy for Oracle databases to meet your workload requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be more architectural guidance than a configuration for the Oracle resource that can be configured to improve the resiliency of the Oracle resource deployment. As such, wouldn't this better be addressed through WAF guidance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recommendation is agreed by relevant teams and as such we would like to keep it. Please reach on Teams if you need further discussion with the relevant team experts. thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @ehaslett this reads as a generic best practice rather than an actionable reliability recommendation. We also already have WAF recommendation RE:09 that covers the same topic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basimolimajeed , due to the APRL requirements @ehaslett mentions above, this should be moved to WAF.

aprlGuid: 0583239a-dfb5-44d4-94db-804bfc8e3bd1
recommendationTypeId: null
recommendationControl: Business Continuity
recommendationImpact: High
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription: |
When you configure automatic backup to Autonomous Recovery Service or Object Storage Service in OCI, backup copies provide additional protection. Oracle Recovery Manager (RMAN) validates cloud database backups for any physical corruptions.
potentialBenefits: Provide workload data protection.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Business continuity and disaster recovery considerations for Oracle Database@Azure
url: "https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/oracle-iaas/oracle-disaster-recovery-oracle-database-azure"
- name: Learn about Oracle Maximum Availability Architecture for Oracle Database@Azure
url: "https://docs.oracle.com/en/solutions/oracle-maa-db-at-azure/#GUID-7723E2B1-9588-40BC-88BE-44637B1AF0D9"

- description: Scale up the VM cluster based on the workload requirement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this might be better stated in terms of monitoring the VM cluster and scaling based on alerting on thresholds set via monitoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will discuss with the experts

aprlGuid: 2322a597-a6af-4c3e-a1b1-d1b1ddead508
recommendationTypeId: null
recommendationControl: Scalability
recommendationImpact: Medium
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription: |
Oracle RAC is ideal for high-volume applications or consolidated environments where scalability and the ability to dynamically add or re-prioritize capacity across more than a single server are required. An individual database may have instances running on one or more nodes of a cluster. Similarly, a database service may be available on one or more database instances. Additional nodes, database instances, and database services can be provisioned online.
Auto-scaling is not avaialble on ODAA.
Be mindful that the capacity of Vm cluster is deployed according to the workload requirements.
If scaling up is needed, scale up the number of OCPUs on the VM cluster, if there is capacity available on the Exa infra.
potentialBenefits: Meet workload scalability requirements.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Learn about Oracle Maximum Availability Architecture for Oracle Database@Azure
url: "https://docs.oracle.com/en/solutions/oracle-maa-db-at-azure/#GUID-7723E2B1-9588-40BC-88BE-44637B1AF0D9"

- description: Plan and implement IP addressing strategy to meet current and future requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be more architectural guidance than a configuration for the Oracle resource that can be configured to improve the resiliency of the Oracle resource deployment. As such, wouldn't this better be addressed through WAF guidance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recommendation is agreed by relevant teams and as such we would like to keep it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @ehaslett this reads as a generic best practice rather than an actionable reliability recommendation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basimolimajeed , due to the APRL requirements @ehaslett mentions above, this should be moved to WAF.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

being discussed

aprlGuid: 1bfdf86c-f501-4ad9-99a7-b29b736f34dc
recommendationTypeId: null
recommendationControl: Other Best Practices
recommendationImpact: High
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription: |
It is important to design an IP addressing scheme taking into consideration the current and future requirements of your Oracle workloads. This will help you to avoid IP address conflicts and ensure that your Oracle workloads are always available.
Primary, standby, client and backup subnets should be designed without overlapping IP CIDR ranges.
potentialBenefits: Avoid conflicts in IP addressing.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Plan for IP Address Space
url: "https://docs.oracle.com/en-us/iaas/Content/database-at-azure/oaa_ip.htm"

- description: Only one ODAA delegated subnet must exist within the VNet
oZakari marked this conversation as resolved.
Show resolved Hide resolved
aprlGuid: 76c9136c-642d-4ea3-a4f5-655f28d2ee07
recommendationTypeId: null
recommendationControl: Other Best Practices
recommendationImpact: High
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription: |
An Azure VNet must have only one ODAA delegated subnet. However other delegated subnets can exist for other services e.g. Azure NetApp Files.
potentialBenefits: Avoid multiple ODAA delegated subnets within a VNet.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Network Planning for Oracle Database@Azure
url: "https://learn.microsoft.com/en-us/azure/oracle/oracle-db/oracle-database-network-plan#constraints"

- description: Deploy Data Guard observer in a redundant manner
aprlGuid: 768a5b06-41d4-4f10-b544-fbd2f6999af4
recommendationTypeId: null
recommendationControl: Business Continuity
recommendationImpact: Medium
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription: |
Deploy Data Guard observer nodes in different AZs and make sure that an observer node will always stay up if anything happens to the Production deployment.
potentialBenefits: Data Guard observer automates database failover.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Learn More
url: "https://www.oracle.com/technical-resources/articles/smiley-fsfo.html
https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/configure-and-deploy-oracle-data-guard.html#GUID-FA4AAC8F-EDA8-489F-9168-E83AE23B86F7"

- description: When using own encryption keys with OKV as VMs in Azure, set up VMs in a redundant manner.
aprlGuid: bbe4014f-c49d-475d-9c48-76cb3c190483
recommendationTypeId: null
recommendationControl: High Availability
recommendationImpact: High
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription: |
Provide redundancy for VMs used as OKV, note that this is only relevant if customer is using own keys and only if OCI vault is not used. Minimum 4 node OKV cluster deployment is advised.
potentialBenefits: Protect access to keys in case of VM or AZ failure.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Deploy Oracle Key Vault for Oracle Database@Azure
url: "https://docs.oracle.com/en/solutions/deploy-key-vault-database-at-azure/index.html#GUID-3C967419-6461-470C-AC86-07F419CDF967"

- description: Ensure that the application tier spans at least two availability zones.
aprlGuid: 2ce48b43-bc8f-4b9e-850e-6cf827592daa
recommendationTypeId: null
recommendationControl: Business Continuity
recommendationImpact: High
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription:
The application tier spans at least two availability zones which are the same AZs where ODAA is deployed.
potentialBenefits: Azure Availability Zones provide high availability.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Business continuity and disaster recovery considerations for Oracle Database@Azure
url: "https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/oracle-iaas/oracle-disaster-recovery-oracle-database-azure"
- name: Learn about Oracle Maximum Availability Architecture for Oracle Database@Azure
url: "https://docs.oracle.com/en/solutions/oracle-maa-db-at-azure/#GUID-7723E2B1-9588-40BC-88BE-44637B1AF0D9"

- description: Ensure Infrastructure is updated in a rolling manner.
aprlGuid: 02bfe908-d958-451a-a603-bef8277ae56a
recommendationTypeId: null
recommendationControl: Other Best Practices
recommendationImpact: High
recommendationResourceType: Specialized.Workload/oracle
recommendationMetadataState: Active
longDescription: |
You can select non-rolling maintenance to update database and storage servers in parallel. However, non-rolling maintenance incurs a full system downtime.
Automatic infrastructure maintenance occurs each quarter. Oracle will notify you of the exact date and time of your maintenance a few weeks in advance. You can change the automatically chosen date and time at any time before the maintenance starts.
potentialBenefits: Keep Oracle workloads up-to-date.
pgVerified: false
publishedToLearn: false
automationAvailable: no
tags: null
learnMoreLink:
- name: Learn More
url: "https://docs.oracle.com/en-us/iaas/exadatacloud/exacs/exa-conf-oracle-man-infra.html#ECSCM-GUID-3274A857-5AE3-468E-950B-CB1D61DE48A9"
Loading