Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNF-14144: Add resource server enhancements doc #262

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

alegacy
Copy link
Collaborator

@alegacy alegacy commented Oct 16, 2024

This adds a document describing the proposal to change the resource server architecture to address performance and scalability concerns.

This adds a document describing the proposal to change the resource
server architecture to address performance and scalability concerns.

Signed-off-by: Allain Legacy <[email protected]>
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 16, 2024

@alegacy: This pull request references CNF-14144 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.18.0" version, but no target version was set.

In response to this:

This adds a document describing the proposal to change the resource server architecture to address performance and scalability concerns.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from browsell and edcdavid October 16, 2024 19:32
Copy link

openshift-ci bot commented Oct 16, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from alegacy. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alegacy
Copy link
Collaborator Author

alegacy commented Oct 16, 2024

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 16, 2024
@bartwensley
Copy link
Collaborator

/cc @pixelsoccupied

@openshift-ci openshift-ci bot requested a review from pixelsoccupied October 17, 2024 13:12
attributes will be normalized into the formal model definition. The following subsection lists the extensions currently
stored within the extensions attribute and passed directly onto the public facing API.

#### Resource Pool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need the H/W server resource pools

For nodes from hardware managers:

```json
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would also need to look at exposing nics and accelerators ( FEC, GPUS etc)

"cores": "TBD",
"bios": "TBD",
"memory": "TBD"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want the server variant here as well

This feature depends on the Postgres service being deployed, This work is to be completed as part of the alarm server
feature development {TBD Reference}.

## Consolidation of microservices
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Addresses PR comments

Signed-off-by: Allain Legacy <[email protected]>
Comment on lines +242 to +243
Ideally, since re-syncing is a resource intensive process, it would be best to not do a periodic sync for any data
source which supports an asynchronous event notifications mechanism. However, given that these other software
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation details...but if all these resources are k8s CRs, we can simply setup a watch on them with client-go? We basically setup a listen with SharedInformerFactory and just react on based on events (add/delete/etc). This way we dont really need to keep track of generation IDs and such (k8s will take care of that for us)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they won't be k8s CRs in most cases because we'll be importing data from ACM, and h/w managers... so direct polling will be required.

Comment on lines +351 to +354
- The resource server will push any changes to the resource type table to the alarm server
- The resource server will pull the latest alarm dictionary data on startup
- The alarm server will push any changes to the alarm dictionary table to the resource server
- The alarm server will pull the latest resource type data on startup.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool basically we will fill in the details for each other. Alarms will ask something like /ResourceType GET...and then we generate the AlarmDictionary rows and respond back with ResourceType and its corresponding AlarmDictionary filled out. We can probably reuse the exact the O-RAN payload for req and response to keep it simple.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we can probably use the 'standard' API endpoints... maybe.

);

-- Table: resource_type
CREATE TABLE IF NOT EXISTS resource_type (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will you keep a copy of the AlarmDictionary? But maybe a call to alarm server will do the trick. Basically what happens when client requests to list all the resourceType after the server is ready?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as discussed we'll keep a cached copy of the alarm dictionary/definitions and refresh it periodically and based on notifications from the alarm server. See diagram.

Copy link

openshift-ci bot commented Nov 29, 2024

@alegacy: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint 329860d link true /test markdownlint

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants