Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1566: ipfix: make RTT optional #630

Merged
merged 1 commit into from
Mar 15, 2024

Conversation

jotak
Copy link
Member

@jotak jotak commented Mar 13, 2024

Description

  • refactor IPFIX fields mapping / definition
  • allow optional fields
  • add interfaces and directions (plural) fields to IPFIX template
  • add tests for partial records & non-enriched records

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Mar 13, 2024

@jotak: This pull request references NETOBSERV-1566 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.16.0" version, but no target version was set.

In response to this:

Description

  • refactor IPFIX fields mapping / definition
  • allow optional fields
  • add tests for partial records & non-enriched records

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jotak jotak force-pushed the ipfix-rtt-optional branch from 5c1739a to 1a3119a Compare March 13, 2024 13:28
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Mar 13, 2024

@jotak: This pull request references NETOBSERV-1566 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.16.0" version, but no target version was set.

In response to this:

Description

  • refactor IPFIX fields mapping / definition
  • allow optional fields
  • add interfaces and directions (plural) fields to IPFIX template
  • add tests for partial records & non-enriched records

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Comment on lines +146 to +157
"flowDirection": {
Key: "IfDirections",
Setter: func(elt entities.InfoElementWithValue, rec any) {
if dirs, ok := rec.([]int); ok && len(dirs) > 0 {
elt.SetUnsigned8Value(uint8(dirs[0]))
}
},
Matcher: func(elt entities.InfoElementWithValue, expected any) bool {
ifdirs := expected.([]int)
return int(elt.GetUnsigned8Value()) == ifdirs[0]
},
},
Copy link
Collaborator

@jpinsonneau jpinsonneau Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to manage reinterpret direction here ?

We could check for FlowDirection field first and fallback on IfDirections if not found for example

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But our agent always provides IfDirections, right? Or there are cases where it would fail?

Like you said here #630 (comment) ideally the mapping should be configurable, but that's a quite bigger refactoring. Until we do that, I'd prefer to stick with what the agent does to keep it simple .. does it make sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent always provide IfDirections using gRPC, it doesn't when using IPFIX.
FLP can ingest IPFIX from eBPF or external source, in these cases we don't have this IfDirections field

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's not something we cover today, to have the agent sending IPFIX flows to FLP. Just to make sure I did a try, and it seems far from working (cf #632). This whole exporter implementation is currently very tied to the usage via the operator, for instance with a lot of assumptions about the types in the generic map.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default:
assert.Fail(t, "missing check on element", element.GetName())
name := element.GetName()
mapping, ok := write.MapIPFIXKeys[name]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, should this be configurable ?

I feel the hardcoded mapping makes it really specific to our usage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, it should read some of the mapping from the k8s rules config, but I wasn't up for doing this refactoring just for this little bug fix :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure let's just add a TODO here and maybe create a JIRA task if it's relevent for us

Copy link
Member Author

@jotak jotak Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we will ever need that in the operator, it's unlikely that we use different mechanisms to get agent's flows imho; so I opened an upstream issue: #632

Copy link

codecov bot commented Mar 13, 2024

Codecov Report

Attention: Patch coverage is 0% with 113 lines in your changes are missing coverage. Please review.

Project coverage is 67.44%. Comparing base (fee143f) to head (c57a686).

Files Patch % Lines
pkg/pipeline/write/write_ipfix.go 0.00% 113 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #630      +/-   ##
==========================================
+ Coverage   67.05%   67.44%   +0.38%     
==========================================
  Files         110      110              
  Lines        7677     7633      -44     
==========================================
  Hits         5148     5148              
+ Misses       2217     2173      -44     
  Partials      312      312              
Flag Coverage Δ
unittests 67.44% <0.00%> (+0.38%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@OlivierCazade OlivierCazade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Amoghrd
Copy link
Contributor

Amoghrd commented Mar 13, 2024

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Mar 13, 2024
Copy link

New image:
quay.io/netobserv/flowlogs-pipeline:a2b8863

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=a2b8863 make set-flp-image

@Amoghrd
Copy link
Contributor

Amoghrd commented Mar 13, 2024

@jotak The export is still failing with ipfix collextor pod showing logs
E0313 17:46:49.328657 1 tcp.go:87] error in decoding message: template 256 with obsDomainID 1 does not exist E0313 17:46:49.328693 1 tcp.go:87] error in decoding message: template 256 with obsDomainID 1 does not exist E0313 17:46:49.328717 1 tcp.go:87] error in decoding message: template 256 with obsDomainID 1 does not exist E0313 17:46:49.328860 1 tcp.go:87] error in decoding message: template 256 with obsDomainID 1 does not exist

@jotak
Copy link
Member Author

jotak commented Mar 14, 2024

@Amoghrd I tried using another FLP instance deployed as a collector and I don't see any error. I'm wondering if this could be due to your collector.
Do you know what exactly is this, that you're using: https://github.com/Amoghrd/go-ipfix/blob/custom-image/build/yamls/ipfix-collector.yaml#L50-L54 ? Is it pre-setting some IPFIX template? That could be the cause of the issue, whener new fields are added to our template, if you force a pre-registered template. cc @jpinsonneau do you know? (since it seems you created this custom image)

FWIW we can also use a custom FLP deployment to ingest IPFIX (with a workflow like:
agent -> usual netobserv' FLP + IPFIX exporter -> Custom FLP)
See my PR #633 for such as deployment

@Amoghrd
Copy link
Contributor

Amoghrd commented Mar 14, 2024

Yeah this template was built with @jpinsonneau help. He had mentioned that a custom image was built to cater to the needs of the IBM folks as far as I remember. But yeah for all IPFIX export testing we have been using this YAML. @jpinsonneau Could you confirm if we should move away from this YAML and use @jotak suggestion of a custom FLP deployment?

@Amoghrd
Copy link
Contributor

Amoghrd commented Mar 14, 2024

Tried the FLP collector and it worked fine.
/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved QE has approved this pull request label Mar 14, 2024
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Mar 14, 2024

@jotak: This pull request references NETOBSERV-1566 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.16.0" version, but no target version was set.

In response to this:

Description

  • refactor IPFIX fields mapping / definition
  • allow optional fields
  • add interfaces and directions (plural) fields to IPFIX template
  • add tests for partial records & non-enriched records

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

- refactor IPFIX fields mapping / definition
- allow optional fields
- add interfaces and directions (plural) fields to IPFIX template
- add tests for partial records & non-enriched records
@jotak jotak force-pushed the ipfix-rtt-optional branch from cef7921 to c57a686 Compare March 15, 2024 10:00
@openshift-ci openshift-ci bot removed the lgtm label Mar 15, 2024
Copy link

openshift-ci bot commented Mar 15, 2024

New changes are detected. LGTM label has been removed.

@jotak
Copy link
Member Author

jotak commented Mar 15, 2024

(rebased)

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Mar 15, 2024
@jotak
Copy link
Member Author

jotak commented Mar 15, 2024

/approve

Copy link

openshift-ci bot commented Mar 15, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jotak jotak merged commit af02097 into netobserv:main Mar 15, 2024
9 of 10 checks passed
@jotak jotak deleted the ipfix-rtt-optional branch March 15, 2024 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved jira/valid-reference qe-approved QE has approved this pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants