Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data filter in kernel #4324

Merged
merged 7 commits into from
Dec 13, 2024
Merged

Conversation

rscampos
Copy link
Collaborator

@rscampos rscampos commented Sep 24, 2024

1. Explain what the PR does

03b6b4d docs(filters): add restrictions when kernel data filter is used
97dd25f test(filters): kernel data filter
4030594 feat(filters): equalities for kernel data filter
0d98252 feat(filters): eBPF management for kernel data filter
0c31dbe feat(ebpf): enable kernel data filter in eBPF program
86ac860 feat(ebpf): kernel data filter logic
a499bc1 feat(ebpf): Enable BPF_F_NO_PREALLOC for LPM TRIE

03b6b4d docs(filters): add restrictions when kernel data filter is used

- Add the restrictions applicable when the kernel-space data filter is
available for an event field.

97dd25f test(filters): kernel data filter

- Add MatchTypes{} and KernelDataFilter{} in cmp.AllowUnexported;
- Kernel data filters restrict pathnames to 255 characters and
disallow 'contains' filters; unit tests have been added to validate
these restrictions;
- Integration tests for specific events added, covering three filter
types with "equal" and "not equal" conditions.

4030594 feat(filters): equalities for kernel data filter

- method equalities created for data filter;
- method computeDataFilterEqualities created for kernel data filter;
- handle corner case when one policy uses a substring (path) of another
  policy;
- disable data filter (only pathname) for selected events;
- Kernel data filters restrict pathnames to 255 characters and disallow
'contains' filters - added functions to enforce that.

0d98252 feat(filters): eBPF management for kernel data filter

- eBPF map definition for exact, prefix, suffix match;
- create updateDataFilterLPMBPF and updateDataFilterBPF to populate eBPF
  maps;
- config map fields for exact, prefix and suffix;
- Create the function createNewDataFilterMapsVersion in order to create
the inner maps based on version and event id.

0c31dbe feat(ebpf): enable kernel data filter in eBPF program

- how to enable data filter in the eBPF program using the function
evaluate_data_filters.

86ac860 feat(ebpf): kernel data filter logic

- function load_str_from_buf created to retrieve str value based on index;
- function reverse_string created to revert an string in order to enable suffix;
- function evaluate_data_filters/match_data_filters created to apply: exact, prefix and suffix match;
- eBPF maps for exact, prefix and suffix. eBPF map for hold temporary LPM TRI key;
- extend event_config to have data filter config per event: used for exact, prefix and suffix match;
- save offset at the specified index in the function save_str_to_buf.

2. Explain how to test it

The method for defining data filters in Tracee remains the same. However, for the security_file_open and magic_write events, if the pathname is used as a filter, the event is now filtered at the eBPF data plane, preventing it from being sent to user space for filtering.

Notes for the reviewer: The following sections contain commands I used to test with policies. The results for each test group are also included. Both the policies and results are located in the zip file provided in each section. The results of some tests may vary depending on the Linux version and libraries, especially when the "not equal" operator is used.

Only exact match

Tracee

sudo ./dist/tracee -p examples/policies/sfo-exact-1.yaml -p examples/policies/sfo-exact-2.yaml -p examples/policies/sfo-exact-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"enabled_policies": 7,
"data_filter_prefix_enabled": 0,
"data_filter_suffix_enabled": 0,
"data_filter_exact_enabled": 7,
"data_filter_prefix_match_if_key_missing": 0,
"data_filter_suffix_match_if_key_missing": 0,
"data_filter_exact_match_if_key_missing": 4,
...
## dump data_filter_exact
% sudo bpftool map dump id 24016
[{
        "key": {
            "path": "/etc/networks"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 6
        }
    },{
        "key": {
            "path": "/etc/netconfig"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 1
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_exact.json):

% more /etc/netconfig # json line 1 (sfo-exact-match-1)
% more /etc/networks # json line 2 (sfo-exact-match-2)
% cat /etc/networks # json line 3,4,5 (sfo-exact-match-3)

exactly_policies_results.zip

Only prefix match

Tracee

sudo ./dist/tracee -p examples/policies/sfo-prefix-1.yaml -p examples/policies/sfo-prefix-2.yaml -p examples/policies/sfo-prefix-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"enabled_policies": 7,
"data_filter_prefix_enabled": 7,
"data_filter_suffix_enabled": 0,
"data_filter_exact_enabled": 0,
"data_filter_prefix_match_if_key_missing": 4,
"data_filter_suffix_match_if_key_missing": 0,
"data_filter_exact_match_if_key_missing": 0,
...
## dump data_filter_prefix
% sudo bpftool map dump id 24583
[{
        "key": {
            "prefix_len": 96,
            "path": "/etc/network"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 5
        }
    },{
        "key": {
            "prefix_len": 72,
            "path": "/etc/pass"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 2
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_prefix.json):

% more /etc/networks # json line 1 (sfo-prefix-match-1)
% sudo cp /etc/networks /etc/networks.bkp; more /etc/networks.bkp # json line 2 (sfo-prefix-match-1)
% more /etc/passwd # json line 3 (sfo-prefix-match-2)
% cat /etc/networks # json line 4,5,6 (sfo-prefix-match-3)

prefix_policies_results.zip

Only suffix match

Tracee

sudo ./dist/tracee -p examples/policies/sfo-suffix-1.yaml -p examples/policies/sfo-suffix-2.yaml -p examples/policies/sfo-suffix-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"enabled_policies": 7,
"data_filter_prefix_enabled": 0,
"data_filter_suffix_enabled": 7,
"data_filter_exact_enabled": 0,
"data_filter_prefix_match_if_key_missing": 0,
"data_filter_suffix_match_if_key_missing": 4,
"data_filter_exact_match_if_key_missing": 0,
...
## dump data_filter_suffix
% sudo bpftool map dump id 24583
[{
        "key": {
            "prefix_len": 48,
            "path": "dwssap"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 2
        }
    },{
        "key": {
            "prefix_len": 72,
            "path": "gifnocten"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 5
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_suffix.json):

% more /etc/netconfig # json line 1 (sfo-suffix-match-1)
% cp /etc/netconfig /tmp/netconfig; more /tmp/netconfig # json line 2 (sfo-suffix-match-1)
% more /etc/passwd # json line 3 (sfo-suffix-match-2)
% cat /etc/netconfig # json line 4,5,6 (sfo-suffix-match-3)

suffix_policies_results.zip

Mixed (exact/prefix/suffix) match

In this section, you can see all string matches working together. The command cat /etc/netconfig triggers three policies simultaneously, while the command cat /etc/host.conf triggers two policies.

Tracee

sudo ./dist/tracee -p examples/policies/sfo-exact-5.yaml -p examples/policies/sfo-prefix-4.yaml -p examples/policies/sfo-suffix-4.yaml -p examples/policies/sfo-suffix-5.yaml -p examples/policies/sfo-prefix-5.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"enabled_policies": 31,
"data_filter_prefix_enabled": 18,
"data_filter_suffix_enabled": 12,
"data_filter_exact_enabled": 1,
"data_filter_prefix_match_if_key_missing": 0,
"data_filter_suffix_match_if_key_missing": 0,
"data_filter_exact_match_if_key_missing": 0,
...
## dump exact
[{
        "key": {
            "path": "/etc/netconfig"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 1
        }
    }
]

## dump prefix
[{
        "key": {
            "prefix_len": 72,
            "path": "/etc/host"
        },
        "value": {
            "equal_in_scopes": 16,
            "equality_set_in_scopes": 16
        }
    },{
        "key": {
            "prefix_len": 64,
            "path": "/etc/net"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 2
        }
    }
]

## dump suffix
[{
        "key": {
            "prefix_len": 40,
            "path": "fnoc."
        },
        "value": {
            "equal_in_scopes": 8,
            "equality_set_in_scopes": 8
        }
    },{
        "key": {
            "prefix_len": 72,
            "path": "gifnocten"
        },
        "value": {
            "equal_in_scopes": 4,
            "equality_set_in_scopes": 4
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_mixed.json):

% cat /etc/network/fan # json line 1 (sfo-prefix-match-4)
% cp /etc/netconfig /tmp/netconfig; cat /tmp/netconfig # json line 2 (sfo-suffix-match-4)
% cat /etc/netconfig # json line 3 (sfo-exact-match-5,sfo-prefix-match-4,sfo-suffix-match-4)
% cat /etc/host.conf # json line 4 (sfo-suffix-match-5,sfo-prefix-match-5)
% cat /etc/sysctl.conf # json line 5 (sfo-suffix-match-5)

mixed_policies_results.zip

Mixed (exact/prefix/suffix) match (same policy)

In this section, you can see all string matches working together in the same policy. The command more /etc/netconfig triggers three policies simultaneously.

Tracee

sudo ./dist/tracee -p examples/policies/sfo-mixed-pol-1.yaml -p examples/policies/sfo-mixed-pol-2.yaml -p examples/policies/sfo-mixed-pol-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"enabled_policies": 7,
"data_filter_prefix_enabled": 3,
"data_filter_suffix_enabled": 6,
"data_filter_exact_enabled": 7,
"data_filter_prefix_match_if_key_missing": 3,
"data_filter_suffix_match_if_key_missing": 4,
"data_filter_exact_match_if_key_missing": 6,
...
## dump exact
[{
        "key": {
            "str": "/etc/ld.so.cache"
        },
        "value": {
            "equals_in_policies": 1,
            "key_used_in_policies": 5
        }
    },{
        "key": {
            "str": "/etc/netconfig"
        },
        "value": {
            "equals_in_policies": 0,
            "key_used_in_policies": 2
        }
    }
]

## dump prefix
[{
        "key": {
            "prefix_len": 64,
            "str": "/etc/net"
        },
        "value": {
            "equals_in_policies": 1,
            "key_used_in_policies": 1
        }
    },{
        "key": {
            "prefix_len": 72,
            "str": "/usr/lib/"
        },
        "value": {
            "equals_in_policies": 0,
            "key_used_in_policies": 3
        }
    }
]

## dump suffix
[{
        "key": {
            "prefix_len": 120,
            "str": "3.6.os.ofnitbil"
        },
        "value": {
            "equals_in_policies": 0,
            "key_used_in_policies": 4
        }
    },{
        "key": {
            "prefix_len": 72,
            "str": "6.os.cbil"
        },
        "value": {
            "equals_in_policies": 0,
            "key_used_in_policies": 4
        }
    },{
        "key": {
            "prefix_len": 88,
            "str": "ehcac.os.dl"
        },
        "value": {
            "equals_in_policies": 2,
            "key_used_in_policies": 2
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_mixed_same_policy.json):

% more /etc/netconfig

Results: results_mixed_same_policy.json

json line 1 (sfo-mixed-pol-1, sfo-mixed-pol-2)
json line 2 (sfo-mixed-pol-3)
json line 3 (sfo-mixed-pol-3)
json line 4 (sfo-mixed-pol-1, sfo-mixed-pol-3)
json line 5 (sfo-mixed-pol-3)

results_mixed_same_policy_results.zip

Ensuring Multiple Policy Matches when LPM Trie is used - Prefix

Corner case description: When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers /etc/net* and policy2 covers /etc/netconf*, a lookup for /etc/netconfig currently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2.

A potential solution (implemented): If one suffix or prefix overlaps with another, we can simply combine their bitmaps in user space. No additional logic is required in kernel space to handle this corner case.

Tracee

sudo ./dist/tracee -p examples/policies/cc-sfo-prefix-1.yaml -p examples/policies/cc-sfo-prefix-2.yaml -p examples/policies/cc-sfo-prefix-3.yaml -p examples/policies/cc-sfo-prefix-4.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"enabled_policies": 15,
"data_filter_prefix_enabled": 15,
"data_filter_suffix_enabled": 0,
"data_filter_exact_enabled": 0,
"data_filter_prefix_match_if_key_missing": 8,
"data_filter_suffix_match_if_key_missing": 0,
"data_filter_exact_match_if_key_missing": 0,
...
## dump prefix
[{
        "key": {
            "prefix_len": 96,
            "path": "/etc/netconf"
        },
        "value": {
            "equal_in_scopes": 3,
            "equality_set_in_scopes": 11
        }
    },{
        "key": {
            "prefix_len": 96,
            "path": "/etc/network"
        },
        "value": {
            "equal_in_scopes": 7,
            "equality_set_in_scopes": 7
        }
    },{
        "key": {
            "prefix_len": 64,
            "path": "/etc/net"
        },
        "value": {
            "equal_in_scopes": 3,
            "equality_set_in_scopes": 3
        }
    },{
        "key": {
            "prefix_len": 48,
            "path": "/etc/n"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 1
        }
    },{
        "key": {
            "prefix_len": 72,
            "path": "/usr/lib/"
        },
        "value": {
            "equal_in_scopes": 0,
            "equality_set_in_scopes": 8
        }
    }
]

Note: Policy 4 includes lines to exclude library entries from the output. These lines are solely for cleaning up the output to simplify the testing in this section.

Policy 2 (prefix /etc/net) overlaps with Policy 1 (prefix /etc/n), as /etc/n is a substring of /etc/net. This is why the key with the path "/etc/net" has equality_set_in_scopes=3, indicating that both Policy 1 and Policy 2 are part of the same equality set. Additionally, equal_in_scopes=3 shows that Policy 1 and Policy 2 are considered equal in their scopes.

Policy 3 (prefix /etc/network) encompasses both Policy 1 (prefix /etc/n) and Policy 2 (prefix /etc/net). Consequently, the key with the path "/etc/network" has equality_set_in_scopes=7, which signifies that all three policies are present within the same scope. Similarly, equal_in_scopes=7 indicates that Policy 1, Policy 2, and Policy 3 are equal in scopes.

Policy 4 (prefix /etc/netconf) also includes both Policy 1 (prefix /etc/n) and Policy 2 (prefix /etc/net). Therefore, the key with the path "/etc/netconf" has equality_set_in_scopes=11, which means that Policy 1, Policy 2, and Policy 4 are all part of the same scope. However, because Policy 4 was defined with the condition data.pathname!=/etc/netconf, equal_in_scopes=3, meaning that only Policy 1 and Policy 2 are considered equal in scopes, while Policy 4 is excluded from that equality.

In summary, Policy 2, Policy 3, and Policy 4 derive bits from other policies, reflecting their interdependencies and overlaps in scope.

Cmds

The results of each of the following lines are in the JSON file (results_corner_case_prefix.json):

% more /etc/netconfig; json line 1 (cc-sfo-prefix-match-1; cc-sfo-prefix-match-2)
% more /etc/networks; json line 2 (cc-sfo-prefix-match-2; cc-sfo-prefix-match-3; cc-sfo-prefix-match-4; cc-sfo-prefix-match-1)
% sudo cp /etc/netconfig /etc/na; more /etc/na; json line 3 (cc-sfo-prefix-match-4; cc-sfo-prefix-match-1)

cc_prefix_policies_results.zip

Ensuring Multiple Policy Matches when LPM Trie is used - Suffix

Tracee

sudo ./dist/tracee -p examples/policies/cc-sfo-suffix-1.yaml -p examples/policies/cc-sfo-suffix-2.yaml -p examples/policies/cc-sfo-suffix-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"enabled_policies": 7,
"data_filter_prefix_enabled": 4,
"data_filter_suffix_enabled": 7,
"data_filter_exact_enabled": 4,
"data_filter_prefix_match_if_key_missing": 4,
"data_filter_suffix_match_if_key_missing": 4,
"data_filter_exact_match_if_key_missing": 4,
...
## dump suffix
[{
        "key": {
            "prefix_len": 104,
            "path": "gifnocten/cte"
        },
        "value": {
            "equal_in_scopes": 3,
            "equality_set_in_scopes": 7
        }
    },{
        "key": {
            "prefix_len": 72,
            "path": "gifnocten"
        },
        "value": {
            "equal_in_scopes": 3,
            "equality_set_in_scopes": 3
        }
    },{
        "key": {
            "prefix_len": 48,
            "path": "gifnoc"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 2
        }
    }
]

Note: Policy 3 includes lines to exclude library entries from the output. These lines are solely for cleaning up the output to simplify the testing in this section.

Policy 1 (suffix netconfig) overlaps with Policy 2 (suffix config), as config is a substring of netconfig. This is why the key with the path "gifnocten" (which is a reversed representation of netconfig) has equality_set_in_scopes=3 and this indicates that both Policy 1 and Policy 2 are contained within the same scope. However, equal_in_scopes=3 shows that Policy 1 and Policy 2 are equal in scopes.

Policy 2 (suffix config) only contains the bitmap of Policy 2 (equality_set_in_scopes=2 and equal_in_scopes=2).

Policy 3 (suffix etc/netconfig) contains both Policy 1 (suffix netconfig) and Policy 2 (suffix config). Therefore, the key with the path "gifnocten/cte" (representing the reverse of etc/netconfig) has equality_set_in_scopes=7. This value indicates that all three policies are present within the scope. However, equal_in_scopes=3 shows that only Policy 1 and Policy 2 are equal in scopes, whereas Policy 3 is disabled in this scope because it was defined using data.pathname!=etc/netconfig.

Cmds

The results of each of the following lines are in the JSON file (results_corner_case_suffix.json):

% more /etc/netconfig; json line 1 (cc-sfo-suffix-match-1; cc-sfo-suffix-match-2)
% more /etc/ssh/ssh_config; json line 2 (cc-sfo-suffix-match-2; cc-sfo-suffix-match-3)
% cp /etc/netconfig /tmp/netaconfig; more /tmp/netaconfig; json line 3 (cc-sfo-suffix-match-2; cc-sfo-suffix-match-3)

cc_suffix_policies_results.zip

Performance results

To measure performance, the bpftool tool was used to collect and export bpf_program_runtime_ns and bpf_program_amount metrics to Grafana for analyzing eBPF program latency. Exporting this data to Grafana allows for measuring latency within a specific time window using the following logic:

rate(bpf_program_runtime_ns[1m]) / rate(bpf_program_amount[1m])

Improvements in eBPF Latency for Some Events (Compared to the Old Version of Tracee):
security_file_open: ~80% improvement (both with and without stress).
security_mmap_file: 74% improvement without stress and 48% with stress.
magic_write: 54% improvement without stress and 0% with stress.

Improvements in Lost Events:
security_file_open: ~97.65% improvement.
security_mmap_file: ~95.04% improvement.
magic_write: ~99.40% improvement.

By applying filters in the eBPF plane, fewer events are placed in the perf buffer, which significantly reduces eBPF latency. The primary benefit of performing filtering in the kernel is the reduction in the number of events added to the perf buffer, leading to improved overall performance.

3. Other comments

TODO

First Phase:

  • Evaluate and integrate all three types of string-based filters simultaneously. Currently, they operate independently and need to be combined.
  • Complete the implementation and testing of filters for exact matches, prefix, and suffix. Including both equal and not equal for each of one these three filters.
  • In the function save_str_to_buf(), add the argument offset based on its index to facilitate direct access for the load_str_from_buf() function.
  • Document the testing steps for exact match, prefix, and suffix filters. Include policies and expected results for the review process.
  • When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers /etc/net* and policy2 covers /etc/netconf*, a lookup for /etc/netconfig currently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2. A potential solution (work in progress) is to combine equality when such a corner case is detected.
  • Document the testing steps for corner case (prefix and suffix) filters. Include policies and expected results for the review process.
  • The filters for exact match, prefix, and suffix work with a total of 128 bytes (including the NULL terminator). Therefore, a maximum of 127 characters can be defined in the policy or provided via the CLI.
  • Improve the method for defining (in user-space) which events should have an in-kernel filter enabled. The current logic was added as a proof of concept and requires rework.
  • Measure performance between filter in user-space (older version) and filter in kernel-space (new version).
  • Create inner maps based on version and event id - in this way event id will be removed from the key.
  • Extend the filters to use 255 characters for exact, prefix and suffix. This is possible after removing event id from key, freeing 4 bytes.
  • Combine the bitmaps (using OR operation) when multiples filters are used in the same policy (match_data_filters function);
  • Need to add some validations/warning for event ids that has data kernel filter (e.g.: disable filter contains or 255 characters filter limit);
  • Need to add unit tests and integration tests;
  • Update docs.

Second Phase:

  • The data kernel filter is currently enabled for security_file_open, magic_write, and security_mmap_file. Extend support to other events.
  • Currently the index for retrieving the pathname in evaluate_data_filters is explicitly defined. While this works, it would be better to dynamically retrieve the index based on the event ID for greater flexibility.
  • Currently only string filter is allowed. Design and implement a generic approach for other type of filters.

Copy link
Member

@geyslan geyslan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a first pass (skimming) review.

Tested and working with:

sudo ./dist/tracee -e security_file_open.data.pathname=/etc/passwd
sudo ./dist/tracee -e security_file_open.data.pathname='/etc/passwd*'
sudo ./dist/tracee -e security_file_open.data.pathname='*passwd-'

Amazing, @rscampos! 👏🏼

pkg/utils/utils.go Show resolved Hide resolved
pkg/policy/ebpf.go Outdated Show resolved Hide resolved
pkg/filters/data.go Show resolved Hide resolved
pkg/filters/data.go Outdated Show resolved Hide resolved
pkg/filters/data.go Outdated Show resolved Hide resolved
pkg/ebpf/c/types.h Outdated Show resolved Hide resolved
pkg/ebpf/c/types.h Outdated Show resolved Hide resolved
pkg/ebpf/c/tracee.bpf.c Show resolved Hide resolved
@rscampos rscampos force-pushed the data_filter_in_kernel branch 2 times, most recently from ca7b783 to 5187326 Compare October 7, 2024 21:42
@rscampos
Copy link
Collaborator Author

rscampos commented Oct 7, 2024

@geyslan I've pushed some changes to how we retrieve the string from args in args_buffer_t. To make it work, I added a field to args_buffer_t and modified the save_str_to_buf function.

@rscampos rscampos force-pushed the data_filter_in_kernel branch 2 times, most recently from 295957e to 4b53fe8 Compare October 10, 2024 19:33
@itaysk
Copy link
Collaborator

itaysk commented Oct 12, 2024

When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers /etc/net* and policy2 covers /etc/netconf*, a lookup for /etc/netconfig currently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2. A potential solution (work in progress) is to combine equality when such a corner case is detected.

This is important, it could have been a security vulnerability.
If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.

@rscampos
Copy link
Collaborator Author

This is important, it could have been a security vulnerability.
If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.

Thank you for commenting on this, @itaysk. If filtering in the kernel isn't possible, I'll definitely try this solution.

@rscampos rscampos force-pushed the data_filter_in_kernel branch from 4b53fe8 to 20842f5 Compare October 18, 2024 12:20
@rscampos rscampos force-pushed the data_filter_in_kernel branch 3 times, most recently from f82b46d to a600fb5 Compare October 18, 2024 14:22
@yanivagman
Copy link
Collaborator

yanivagman commented Oct 18, 2024

This is important, it could have been a security vulnerability.
If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.

Thank you for commenting on this, @itaysk. If filtering in the kernel isn't possible, I'll definitely try this solution.

There is no need to filter in userspace for such a corner case. It is possible to define the result of the longest prefix match to contain the results of the policies with the sub strings as well, since the return value of the map is matched policies (or matched rules in the future)

@rscampos rscampos force-pushed the data_filter_in_kernel branch 2 times, most recently from e491453 to c87ab81 Compare October 23, 2024 20:25
Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good Raphael!
I did a first review on this draft, looking forward to see it merged

pkg/ebpf/c/common/buffer.h Outdated Show resolved Hide resolved
pkg/ebpf/c/common/consts.h Show resolved Hide resolved
pkg/ebpf/c/common/buffer.h Outdated Show resolved Hide resolved
pkg/ebpf/c/types.h Outdated Show resolved Hide resolved
pkg/ebpf/c/common/buffer.h Outdated Show resolved Hide resolved
pkg/ebpf/c/common/filtering.h Outdated Show resolved Hide resolved
pkg/ebpf/c/common/filtering.h Outdated Show resolved Hide resolved
pkg/policy/ebpf.go Outdated Show resolved Hide resolved
pkg/policy/ebpf.go Outdated Show resolved Hide resolved
@rscampos rscampos force-pushed the data_filter_in_kernel branch 4 times, most recently from 0cf3a5d to 9097b76 Compare November 13, 2024 15:02
@rscampos rscampos marked this pull request as ready for review November 13, 2024 19:51
@rscampos rscampos force-pushed the data_filter_in_kernel branch 2 times, most recently from c6469ce to 86b4d0f Compare November 14, 2024 10:30
pkg/ebpf/c/common/buffer.h Outdated Show resolved Hide resolved
pkg/ebpf/c/common/buffer.h Outdated Show resolved Hide resolved
pkg/ebpf/c/common/consts.h Outdated Show resolved Hide resolved
pkg/ebpf/c/common/filtering.h Outdated Show resolved Hide resolved
pkg/ebpf/c/common/filtering.h Outdated Show resolved Hide resolved
@rscampos rscampos force-pushed the data_filter_in_kernel branch from 86b4d0f to 88809d1 Compare November 14, 2024 18:25
@rscampos rscampos force-pushed the data_filter_in_kernel branch 2 times, most recently from d5d64b3 to bd095ca Compare December 9, 2024 21:54
Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Great work @rscampos !

@rscampos rscampos force-pushed the data_filter_in_kernel branch 9 times, most recently from 7ee86fe to 83551f0 Compare December 13, 2024 10:14
@rscampos rscampos force-pushed the data_filter_in_kernel branch from de8e0a5 to b498708 Compare December 13, 2024 13:48
@rscampos rscampos force-pushed the data_filter_in_kernel branch from b498708 to 7192fb7 Compare December 13, 2024 17:20
@rscampos rscampos linked an issue Dec 13, 2024 that may be closed by this pull request
@aquasecurity aquasecurity deleted a comment from github-actions bot Dec 13, 2024
- function load_str_from_buf created to retrieve str value based on index;
- function reverse_string created to revert an string in order to enable suffix;
- function evaluate_data_filters/match_data_filters created to apply: exact, prefix and suffix match;
- eBPF maps for exact, prefix and suffix. eBPF map for hold temporary LPM TRI key;
- extend event_config to have data filter config per event: used for exact, prefix and suffix match;
- save offset at the specified index in the function save_str_to_buf.
- how to enable data filter in the eBPF program using the function
evaluate_data_filters.
- eBPF map definition for exact, prefix, suffix match;
- create updateDataFilterLPMBPF and updateDataFilterBPF to populate eBPF
  maps;
- config map fields for exact, prefix and suffix;
- Create the function createNewDataFilterMapsVersion in order to create
the inner maps based on version and event id.
- method equalities created for data filter;
- method computeDataFilterEqualities created for kernel data filter;
- handle corner case when one policy uses a substring (path) of another
  policy;
- disable data filter (only pathname) for selected events;
- Kernel data filters restrict pathnames to 255 characters and disallow
'contains' filters - added functions to enforce that.
- Add MatchTypes{} and KernelDataFilter{} in cmp.AllowUnexported;
- Kernel data filters restrict pathnames to 255 characters and
disallow 'contains' filters; unit tests have been added to validate
these restrictions;
- Integration tests for specific events added, covering three filter
types with "equal" and "not equal" conditions.
- Add the restrictions applicable when the kernel-space data filter is
available for an event field.
@rscampos rscampos force-pushed the data_filter_in_kernel branch from 7192fb7 to 03b6b4d Compare December 13, 2024 19:54
@rscampos
Copy link
Collaborator Author

/fast-forward

@github-actions github-actions bot merged commit 03b6b4d into aquasecurity:main Dec 13, 2024
31 checks passed
@rscampos
Copy link
Collaborator Author

Folks @yanivagman @geyslan,

Thank you for all the feedbacks! Learned a lot of good things during this work!

@geyslan
Copy link
Member

geyslan commented Dec 13, 2024

Congrats for this amazing new feature! 🚀🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

basic in-kernel data filtering for select events
4 participants