Skip to content

Extended Request Data Collection#5709

Merged
CarlesDD merged 42 commits intomasterfrom
ccapell/asm-extended-data-collection
May 26, 2025
Merged

Extended Request Data Collection#5709
CarlesDD merged 42 commits intomasterfrom
ccapell/asm-extended-data-collection

Conversation

@CarlesDD
Copy link
Contributor

@CarlesDD CarlesDD commented May 13, 2025

What does this PR do?

Provides the feature to collect additional data on Appsec events.

Request and response headers collection

Adds the APPSEC_COLLECT_ALL_HEADERS flag, which enables collection of all request and response headers. This feature is disabled by default.

Adds the APPSEC_HEADER_COLLECTION_REDACTION_ENABLED flag, which enables header redaction. This feature is true by default. (The redaction is out of the scope, right now we only want to collect the headers without redaction).

Introduces the APPSEC_MAX_COLLECTED_HEADERS setting to limit the maximum number of headers collected.

Updates the Reporter.finishRequest logic to collect all headers when APPSEC_COLLECT_ALL_HEADERS is enabled. Allowed headers are prioritized and must be collected if present.

If the number of headers exceeds APPSEC_MAX_COLLECTED_HEADERS, the following tags are added to the span indicating the number of discarded headers:

  • dd.appsec.request.header_collection.discarded
  • dd.appsec.response.header_collection.discarded

Request body collection

Adds the APPSEC_RASP_COLLECT_REQUEST_BODY flag, which enables collection of request body on RASP events. This feature is disabled by default.

When APPSEC_RASP_COLLECT_REQUEST_BODY is enabled and there is a RASP event put the same parsed request body that is sent to the WAF via meta_struct with http.request.body key. The body data is truncated under certain conditions. When it happens, the boolean tag _dd.appsec.rasp.request_body_size.exceeded is added

Updates the Reporter.reportAttack logic to collect request body when APPSEC_RASP_COLLECT_REQUEST_BODY is enabled.

Motivation

Meeting the need of more information for the in-depth investigation of high-risk events.

Additional Notes

APPSEC-57290
RFC-1031

ST

@github-actions
Copy link
Contributor

github-actions bot commented May 13, 2025

Overall package size

Self size: 9.45 MB
Deduped: 103.48 MB
No deduping: 104 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.5.1 | 29.73 MB | 29.73 MB | | @datadog/native-appsec | 8.5.2 | 19.33 MB | 19.34 MB | | @datadog/pprof | 5.8.0 | 12.55 MB | 12.92 MB | | @datadog/native-iast-taint-tracking | 4.0.0 | 11.72 MB | 11.73 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.4.0 | 2.77 MB | 5.42 MB | | @datadog/wasm-js-rewriter | 4.0.1 | 2.85 MB | 3.58 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.13.1 | 117.64 kB | 839.26 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.2 | 53.63 kB | 53.63 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | dc-polyfill | 0.1.8 | 25.08 kB | 25.08 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.2 | 23.54 kB | 23.54 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@codecov
Copy link

codecov bot commented May 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.24%. Comparing base (1c17b95) to head (367d3a2).
Report is 12 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5709      +/-   ##
==========================================
+ Coverage   78.92%   79.24%   +0.31%     
==========================================
  Files         515      520       +5     
  Lines       23496    23826     +330     
==========================================
+ Hits        18544    18880     +336     
+ Misses       4952     4946       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented May 13, 2025

Datadog Summary

✅ Code Quality    ✅ Code Security    ✅ Dependencies


Was this helpful? Give us feedback!

@pr-commenter
Copy link

pr-commenter bot commented May 13, 2025

Benchmarks

Benchmark execution time: 2025-05-23 13:40:39

Comparing candidate commit 367d3a2 in PR branch ccapell/asm-extended-data-collection with baseline commit 1c17b95 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1269 metrics, 54 unstable metrics.

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented May 14, 2025

Datadog Report

Branch report: ccapell/asm-extended-data-collection
Commit report: 2d6100d
Test service: dd-trace-js-integration-tests

✅ 0 Failed, 1153 Passed, 0 Skipped, 20m 28.6s Total Time

@CarlesDD CarlesDD marked this pull request as ready for review May 14, 2025 07:08
@CarlesDD CarlesDD requested review from a team as code owners May 14, 2025 07:08
Copy link
Collaborator

@uurien uurien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss integration or plugin style tests testing the whole feature.

@CarlesDD CarlesDD force-pushed the ccapell/asm-extended-data-collection branch from 74601af to cc38d6d Compare May 20, 2025 06:45
@CarlesDD
Copy link
Contributor Author

@uurien New integration test have been added in order to test the feat.

CarlesDD and others added 4 commits May 22, 2025 07:22
Co-authored-by: simon-id <simon.id@datadoghq.com>
Co-authored-by: simon-id <simon.id@datadoghq.com>
Co-authored-by: simon-id <simon.id@datadoghq.com>
Co-authored-by: simon-id <simon.id@datadoghq.com>
@CarlesDD CarlesDD marked this pull request as draft May 22, 2025 07:21
@CarlesDD CarlesDD marked this pull request as ready for review May 22, 2025 14:23
simon-id
simon-id previously approved these changes May 22, 2025

async function assertHeadersReported (requestHeaders, responseHeaders) {
await agent.assertMessageReceived(({ headers, payload }) => {
// Request headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: I would like to check _dd.appsec.rasp.request_body_size.exceeded tag exist in this integration test but it's not a blocker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test case to check the tag and the reported truncated req body

IlyasShabi
IlyasShabi previously approved these changes May 22, 2025
@CarlesDD CarlesDD dismissed stale reviews from IlyasShabi and simon-id via f1885a0 May 22, 2025 16:29
IlyasShabi
IlyasShabi previously approved these changes May 22, 2025
Copy link
Contributor

@IlyasShabi IlyasShabi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

'connection'
]

const expectedResponseHeaders = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment to explain why do we expect 22 of x-datadog-res- and not 25? I expend few minutes understanding that it was because the DD_APPSEC_MAX_COLLECTED_HEADERS=25

DD_APPSEC_RASP_ENABLED: true,
DD_APPSEC_RULES: path.join(cwd, 'appsec/rasp/rasp_rules.json')
DD_APPSEC_RULES: path.join(cwd, 'appsec/rasp/rasp_rules.json'),
DD_APPSEC_RASP_COLLECT_REQUEST_BODY: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for headers you are testing also when the configuration is not enabled, could you do the same here?

return { value: target.slice(0, COLLECTED_REQUEST_BODY_MAX_STRING_LENGTH), truncated: true }
}
return { value: target, truncated: false }
case 'object': {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this code block? {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To limit the scope of the variables declared in the case. See this

return { value: truncatedArray, truncated: wasTruncated }
}

if (typeof target.toJSON === 'function') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be before the Array.isArray?

@CarlesDD CarlesDD merged commit 3ed46b7 into master May 26, 2025
499 checks passed
@CarlesDD CarlesDD deleted the ccapell/asm-extended-data-collection branch May 26, 2025 13:09
ghost pushed a commit that referenced this pull request May 27, 2025
* Headers collection config

* Revamp headers collection + extended collection

* Fix condition for extended collection

* Body collection config

* Check for RASP events

* Testing extend header collection

* RASP body collection tests

* Minor reformats

* Lint

* Clearer condition

* Truncate body on body collection

* Complete config test

* Fix lint

* Reporter init w/ appsec config

* Remove TODO

* Clarify collection comments

* Update test.ts with new configuration

* Fix linting

* Integration test for RASP request body collection

* Set tag when reported request body is truncated

* Integration test for data collection

* Fix linting

* Check for header names in integration test

* Set the correct value for request body exceeded size tag

* Add support for toJSON in request body truncation

* Invert condition for early return

* Avoid slicing on request body truncation and implement it in a for loop

* Rename reporter config

* Simplify condition logic

* Create block scope in switch case to limit the scope of declared vars

* Initialize Array with a fixed size

Co-authored-by: simon-id <simon.id@datadoghq.com>

* Switch to for loop

Co-authored-by: simon-id <simon.id@datadoghq.com>

* Set env var as strings

Co-authored-by: simon-id <simon.id@datadoghq.com>

* Set env var as string

Co-authored-by: simon-id <simon.id@datadoghq.com>

* Refactor headers group declaration + using set instead of arrays

* Cache res.getHeaders()

* Fix config and its test

* Fix config test for defaults values

* Check reported request body truncation on integration test

* Add a comment to clarify the test case

* Add a test case to check no request body is collected when feat is disabled

* Manage custom toJSON function in arrays

---------

Co-authored-by: simon-id <simon.id@datadoghq.com>
@ghost ghost mentioned this pull request May 27, 2025
Comment on lines +362 to +370
try {
await axios.post('/ssrf', requestBody)
} catch (e) {
if (!e.response) {
throw e
}

await assertBodyReported(requestBody)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw this by chance and noticed that we use try / catch here and in other tests. Sadly, we don't have a guarantee that these tests actually work as anticipated due to not checking if the post call would pass (which is likely not what is expected here).

I don't know about a eslint rule to enforce assert.throws / assert.rejects but that would solve that issue. It would detect if the test would not throw an error.

watson pushed a commit that referenced this pull request Jun 4, 2025
* Headers collection config

* Revamp headers collection + extended collection

* Fix condition for extended collection

* Body collection config

* Check for RASP events

* Testing extend header collection

* RASP body collection tests

* Minor reformats

* Lint

* Clearer condition

* Truncate body on body collection

* Complete config test

* Fix lint

* Reporter init w/ appsec config

* Remove TODO

* Clarify collection comments

* Update test.ts with new configuration

* Fix linting

* Integration test for RASP request body collection

* Set tag when reported request body is truncated

* Integration test for data collection

* Fix linting

* Check for header names in integration test

* Set the correct value for request body exceeded size tag

* Add support for toJSON in request body truncation

* Invert condition for early return

* Avoid slicing on request body truncation and implement it in a for loop

* Rename reporter config

* Simplify condition logic

* Create block scope in switch case to limit the scope of declared vars

* Initialize Array with a fixed size

Co-authored-by: simon-id <simon.id@datadoghq.com>

* Switch to for loop

Co-authored-by: simon-id <simon.id@datadoghq.com>

* Set env var as strings

Co-authored-by: simon-id <simon.id@datadoghq.com>

* Set env var as string

Co-authored-by: simon-id <simon.id@datadoghq.com>

* Refactor headers group declaration + using set instead of arrays

* Cache res.getHeaders()

* Fix config and its test

* Fix config test for defaults values

* Check reported request body truncation on integration test

* Add a comment to clarify the test case

* Add a test case to check no request body is collected when feat is disabled

* Manage custom toJSON function in arrays

---------

Co-authored-by: simon-id <simon.id@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants