Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion cid-redirects.json
Original file line number Diff line number Diff line change
Expand Up @@ -2820,7 +2820,8 @@
"/cid/10333": "/docs/send-data/opentelemetry-collector/remote-management/processing-rules",
"/cid/10334": "/docs/send-data/opentelemetry-collector/remote-management/processing-rules/include-and-exclude-rules",
"/cid/10335": "/docs/send-data/opentelemetry-collector/remote-management/processing-rules/include-and-exclude-rules",
"/cid/10336": "/docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules",
"/cid/10336": "/docs/send-data/opentelemetry-collector/remote-management/processing-rules/hash-and-mask-rules",
"/docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules": "/docs/send-data/opentelemetry-collector/remote-management/processing-rules/hash-and-mask-rules",
"/cid/9010": "/docs/send-data/opentelemetry-collector",
"/cid/9011": "/docs/send-data/opentelemetry-collector/install-collector/linux",
"/cid/9012": "/docs/send-data/opentelemetry-collector/install-collector/macos",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,203 @@
---
id: hash-and-mask-rules
title: OpenTelemetry Remote Management Hash and Mask Rules
sidebar_label: Hash and Mask Rules
description: Use hash and mask processing rules to replace an expression with the respective hash and mask strings.
---

## OpenTelemetry Remote Management Hash Rules

A hash rule is a processing rule that allows you to replace an expression with a hash code generated for that value. Hashed data is completely hidden (obfuscated) before being sent to Sumo Logic. This can be very useful in situations where certain types of data must not leave your premises, such as credit card numbers and Social Security numbers. Each unique value will have a unique hash code.

The hash algorithm used is **SHA-256**.

Ingestion volume is calculated after the hash filter is applied. If the hash reduces the log size, the smaller size will be measured against ingestion limits.

:::note
Currently available for all standard STs, except Windows and Syslog.
:::

### How it works

When you add a hash rule action to your processing rules, you need to provide two inputs:

1. **Expression**: A regular expression that must contain exactly **one capture group** `( )`. The string value matched by this capture group will be hashed using SHA-256. If multiple parts of the string need to be hashed, add additional hashing rules for them.

2. **Replacement Format**: The formatted replacement string that will replace the matching string in the log. Use `%s` to refer to the hashed value from the SHA-256 function. The `%s` reference is mandatory and can only be used once.

### Examples

#### Hash a password

For example, to hash the password `Welcome123` from this log:

```
user=sumo password=Welcome123
```

You could use the following configuration:

**Expression:**
```
password=([A-Za-z0-9]+)
```

**Replacement Format:**
```
password=%s
```

**Result:**
- **Matching string**: `password=Welcome123`
- **Capture group**: `Welcome123` (this value is hashed)
- **Output log**: `user=sumo password=<hashed_value>`

Where `<hashed_value>` is the SHA-256 hash of `Welcome123`.

#### Hash member IDs

To hash member IDs from this log:

```
2012-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [memberid=dan@demo.com] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]
```

You could use the following configuration:

**Expression:**
```
memberid=([^\]]+)
```

**Replacement Format:**
```
memberid=%s
```

**Resulting hashed log:**

```
2012-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [memberid=906e9cc124c8e1085b10e1cec4cc6526f3637558be361d3b4bb54bb537e49a49] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]
```

:::important
Any hashing expression should be tested and verified on a sample source file before being applied to your production logs.
:::

### Rules and limitations

* The regular expression must contain exactly **one capture group** enclosed in `( )`. Values inside this capture group will be hashed. If multiple parts of the string need to be hashed, add additional hashing rules for them.

* You can use an anchor to detect specific values in your logs. Only the value within the capture group will be hashed.

* The hash algorithm is **SHA-256** (MD5 is not supported for OpenTelemetry collectors).

* Make sure you do not specify a regular expression that matches a full log line. Doing so will hash the entire log line.

* The replacement format must include `%s` exactly once to reference the hashed value.

* Do not unnecessarily match on more of the log than needed. Use precise regular expressions to ensure that only the intended sensitive information is hashed, not the surrounding context.

* Each unique value will produce a unique hash code. The same input value will always produce the same hash output, allowing you to correlate occurrences while keeping the actual value hidden.

## OpenTelemetry Remote Management Mask Rules

:::note
This document does not cover masking logs for Windows source templates. For details on masking logs for Windows, refer to [OpenTelemetry Remote Management Windows Source Template Mask Rules](/docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules-windows/).
:::

A mask rule is a processing rule that hides irrelevant or sensitive information from logs before they are ingested. When you create a mask rule, the selected expression will be replaced with a mask string before the data is sent to Sumo Logic. You can either specify a custom mask string or use the default `"#####"`.

Ingestion volume is calculated after applying the mask filter. If the mask reduces the size of the log, the smaller size will be measured against ingestion limits. Masking is an effective method to reduce overall ingestion volume.

### Examples

#### Mask an email address

For example, to mask the email address `dan@demo.com` from this log:

`2018-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [auth=User:dan@demo.com] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]`

You could use the following filter expression:

```
auth=User:.*\.com
```

Using the masking string `auth=User:AAA` would produce the following result:

`2018-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [auth=User:AAA] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]`


:::important
Any masking expression should be tested and verified with a sample source file before applying it to your production logs.
:::

#### Mask credit card numbers

You can mask credit card numbers from log messages using a regular expression within a mask rule. Once masked with a known string, you can then perform a search for that string within your logs to detect if credit card numbers may be leaking into your log files.

To mask credit card numbers in logs, you can use a masking filter with the following regular expression. The following regular expression can be used within a masking filter to mask American Express, Visa (16 digit only), Mastercard, and Discover credit card numbers:

```
((?:(?:4\d{3})|(?:5[1-5]\d{2})|6(?:011|5[0-9]{2}))(?:-?|\040?)(?:\d{4}(?:-?|\040?)){3}|(?:3[4,7]\d{2})(?:-?|\040?)\d{6}(?:-?|\040?)\d{5})
```

This regular expression covers instances where the number includes dashes, spaces, or is a solid string of numbers.

Samples include:

* **American Express**. 3711-078176-01234  \|  371107817601234  \|  3711 078176 01234
* **Visa**. 4123-5123-6123-7123  \|  4123512361237123  \|  4123 5123 6123 7123
* **Master Card**. 5123-4123-6123-7123  \|  5123412361237123  \|  5123 4123 6123 7123
* **Discover**. 6011-0009-9013-9424  \|  6500000000000002  \|  6011 0009 9013 9424


### Rules and limitations

* Expressions that you want masked must be selected by the regular expression you given. And the masking string provided will mask whole of the string which is selected by the regular expression.

For example, for this log message:

`{
"reqHdr":{
"auth":"Basic ksoe9wudkej2lfj*jshd6sl.cmei=",
"cookie":"$Version=0; JSESSIONID=6C1BR5DAB897346B70FD2CA7SD4639.localhost_bc; $Path=/"
}
}`

You would use the following as a mask expression to mask the auth parameter's token:

```
"auth"\s*:\s*"Basic\s*[^"]+"
```

Applying the masking string `"auth":"#####"`, the log output will be:

`{
"reqHdr": {
"auth":"#####",
"cookie":"$Version=0; JSESSIONID=6C1BR5DAB897346B70FD2CA7SD4639.localhost_bc; $Path=/"
}
}`

* Do not unnecessarily match on more of the log than needed. As seen in the previous example, avoid using overly broad expressions that could mask the entire log. This ensures that only the sensitive information is masked, not the whole log entry.

```
(?s).*auth"\s*:\s*"Basic\s*([^"]+)".*(?s)
```

* Avoid regular expressions that match an entire log line, as this will result in the entire line being masked.

* To mask values spanning multiple lines, use the single-line modifier `(?s)`. For example:

```
auth=User\:(.*(?s).*session=.*?)\]
```

:::note
- Masking utilizes the [replace_pattern](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/ottlfuncs/README.md#replace_pattern) OTTL function. In this function:
- Escape `$` as `$$` to bypass environment variable substitution logic.
- Use `$$$` to include a literal `$`.
- When masking strings containing special characters like double quotes (`"`) and backslashes (`\`), these characters will be escaped by a backslash when masking the logs.
:::
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@ To configure processing rules, navigate to the remote management section in the
In this section, we'll introduce the following concepts:

<div className="box-wrapper" >
<div className="box smallbox card">
<div className="container">
<a href={useBaseUrl('docs/send-data/opentelemetry-collector/remote-management/processing-rules/overview')}><img src={useBaseUrl('img/icons/operations/rules.png')} alt="Rules icon" width="40"/><h4>OTRM Overview</h4></a>
<p>Get an overview of how to use processing rules to specify what kind of data is sent to Sumo Logic using OpenTelemetry remote management.</p>
</div>
</div>
<div className="box smallbox card">
<div className="container">
<a href={useBaseUrl('docs/send-data/opentelemetry-collector/remote-management/processing-rules/include-and-exclude-rules')}><img src={useBaseUrl('img/icons/operations/rules.png')} alt="Rules icon" width="40"/><h4>OTRM Include and Exclude Rules</h4></a>
Expand All @@ -27,8 +33,8 @@ In this section, we'll introduce the following concepts:
</div>
<div className="box smallbox card">
<div className="container">
<a href={useBaseUrl('docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules')}><img src={useBaseUrl('img/icons/operations/rules.png')} alt="Rules icon" width="40"/><h4>OTRM Mask Rules</h4></a>
<p>Create an OTRM mask rule to replace an expression with a mask string.</p>
<a href={useBaseUrl('docs/send-data/opentelemetry-collector/remote-management/processing-rules/hash-and-mask-rules')}><img src={useBaseUrl('img/icons/operations/rules.png')} alt="Rules icon" width="40"/><h4>OTRM Hash and Mask Rules</h4></a>
<p>Create an OTRM hash and mask rule to replace an expression with the respective hash and mask string.</p>
</div>
</div>
<div className="box smallbox card">
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ description: Create an OpenTelemetry remote management Windows source template m
---

:::note
This document supports masking logs specifically for our [Windows source template](/docs/send-data/opentelemetry-collector/remote-management/source-templates/windows). For other source templates, refer to [Mask Rules](mask-rules.md).
This document supports masking logs specifically for our [Windows source template](/docs/send-data/opentelemetry-collector/remote-management/source-templates/windows). For other source templates, refer to [Hash and Mask Rules](hash-and-mask-rules.md).
:::

A mask rule is a type of processing rule that hides irrelevant or sensitive information from logs before they are ingested. When you create a mask rule:
Expand Down

This file was deleted.

Loading