-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/tsdb/filter: add support for white-list filtering #43
Comments
You can achieve this today. Construct a ruleset like so:
This way, the engine will pass only those points that match any of the two whitelist rules. I'm a bit confused though because this ruleset is essentially the same as the one you gave as an example, which makes me think I'm not understanding the issue. How is this ruleset insufficient for whitelisting purposes? |
Unfortunately you can't have a whitelist rule because a rule configured with "Block": false is not accepted by the validation logic (it is considered a no-op, see validate()). We are also missing the logic to immediately accept a point by a matching whitelist rule (see Eval()). The way I implemented support for whitelist rules is by representing Block as *bool instead of bool type and by changing the logic of the Eval() function (once a rule matches and Block is set return (not Block). As soon as I find some spare time I will adjust my implementation to the current version of the code base and I'll push it for review. |
@masiulaniec could you please review my commit. I will create a pull request once I receive your feedback. |
I will take a look in the evening. On Mon, Sep 5, 2016 at 7:21 AM, Sebastian Cotarla [email protected]
|
Alternative proposal(1) Leave definition of Block intact (2) Introduce a new field: Pass Pass complements Block. It also terminates the evaluation but with the opposite result. A rule where A Pass rule must take one of the following forms:
(3) Require that the final rule of any ruleset is an unconditional Pass or Block That is one of:
DiscussionIf we can avoid complicating the definition of In English, the words pass and block are antonyms. This makes for more rapid comprehension than Point (3) is about readability: be explicit about whether the filter is a whitelist or a blacklist. This saves a trip to documentation. In fact, the manpage doesn't even document that the default action is pass. With (3) in effect, there is less to document. Example: a whitelist ruleset
Example: a blacklist ruleset
Notice how nicely symmetric these two cases are. |
Thanks! I agree that introducing a new field Pass improves readability although it gives the user the option to combine the Block and Pass in one invalid rule (which of course we will reject at validation). What if we slowly deprecate Block and introduce a new optional field Action or Do with two possible values "block" or "pass"? This is even more readable and extensible with the benefit of not requiring extra validations because there is only one action field. I appreciate the utility of point (3) but if we introduce it we create a breaking change which will be problematic to roll out in a production environment - the forwarders implementing this requirement will reject filters returned by the old version of the controller and if we roll out the new controller first the old version forwarders will fail to parse the new rule (be it either "Pass": true or "Do": "pass"). I would rather improve the documentation and not impose any restrictions on how the filter ends. Example: a whitelist ruleset
Example: a blacklist ruleset
|
I mildly prefer the new-action-new-field model. I suspect it might actually benefit extensibility. If we decide to parameterize Pass, for example to make it conditional, I would prefer to see:
Rather than:
Do you mean from cpu efficiency point of view? Validations are rare: they happen only at ruleset load time. So it's not worth optimizing. Or do you mean from development effort point of view? I think the effort required to add a validation rule is low. Besides, it's a bad trade to save minutes of dev time in exchange for a bad schema decision, whose consequences can be felt for years.
I don't feel too strongly about this. You make a good point about transition troubles. It can be made practical by dropping the unconditionality requirement. Then, make use of the rule that sets the cluster tag. Arrange for it to be the final rule, and add |
I also prefer new-action-new-field model however in this particular case the two actions are mutual exclusive while having the same processing model (stop evaluating rules further) and to me it feels more natural to have only one field for both. A good language should not allow the user to abuse it's syntax. This reasoning combined with the intention to make minimal changes in the configuration schema lead me to overload the meaning of Block attribute with the assumed downside of lack of clarity ("Block": false - does not explicitly mean "pass"). Since we are fine with changing the syntax probably "Return": "pass|block" would be even more appropriate than Do vs. Block vs. Pass To be honest I don't expect the need to parameterize Return and actually I think we should never allow it because it will add too much complexity (I could even propose a syntax for this but I won't because I really don't think we should go in that direction). These being said I will let you to make the call Block|Pass: true vs. Return: block|pass
I was referring to schema design which limits the number of invalid combinations which in turn limits the number of lines of code needed for validation.
So basically you say we should just make adjustments in the list of rules returned by controller. I had a look on how it is implemented and it turns out the controller produces the rules for setting the cluster and host tags and delegates the responsibility of generating custom filtering rules to an external program. This means the external program will decide if it wants to do white-list or black-list (will decide which is the last rule in the list). To make it work in the white-list mode we just need to make sure that custom filtering rules are appended to cluster & host rules. Current implementation prepends the custom rules and this is fine for black-list mode but in white-list mode the cluster & host rules will not have a chance to execute. |
@masiulaniec I'm back from a short holiday and will look for some spare time to code an agreed option. Hopefully you will find some spare time to comment on the above. |
I think either is fine.
One could disable the prepend logic in controller if a custom filter program is defined. |
Ok then, I will try the Return: block|pass option, Block: true will still be supported but I will add validations to not allow both in the same rule. Do you see any strong reasons why we should not append the rules returned by the custom program filter. I would prefer this approach. |
This issue has made it clear that prepend was a wrong choice. I think that
Two things to pay attention to:
|
Although it has the advantage of keeping all filtering rules in one place and provides full flexibility for users (e.g. they will be able to choose to not add host and cluster tags or to use different tag names) the above change has also one drawback: users that used custom filtering will have to adjust their filtering program to return the rules which set cluster and host tags. For me cluster and host addition by default is a fine feature of TSP and I believe users should not be burden with it and also don't see a real need for the flexibility of interfering with these tags. Appending the custom filters fixes the prepend issue, does not put any burden on users and supports both black list and white list filtering modes. It introduces one difference at run time: in blacklist mode it will add cluster and host tags to some points that might be discarded by custom filter rules; prepending is more efficient from this point of view because it only adds the tags to points that were not blocked by custom filters. |
s/prepend/append/ sounds okay to me, too. |
:) great. I will rename the issue to "add support for white-list filtering" |
At the moment filtering works only in "black list" mode, once you specify a filtering rule everything else is accepted. There are cases when it is useful to operate in a "white list" mode - in other words allow one to configure all the rules which describe what points are allowed to pass and block everything else.
E.g. Sample configuration which will accept only metric points which have attribute "path" starting with "/app". Any other metric point will be blocked/dropped.
"Filter": [
{ "Match": ["", "path", "^/app"], "Block": false },
{ "Block": true }
]
I propose to enhance the meaning of the Block attribute: true means "block", false means "pass" and when not specified it means the rule will not stop the evaluation of the following rules (e.g. the case of Set rules).
The text was updated successfully, but these errors were encountered: