Skip to content

[RFC] Drop |raw in favor of a more secure approachΒ #4754

@Toflar

Description

@Toflar

I would like to propose deprecating (and in future drop) the use of |raw which is quite a radical change of the current concept, I'm very much aware of that 😊 But bear with me for a second here - I named it RFC for a reason πŸ˜‰

Here's why I think the concept of |raw in templates is fundamentally flawed.
If you take a step back and think of what {{ variable|raw }} does, the first thought that comes to your mind because of how the filter is named is probably:

Hello Twig, output variable exactly as it is, I know it is safe here.

And this thought is exactly what I would like to challenge.
Because in projects where you write the controllers and you control the templates. you know exactly what input you have, where you pass the variables, in what context etc. - you are in full control.
Hence, this makes you think (!) that {{ variable|raw }} is safe and you can ouput it |raw.
But can you?

Here's the thing: I think, this assumption is wrong.
Because very often, the ones that write the templates and the ones that use them (aka write the controllers), are not the same people.

Imagine you have template designer Alice that designs a table.html.twig:

<table>
    {% for row in rows %}
        <tr>
            {% for cell in row %}
                <td>{{ cell }}</td>
            {% endfor %}
        </tr>
    {% endfor %}
</table>

That's great, developer Bob can now use this template when writing his controller:

$rows = [
    ['Cell 1', 'Cell 2', 'Cell 3'],
    ['Cell 4', 'Cell 5', 'Cell 6'],
];

echo $twig->render('table.html.twig', [
    'rows' => $rows,
]);

Nice! Now two weeks later, developer Eve also needs to output content formatted as a table, in a completely different context. She needs to output user input in one of the cells, though:

$rows = [
    ['Cell 1', 'Cell 2', 'Cell 3'],
    ['Cell 4', 'Cell 5', $request->query->get('user-input')],
];

echo $twig->render('table.html.twig', [
    'rows' => $rows,
]);

Well, no problem. She checks the template, sees that it contains <td>{{ cell }}</td> and knows that Twig's output encoding is going to take care of that properly. All good!

Now another two weeks later, developer Bob comes back to his controller and his task is to wrap Cell 1 into a <strong>Cell 1</strong>.
So he asks template designer Alice to adjust the template to allow for that.
Now what does template designer Alice do?
If we are lucky, she knows (or can see from the source code, if the project lives in the same git repository) that developer Eve passes untrusted input and so she'll go for

<table>
    {% for row in rows %}
        <tr>
            {% for cell in row %}
                <td>{{ cell|sanitize_html }}</td>
            {% endfor %}
        </tr>
    {% endfor %}
</table>

Or she implements some special logic where one can say which cells should be wrapped in <strong></strong> (but then again, what if tomorrow someone needs another formatting option?).

Phew! That was close! But what if she did not know this template was used elsewhere? What if there's a general repository for some company CI/CD and re-usable Twig templates used in all sorts of projects?
Would she go for

<table>
    {% for row in rows %}
        <tr>
            {% for cell in row %}
                <td>{{ cell|raw }}</td>
            {% endfor %}
        </tr>
    {% endfor %}
</table>

and thus introduce a security vulnerability without knowing it? πŸ’₯

And here's why I think that

Hello Twig, output variable exactly as it is, I know it is safe here.

is always the wrong thing to say. You never want to allow |raw values which essentially translates to "whatever format".
What you actually want to say in this case is

Hello Twig, if variable is safe for the current escaping strategy (so html because derived from table.html.twig), you can output it raw. If not, please apply output encoding for current escaping strategy.

The template designer never has this information. So in my opinion, it's inherently wrong to ask them to add |raw.
As the template designer, what you want is: {{ variable|rawIfEscapingStrategiesMatch }} 😊
But you always want that, so might as well drop the filter alltogether πŸ˜‰

Imho, we should invert the logic and move this responsibility to the one using the template (aka the controller).

Imagine something along the lines of this (I know a lot of it is missing, it's just for illustration purposes):

class TemplateVariable
{
    public function setContentForEscapingStrategy(string $content, string $strategy): self
    public function getContentForEscapingStrategy(string $strategy): mixed
}

Now there's no more |raw so our template designer Alice doesn't have to worry. We're back to

<table>
    {% for row in rows %}
        <tr>
            {% for cell in row %}
                <td>{{ cell }}</td>
            {% endfor %}
        </tr>
    {% endfor %}
</table>

Developer Bob can still do this in the simplest case:

$rows = [
    ['Cell 1', 'Cell 2', 'Cell 3'],
    ['Cell 4', 'Cell 5', 'Cell 6'],
];

echo $twig->render('table.html.twig', [
    'rows' => $rows,
]);

For his <strong> case, he now needs to do this:

$cell1Content = new TemplateVariable();
$cell1Content->setContentForEscapingStrategy('<strong>Cell 1</strong>', 'html'); // Bob knows this is safe, it's HIS responsibility!

$rows = [
    [$cell1Content, 'Cell 2', 'Cell 3'],
    ['Cell 4', 'Cell 5', 'Cell 6'],
];

echo $twig->render('table.html.twig', [
    'rows' => $rows,
]);

Twig would now check if {{ cell }} represents an instance of TemplateVariable and fetch the content for the current escaping strategy which is html. If it's configured like in this case, it can be output as is, because the strategies match. If it's not, then the escaping strategy needs to be applied just as usual.

And what would Eve do to use that same template? Well:

$cell6Content = new TemplateVariable();
$cell6Content->setContentForEscapingStrategy($this->sanitizer->sanitize($request->query->get('user-input')), 'html'); // Eve knows this is safe now

$rows = [
    ['Cell 1', 'Cell 2', 'Cell 3'],
    ['Cell 4', 'Cell 5', $cell6Content]
];

echo $twig->render('table.html.twig', [
    'rows' => $rows
]);

You see, the responsibility to tell Twig which variables are safe for what strategy, must not be the template designer's. Using |raw effectively disables all output encoding which is never what you want as a template designer. You don't want to allow "everything". You want to allow "everything that is safe".
So all you should care about as a template designer is to write your templates with the correct escaping strategies in mind:

{# Ensure this template is called `button_with_js.html.twig` because we want `html` to be the default escaping strategy for this template #}
<div>
    <p>{{ value }}</p>
    <button title="{{ value|e('html_attr') }}">Click</button>
    <script>
        const msg = "{{ value|e('js') }}";
        console.log(msg);
    </script>
</div>

And here we go. This template can now be used by any developer in that project safely.
In its simplest form, you can just use it like you do it today. Pass the value as a string:

echo $twig->render('button_with_js.html.twig', [
    'value' => 'I am not dangerous',
]);

And if you want value to contain Bob "The Builder" <script>doSomethingSafe()</script>, because you know that it's safe for e.g. html, you have to take care of things yourself:

$value = <<<'TXT'
Bob "The Builder" <script>doSomethingSafe()</script>
TXT;
$variable = new TemplateVariable();
$variable->setContentForEscapingStrategy($value, 'html'); // I know this is safe for "html", I want the script to be executed - for all the other strategies, escape normally.

echo $twig->render('button_with_js.html.twig', [
    'value' => $variable,
]);

Not sure if this was presented logically and the examples make sense πŸ˜„
But I think it's good enough to make my point.

  • Currently |raw outputs the contents always raw no matter the escaping strategy.
  • I think this is never what you want. You want to always output things safely no matter the strategy.
  • Template designers should not think about the contents of variables. They should think about the context they are outputting those and apply the correct escaping filters there (so nothing new for them actually).
  • Developers that make use of those templates need a way to mark a variable as being safe for a certain strategy
  • This shifts responsibilities to where they belong

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions