Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: dom.execute() #678

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
174 changes: 174 additions & 0 deletions proposals/dom_execute.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Proposal: browser.dom.execute()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation to use the dom namespace instead of the scripting namespace? Considering it's similarities with scripting.executeScript and the seemingly lack of connection with the DOM. Considering execute(), createPort() and also openOrClosedShadowRoot(), would it not make sense to use browser.document instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the separation between browser.dom and browser.scripting is valuable -- even though they have similar surfaces and serve conceptually-similar purposes, their use cases, restrictions, and implementations are quite different. One is used from the extension's primary contexts (e.g., background script) to configure scripts that will be running in completely different contexts (e.g., tabs), and the other is used to interact with the same document the extension is currently running in. IMO, this distinction more easily separates these APIs from developers, and also allows a cleaner isolation of APIs for browsers: scripting.execute() is not allowed in isolated worlds at all, since it is a privileged API. By contrast, this is, and it is (kinda) just adding a script to the DOM (though it's closer to eval than adding a child element), and so is available in these (relatively) untrusted contexts.

Considering execute(), createPort() and also openOrClosedShadowRoot(), would it not make sense to use browser.document instead?

I don't have strong preferences for browser.dom vs browser.document except that we already have browser.dom : ) It doesn't seem worth the churn to migrate to browser.document at this stage to me, given the impact on the ecosystem and the delay it would entail to introduce new APIs (since I do think we'd then want to migrate fully away from dom before moving to document to avoid just having two APIs to maintain).


**Summary**

This API allows an isolated world (such as a content script world or user
script world) to synchronously execute code in the main world of the document
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
its injected in.

**Document Metadata**

**Author:** rdcronin

**Sponsoring Browser:** Chromee
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

**Contributors:** Rob--W, ...

**Created:** 2024-06-24

**Related Issues:** <TODO>

## Motivation

### Objective

This allows isolated worlds to synchronously inject in the main world. This is
both beneficial by itself and also a prerequisite for our current plan to
implement inter-world communication (the latter of which will be a separate
proposal).

#### Use Cases

##### Executing script in the main world

Today, extensions can execute script in the main world by:
* Leveraging scripting.executeScript() or registering content scripts or user
scripts, or
* Appending a new script tag to the document

The former is heavily asynchronous or requires knowledge of whether to inject
beforehand (to register a script). The latter is very visible to the site and
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
has more potential side effects. While there's no way to fully "hide" a script
from a site, it is desirable to avoid adding DOM elements (which could
interfere with sites with certain properties).

### Known Consumers

User script managers have expressed an interest in this API, but this is also
useful to any extension that wishes to execute script in the main world while
leveraging information from the isolated world.

## Specification

### Schema

```
declare namespace dom {
interface ScriptInjection {
func: () => any;
args: any[];
};

export function execute(
injection: ScriptInjection
): any
}
```

### Behavior

#### Function execution

The function is executed _synchronously_ in the main world. The return value
is the serialized return value from the function execution.

#### Cross-world contamination

To prevent any "piercing" of the different worlds, all pieces are serialized in
their original world, then a copy is deserialized in the target world.
Otherwise, there would be a risk of sharing variables between worlds (even when
non-obvious, such as by accessing the prototype of an item).

##### Argument serialization

Arguments are serialized and deserialized using the Structured Cloning
algorithm. This allows for more flexibility than simply JSON conversion. In
the future, some arguments may have custom serialization.
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If dom.execute doesn't support DOM nodes, it will cause problems for extension developers because dealing with DOM nodes is what content scripts usually do and chrome.dom namespace strongly implies that DOM is supported.

To be precise, it should be EventTarget, the ancestor of Node, which additionally allows transferring things like iframeElem.contentWindow, window.parent or event.source from a postMessage listener.

Although technically this is already possible by sending a MouseEvent with relatedTarget set to the EventTarget instance, but this trick is largely unknown. World isolation is maintained automatically: the event keeps only the spec'd part of the object i.e. without this JS world's expando properties.


##### Function serialization

The injected function is serialized to and parsed from a string. Bound
parameters are not supported. Arguments must be curried in through the `args`
key.

##### Return value serialization

Like arguments, the return value of the execution is serialized and
deserialized using the Structured Cloning algorithm.
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

##### Injected script privileges

The injected script has the same privileges has other script in the main world.
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
This goes for API access, Content Security Policy, origin access, and more.

##### CSP Application

The injected script is not considered inline and does not have a corresponding
rdcronin marked this conversation as resolved.
Show resolved Hide resolved
source; as such, CSP restrictions on script-src (and similar) do not apply.
However, since the script executes in the main world, other CSP restrictions
(including connect-src and which sources may be added to the document, among
many others) *may* apply to the injected script.

### New Permissions

There are no new permissions for this capability. This does not allow any new
data access, since it is only accessible once the extension has already
injected a script into the document. Extensions can also already interact with
the main world of the document through either appending script tags or directly
injecting with registered content or user scripts, or the
scripting.executeScript() method.

### Manifest File Changes

There are no necessary manifest changes.

## Security and Privacy

rdcronin marked this conversation as resolved.
Show resolved Hide resolved
### Exposed Sensitive Data

This does not result in any new data being exposed.

### Abuse Mitigations

This doesn't enable any new data access. To prevent any risk of cross-world
interaction, all data that passes between worlds is serialized in the original
world, then deserialized (creating a new copy) in the target world. The
executed script has no additional capabilities or APIs; it has the same
privilege as the document's own main world code.

### Additional Security Considerations

N/A.

## Alternatives

### Existing Workarounds

Extension developers can inject in the main world through registered content or
user scripts, along with the scripting.executeScript() API. They can also
inject code in the main world by adding a script tag to the DOM. However, the
API-powered options are asynchronous, and a script tag is inherently visible to
other scripts running in the main world.

### Open Web API

The open web is unaware of the concept of multiple Javascript worlds, so this
wouldn't belong as an open web API.

## Implementation Notes

N/A.
rdcronin marked this conversation as resolved.
Show resolved Hide resolved

## Future Work

### Inter-world communication

This API is a necessary first step before introducing inter-world communication
channels. We plan to do that alongside the implementation for this API.

### Specifying a target world
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which means not specifying the target world in the future would default to the main world. If that is a downside or if explicitly specifying a target world right from the start is preferred specifying the target world right now world be good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. I think defaulting to the main world makes sense. However, I don't feel strongly here, so if folks would prefer to have a target world specified in all cases, I'm okay with that. @Rob--W or others, any preference?


For now, this API is intended to let scripts running in content or user script
worlds inject in the main world. In the future, we will expand this to allow a
script to inject in additional other worlds, such as a content script injecting
in a user script world.