-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graph-building syntax simpler for web developers #16
Comments
+1. How about change long addOperand(OperandOptions options);
I agree it can simplify the user JS code, however there are two aspects we may take care of:
Some native APIs require the output tensors' shape when add/create an operation. For example, ANeuralNetworksModel_addOperation of NNAPI, CreateOperator of DirectML and MPSCNNKernel.encodeToCommandBuffer of MPS. For those APIs, if JS code provides the output tensors description, browser can easily use them. Otherwise, it requires browser to implement output tensor shape inference code.
The outputs list would useful for optional output tensors of an operation, for example BatchNormalization of ONNX.
Besides the
+1. How about remove long addConstant(OperandOptions options, ArrayBufferView data); |
Let me expand on the point mentioned above about making the graph-building syntax similar to an eager API. From the other discussions, there seems to be an interest in exploring Eager API (and JIT) in the future. A similarity in the syntax (between graph-building and eager-mode) will make it relatively easier for a user to change code using one mode into the other. (Conceptually, the two modes can be seen as different implementations of (almost) the same interface.) Thus, where an Eager API interface has a method like:
the builder API has a method like:
As a result, a line of code like:
executes either in the graph-builder mode or eager mode depending on the type of M. |
Re. @huningxin 's reply about changing addOperand to:
Yes to adding a return-value. I think it would be helpful to replace the use of “long” to represent operands by an abstract type (interface), say “Operand”, and make “addOperand” return a value of this type. An operand has multiple attributes (type, shape, etc.) and will likely be implemented as an object of some type/class. Using a long in the interface would restrict implementation choices (e.g., forcing an array or something similar to map the index to an object) and could also lead to potential misuse if users assume some implementation details (such as operands being numbered from 0 consecutively etc.) |
@huningxin : |
(I group point (a) and (c) together as they seem to be closely related to me.) IMO, WebNN API should not force developers to specify the output shapes when they can be inferred. For "unknown dimension", developers could create a
I suppose the idea of current design (mix of tensor operands and attributes for an operation) is to minimize the interfaces exposed. However, I would agree it is more friendly to developers by differentiating them. With that, we can add those attributes by individual parameters or grouped one. For example, Operand Conv(sequence<Operand> inputs,
sequence<long> paddings,
sequence<long> strides,
sequence<long> dilations,
optional FusedActivation activation = 'none'); or dictionary ConvAttributes {
sequence<long> paddings;
sequence<long> strides;
sequence<long> dilations;
FusedActivation? activation = 'none';
};
Operand Conv(sequence<Operand> inputs, ConvAttributes attributes);
Sounds good to me. |
Regarding:
That works when we know all the dimensions or none of the dimensions. But it doesn't allow us to specify some dimensions, without others. E.g., if we know that some input is of dimension N x 100 x 100 where N is an unknown dimension. This seems to be common from what I have seen. How about if we use negative integers in the dimension list for an unknown dimension?
The two suggested options look good. I think the dictionary approach may be a bit more convenient. |
Regarding to 6 June Teleconference, I'd like to summarize the API improvement areas and proposals. OperandOptionsImprovement areas:
Proposals:
addOperandImprovement areas:
Proposals:
addOperationImprovement areas:
Proposals:
The example would look like: // Use tensors in 4 dimensions.
const TENSOR_DIMS = [2, 2, 2, 2];
const TENSOR_SIZE = 16;
// Create a Model object.
const model = await nn.createModel();
// Add inputs and constants
const float32TensorDesc = {type: 'float32', dimensions: TENSOR_DIMS};
const tensor0 = model.addConstant(float32TensorDesc, new Float32Array(arrayBuffer, 0, TENSOR_SIZE));
const tensor1 = model.addInput(float32TensorDesc);
const tensor2 = model.addConstant(float32TensorDesc, new Float32Array(arrayBuffer,
TENSOR_SIZE * Float32Array.BYTES_PER_ELEMENT,
TENSOR_SIZE));
const tensor3 = model.addInput(float32TensorDesc);
// Add operations
const intermediateOutput0 = model.addAddition(tensor0, tensor1);
const intermediateOutput1 = model.addAddition(tensor2, tensor3);
const output = model.addMultiply(intermediateOutput0, intermediateOutput1);
// Identify graph output
model.identifyOutputs([output]);
// Finish building the model.
await model.finish(); @gramalingam and others, please take a look. if no big concern, I am going to make a PR. |
Hi @huningxin : thanks for the proposal. It looks good to me. (I have only a minor naming suggestion: drop the prefix "add" … and just use names like model.Constant, model.Input, model.Add, model.Multiply … but that is mostly a matter of personal-taste. It may help make the eager-API and graph-building API similar … in case we ever add an eager-execution API also.) |
Thanks @gramalingam . Regarding to your comments:
We may need to follow lowerCamelCase for method name. So if we remove "add" prefix, these methods would look like
This is a good point. For this purpose, we may step forward to decouple the operations from // Create Tensor a.
const a = nn.tensor(descriptor, value);
// Create Tensor b.
const b = nn.tensor(descriptor, value);
// Invoke `Tensor add(Tensor a, Tensor b)` that computes the result and save to tensor c.
const c = nn.add(a, b); To build a // Create Operand a.
const a = nn.input(descriptor);
// Create Operand b.
const b = nn.constant(descriptor, buffer);
// Invoke `Operand add(Operand a, Operand b)` that builds the graph node and returns Operand c.
const c = nn.add(a, b); After building and wiring graph nodes, user code can create a model by identifying the output operands. With this proposal, the example would look like: // Use tensors in 4 dimensions.
const TENSOR_DIMS = [2, 2, 2, 2];
const TENSOR_SIZE = 16;
// Create input and constant operands.
const float32TensorDesc = {type: 'float32', dimensions: TENSOR_DIMS};
const tensor0 = nn.constant(float32TensorDesc, new Float32Array(arrayBuffer, 0, TENSOR_SIZE));
const tensor1 = nn.input(float32TensorDesc);
const tensor2 = nn.constant(float32TensorDesc, new Float32Array(arrayBuffer,
TENSOR_SIZE * Float32Array.BYTES_PER_ELEMENT,
TENSOR_SIZE));
const tensor3 = nn.input(float32TensorDesc);
// Create operations
const intermediateOutput0 = nn.add(tensor0, tensor1);
const intermediateOutput1 = nn.add(tensor2, tensor3);
const output = nn.mul(intermediateOutput0, intermediateOutput1);
// Create a Model object by identifying the output operands.
const model = nn.createModel([output]); In this example, the code snippet can be reused for eager execution. const intermediateOutput0 = nn.add(tensor0, tensor1);
const intermediateOutput1 = nn.add(tensor2, tensor3);
const output = nn.mul(intermediateOutput0, intermediateOutput1); @nsthorat according to #12 , do you think this proposal is a future-proofing way for supporting eager execution later? |
Hi @huningxin , yes, something like this would be good, I think. One issue that complicates this (or at least is related to this discussion) is the notion of subgraphs or functions or scopes. In the long run, we may want to support control-flow constructs (if-then-else, loops). In this setting, we need to identify the scope into which we are adding an operation. One suggestion is that the interface that exposes these methods ("add", "multiply") represents an object (which could be a scope into which we are adding the ops or it could be an eager-execution-mode scope). This object can make the choice on whether to build a graph or eagerly execute them. I think we may not be able to unify the eager and graph-builder interfaces completely seamlessly, and there may be some differences. (E.g., eager expects only Tensors … but a builder can actually work with both Tensors and Operands). But there may still be value in making them similar, even if not identical. Anyway: In the last meeting you made a good point that separating issues may be helpful (instead of trying to solve all issues together in one shot). So, we could worry about the above issues separately if you feel the same. |
As discussed in WebML CG Teleconference - 27 June 2019:
I will follow up. |
I think it is useful to make the (graph-building) syntax simpler for the user. In fact, I think we can make it look almost exactly like an API for executing the ops directly (and the "eager" API), at least for the non-control-flow ops, as explained below.
(1) In the proposal, operands are represented by integers, implicitly associated with the order in which "addOperand" is called. It would be cleaner to have "addOperand" return an object/value that represents the operand. So, this should allow us to replace
by
(2) Instead of separately creating operands to represent the outputs of an op and separately adding the op (connecting the inputs and outputs up), it is better to have addOperation create the operands representing the outputs and returning them. This would allow us to replace
by
I omitted the type float32TensorType since it can be inferred from the operands. But where necessary, we can add the type as an extra parameter. This approach can still support operations that return multiple outputs.
(3) Furthermore, instead of representing operations by a constant like nn.ADD, it seems better to encapsulate them as a method ADD, allowing us to simplify
to
(4) Similarly, for constants, instead of separately creating an operand and then setting its value, we can use a single method to create a constant that does both. This would allow us to simplify:
to:
(As discussed earlier, the type can be omitted if it can be inferred.)
Originally posted by @gramalingam in #15 (comment)
The text was updated successfully, but these errors were encountered: