|
| 1 | +# Steps to debug an ML Program operator implementation |
| 2 | + |
| 3 | +Basic debugging of everything, excluding model execution, (e.g. partitioning, checking if operator is supported, |
| 4 | +adding CoreML operator input/outputs) can be done anywhere as the code is setup to build and be able to create the |
| 5 | +protobuf based CoreML Model on all platforms. |
| 6 | + |
| 7 | +To debug model execution issues you will need a macOS machine. |
| 8 | + |
| 9 | +## Debugging invalid output |
| 10 | + |
| 11 | +If there is a crash during execution or unexpected output, the best approach is to see what using coremltools directly |
| 12 | +produces. |
| 13 | + |
| 14 | +NOTE: that doesn't guarantee coremltools is correct as there could be a bug in their implementation. It does however |
| 15 | +provide a data point on whether we are generating the same CoreML model as the coremltools python. |
| 16 | + |
| 17 | +### Comparing to coremltools output |
| 18 | + |
| 19 | +Create a small test script that replicates the inputs/outputs of the operator you are debugging. |
| 20 | +This script should use the coremltools library to run the operator and print the output. |
| 21 | +This can be used to compare the CoreML EP's output with the coremltools output. |
| 22 | + |
| 23 | +https://apple.github.io/coremltools/docs-guides/source/model-intermediate-language.html#create-a-mil-program |
| 24 | + |
| 25 | +Usage is reasonably intuitive. The below example defines a model with 2 inputs and a matmul operator. |
| 26 | +The model is printed, and run with randomly generated inputs. The output from doing so is printed. |
| 27 | + |
| 28 | +```python |
| 29 | +import numpy as np |
| 30 | +import coremltools as ct |
| 31 | +from coremltools.converters.mil import Builder as mb |
| 32 | + |
| 33 | +target = ct.target.iOS15 |
| 34 | + |
| 35 | +x_shape = (1, 4) |
| 36 | +y_shape = (10, 4, 3) |
| 37 | + |
| 38 | +@mb.program(input_specs=[mb.TensorSpec(shape=x_shape), mb.TensorSpec(shape=y_shape)], |
| 39 | + opset_version=target) |
| 40 | +def prog(x, y): |
| 41 | + # For reference, a constant can be added using `mb.const` and specifying the data in the `val` parameter. |
| 42 | + # c_shape = (3, ) |
| 43 | + # c_data = np.random.random_sample(c_shape) |
| 44 | + # c = mb.const(val=c_data) |
| 45 | + |
| 46 | + # call the operator you are debugging with the inputs/constants. |
| 47 | + # See the spec for the operator names, input/outputs and supported data types. |
| 48 | + # https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html |
| 49 | + z = mb.matmul(x=x, y=y) |
| 50 | + |
| 51 | + # can have additional function calls here if there are multiple operators involved. |
| 52 | + # Contrived example that uses a constant and the output from a previous operator: |
| 53 | + # z = mb.add(x=z, y=c) |
| 54 | + |
| 55 | + return z |
| 56 | + |
| 57 | +# Prints the MIL program in a reasonably concise manner. |
| 58 | +print(prog) |
| 59 | + |
| 60 | +# Convert to ML Program model |
| 61 | +m = ct.convert(prog, minimum_deployment_target=target) |
| 62 | + |
| 63 | +# If you want to dump the full protobuf of the model uncomment this. |
| 64 | +# You can compare the values to what is being set by the ORT CoreML EP code if you suspect any issues there. |
| 65 | +# spec = m.get_spec() |
| 66 | +# print(spec) |
| 67 | + |
| 68 | +# run the model to generate output for comparison with the CoreML EP output |
| 69 | +x = np.random.rand(*x_shape) |
| 70 | +y = np.random.rand(*y_shape) |
| 71 | + |
| 72 | +print(m.predict({'x': x, 'y': y})) |
| 73 | +``` |
| 74 | + |
| 75 | +## Dumping the ORT generated mlmodel |
| 76 | + |
| 77 | +You can also dump the mlmodel generated by the ORT CoreML EP. This can be handy with larger models. |
| 78 | + |
| 79 | +In a debug build, set the ORT_COREML_EP_MODEL_DIR environment variable to a directory where you want the ML Package |
| 80 | +containing the mlmodel to be saved. The model will remain after the CoreML EP exits, unlike the default behavior |
| 81 | +where we write it to a temporary directory that is automatically removed on application exit. |
| 82 | + |
| 83 | +Script to dump: [dump_mlprogram_model.py](dump_mlprogram_model.py) |
| 84 | + |
| 85 | +See [here](https://github.com/microsoft/onnxruntime/blob/3c0b407709fd3c71755ed046edd688b30a786d94/onnxruntime/core/providers/coreml/model/host_utils.h#L70-L75) for environment variable setup and [usage](https://github.com/search?q=repo%3Amicrosoft%2Fonnxruntime%20kOverrideModelOutputDirectoryEnvVar%20&type=code). |
0 commit comments