Skip to content

Commit e37e936

Browse files
committed
doc updates and fix some typos
1 parent fdff16e commit e37e936

File tree

12 files changed

+124
-97
lines changed

12 files changed

+124
-97
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,14 +89,15 @@ python setup.py install
8989

9090
# Deep Dive
9191

92-
* [Quantization](docs/Quantization.md) is the processes that enable inference and training by performing computations at low precision data type, such as fixed point integers. LPOT supports [Post-Training Quantization (static and dynamic)](docs/PTQ.md) and [Quantization-Aware Training](docs/QAT.md)
92+
* [Quantization](docs/Quantization.md) is the processes that enable inference and training by performing computations at low precision data type, such as fixed point integers. LPOT supports [Post-Training Quantization](docs/PTQ.md) and [Quantization-Aware Training](docs/QAT.md)
9393
* [Pruning](docs/pruning.md) provides a common method for introducing sparsity in weights and activations.
9494
* [Benchmarking](docs/benchmark.md) introduces how to utilize the benchmark interface of LPOT.
9595
* [Mixed precision](docs/mixed_precision.md) introduces how to enable mixed precision, including BFP16 and int8 and FP32, on Intel platforms during tuning.
9696
* [Transform](docs/transform.md) introduces how to utilize LPOT buildin data processing and how to develop a custom data processing method.
9797
* [Dataset](docs/dataset.md) introudces how to utilize LPOT buildin dataset and how to develop a custom dataset.
9898
* [Metric](docs/metric.md) introduces how to utilize LPOT buildin metric and how to develop a custom metric.
9999
* [TensorBoard](docs/tensorboard.md) provides tensor histogram and execution graph for tuning debugging purpose.
100+
* [PyTorch Deploy](docs/pytorch_model_saving.md) introduces how LPOT saves and loads quantized PyTorch model.
100101

101102

102103
# Advanced Topics

docs/adaptor.md

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,44 @@
11
Adaptor
22
=================
3-
1. query fw capbility
4-
2. parse tune config ( lpot config -> fwk capbility)
5-
3. (optianal) pre optimize
6-
4. do the quantization
73

4+
## Introduction
85

6+
Intel® Low Precision Optimization Tool built the low-precision inference solution upon popular Deep Learning frameworks
7+
such as TensorFlow, PyTorch, MXNet and ONNX Runtime. The adaptor layer is the bridge between LPOT tuning strategy and
8+
framework vanilla quantizaton APIs.
99

10-
Extension
10+
## Adaptor Design
11+
12+
Intel® Low Precision Optimization Tool supports new adaptor extension by implementing a subclass of `Adaptor` class in lpot.adaptor package
13+
and registering this strategy by `adaptor_registry` decorator.
14+
15+
for example, user can implement a `Abc` strategy like below:
16+
```
17+
@adaptor_registry
18+
class AbcAdaptor(Adaptor):
19+
def __init__(self, framework_specific_info):
20+
...
21+
22+
def quantize(self, tune_cfg, model, dataloader, q_func=None):
23+
...
24+
25+
def evaluate(self, model, dataloader, postprocess=None,
26+
metric=None, measurer=None, iteration=-1, tensorboard=False):
27+
...
28+
29+
def query_fw_capability(self, model):
30+
...
31+
32+
def query_fused_patterns(self, model):
33+
...
34+
```
35+
36+
`quantize` function is used to do calibration and quanitization in post-training quantization.
37+
`evaluate` function is used to run evaluation on validation dataset.
38+
`query_fw_capability` function is used to run query framework quantization capability and intersects with user yaml configuration setting to
39+
`query_fused_patterns` function is used to run query framework graph fusion capability and decide the fusion tuning space.
40+
41+
Customize a New Framework Backend
1142
=================
1243
Let us take onnxruntime as en example. Onnxruntime is a backend proposed by microsoft, and it's based on MLAS kernel defaultly.
1344
Onnxruntime already has [quantization tools](https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/python/tools/quantization), so the question becomes how to intergrate onnxruntime quantization tools into LPOT.

docs/benchmark.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
Benchmarking
22
===============
33

4-
Benchmakring measuring the model performance with the objective settings, user can get the performance of the models between float32 model and quantized low precision model in same scenarios that they configured in yaml. Benchmarking is always used after a quantization process.
4+
Benchmarking feature of LPOT is used to measure the model performance with the objective settings, user can get the performance of the models between float32 model and quantized low precision model in same scenarios that they configured in yaml. Benchmarking is always used after a quantization process.
55

66
# how to use it
77
## config evaluation filed in yaml file
8-
'''
8+
9+
```
910
evaluation: # optional. required if user doesn't provide eval_func in lpot.Quantization.
1011
accuracy: # optional. required if user doesn't provide eval_func in lpot.Quantization.
1112
metric:
@@ -41,20 +42,24 @@ evaluation: # optional. required if use
4142
ToTensor:
4243
Normalize:
4344
mean: [0.485, 0.456, 0.406]
44-
'''
45+
```
4546

4647
in this example config you can see there is 2 sub-fields named 'accuracy' and 'performance', benchmark module will get the accuracy and performance of the model. User can also remove the performance field to only get accuracy of the model or the opposite. It's flexible to configure the benchmark you want.
48+
4749
## use user specific dataloader to run benchmark
50+
4851
In this case, you should config your dataloader and lpot will construct an evaluation function to run the benchmarking.
49-
'''python
52+
53+
```python
5054
dataset = Dataset() # dataset class that implement __getitem__ method or __iter__ method
5155
from lpot import Benchmark
5256
evaluator = Benchmark(config.yaml)
5357
evaluator.dataloader(dataset, batch_size=batch_size)
5458
results = evaluator(model=input_model)
5559

56-
'''
60+
```
5761

5862
###Examples
59-
View [Benchamrk example of Tensorflow image recognition models](../examples/tensorflow/image_recognition/run_benchmarking.sh).
63+
64+
[Benchamrk example](../examples/tensorflow/image_recognition/run_benchmark.sh).
6065

docs/introduction.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ The `conf_fname` parameter used in the class initialization is the path to user
2626
>
2727
> Intel® Low Precision Optimization Tool provides template yaml files for the [Post-Training Quantization](../lpot/template/ptq.yaml), [Quantization-Aware Traing](../lpot/template/qat.yaml), and [Pruning](../lpot/template/pruning.yaml) scenarios. Refer to these template files to understand the meaning of each field.
2828
29-
> Note that most fields in the yaml templates are optional. View the [HelloWorld Yaml](../examples/helloworld/tf2.x/conf.yaml) example for reference.
29+
> Note that most fields in the yaml templates are optional. View the [HelloWorld Yaml](../examples/helloworld/tf_example2/conf.yaml) example for reference.
3030
3131
For TensorFlow backend, LPOT supports passing the path of keras model, frozen pb, checkpoint, saved model as the input of `model` parameter of `Quantization()`.
3232

@@ -139,4 +139,4 @@ If `dataloader` and `metric` components get fully configured by yaml, the quanti
139139
quantizer = Quantization('/path/to/user.yaml')
140140
q_model = quantizer('/path/to/model')
141141
```
142-
Examples of this usage are at [TensorFlow Classification Models](../examples/tensorflow/image_recognition/README.md).
142+
Examples of this usage are at [TensorFlow Classification Models](../examples/tensorflow/image_recognition/README.md).

docs/strategy.md

Lines changed: 0 additions & 2 deletions
This file was deleted.

docs/tensorflow_model_support.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,14 @@ Intel® Low Precision Optimization Tool supports diffrent model formats of Tenso
77
| TensorFlow model format | Supported? | Example | Comments |
88
| ------ | ------ |------|------|
99
| frozen pb | Yes | [examples/tensorflow/image_recognition](examples/tensorflow/image_recognition), [examples/tensorflow/oob_models](examples/tensorflow/oob_models) | |
10-
| Graph object | Yes | [examples/helloworld/tf1.x](examples/helloworld/tf1.x), [examples/tensorflow/style_transfer](examples/tensorflow/style_transfer), [examples/tensorflow/recommendation/wide_deep_large_ds](examples/tensorflow/recommendation/wide_deep_large_ds) | |
10+
| Graph object | Yes | [examples/tensorflow/style_transfer](examples/tensorflow/style_transfer), [examples/tensorflow/recommendation/wide_deep_large_ds](examples/tensorflow/recommendation/wide_deep_large_ds) | |
1111
| GraphDef object | Yes | | |
12-
| tf1.x checkpoint | Yes | [examples/tensorflow/object_detection](examples/tensorflow/object_detection) | |
13-
| keras.Model object | Yes | [examples/helloworld/tf2.x](examples/helloworld/tf2.x) | |
14-
| keras saved model | Yes | [examples/helloworld/tf2.x](examples/helloworld/tf2.x) | |
12+
| tf1.x checkpoint | Yes | [examples/helloworld/tf_example4](examples/helloworld/tf_example4), [examples/tensorflow/object_detection](examples/tensorflow/object_detection) | |
13+
| keras.Model object | Yes | | |
14+
| keras saved model | Yes | [examples/helloworld/tf_example2](examples/helloworld/tf_example2) | |
1515
| tf2.x saved model | TBD | | |
1616
| tf2.x h5 format model | TBD ||
17-
| slim checkpoint | TBD | |
17+
| slim checkpoint | Yes | [examples/helloworld/tf_example3](examples/helloworld/tf_example3) |
1818
| tf1.x saved model | No| | No plan to support it |
1919
| tf2.x checkpoint | No | | As tf2.x checkpoint only has weight and does not contain any description of the computation, please use different tf2.x model for quantization |
2020

@@ -27,7 +27,7 @@ from lpot import Quantization
2727
quantizer = Quantization('./conf.yaml')
2828
dataset = mnist_dataset(mnist.test.images, mnist.test.labels)
2929
data_loader = quantizer.dataloader(dataset=dataset, batch_size=1)
30-
model = frozen_pb/Graph/GraphDef/checkpoint_path/keras.Model/keras_savedmodel_path
30+
# model parameter could be one of frozen_pb, Graph, GraphDef, checkpoint_path, keras.Model and keras_savedmodel_path
3131
q_model = quantizer(frozen_pb, q_dataloader=data_loader, eval_func=eval_func)
3232

3333
```

docs/tuning_strategies.md

Lines changed: 38 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ tuning phase stops when the `accuracy` criteria is met.
3333

3434
## Configurations
3535

36-
Detailed configuration templates can be found in [`here`](ilit/template).
36+
Detailed configuration templates can be found in [`here`](lpot/template).
3737

3838
### Model-specific configurations
3939

@@ -46,7 +46,7 @@ quantization: # optional. tuning constrai
4646
approach: post_training_static_quant # optional. default value is post_training_static_quant.
4747
calibration:
4848
sampling_size: 1000, 2000 # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
49-
dataloader: # optional. if not specified, user need construct a q_dataloader in code for ilit.Quantization.
49+
dataloader: # optional. if not specified, user need construct a q_dataloader in code for lpot.Quantization.
5050
dataset:
5151
TFRecordDataset:
5252
root: /path/to/tf_record
@@ -103,12 +103,6 @@ tuning:
103103
random_seed: 9527 # optional. random seed for deterministic tuning.
104104
tensorboard: True # optional. dump tensor distribution in evaluation phase for debug purpose. default value is False.
105105
```
106-
## Customize a new strategy
107-
108-
Users can use the basic `TuneStrategy` class to enable a new strategy with a
109-
new `self.next_tune_cfg()` function implementation. If the new strategy
110-
needs additional information, users can override the `self.traverse()` in
111-
the new strategy, such as `TPE` strategy.
112106
113107
### Basic
114108
@@ -296,3 +290,39 @@ tuning configs to generate a better-performance quantized model.
296290
`Random` usage is similar to `Basic`:
297291

298292
```yaml
293+
tuning:
294+
strategy:
295+
name: random
296+
accuracy_criterion:
297+
relative: 0.01
298+
exit_policy:
299+
timeout: 0
300+
random_seed: 9527
301+
302+
```
303+
304+
Customize a New Tuning Strategy
305+
======================
306+
307+
Intel® Low Precision Optimization Tool supports new strategy extension by implementing a subclass of `TuneStrategy` class in lpot.strategy package
308+
and registering this strategy by `strategy_registry` decorator.
309+
310+
for example, user can implement a `Abc` strategy like below:
311+
312+
```
313+
@strategy_registry
314+
class AbcTuneStrategy(TuneStrategy):
315+
def __init__(self, model, conf, q_dataloader, q_func=None,
316+
eval_dataloader=None, eval_func=None, dicts=None):
317+
...
318+
319+
def next_tune_cfg(self):
320+
...
321+
322+
```
323+
324+
The `next_tune_cfg` function is used to yield the next tune configuration according to some algorithm or strategy. `TuneStrategy` base class will traverse
325+
all the tuning space till a quantization configuration meets pre-defined accuray criterion.
326+
327+
If the traverse behavior of `TuneStrategy` base class does not meet new strategy requirement, it could re-implement `traverse` function with self own logic.
328+
An example like this is under [TPE Strategy](../lpot/strategy/tpe.py).

docs/tutorial.md

Lines changed: 19 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -38,110 +38,75 @@ To define a customized dataloader or evaluator for quantization, user can implem
3838

3939
Next, let's introduce how to do quantization in different scenarios.
4040

41-
# Coding free quantization
42-
The examples/helloworld/tf_coding_free demonstrates how to utilize LPOT builtin dataloader and evalautors for quantizaiton, and how to use LPOT Benchmark class for performance and accuracy measurement, and user only need to add 3 lines of launcher code for tuning. See [README](examples/helloworld/tf_coding_free/README.md)
41+
# Buildin dataloader and metric
42+
The [tf_example1](examples/helloworld/tf_example1) demonstrates how to utilize LPOT builtin dataloader and evalautors for quantizaiton. User only needs to add 3 lines of launcher code for tuning, see [README](examples/helloworld/tf_example1/README.md) for more details.
4343

4444

4545
# Customized dataloader
46-
With a Keras saved model as example, [examples/helloworld/tf2.x_custom_dataloader](examples/helloworld/tf2.x_custom_dataloader] demonstrates how to define a customized dataloader.
46+
With a Keras saved model as example, [examples/helloworld/tf_example2](examples/helloworld/tf_example2] demonstrates how to define a customized dataloader and metric for quantization.
47+
48+
First define a dataset class on mnist, it implements a __getitem() interface and return the next (image, label) pair.
4749

48-
First define a dataset class on mnist.
4950
```
5051
class Dataset(object):
5152
def __init__(self):
52-
# TODO:initialize dataset related info here
5353
(train_images, train_labels), (test_images,
5454
test_labels) = keras.datasets.fashion_mnist.load_data()
5555
self.test_images = test_images.astype(np.float32) / 255.0
5656
self.labels = test_labels
5757
pass
5858
5959
def __getitem__(self, index):
60-
# TODO:get item magic method
61-
# return a tuple containing 1 image and 1 label
62-
# for example, return img, label
6360
return self.test_images[index], self.labels[index]
6461
6562
def __len__(self):
66-
# TODO:get total length of dataset, such as how many images in the dataset
67-
# if the total length is not able to know, pls implement __iter__() magic method
68-
# rather than above two methods.
6963
return len(self.test_images)
7064
7165
```
72-
Then define a dataloader based on the mnist dataset, run quantization on the customized dataloader. q_model is the quantized model.
73-
```
74-
import lpot
75-
quantizer = lpot.Quantization('./conf.yaml')
76-
dataset = Dataset()
77-
# Define a customer data loader
78-
dataloader = quantizer.dataloader(dataset, batch_size=1)
79-
q_model = quantizer('../models/simple_model', q_dataloader = dataloader, eval_dataloader = dataloader)
80-
```
81-
# Customized evaluator
82-
Example examples/helloworld/tf2.x_custom_metric shows how to define and use a customized evaluator for quantization, this evaluator calculate accuracy and it will be registered in Qunatization object with metric() funciton. See [README](examples/helloworld/tf2.x_custom_metric/README.md)
83-
```
84-
quantizer.metric('hello_metric', MyMetric)
85-
```
66+
Then define a customized metric to caculate accuracy. The update() function record the predict result and result() function provide the summary of accurate rate.
8667

8768
```
69+
import lpot
70+
from lpot.metric import Metric
8871
class MyMetric(Metric):
8972
def __init__(self, *args):
90-
# TODO:initialize metric related info here
9173
self.pred_list = []
9274
self.label_list = []
9375
self.samples = 0
9476
pass
9577
9678
def update(self, predict, label):
97-
# TODO:metric evaluation per evaluation
9879
self.pred_list.extend(np.argmax(predict, axis=1))
9980
self.label_list.extend(label)
10081
self.samples += len(label)
10182
pass
10283
10384
def reset(self):
104-
# TODO:reset variable if needed
10585
self.pred_list = []
10686
self.label_list = []
10787
self.samples = 0
10888
pass
10989
11090
def result(self):
111-
# TODO:calculate the whole batch final evaluation result
112-
# return a float value which is higher-is-better.
113-
# for example, return coco_map_value
11491
correct_num = np.sum(
11592
np.array(self.pred_list) == np.array(self.label_list))
11693
return correct_num / self.samples
117-
11894
```
11995

120-
# The interface is similiar for different TensorFlow models
121-
1. see example on tf1.x frozen pb at [tf1.x_pb](examples/helloworld/tf1.x_pb/).
122-
2. see example on tf1.x checkpoint at [tf1.x_ckpt](examples/helloworld/tf1.x_ckpt/).
123-
3. To quantize a slim .ckpt model, we need to get the graph. See a full example on slim at [examples/helloworld/tf1.x_slim](examples/helloworld/tf1.x_slim).
96+
Then define a dataloader based on the mnist dataset, and register the customer metric to run quantization. q_model is the quantized model generated.
12497
```
125-
import lpot
126-
quantizer = lpot.Quantization('./conf.yaml')
127-
128-
# Get graph from slim checkpoint
129-
from tf_slim.nets import inception
130-
model_func = inception.inception_v1
131-
arg_scope = inception.inception_v1_arg_scope()
132-
kwargs = {'num_classes': 1001}
133-
inputs_shape = [None, 224, 224, 3]
134-
images = tf.compat.v1.placeholder(name='input', \
135-
dtype=tf.float32, shape=inputs_shape)
13698
137-
from lpot.adaptor.tf_utils.util import get_slim_graph
138-
graph = get_slim_graph('./inception_v1.ckpt', model_func, \
139-
arg_scope, images, **kwargs)
140-
141-
# Do quantization
142-
quantized_model = quantizer(graph)
99+
import lpot
100+
quantizer = lpot.Quantization('./conf.yaml')
101+
dataset = Dataset()
102+
quantizer.metric('hello_metric', MyMetric)
103+
dataloader = quantizer.dataloader(dataset, batch_size=1)
104+
q_model = quantizer('../models/simple_model', q_dataloader = dataloader, eval_dataloader = dataloader)
143105
144106
```
145107

146-
108+
# The interface is similiar for different TensorFlow models
109+
1. TensorFlow checkpoint: see [tf_example4](examples/helloworld/tf_example4)
110+
2. Enable benchmark for performanace and accuracy measurement: see [tf_example5](examples/helloworld/tf_example5)
111+
3. TensorFlow slim model: see [tf_example3](examples/helloworld/tf_example3), while to quantize a slim .ckpt model we need to get the graph first, see [README](examples/helloworld/tf_example3/README.md).
147112

examples/helloworld/tf_example2/conf.yaml

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,11 @@ model:
44
inputs: input # optional. inputs and outputs fields are only required for tensorflow backend.
55
outputs: output
66

7-
#evaluation:
8-
# accuracy:
9-
107
tuning:
118
accuracy_criterion:
12-
relative: 0.01 # the tuning target of accuracy loss percentage: 1%
9+
relative: 0.01 # the tuning target of accuracy loss percentage: 1%
1310
exit_policy:
14-
timeout: 100 # tuning timeout (seconds)
11+
timeout: 100 # tuning timeout (seconds)
1512
random_seed: 100 # random seed
1613

1714

examples/helloworld/tf_example4/conf.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ model: # mandatory. lpot uses this
55

66
quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
77
calibration:
8-
sampling_size: 20 # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
8+
sampling_size: 20 # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
99

1010
evaluation: # optional. required if user doesn't provide eval_func in lpot.Quantization.
1111
accuracy: # optional. required if user doesn't provide eval_func in lpot.Quantization.

0 commit comments

Comments
 (0)