doc updates and fix some typos

ftian1 · ftian1 · commit e37e936bbdeb · 2020-12-31T21:35:08.000+08:00
diff --git a/README.md b/README.md
@@ -89,14 +89,15 @@ python setup.py install
 
 # Deep Dive
 
-* [Quantization](docs/Quantization.md) is the processes that enable inference and training by performing computations at low precision data type, such as fixed point integers. LPOT supports [Post-Training Quantization (static and dynamic)](docs/PTQ.md) and [Quantization-Aware Training](docs/QAT.md)
+* [Quantization](docs/Quantization.md) is the processes that enable inference and training by performing computations at low precision data type, such as fixed point integers. LPOT supports [Post-Training Quantization](docs/PTQ.md) and [Quantization-Aware Training](docs/QAT.md)
 * [Pruning](docs/pruning.md) provides a common method for introducing sparsity in weights and activations.
 * [Benchmarking](docs/benchmark.md) introduces how to utilize the benchmark interface of LPOT.
 * [Mixed precision](docs/mixed_precision.md) introduces how to enable mixed precision, including BFP16 and int8 and FP32, on Intel platforms during tuning.
 * [Transform](docs/transform.md) introduces how to utilize LPOT buildin data processing and how to develop a custom data processing method. 
 * [Dataset](docs/dataset.md) introudces how to utilize LPOT buildin dataset and how to develop a custom dataset.
 * [Metric](docs/metric.md) introduces how to utilize LPOT buildin metric and how to develop a custom metric.
 * [TensorBoard](docs/tensorboard.md) provides tensor histogram and execution graph for tuning debugging purpose.
+* [PyTorch Deploy](docs/pytorch_model_saving.md) introduces how LPOT saves and loads quantized PyTorch model.
 
 
 # Advanced Topics
diff --git a/docs/adaptor.md b/docs/adaptor.md
@@ -1,13 +1,44 @@
 Adaptor
 =================
-1. query fw capbility
-2. parse tune config ( lpot config -> fwk capbility)
-3. (optianal) pre optimize 
-4. do the quantization
 
+## Introduction
 
+Intel® Low Precision Optimization Tool built the low-precision inference solution upon popular Deep Learning frameworks
+such as TensorFlow, PyTorch, MXNet and ONNX Runtime. The adaptor layer is the bridge between LPOT tuning strategy and
+framework vanilla quantizaton APIs.
 
-Extension
+## Adaptor Design
+
+Intel® Low Precision Optimization Tool supports new adaptor extension by implementing a subclass of `Adaptor` class in lpot.adaptor package
+ and registering this strategy by `adaptor_registry` decorator.
+
+for example, user can implement a `Abc` strategy like below:
+```
+@adaptor_registry
+class AbcAdaptor(Adaptor):
+    def __init__(self, framework_specific_info):
+        ...
+
+    def quantize(self, tune_cfg, model, dataloader, q_func=None):
+        ...
+
+    def evaluate(self, model, dataloader, postprocess=None,
+                 metric=None, measurer=None, iteration=-1, tensorboard=False):
+        ...
+
+    def query_fw_capability(self, model):
+        ...
+
+    def query_fused_patterns(self, model):
+        ...
+```
+
+`quantize` function is used to do calibration and quanitization in post-training quantization.
+`evaluate` function is used to run evaluation on validation dataset.
+`query_fw_capability` function is used to run query framework quantization capability and intersects with user yaml configuration setting to
+`query_fused_patterns` function is used to run query framework graph fusion capability and decide the fusion tuning space.
+
+Customize a New Framework Backend
 =================
 Let us take onnxruntime as en example. Onnxruntime is a backend proposed by microsoft, and it's based on MLAS kernel defaultly. 
 Onnxruntime already has  [quantization tools](https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/python/tools/quantization), so the question becomes how to intergrate onnxruntime quantization tools into LPOT. 
diff --git a/docs/benchmark.md b/docs/benchmark.md
@@ -1,11 +1,12 @@
 Benchmarking
 ===============
 
-Benchmakring measuring the model performance with the objective settings, user can get the performance of the models between float32 model and quantized low precision model in same scenarios that they configured in yaml. Benchmarking is always used after a quantization process.
+Benchmarking feature of LPOT is used to measure the model performance with the objective settings, user can get the performance of the models between float32 model and quantized low precision model in same scenarios that they configured in yaml. Benchmarking is always used after a quantization process.
 
 # how to use it
 ## config evaluation filed in yaml file
-'''
+
+```
 evaluation:                                          # optional. required if user doesn't provide eval_func in lpot.Quantization.
   accuracy:                                          # optional. required if user doesn't provide eval_func in lpot.Quantization.
     metric:
@@ -41,20 +42,24 @@ evaluation:                                          # optional. required if use
         ToTensor:
         Normalize:
           mean: [0.485, 0.456, 0.406]
-'''
+```
 
 in this example config you can see there is 2 sub-fields named 'accuracy' and 'performance', benchmark module will get the accuracy and performance of the model. User can also remove the performance field to only get accuracy of the model or the opposite. It's flexible to configure the benchmark you want.
+
 ## use user specific dataloader to run benchmark
+
 In this case, you should config your dataloader and lpot will construct an evaluation function to run the benchmarking.
-'''python
+
+```python
 dataset = Dataset() #  dataset class that implement __getitem__ method or __iter__ method
 from lpot import Benchmark
 evaluator = Benchmark(config.yaml)
 evaluator.dataloader(dataset, batch_size=batch_size)
 results = evaluator(model=input_model)
 
-'''
+```
 
 ###Examples
-View [Benchamrk example of Tensorflow image recognition models](../examples/tensorflow/image_recognition/run_benchmarking.sh).
+
+[Benchamrk example](../examples/tensorflow/image_recognition/run_benchmark.sh).
 
diff --git a/docs/introduction.md b/docs/introduction.md
@@ -26,7 +26,7 @@ The `conf_fname` parameter used in the class initialization is the path to user
 >
 > Intel® Low Precision Optimization Tool provides template yaml files for the [Post-Training Quantization](../lpot/template/ptq.yaml), [Quantization-Aware Traing](../lpot/template/qat.yaml), and [Pruning](../lpot/template/pruning.yaml) scenarios. Refer to these template files to understand the meaning of each field.
 
-> Note that most fields in the yaml templates are optional. View the [HelloWorld Yaml](../examples/helloworld/tf2.x/conf.yaml) example for reference.
+> Note that most fields in the yaml templates are optional. View the [HelloWorld Yaml](../examples/helloworld/tf_example2/conf.yaml) example for reference.
 
 For TensorFlow backend, LPOT supports passing the path of keras model, frozen pb, checkpoint, saved model as the input of `model` parameter of `Quantization()`.
 
@@ -139,4 +139,4 @@ If `dataloader` and `metric` components get fully configured by yaml, the quanti
 quantizer = Quantization('/path/to/user.yaml')
 q_model = quantizer('/path/to/model')
 ```
-Examples of this usage are at [TensorFlow Classification Models](../examples/tensorflow/image_recognition/README.md).
+Examples of this usage are at [TensorFlow Classification Models](../examples/tensorflow/image_recognition/README.md).
diff --git a/docs/strategy.md b/docs/strategy.md
diff --git a/docs/tensorflow_model_support.md b/docs/tensorflow_model_support.md
@@ -7,14 +7,14 @@ Intel® Low Precision Optimization Tool supports diffrent model formats of Tenso
 | TensorFlow model format | Supported? | Example | Comments |
 | ------ | ------ |------|------|
 | frozen pb | Yes | [examples/tensorflow/image_recognition](examples/tensorflow/image_recognition), [examples/tensorflow/oob_models](examples/tensorflow/oob_models) | |
-| Graph object | Yes | [examples/helloworld/tf1.x](examples/helloworld/tf1.x), [examples/tensorflow/style_transfer](examples/tensorflow/style_transfer), [examples/tensorflow/recommendation/wide_deep_large_ds](examples/tensorflow/recommendation/wide_deep_large_ds) | |
+| Graph object | Yes | [examples/tensorflow/style_transfer](examples/tensorflow/style_transfer), [examples/tensorflow/recommendation/wide_deep_large_ds](examples/tensorflow/recommendation/wide_deep_large_ds) | |
 | GraphDef object | Yes | | |
-| tf1.x checkpoint | Yes | [examples/tensorflow/object_detection](examples/tensorflow/object_detection) | |
-| keras.Model object | Yes | [examples/helloworld/tf2.x](examples/helloworld/tf2.x) | |
-| keras saved model | Yes | [examples/helloworld/tf2.x](examples/helloworld/tf2.x) | |
+| tf1.x checkpoint | Yes | [examples/helloworld/tf_example4](examples/helloworld/tf_example4), [examples/tensorflow/object_detection](examples/tensorflow/object_detection) | |
+| keras.Model object | Yes | | |
+| keras saved model | Yes | [examples/helloworld/tf_example2](examples/helloworld/tf_example2) | |
 | tf2.x saved model | TBD | | |
 | tf2.x h5 format model  | TBD ||
-| slim checkpoint | TBD | |
+| slim checkpoint | Yes | [examples/helloworld/tf_example3](examples/helloworld/tf_example3) |
 | tf1.x saved model | No| | No plan to support it |
 | tf2.x checkpoint | No | | As tf2.x checkpoint only has weight and does not contain any description of the computation, please use different tf2.x model for quantization |
 
@@ -27,7 +27,7 @@ from lpot import Quantization
 quantizer = Quantization('./conf.yaml')
 dataset = mnist_dataset(mnist.test.images, mnist.test.labels)
 data_loader = quantizer.dataloader(dataset=dataset, batch_size=1)
-model = frozen_pb/Graph/GraphDef/checkpoint_path/keras.Model/keras_savedmodel_path
+# model parameter could be one of frozen_pb, Graph, GraphDef, checkpoint_path, keras.Model and keras_savedmodel_path
 q_model = quantizer(frozen_pb, q_dataloader=data_loader, eval_func=eval_func)
 
 ```
diff --git a/docs/tuning_strategies.md b/docs/tuning_strategies.md
@@ -33,7 +33,7 @@ tuning phase stops when the `accuracy` criteria is met.
 
 ## Configurations
 
-Detailed configuration templates can be found in [`here`](ilit/template).
+Detailed configuration templates can be found in [`here`](lpot/template).
 
 ### Model-specific configurations
 
@@ -46,7 +46,7 @@ quantization:                                        # optional. tuning constrai
   approach: post_training_static_quant               # optional. default value is post_training_static_quant.
   calibration:
     sampling_size: 1000, 2000                        # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
-    dataloader:                                      # optional. if not specified, user need construct a q_dataloader in code for ilit.Quantization.
+    dataloader:                                      # optional. if not specified, user need construct a q_dataloader in code for lpot.Quantization.
       dataset:
         TFRecordDataset:
           root: /path/to/tf_record
@@ -103,12 +103,6 @@ tuning:
   random_seed: 9527                                  # optional. random seed for deterministic tuning.
   tensorboard: True                                  # optional. dump tensor distribution in evaluation phase for debug purpose. default value is False.
 ```
-## Customize a new strategy
-
-Users can use the basic `TuneStrategy` class to enable a new strategy with a
-new `self.next_tune_cfg()` function implementation. If the new strategy
-needs additional information, users can override the `self.traverse()` in
-the new strategy, such as `TPE` strategy.
 
 ### Basic
 
@@ -296,3 +290,39 @@ tuning configs to generate a better-performance quantized model.
 `Random` usage is similar to `Basic`:
 
 ```yaml
+tuning:
+  strategy:
+    name: random 
+  accuracy_criterion:
+    relative:  0.01
+  exit_policy:
+    timeout: 0
+  random_seed: 9527
+
+```
+
+Customize a New Tuning Strategy
+======================
+
+Intel® Low Precision Optimization Tool supports new strategy extension by implementing a subclass of `TuneStrategy` class in lpot.strategy package
+ and registering this strategy by `strategy_registry` decorator.
+
+for example, user can implement a `Abc` strategy like below:
+
+```
+@strategy_registry
+class AbcTuneStrategy(TuneStrategy):
+    def __init__(self, model, conf, q_dataloader, q_func=None,
+                 eval_dataloader=None, eval_func=None, dicts=None):
+        ...
+
+    def next_tune_cfg(self):
+        ...
+
+```
+
+The `next_tune_cfg` function is used to yield the next tune configuration according to some algorithm or strategy. `TuneStrategy` base class will traverse
+ all the tuning space till a quantization configuration meets pre-defined accuray criterion.
+
+If the traverse behavior of `TuneStrategy` base class does not meet new strategy requirement, it could re-implement `traverse` function with self own logic.
+An example like this is under [TPE Strategy](../lpot/strategy/tpe.py).
diff --git a/docs/tutorial.md b/docs/tutorial.md
@@ -38,110 +38,75 @@ To define a customized dataloader or evaluator for quantization, user can implem
 
 Next, let's introduce how to do quantization in different scenarios. 
 
-# Coding free quantization
-The examples/helloworld/tf_coding_free demonstrates how to utilize LPOT builtin dataloader and evalautors for quantizaiton, and how to use LPOT Benchmark class for performance and accuracy measurement, and user only need to add 3 lines of launcher code for tuning. See [README](examples/helloworld/tf_coding_free/README.md)
+# Buildin dataloader and metric 
+The [tf_example1](examples/helloworld/tf_example1) demonstrates how to utilize LPOT builtin dataloader and evalautors for quantizaiton. User only needs to add 3 lines of launcher code for tuning, see [README](examples/helloworld/tf_example1/README.md) for more details. 
 
 
 # Customized dataloader
-With a Keras saved model as example, [examples/helloworld/tf2.x_custom_dataloader](examples/helloworld/tf2.x_custom_dataloader] demonstrates how to define a customized dataloader. 
+With a Keras saved model as example, [examples/helloworld/tf_example2](examples/helloworld/tf_example2] demonstrates how to define a customized dataloader and metric for quantization. 
+
+First define a dataset class on mnist, it implements a __getitem() interface and return the next (image, label) pair.
 
-First define a dataset class on mnist.
 ```
 class Dataset(object):
   def __init__(self):
-      # TODO:initialize dataset related info here
       (train_images, train_labels), (test_images,
                  test_labels) = keras.datasets.fashion_mnist.load_data()
       self.test_images = test_images.astype(np.float32) / 255.0
       self.labels = test_labels
       pass
 
   def __getitem__(self, index):
-      # TODO:get item magic method
-      # return a tuple containing 1 image and 1 label
-      # for example, return img, label
       return self.test_images[index], self.labels[index]
 
   def __len__(self):
-      # TODO:get total length of dataset, such as how many images in the dataset
-      # if the total length is not able to know, pls implement __iter__() magic method
-      # rather than above two methods.
       return len(self.test_images)
 
 ```
-Then define a dataloader based on the mnist dataset, run quantization on the customized dataloader. q_model is the quantized model. 
-```
-import lpot
-quantizer = lpot.Quantization('./conf.yaml')
-dataset = Dataset()
-# Define a customer data loader
-dataloader = quantizer.dataloader(dataset, batch_size=1)
-q_model = quantizer('../models/simple_model', q_dataloader = dataloader, eval_dataloader = dataloader)
-```
-# Customized evaluator
-Example examples/helloworld/tf2.x_custom_metric shows how to define and use a customized evaluator for quantization, this evaluator calculate accuracy and it will be registered in Qunatization object with metric() funciton. See [README](examples/helloworld/tf2.x_custom_metric/README.md)
-```
-quantizer.metric('hello_metric', MyMetric)
-```
+Then define a customized metric to caculate accuracy. The update() function record the predict result and result() function provide the summary of accurate rate. 
 
 ```
+import lpot
+from lpot.metric import Metric
 class MyMetric(Metric):
   def __init__(self, *args):
-      # TODO:initialize metric related info here
       self.pred_list = []
       self.label_list = []
       self.samples = 0
       pass
 
   def update(self, predict, label):
-      # TODO:metric evaluation per evaluation
       self.pred_list.extend(np.argmax(predict, axis=1))
       self.label_list.extend(label)
       self.samples += len(label)
       pass
 
   def reset(self):
-      # TODO:reset variable if needed
       self.pred_list = []
       self.label_list = []
       self.samples = 0
       pass
 
   def result(self):
-      # TODO:calculate the whole batch final evaluation result
-      # return a float value which is higher-is-better.
-      # for example, return coco_map_value
       correct_num = np.sum(
             np.array(self.pred_list) == np.array(self.label_list))
       return correct_num / self.samples
-
 ```
 
-# The interface is similiar for different TensorFlow models
-1. see example on tf1.x frozen pb at [tf1.x_pb](examples/helloworld/tf1.x_pb/).
-2. see example on tf1.x checkpoint at [tf1.x_ckpt](examples/helloworld/tf1.x_ckpt/).
-3. To quantize a slim .ckpt model, we need to get the graph. See a full example on slim at [examples/helloworld/tf1.x_slim](examples/helloworld/tf1.x_slim). 
+Then define a dataloader based on the mnist dataset, and register the customer metric to run quantization. q_model is the quantized model generated. 
 ```
-   import lpot
-    quantizer = lpot.Quantization('./conf.yaml')
-
-    # Get graph from slim checkpoint
-    from tf_slim.nets import inception
-    model_func = inception.inception_v1
-    arg_scope = inception.inception_v1_arg_scope()
-    kwargs = {'num_classes': 1001}
-    inputs_shape = [None, 224, 224, 3]
-    images = tf.compat.v1.placeholder(name='input', \
-    dtype=tf.float32, shape=inputs_shape)
 
-    from lpot.adaptor.tf_utils.util import get_slim_graph
-    graph = get_slim_graph('./inception_v1.ckpt', model_func, \
-            arg_scope, images, **kwargs)
-
-    # Do quantization
-    quantized_model = quantizer(graph)
+import lpot
+quantizer = lpot.Quantization('./conf.yaml')
+dataset = Dataset()
+quantizer.metric('hello_metric', MyMetric)
+dataloader = quantizer.dataloader(dataset, batch_size=1)
+q_model = quantizer('../models/simple_model', q_dataloader = dataloader, eval_dataloader = dataloader)
 
 ```
 
-
+# The interface is similiar for different TensorFlow models
+1.  TensorFlow checkpoint: see [tf_example4](examples/helloworld/tf_example4)
+2.  Enable benchmark for performanace and accuracy measurement: see [tf_example5](examples/helloworld/tf_example5)
+3.  TensorFlow slim model: see [tf_example3](examples/helloworld/tf_example3), while to quantize a slim .ckpt model we need to get the graph first, see [README](examples/helloworld/tf_example3/README.md).  
 
diff --git a/examples/helloworld/tf_example2/conf.yaml b/examples/helloworld/tf_example2/conf.yaml
@@ -4,14 +4,11 @@ model:
   inputs: input                                  # optional. inputs and outputs fields are only required for tensorflow backend.
   outputs: output
 
-#evaluation:
-#  accuracy:
-
 tuning:
   accuracy_criterion:
-    relative: 0.01                              # the tuning target of accuracy loss percentage: 1%
+    relative: 0.01                               # the tuning target of accuracy loss percentage: 1%
   exit_policy:
-    timeout: 100                                   # tuning timeout (seconds)
+    timeout: 100                                 # tuning timeout (seconds)
   random_seed: 100                               # random seed
 
 
diff --git a/examples/helloworld/tf_example4/conf.yaml b/examples/helloworld/tf_example4/conf.yaml
@@ -5,7 +5,7 @@ model:                                               # mandatory. lpot uses this
 
 quantization:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
   calibration:
-    sampling_size: 20                            # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
+    sampling_size: 20                                # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
  
 evaluation:                                          # optional. required if user doesn't provide eval_func in lpot.Quantization.
   accuracy:                                          # optional. required if user doesn't provide eval_func in lpot.Quantization.
diff --git a/examples/tensorflow/image_recognition/nasnet_mobile.yaml b/examples/tensorflow/image_recognition/nasnet_mobile.yaml
diff --git a/examples/tensorflow/object_detection/ssd_resnet34.yaml b/examples/tensorflow/object_detection/ssd_resnet34.yaml