TensorFlow Lite Conversion

TF Lite Conversion Comparison

This page provide a guidance of using TFLite to convert and deploy models.

We use LeNet-like CNN model on MNIST dataset. The workflow is general, however the performance of TF Lite model (compression, accuracy) would be different based your models and datasets.

Specifically, I am going to explain the workflow buried in Tensorflow Lite webpage

Lite convert decision map

In [0]:
# !pip install -U tensorflow=2.0.0
In [0]:
!rm -rf *.tflite
!mkdir -p tmp
!rm -rf tmp/*.tflite
In [0]:
%tensorflow_version 1.15
from google.colab import files
import tensorflow as tf
from tensorflow import keras
from tensorflow import lite
import numpy as np
import matplotlib.pylab as plt
from packaging import version
from os import path
import pandas as pd
import os
from IPython.core.display import HTML
import time
%matplotlib inline

os.environ["TF_CPP_MIN_LOG_LEVEL"]="3"
ver1_flag = version.parse(tf.__version__) < version.parse("2.0")
tf.__version__
`%tensorflow_version` only switches the major version: `1.x` or `2.x`.
You set: `1.15`. This will be interpreted as: `1.x`.


TensorFlow 1.x selected.
Out[0]:
'1.15.0'

Load data

Also we create two generator functions, create_data and create_represent_data for TFLite usage later.

In [0]:
# load mnist data for testing

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 28, 28, 1).astype('float32') / 255
x_test = x_test.reshape(10000, 28,28, 1).astype('float32') / 255
y_train = y_train.astype('float32')
y_test = y_test.astype('float32')

def create_data(data):
  def data_gen():
    for i in data:
      yield [i]
  return data_gen

def create_represent_data(data):
  def data_gen():
    for i in data:
      yield [list([i])]
  return data_gen

x_train.shape
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step
Out[0]:
(60000, 28, 28, 1)

Build Keras Model

We build a simple CNN model for testing.

In [0]:
keras.backend.clear_session()
m = keras.Sequential([
                       keras.layers.Conv2D(16, 3, activation='relu', input_shape=(28,28,1)),
                       keras.layers.BatchNormalization(),
                       keras.layers.MaxPool2D(),
                       keras.layers.Conv2D(16, 3, activation='relu'),
                       keras.layers.BatchNormalization(),
                       keras.layers.MaxPool2D(),
                       keras.layers.Flatten(),
                       keras.layers.Dense(128, activation='relu'),
                       keras.layers.Dense(10, activation='softmax', )
])

m.compile(optimizer=keras.optimizers.Adam(),
          loss=keras.losses.SparseCategoricalCrossentropy(),
          metrics=[keras.metrics.SparseCategoricalAccuracy()])
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
In [0]:
if path.isfile("model.h5"):  # try to avoid train again, load model if present
  m = keras.models.load_model("model.h5")
  m.compile(optimizer=keras.optimizers.Adam(),
          loss=keras.losses.SparseCategoricalCrossentropy(),
          metrics=[keras.metrics.SparseCategoricalAccuracy()])
else:
  m.fit(x_train, y_train, batch_size=128, epochs=10)
  m.save("model.h5")
Train on 60000 samples
Epoch 1/10
60000/60000 [==============================] - 39s 647us/sample - loss: 0.1670 - sparse_categorical_accuracy: 0.9497
Epoch 2/10
60000/60000 [==============================] - 37s 622us/sample - loss: 0.0486 - sparse_categorical_accuracy: 0.9854
Epoch 3/10
60000/60000 [==============================] - 37s 622us/sample - loss: 0.0328 - sparse_categorical_accuracy: 0.9900
Epoch 4/10
60000/60000 [==============================] - 40s 663us/sample - loss: 0.0237 - sparse_categorical_accuracy: 0.9926
Epoch 5/10
60000/60000 [==============================] - 39s 652us/sample - loss: 0.0188 - sparse_categorical_accuracy: 0.9940
Epoch 6/10
60000/60000 [==============================] - 39s 652us/sample - loss: 0.0123 - sparse_categorical_accuracy: 0.9961
Epoch 7/10
60000/60000 [==============================] - 39s 655us/sample - loss: 0.0125 - sparse_categorical_accuracy: 0.9959
Epoch 8/10
60000/60000 [==============================] - 39s 655us/sample - loss: 0.0087 - sparse_categorical_accuracy: 0.9974
Epoch 9/10
60000/60000 [==============================] - 39s 657us/sample - loss: 0.0081 - sparse_categorical_accuracy: 0.9974
Epoch 10/10
60000/60000 [==============================] - 39s 656us/sample - loss: 0.0068 - sparse_categorical_accuracy: 0.9976
In [0]:
m.evaluate(x_test, y_test)[1] ## accuracy
10000/10000 [==============================] - 3s 289us/sample - loss: 0.0532 - sparse_categorical_accuracy: 0.9867
Out[0]:
0.9867
In [0]:
m = keras.models.load_model("model.h5")
plain_res = m.predict(x_test)
plain_res.shape
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Out[0]:
(10000, 10)
In [0]:
sum(np.argmax(plain_res, axis=1) == y_test)/len(y_test)  # test accuracy
Out[0]:
0.9867

TF Lite conversion options

In [0]:
def get_conv(model_file):  # create tflite converter for keras model
  """
  Create TFLiteConverter from keras model
  """
  if ver1_flag:
    conv = lite.TFLiteConverter.from_keras_model_file(model_file)
  else: 
    m = keras.models.load_model(model_file)
    conv = lite.TFLiteConverter.from_keras_model(m)
  return conv
In [0]:
def get_diff(result1, result2):
  """
  compute the difference between two results
  """
  assert result1.shape == result2.shape
  id1 = np.argmax(result1, axis=1)
  id2 = np.argmax(result2, axis=1)
  mismatch = sum(id1!=id2)
  diff = result1[id1]-result2[id1]
  return mismatch, diff

def get_res(filename, data_gen):  # get interpreter output
  """
  get output from tflite model
  
  filename - tflite model
  data_gen - generator for data input x
  """
  intp = lite.Interpreter(filename)
  intp.allocate_tensors()
  for i in data_gen():
    intp.set_tensor(intp.get_input_details()[0]['index'], i)
    intp.invoke()
    yield intp.get_tensor(intp.get_output_details()[0]['index'])

def get_acc(filename):
  "get acuracy from tflite model"
  data_gen = create_data(x_test)
  pred = np.squeeze([i for i in get_res(filename, data_gen)])
  return np.sum(np.argmax(pred, axis=1) == y_test) / len(y_test)

def get_res2(filename, data_gen):
  "get accuracy and time"
  intp = lite.Interpreter(filename)
  intp.allocate_tensors()

  for i in data_gen():
    t = time.monotonic()
    intp.set_tensor(intp.get_input_details()[0]['index'], i)
    intp.invoke()
    t = time.monotonic() - t
    yield np.argmax(intp.get_tensor(intp.get_output_details()[0]['index'])), t


def get_acc_and_time(filename):
  data_gen = create_data(x_test)
  pred = np.squeeze([i for i in get_res2(filename, data_gen)])
  return np.sum(pred[:,0]==y_test)/len(y_test), np.mean(pred[:,1]), np.std(pred[:,1])  # acc, mean, std of inference

Collect all options for tflite conversion

In [0]:
# for converter.target_spec.supported_types 
type_choice = {}
if ver1_flag:
    for i in lite.constants.__all__:
        type_choice[i.lower()] = [lite.constants.__dict__[i]]
else:
    from tensorflow.lite.python import lite_constants as constants
    type_choice = {
        "float": [constants.FLOAT],      # tf.float32
        "int8": [constants.INT8],        # tf.int8
        "int32": [constants.INT32],      # tf.int32
        "int64": [constants.INT64],      # tf.int64
        "string": [constants.STRING],    # tf.string
        "uint8": [constants.QUANTIZED_UINT8],  #tf.uint
    }
type_choice['none'] = None   
# for converter.target_spec.supported_ops
ops_choice = {
    "int8": [lite.OpsSet.TFLITE_BUILTINS_INT8],
    "tflite": [lite.OpsSet.TFLITE_BUILTINS],  # default
    "tf": [lite.OpsSet.SELECT_TF_OPS, lite.OpsSet.TFLITE_BUILTINS]
}

opt_choice = {
    "default": [lite.Optimize.DEFAULT], 
    "latency": [lite.Optimize.OPTIMIZE_FOR_LATENCY], 
    "size": [lite.Optimize.OPTIMIZE_FOR_SIZE],
    "none": []
}

# for converter.representative_dataset
data_gen2 = create_represent_data(x_train[:5000])
data_choice = {"with_data": data_gen2, "wo_data": None}
In [0]:
type_choice
# tflite and graphviz_dot are used to control output graph type.
Out[0]:
{'float': [tf.float32],
 'float16': [tf.float16],
 'graphviz_dot': [3],
 'int32': [tf.int32],
 'int64': [tf.int64],
 'int8': [tf.int8],
 'none': None,
 'quantized_uint8': [tf.uint8],
 'string': [tf.string],
 'tflite': [2]}
In [ ]:
%%capture convert_log 
# output has been cleared
res = []
for xdata in data_choice:
  for xopt in opt_choice:
    for xops in ops_choice:
      for xtype in type_choice:
        filename = "tmp/%s-opt(%s)-ops(%s)-type(%s).tflite"%(xdata, xopt, xops, xtype)
        print("********  %s ********" % filename)
        keras.backend.clear_session()
        try:
          conv = get_conv("model.h5")
          conv.optimizations = opt_choice[xopt]
          conv.representative_dataset = data_choice[xdata]
          conv.target_spec.supported_ops = ops_choice[xops]
          conv.target_spec.supported_types = type_choice[xtype]
          fb = conv.convert()
          msg = ("success")
          with open(filename, 'wb') as f:
            f.write(fb)
          size = path.getsize(filename)
          print("finished")
          acc = get_acc_and_time(filename)
        except Exception as e:
          msg = e.__str__()
          print("failed - %s"%msg)    
          size = None
          acc = None, None, None
        finally:
          res.append([xdata, xopt, xops, xtype, size, msg, *acc])
In [0]:
result = pd.DataFrame(res, columns=["data", "optimization", "ops", "type", "size", "status", "accuracy","mean_inference","std_inference"])
result.to_pickle("result.pkl")
In [0]:
files.download("result.pkl")
In [0]:
%%javascript
require.config({
    paths: {
        DT: '//cdn.datatables.net/1.10.19/js/jquery.dataTables.min',
    }
});

Raw results

In [0]:
HTML(result.to_html())
Out[0]:
data optimization ops type size status accuracy mean_inference std_inference
0 with_data default int8 float NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
1 with_data default int8 float16 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
2 with_data default int8 graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
3 with_data default int8 int32 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
4 with_data default int8 int64 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
5 with_data default int8 int8 60976.0 success 0.9869 0.001737 0.001835
6 with_data default int8 quantized_uint8 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
7 with_data default int8 string NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
8 with_data default int8 tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
9 with_data default int8 none 60976.0 success 0.9869 0.001721 0.000083
10 with_data default tflite float 60976.0 success 0.9869 0.001719 0.000059
11 with_data default tflite float16 114928.0 success 0.9867 0.000301 0.000026
12 with_data default tflite graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
13 with_data default tflite int32 60976.0 success 0.9869 0.001721 0.000060
14 with_data default tflite int64 60976.0 success 0.9869 0.001719 0.000059
15 with_data default tflite int8 60976.0 success 0.9869 0.001718 0.000060
16 with_data default tflite quantized_uint8 60976.0 success 0.9869 0.001722 0.000087
17 with_data default tflite string 60976.0 success 0.9869 0.001718 0.000059
18 with_data default tflite tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
19 with_data default tflite none 60976.0 success 0.9869 0.001744 0.002122
20 with_data default tf float 60976.0 success 0.9869 0.001716 0.000054
21 with_data default tf float16 114928.0 success 0.9867 0.000299 0.000021
22 with_data default tf graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
23 with_data default tf int32 60976.0 success 0.9869 0.001726 0.000068
24 with_data default tf int64 60976.0 success 0.9869 0.001716 0.000058
25 with_data default tf int8 60976.0 success 0.9869 0.001716 0.000070
26 with_data default tf quantized_uint8 60976.0 success 0.9869 0.001727 0.000104
27 with_data default tf string 60976.0 success 0.9869 0.001713 0.000051
28 with_data default tf tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
29 with_data default tf none 60976.0 success 0.9869 0.001715 0.000049
30 with_data latency int8 float NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
31 with_data latency int8 float16 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
32 with_data latency int8 graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
33 with_data latency int8 int32 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
34 with_data latency int8 int64 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
35 with_data latency int8 int8 60976.0 success 0.9869 0.001721 0.000119
36 with_data latency int8 quantized_uint8 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
37 with_data latency int8 string NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
38 with_data latency int8 tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
39 with_data latency int8 none 60976.0 success 0.9869 0.001716 0.000070
40 with_data latency tflite float 60976.0 success 0.9869 0.001716 0.000072
41 with_data latency tflite float16 114928.0 success 0.9867 0.000301 0.000024
42 with_data latency tflite graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
43 with_data latency tflite int32 60976.0 success 0.9869 0.001719 0.000080
44 with_data latency tflite int64 60976.0 success 0.9869 0.001718 0.000060
45 with_data latency tflite int8 60976.0 success 0.9869 0.001716 0.000063
46 with_data latency tflite quantized_uint8 60976.0 success 0.9869 0.001717 0.000060
47 with_data latency tflite string 60976.0 success 0.9869 0.001727 0.000079
48 with_data latency tflite tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
49 with_data latency tflite none 60976.0 success 0.9869 0.001715 0.000067
50 with_data latency tf float 60976.0 success 0.9869 0.001714 0.000049
51 with_data latency tf float16 114928.0 success 0.9867 0.000303 0.000030
52 with_data latency tf graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
53 with_data latency tf int32 60976.0 success 0.9869 0.001725 0.000072
54 with_data latency tf int64 60976.0 success 0.9869 0.001716 0.000058
55 with_data latency tf int8 60976.0 success 0.9869 0.001739 0.002140
56 with_data latency tf quantized_uint8 60976.0 success 0.9869 0.001713 0.000066
57 with_data latency tf string 60976.0 success 0.9869 0.001709 0.000040
58 with_data latency tf tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
59 with_data latency tf none 60976.0 success 0.9869 0.001714 0.000067
60 with_data size int8 float NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
61 with_data size int8 float16 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
62 with_data size int8 graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
63 with_data size int8 int32 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
64 with_data size int8 int64 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
65 with_data size int8 int8 60976.0 success 0.9869 0.001711 0.000049
66 with_data size int8 quantized_uint8 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
67 with_data size int8 string NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
68 with_data size int8 tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
69 with_data size int8 none 60976.0 success 0.9869 0.001717 0.000065
70 with_data size tflite float 60976.0 success 0.9869 0.001717 0.000055
71 with_data size tflite float16 114928.0 success 0.9867 0.000300 0.000022
72 with_data size tflite graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
73 with_data size tflite int32 60976.0 success 0.9869 0.001721 0.000069
74 with_data size tflite int64 60976.0 success 0.9869 0.001717 0.000057
75 with_data size tflite int8 60976.0 success 0.9869 0.001717 0.000059
76 with_data size tflite quantized_uint8 60976.0 success 0.9869 0.001717 0.000070
77 with_data size tflite string 60976.0 success 0.9869 0.001743 0.002344
78 with_data size tflite tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
79 with_data size tflite none 60976.0 success 0.9869 0.001716 0.000053
80 with_data size tf float 60976.0 success 0.9869 0.001720 0.000098
81 with_data size tf float16 114928.0 success 0.9867 0.000298 0.000024
82 with_data size tf graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
83 with_data size tf int32 60976.0 success 0.9869 0.001716 0.000052
84 with_data size tf int64 60976.0 success 0.9869 0.001740 0.002122
85 with_data size tf int8 60976.0 success 0.9869 0.001719 0.000070
86 with_data size tf quantized_uint8 60976.0 success 0.9869 0.001713 0.000043
87 with_data size tf string 60976.0 success 0.9869 0.001716 0.000061
88 with_data size tf tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
89 with_data size tf none 60976.0 success 0.9869 0.001716 0.000057
90 with_data none int8 float NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
91 with_data none int8 float16 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
92 with_data none int8 graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
93 with_data none int8 int32 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
94 with_data none int8 int64 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
95 with_data none int8 int8 60976.0 success 0.9869 0.001721 0.000063
96 with_data none int8 quantized_uint8 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
97 with_data none int8 string NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
98 with_data none int8 tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
99 with_data none int8 none 60976.0 success 0.9869 0.001717 0.000071
100 with_data none tflite float 223952.0 success 0.9867 0.000300 0.000032
101 with_data none tflite float16 223952.0 success 0.9867 0.000300 0.000021
102 with_data none tflite graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
103 with_data none tflite int32 223952.0 success 0.9867 0.000299 0.000022
104 with_data none tflite int64 223952.0 success 0.9867 0.000303 0.000026
105 with_data none tflite int8 60976.0 success 0.9869 0.001721 0.000061
106 with_data none tflite quantized_uint8 223952.0 success 0.9867 0.000303 0.000023
107 with_data none tflite string 223952.0 success 0.9867 0.000303 0.000028
108 with_data none tflite tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
109 with_data none tflite none 223952.0 success 0.9867 0.000303 0.000023
110 with_data none tf float 223952.0 success 0.9867 0.000302 0.000024
111 with_data none tf float16 223952.0 success 0.9867 0.000301 0.000025
112 with_data none tf graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
113 with_data none tf int32 223952.0 success 0.9867 0.000303 0.000034
114 with_data none tf int64 223952.0 success 0.9867 0.000300 0.000022
115 with_data none tf int8 60976.0 success 0.9869 0.001720 0.000062
116 with_data none tf quantized_uint8 223952.0 success 0.9867 0.000300 0.000022
117 with_data none tf string 223952.0 success 0.9867 0.000301 0.000026
118 with_data none tf tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
119 with_data none tf none 223952.0 success 0.9867 0.000299 0.000022
120 wo_data default int8 float NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
121 wo_data default int8 float16 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
122 wo_data default int8 graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
123 wo_data default int8 int32 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
124 wo_data default int8 int64 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
125 wo_data default int8 int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
126 wo_data default int8 quantized_uint8 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
127 wo_data default int8 string NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
128 wo_data default int8 tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
129 wo_data default int8 none NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
130 wo_data default tflite float 59648.0 success 0.9864 0.000203 0.000023
131 wo_data default tflite float16 114928.0 success 0.9867 0.000301 0.000026
132 wo_data default tflite graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
133 wo_data default tflite int32 59648.0 success 0.9864 0.000193 0.000018
134 wo_data default tflite int64 59648.0 success 0.9864 0.000190 0.000020
135 wo_data default tflite int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
136 wo_data default tflite quantized_uint8 59648.0 success 0.9864 0.000191 0.000017
137 wo_data default tflite string 59648.0 success 0.9864 0.000196 0.000019
138 wo_data default tflite tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
139 wo_data default tflite none 59648.0 success 0.9864 0.000191 0.000016
140 wo_data default tf float 59648.0 success 0.9864 0.000192 0.000019
141 wo_data default tf float16 114928.0 success 0.9867 0.000301 0.000025
142 wo_data default tf graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
143 wo_data default tf int32 59648.0 success 0.9864 0.000194 0.000017
144 wo_data default tf int64 59648.0 success 0.9864 0.000188 0.000016
145 wo_data default tf int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
146 wo_data default tf quantized_uint8 59648.0 success 0.9864 0.000198 0.000025
147 wo_data default tf string 59648.0 success 0.9864 0.000197 0.000018
148 wo_data default tf tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
149 wo_data default tf none 59648.0 success 0.9864 0.000189 0.000014
150 wo_data latency int8 float NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
151 wo_data latency int8 float16 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
152 wo_data latency int8 graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
153 wo_data latency int8 int32 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
154 wo_data latency int8 int64 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
155 wo_data latency int8 int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
156 wo_data latency int8 quantized_uint8 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
157 wo_data latency int8 string NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
158 wo_data latency int8 tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
159 wo_data latency int8 none NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
160 wo_data latency tflite float 59648.0 success 0.9864 0.000195 0.000017
161 wo_data latency tflite float16 114928.0 success 0.9867 0.000302 0.000030
162 wo_data latency tflite graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
163 wo_data latency tflite int32 59648.0 success 0.9864 0.000197 0.000017
164 wo_data latency tflite int64 59648.0 success 0.9864 0.000213 0.002201
165 wo_data latency tflite int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
166 wo_data latency tflite quantized_uint8 59648.0 success 0.9864 0.000194 0.000018
167 wo_data latency tflite string 59648.0 success 0.9864 0.000190 0.000018
168 wo_data latency tflite tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
169 wo_data latency tflite none 59648.0 success 0.9864 0.000189 0.000016
170 wo_data latency tf float 59648.0 success 0.9864 0.000188 0.000014
171 wo_data latency tf float16 114928.0 success 0.9867 0.000321 0.002138
172 wo_data latency tf graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
173 wo_data latency tf int32 59648.0 success 0.9864 0.000196 0.000018
174 wo_data latency tf int64 59648.0 success 0.9864 0.000191 0.000017
175 wo_data latency tf int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
176 wo_data latency tf quantized_uint8 59648.0 success 0.9864 0.000195 0.000024
177 wo_data latency tf string 59648.0 success 0.9864 0.000202 0.000022
178 wo_data latency tf tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
179 wo_data latency tf none 59648.0 success 0.9864 0.000190 0.000016
180 wo_data size int8 float NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
181 wo_data size int8 float16 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
182 wo_data size int8 graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
183 wo_data size int8 int32 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
184 wo_data size int8 int64 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
185 wo_data size int8 int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
186 wo_data size int8 quantized_uint8 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
187 wo_data size int8 string NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
188 wo_data size int8 tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
189 wo_data size int8 none NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
190 wo_data size tflite float 59648.0 success 0.9864 0.000192 0.000017
191 wo_data size tflite float16 114928.0 success 0.9867 0.000299 0.000025
192 wo_data size tflite graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
193 wo_data size tflite int32 59648.0 success 0.9864 0.000203 0.000021
194 wo_data size tflite int64 59648.0 success 0.9864 0.000194 0.000022
195 wo_data size tflite int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
196 wo_data size tflite quantized_uint8 59648.0 success 0.9864 0.000197 0.000020
197 wo_data size tflite string 59648.0 success 0.9864 0.000194 0.000018
198 wo_data size tflite tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
199 wo_data size tflite none 59648.0 success 0.9864 0.000200 0.000018
200 wo_data size tf float 59648.0 success 0.9864 0.000212 0.002256
201 wo_data size tf float16 114928.0 success 0.9867 0.000310 0.000043
202 wo_data size tf graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
203 wo_data size tf int32 59648.0 success 0.9864 0.000192 0.000016
204 wo_data size tf int64 59648.0 success 0.9864 0.000194 0.000020
205 wo_data size tf int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
206 wo_data size tf quantized_uint8 59648.0 success 0.9864 0.000191 0.000015
207 wo_data size tf string 59648.0 success 0.9864 0.000213 0.002210
208 wo_data size tf tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
209 wo_data size tf none 59648.0 success 0.9864 0.000195 0.000019
210 wo_data none int8 float NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
211 wo_data none int8 float16 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
212 wo_data none int8 graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
213 wo_data none int8 int32 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
214 wo_data none int8 int64 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
215 wo_data none int8 int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
216 wo_data none int8 quantized_uint8 NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
217 wo_data none int8 string NaN TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8. NaN NaN NaN
218 wo_data none int8 tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
219 wo_data none int8 none NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
220 wo_data none tflite float 223952.0 success 0.9867 0.000301 0.000023
221 wo_data none tflite float16 223952.0 success 0.9867 0.000322 0.002276
222 wo_data none tflite graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
223 wo_data none tflite int32 223952.0 success 0.9867 0.000301 0.000025
224 wo_data none tflite int64 223952.0 success 0.9867 0.000302 0.000031
225 wo_data none tflite int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
226 wo_data none tflite quantized_uint8 223952.0 success 0.9867 0.000299 0.000021
227 wo_data none tflite string 223952.0 success 0.9867 0.000300 0.000022
228 wo_data none tflite tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
229 wo_data none tflite none 223952.0 success 0.9867 0.000300 0.000021
230 wo_data none tf float 223952.0 success 0.9867 0.000300 0.000022
231 wo_data none tf float16 223952.0 success 0.9867 0.000301 0.000048
232 wo_data none tf graphviz_dot NaN 'int' object has no attribute 'size' NaN NaN NaN
233 wo_data none tf int32 223952.0 success 0.9867 0.000301 0.000028
234 wo_data none tf int64 223952.0 success 0.9867 0.000299 0.000024
235 wo_data none tf int8 NaN representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types. NaN NaN NaN
236 wo_data none tf quantized_uint8 223952.0 success 0.9867 0.000300 0.000023
237 wo_data none tf string 223952.0 success 0.9867 0.000300 0.000025
238 wo_data none tf tflite NaN 'int' object has no attribute 'size' NaN NaN NaN
239 wo_data none tf none 223952.0 success 0.9867 0.000300 0.000026

Finished tf lite models

In [0]:
HTML(result.dropna().to_html())
Out[0]:
data optimization ops type size status accuracy mean_inference std_inference
5 with_data default int8 int8 60976.0 success 0.9869 0.001737 0.001835
9 with_data default int8 none 60976.0 success 0.9869 0.001721 0.000083
10 with_data default tflite float 60976.0 success 0.9869 0.001719 0.000059
11 with_data default tflite float16 114928.0 success 0.9867 0.000301 0.000026
13 with_data default tflite int32 60976.0 success 0.9869 0.001721 0.000060
14 with_data default tflite int64 60976.0 success 0.9869 0.001719 0.000059
15 with_data default tflite int8 60976.0 success 0.9869 0.001718 0.000060
16 with_data default tflite quantized_uint8 60976.0 success 0.9869 0.001722 0.000087
17 with_data default tflite string 60976.0 success 0.9869 0.001718 0.000059
19 with_data default tflite none 60976.0 success 0.9869 0.001744 0.002122
20 with_data default tf float 60976.0 success 0.9869 0.001716 0.000054
21 with_data default tf float16 114928.0 success 0.9867 0.000299 0.000021
23 with_data default tf int32 60976.0 success 0.9869 0.001726 0.000068
24 with_data default tf int64 60976.0 success 0.9869 0.001716 0.000058
25 with_data default tf int8 60976.0 success 0.9869 0.001716 0.000070
26 with_data default tf quantized_uint8 60976.0 success 0.9869 0.001727 0.000104
27 with_data default tf string 60976.0 success 0.9869 0.001713 0.000051
29 with_data default tf none 60976.0 success 0.9869 0.001715 0.000049
35 with_data latency int8 int8 60976.0 success 0.9869 0.001721 0.000119
39 with_data latency int8 none 60976.0 success 0.9869 0.001716 0.000070
40 with_data latency tflite float 60976.0 success 0.9869 0.001716 0.000072
41 with_data latency tflite float16 114928.0 success 0.9867 0.000301 0.000024
43 with_data latency tflite int32 60976.0 success 0.9869 0.001719 0.000080
44 with_data latency tflite int64 60976.0 success 0.9869 0.001718 0.000060
45 with_data latency tflite int8 60976.0 success 0.9869 0.001716 0.000063
46 with_data latency tflite quantized_uint8 60976.0 success 0.9869 0.001717 0.000060
47 with_data latency tflite string 60976.0 success 0.9869 0.001727 0.000079
49 with_data latency tflite none 60976.0 success 0.9869 0.001715 0.000067
50 with_data latency tf float 60976.0 success 0.9869 0.001714 0.000049
51 with_data latency tf float16 114928.0 success 0.9867 0.000303 0.000030
53 with_data latency tf int32 60976.0 success 0.9869 0.001725 0.000072
54 with_data latency tf int64 60976.0 success 0.9869 0.001716 0.000058
55 with_data latency tf int8 60976.0 success 0.9869 0.001739 0.002140
56 with_data latency tf quantized_uint8 60976.0 success 0.9869 0.001713 0.000066
57 with_data latency tf string 60976.0 success 0.9869 0.001709 0.000040
59 with_data latency tf none 60976.0 success 0.9869 0.001714 0.000067
65 with_data size int8 int8 60976.0 success 0.9869 0.001711 0.000049
69 with_data size int8 none 60976.0 success 0.9869 0.001717 0.000065
70 with_data size tflite float 60976.0 success 0.9869 0.001717 0.000055
71 with_data size tflite float16 114928.0 success 0.9867 0.000300 0.000022
73 with_data size tflite int32 60976.0 success 0.9869 0.001721 0.000069
74 with_data size tflite int64 60976.0 success 0.9869 0.001717 0.000057
75 with_data size tflite int8 60976.0 success 0.9869 0.001717 0.000059
76 with_data size tflite quantized_uint8 60976.0 success 0.9869 0.001717 0.000070
77 with_data size tflite string 60976.0 success 0.9869 0.001743 0.002344
79 with_data size tflite none 60976.0 success 0.9869 0.001716 0.000053
80 with_data size tf float 60976.0 success 0.9869 0.001720 0.000098
81 with_data size tf float16 114928.0 success 0.9867 0.000298 0.000024
83 with_data size tf int32 60976.0 success 0.9869 0.001716 0.000052
84 with_data size tf int64 60976.0 success 0.9869 0.001740 0.002122
85 with_data size tf int8 60976.0 success 0.9869 0.001719 0.000070
86 with_data size tf quantized_uint8 60976.0 success 0.9869 0.001713 0.000043
87 with_data size tf string 60976.0 success 0.9869 0.001716 0.000061
89 with_data size tf none 60976.0 success 0.9869 0.001716 0.000057
95 with_data none int8 int8 60976.0 success 0.9869 0.001721 0.000063
99 with_data none int8 none 60976.0 success 0.9869 0.001717 0.000071
100 with_data none tflite float 223952.0 success 0.9867 0.000300 0.000032
101 with_data none tflite float16 223952.0 success 0.9867 0.000300 0.000021
103 with_data none tflite int32 223952.0 success 0.9867 0.000299 0.000022
104 with_data none tflite int64 223952.0 success 0.9867 0.000303 0.000026
105 with_data none tflite int8 60976.0 success 0.9869 0.001721 0.000061
106 with_data none tflite quantized_uint8 223952.0 success 0.9867 0.000303 0.000023
107 with_data none tflite string 223952.0 success 0.9867 0.000303 0.000028
109 with_data none tflite none 223952.0 success 0.9867 0.000303 0.000023
110 with_data none tf float 223952.0 success 0.9867 0.000302 0.000024
111 with_data none tf float16 223952.0 success 0.9867 0.000301 0.000025
113 with_data none tf int32 223952.0 success 0.9867 0.000303 0.000034
114 with_data none tf int64 223952.0 success 0.9867 0.000300 0.000022
115 with_data none tf int8 60976.0 success 0.9869 0.001720 0.000062
116 with_data none tf quantized_uint8 223952.0 success 0.9867 0.000300 0.000022
117 with_data none tf string 223952.0 success 0.9867 0.000301 0.000026
119 with_data none tf none 223952.0 success 0.9867 0.000299 0.000022
130 wo_data default tflite float 59648.0 success 0.9864 0.000203 0.000023
131 wo_data default tflite float16 114928.0 success 0.9867 0.000301 0.000026
133 wo_data default tflite int32 59648.0 success 0.9864 0.000193 0.000018
134 wo_data default tflite int64 59648.0 success 0.9864 0.000190 0.000020
136 wo_data default tflite quantized_uint8 59648.0 success 0.9864 0.000191 0.000017
137 wo_data default tflite string 59648.0 success 0.9864 0.000196 0.000019
139 wo_data default tflite none 59648.0 success 0.9864 0.000191 0.000016
140 wo_data default tf float 59648.0 success 0.9864 0.000192 0.000019
141 wo_data default tf float16 114928.0 success 0.9867 0.000301 0.000025
143 wo_data default tf int32 59648.0 success 0.9864 0.000194 0.000017
144 wo_data default tf int64 59648.0 success 0.9864 0.000188 0.000016
146 wo_data default tf quantized_uint8 59648.0 success 0.9864 0.000198 0.000025
147 wo_data default tf string 59648.0 success 0.9864 0.000197 0.000018
149 wo_data default tf none 59648.0 success 0.9864 0.000189 0.000014
160 wo_data latency tflite float 59648.0 success 0.9864 0.000195 0.000017
161 wo_data latency tflite float16 114928.0 success 0.9867 0.000302 0.000030
163 wo_data latency tflite int32 59648.0 success 0.9864 0.000197 0.000017
164 wo_data latency tflite int64 59648.0 success 0.9864 0.000213 0.002201
166 wo_data latency tflite quantized_uint8 59648.0 success 0.9864 0.000194 0.000018
167 wo_data latency tflite string 59648.0 success 0.9864 0.000190 0.000018
169 wo_data latency tflite none 59648.0 success 0.9864 0.000189 0.000016
170 wo_data latency tf float 59648.0 success 0.9864 0.000188 0.000014
171 wo_data latency tf float16 114928.0 success 0.9867 0.000321 0.002138
173 wo_data latency tf int32 59648.0 success 0.9864 0.000196 0.000018
174 wo_data latency tf int64 59648.0 success 0.9864 0.000191 0.000017
176 wo_data latency tf quantized_uint8 59648.0 success 0.9864 0.000195 0.000024
177 wo_data latency tf string 59648.0 success 0.9864 0.000202 0.000022
179 wo_data latency tf none 59648.0 success 0.9864 0.000190 0.000016
190 wo_data size tflite float 59648.0 success 0.9864 0.000192 0.000017
191 wo_data size tflite float16 114928.0 success 0.9867 0.000299 0.000025
193 wo_data size tflite int32 59648.0 success 0.9864 0.000203 0.000021
194 wo_data size tflite int64 59648.0 success 0.9864 0.000194 0.000022
196 wo_data size tflite quantized_uint8 59648.0 success 0.9864 0.000197 0.000020
197 wo_data size tflite string 59648.0 success 0.9864 0.000194 0.000018
199 wo_data size tflite none 59648.0 success 0.9864 0.000200 0.000018
200 wo_data size tf float 59648.0 success 0.9864 0.000212 0.002256
201 wo_data size tf float16 114928.0 success 0.9867 0.000310 0.000043
203 wo_data size tf int32 59648.0 success 0.9864 0.000192 0.000016
204 wo_data size tf int64 59648.0 success 0.9864 0.000194 0.000020
206 wo_data size tf quantized_uint8 59648.0 success 0.9864 0.000191 0.000015
207 wo_data size tf string 59648.0 success 0.9864 0.000213 0.002210
209 wo_data size tf none 59648.0 success 0.9864 0.000195 0.000019
220 wo_data none tflite float 223952.0 success 0.9867 0.000301 0.000023
221 wo_data none tflite float16 223952.0 success 0.9867 0.000322 0.002276
223 wo_data none tflite int32 223952.0 success 0.9867 0.000301 0.000025
224 wo_data none tflite int64 223952.0 success 0.9867 0.000302 0.000031
226 wo_data none tflite quantized_uint8 223952.0 success 0.9867 0.000299 0.000021
227 wo_data none tflite string 223952.0 success 0.9867 0.000300 0.000022
229 wo_data none tflite none 223952.0 success 0.9867 0.000300 0.000021
230 wo_data none tf float 223952.0 success 0.9867 0.000300 0.000022
231 wo_data none tf float16 223952.0 success 0.9867 0.000301 0.000048
233 wo_data none tf int32 223952.0 success 0.9867 0.000301 0.000028
234 wo_data none tf int64 223952.0 success 0.9867 0.000299 0.000024
236 wo_data none tf quantized_uint8 223952.0 success 0.9867 0.000300 0.000023
237 wo_data none tf string 223952.0 success 0.9867 0.000300 0.000025
239 wo_data none tf none 223952.0 success 0.9867 0.000300 0.000026

TF Lite Interpreter result details

In [0]:
data_gen = create_data(x_test)

Plain TF Lite Convert

In [0]:
plain_tflite = np.squeeze([i for i in get_res("tmp/wo_data-opt(none)-ops(tf)-type(float).tflite", data_gen)])
mismatch, diff = get_diff(plain_res, plain_tflite)
plt.hist(diff)
plt.title("total mismatch=%d"%mismatch)
plt.show()
In [0]:
def print_inpt(filename):
  intp = lite.Interpreter(filename)
  for i in intp.get_tensor_details():
    print("\t".join(["%d"%i['index'], ("%s"%i['dtype']).split("'")[1].split('.')[1], i['name'], "%s"%i['shape'], "(%f,%f)"%i['quantization']]))
In [0]:
print_inpt("tmp/wo_data-opt(none)-ops(tf)-type(float).tflite")
0	float32	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.000000,0.000000)
1	float32	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.000000,0.000000)
2	float32	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.000000,0.000000)
3	float32	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.000000,0.000000)
4	float32	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.000000,0.000000)
5	float32	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.000000,0.000000)
6	float32	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.000000,0.000000)
7	float32	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.000000,0.000000)
8	float32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	float32	conv2d/Relu	[ 1 26 26 16]	(0.000000,0.000000)
10	float32	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	float32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	float32	conv2d_1/Relu	[ 1 11 11 16]	(0.000000,0.000000)
13	float32	conv2d_1/kernel	[16  3  3 16]	(0.000000,0.000000)
14	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
15	float32	dense/MatMul_bias	[128]	(0.000000,0.000000)
16	float32	dense/Relu	[  1 128]	(0.000000,0.000000)
17	float32	dense/kernel/transpose	[128 400]	(0.000000,0.000000)
18	float32	dense_1/BiasAdd	[ 1 10]	(0.000000,0.000000)
19	float32	dense_1/MatMul_bias	[10]	(0.000000,0.000000)
20	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)
21	float32	dense_1/kernel/transpose	[ 10 128]	(0.000000,0.000000)
22	float32	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.000000,0.000000)
23	float32	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.000000,0.000000)

TF default optimization

In [0]:
plain_opt = np.squeeze([i for i in get_res("tmp/wo_data-opt(default)-ops(tflite)-type(float).tflite", data_gen)])
mismatch, diff = get_diff(plain_res, plain_opt)
plt.hist(diff)
plt.title("total mismatch=%d"%mismatch)
plt.show()
In [0]:
print_inpt("tmp/wo_data-opt(default)-ops(tflite)-type(float).tflite")
0	float32	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.000000,0.000000)
1	float32	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.000000,0.000000)
2	float32	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.000000,0.000000)
3	float32	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.000000,0.000000)
4	float32	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.000000,0.000000)
5	float32	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.000000,0.000000)
6	float32	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.000000,0.000000)
7	float32	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.000000,0.000000)
8	float32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	float32	conv2d/Relu	[ 1 26 26 16]	(0.000000,0.000000)
10	float32	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	float32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	float32	conv2d_1/Relu	[ 1 11 11 16]	(0.000000,0.000000)
13	int8	conv2d_1/kernel	[16  3  3 16]	(0.002425,0.000000)
14	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
15	float32	dense/MatMul_bias	[128]	(0.000000,0.000000)
16	float32	dense/Relu	[  1 128]	(0.000000,0.000000)
17	int8	dense/kernel/transpose	[128 400]	(0.002713,0.000000)
18	float32	dense_1/BiasAdd	[ 1 10]	(0.000000,0.000000)
19	float32	dense_1/MatMul_bias	[10]	(0.000000,0.000000)
20	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)
21	int8	dense_1/kernel/transpose	[ 10 128]	(0.003590,0.000000)
22	float32	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.000000,0.000000)
23	float32	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.000000,0.000000)

TF with representative data

In [0]:
data_opt = np.squeeze([i for i in get_res("tmp/with_data-opt(default)-ops(tflite)-type(float).tflite", data_gen)])
mismatch, diff = get_diff(plain_res, data_opt)
plt.hist(diff)
plt.title("total mismatch=%d"%mismatch)
plt.show()
In [0]:
print_inpt("tmp/with_data-opt(default)-ops(tflite)-type(float).tflite")
0	int8	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.043397,-107.000000)
1	int8	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.007060,0.000000)
2	int8	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.041571,-128.000000)
3	int8	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.239892,0.000000)
4	int8	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.078118,-119.000000)
5	int8	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.005670,0.000000)
6	int8	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.077301,-128.000000)
7	int8	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.021564,0.000000)
8	int32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	int8	conv2d/Relu	[ 1 26 26 16]	(0.002887,-128.000000)
10	int8	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	int32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	int8	conv2d_1/Relu	[ 1 11 11 16]	(0.039788,-128.000000)
13	int8	conv2d_1/kernel	[16  3  3 16]	(0.000000,0.000000)
14	int8	conv2d_input_int8	[ 1 28 28  1]	(0.003922,-128.000000)
15	int32	dense/MatMul_bias	[128]	(0.000212,0.000000)
16	int8	dense/Relu	[  1 128]	(0.092764,-128.000000)
17	int8	dense/kernel/transpose	[128 400]	(0.002713,0.000000)
18	int8	dense_1/BiasAdd	[ 1 10]	(0.281922,-28.000000)
19	int32	dense_1/MatMul_bias	[10]	(0.000333,0.000000)
20	int8	dense_1/Softmax_int8	[ 1 10]	(0.003906,-128.000000)
21	int8	dense_1/kernel/transpose	[ 10 128]	(0.003590,0.000000)
22	int8	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.043397,-107.000000)
23	int8	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.078118,-119.000000)
24	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
25	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)

Other quick comparison

In [0]:
print_inpt("tmp/with_data-opt(default)-ops(int8)-type(int8).tflite")
0	int8	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.043397,-107.000000)
1	int8	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.007060,0.000000)
2	int8	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.041571,-128.000000)
3	int8	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.239892,0.000000)
4	int8	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.078118,-119.000000)
5	int8	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.005670,0.000000)
6	int8	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.077301,-128.000000)
7	int8	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.021564,0.000000)
8	int32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	int8	conv2d/Relu	[ 1 26 26 16]	(0.002887,-128.000000)
10	int8	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	int32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	int8	conv2d_1/Relu	[ 1 11 11 16]	(0.039788,-128.000000)
13	int8	conv2d_1/kernel	[16  3  3 16]	(0.000000,0.000000)
14	int8	conv2d_input_int8	[ 1 28 28  1]	(0.003922,-128.000000)
15	int32	dense/MatMul_bias	[128]	(0.000212,0.000000)
16	int8	dense/Relu	[  1 128]	(0.092764,-128.000000)
17	int8	dense/kernel/transpose	[128 400]	(0.002713,0.000000)
18	int8	dense_1/BiasAdd	[ 1 10]	(0.281922,-28.000000)
19	int32	dense_1/MatMul_bias	[10]	(0.000333,0.000000)
20	int8	dense_1/Softmax_int8	[ 1 10]	(0.003906,-128.000000)
21	int8	dense_1/kernel/transpose	[ 10 128]	(0.003590,0.000000)
22	int8	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.043397,-107.000000)
23	int8	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.078118,-119.000000)
24	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
25	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)
In [0]:
print_inpt("tmp/with_data-opt(size)-ops(tflite)-type(int8).tflite")
0	int8	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.043397,-107.000000)
1	int8	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.007060,0.000000)
2	int8	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.041571,-128.000000)
3	int8	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.239892,0.000000)
4	int8	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.078118,-119.000000)
5	int8	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.005670,0.000000)
6	int8	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.077301,-128.000000)
7	int8	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.021564,0.000000)
8	int32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	int8	conv2d/Relu	[ 1 26 26 16]	(0.002887,-128.000000)
10	int8	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	int32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	int8	conv2d_1/Relu	[ 1 11 11 16]	(0.039788,-128.000000)
13	int8	conv2d_1/kernel	[16  3  3 16]	(0.000000,0.000000)
14	int8	conv2d_input_int8	[ 1 28 28  1]	(0.003922,-128.000000)
15	int32	dense/MatMul_bias	[128]	(0.000212,0.000000)
16	int8	dense/Relu	[  1 128]	(0.092764,-128.000000)
17	int8	dense/kernel/transpose	[128 400]	(0.002713,0.000000)
18	int8	dense_1/BiasAdd	[ 1 10]	(0.281922,-28.000000)
19	int32	dense_1/MatMul_bias	[10]	(0.000333,0.000000)
20	int8	dense_1/Softmax_int8	[ 1 10]	(0.003906,-128.000000)
21	int8	dense_1/kernel/transpose	[ 10 128]	(0.003590,0.000000)
22	int8	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.043397,-107.000000)
23	int8	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.078118,-119.000000)
24	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
25	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)

Remarks

Take-aways

Now, let's take another look of the workflow provided by TF.

Lite convert decision map

  • Optimization types are not fully implemented, see [here].(https://github.com/tensorflow/tensorflow/blob/570206441717511720fdae9ac58dac16cc1d348a/tensorflow/lite/python/lite.py#L96)
  • No data, no optimization, ops and types doesn't matter except crashing cases (e.g. int). It will create a float32 tflite for runtime. This corresponds to N to Optimzie model?.

    • Exception case 1: using int in types/ops
    • Exception case 2: with data, and set ops to int, (types is int or none).
  • With optimization and types of float16, it will reduce to half size.

  • With optimization (and without float16), some weights are quantized to int8.
  • With data and optimization, weights are in int type. However int8 is not strictly enforced.
  • When ops is int8, data type needs to be int8.
  • int8 and uint8 are quite different.

Remaining mysterious

  • What's the difference between select and builtin?
  • What's the string or none op type?

An unexpected problem

If you check the source code](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/kernel_util.cc#L100), it is under the GetQuantizedConvolutionMultipler function. So there is some interesting conversion for the fully connected layer. To save the trouble and focus on our original goal.

In [0]:
m = keras.Sequential(
    [keras.layers.Dense(100, input_shape=(5,)),
     keras.layers.Dense(100),
     keras.layers.Dense(3)]
)
m.save("model2.h5")
x = np.random.randn(100,5).astype('float32')

def data_gen():
  for i in range(100):
    yield x[None, i]

def data_gen2():
  y = data_gen()
  for i in y:
    yield [list(i)]

if ver1_flag:
  conv = lite.TFLiteConverter.from_keras_model_file("model2.h5")
else: 
  conv = lite.TFLiteConverter.from_keras_model(m)
conv.optimizations = [lite.Optimize.DEFAULT]
conv.representative_dataset = data_gen2
with open("problem.tflite", "wb") as f:
  f.write(conv.convert())
WARNING:tensorflow:No training configuration found in save file: the model was *not* compiled. Compile it manually.
WARNING:tensorflow:No training configuration found in save file: the model was *not* compiled. Compile it manually.
INFO:tensorflow:Froze 6 variables.
INFO:tensorflow:Froze 6 variables.
INFO:tensorflow:Converted 6 variables to const ops.
INFO:tensorflow:Converted 6 variables to const ops.
In [0]:
intp = lite.Interpreter("problem.tflite")
intp.allocate_tensors()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-33-5338b2d27dd6> in <module>()
      1 intp = lite.Interpreter("problem.tflite")
----> 2 intp.allocate_tensors()

/usr/local/lib/python3.6/dist-packages/tensorflow_core/lite/python/interpreter.py in allocate_tensors(self)
    242   def allocate_tensors(self):
    243     self._ensure_safe()
--> 244     return self._interpreter.AllocateTensors()
    245 
    246   def _safe_to_run(self):

/usr/local/lib/python3.6/dist-packages/tensorflow_core/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py in AllocateTensors(self)
    104 
    105     def AllocateTensors(self):
--> 106         return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_AllocateTensors(self)
    107 
    108     def Invoke(self):

RuntimeError: tensorflow/lite/kernels/kernel_util.cc:106 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale) was not true.Node number 2 (FULLY_CONNECTED) failed to prepare.

Comments