TF Lite Conversion Comparison¶

This page provide a guidance of using TFLite to convert and deploy models.

We use LeNet-like CNN model on MNIST dataset. The workflow is general, however the performance of TF Lite model (compression, accuracy) would be different based your models and datasets.

Specifically, I am going to explain the workflow buried in Tensorflow Lite webpage

Lite convert decision map

In [0]:

# !pip install -U tensorflow=2.0.0

In [0]:

!rm -rf *.tflite
!mkdir -p tmp
!rm -rf tmp/*.tflite

In [0]:

%tensorflow_version 1.15
from google.colab import files
import tensorflow as tf
from tensorflow import keras
from tensorflow import lite
import numpy as np
import matplotlib.pylab as plt
from packaging import version
from os import path
import pandas as pd
import os
from IPython.core.display import HTML
import time
%matplotlib inline

os.environ["TF_CPP_MIN_LOG_LEVEL"]="3"
ver1_flag = version.parse(tf.__version__) < version.parse("2.0")
tf.__version__

`%tensorflow_version` only switches the major version: `1.x` or `2.x`.
You set: `1.15`. This will be interpreted as: `1.x`.


TensorFlow 1.x selected.

Out[0]:

'1.15.0'

Load data¶

Also we create two generator functions, create_data and create_represent_data for TFLite usage later.

In [0]:

# load mnist data for testing

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 28, 28, 1).astype('float32') / 255
x_test = x_test.reshape(10000, 28,28, 1).astype('float32') / 255
y_train = y_train.astype('float32')
y_test = y_test.astype('float32')

def create_data(data):
  def data_gen():
    for i in data:
      yield [i]
  return data_gen

def create_represent_data(data):
  def data_gen():
    for i in data:
      yield [list([i])]
  return data_gen

x_train.shape

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step

Out[0]:

(60000, 28, 28, 1)

Build Keras Model¶

We build a simple CNN model for testing.

In [0]:

keras.backend.clear_session()
m = keras.Sequential([
                       keras.layers.Conv2D(16, 3, activation='relu', input_shape=(28,28,1)),
                       keras.layers.BatchNormalization(),
                       keras.layers.MaxPool2D(),
                       keras.layers.Conv2D(16, 3, activation='relu'),
                       keras.layers.BatchNormalization(),
                       keras.layers.MaxPool2D(),
                       keras.layers.Flatten(),
                       keras.layers.Dense(128, activation='relu'),
                       keras.layers.Dense(10, activation='softmax', )
])

m.compile(optimizer=keras.optimizers.Adam(),
          loss=keras.losses.SparseCategoricalCrossentropy(),
          metrics=[keras.metrics.SparseCategoricalAccuracy()])

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.

In [0]:

if path.isfile("model.h5"):  # try to avoid train again, load model if present
  m = keras.models.load_model("model.h5")
  m.compile(optimizer=keras.optimizers.Adam(),
          loss=keras.losses.SparseCategoricalCrossentropy(),
          metrics=[keras.metrics.SparseCategoricalAccuracy()])
else:
  m.fit(x_train, y_train, batch_size=128, epochs=10)
  m.save("model.h5")

Train on 60000 samples
Epoch 1/10
60000/60000 [==============================] - 39s 647us/sample - loss: 0.1670 - sparse_categorical_accuracy: 0.9497
Epoch 2/10
60000/60000 [==============================] - 37s 622us/sample - loss: 0.0486 - sparse_categorical_accuracy: 0.9854
Epoch 3/10
60000/60000 [==============================] - 37s 622us/sample - loss: 0.0328 - sparse_categorical_accuracy: 0.9900
Epoch 4/10
60000/60000 [==============================] - 40s 663us/sample - loss: 0.0237 - sparse_categorical_accuracy: 0.9926
Epoch 5/10
60000/60000 [==============================] - 39s 652us/sample - loss: 0.0188 - sparse_categorical_accuracy: 0.9940
Epoch 6/10
60000/60000 [==============================] - 39s 652us/sample - loss: 0.0123 - sparse_categorical_accuracy: 0.9961
Epoch 7/10
60000/60000 [==============================] - 39s 655us/sample - loss: 0.0125 - sparse_categorical_accuracy: 0.9959
Epoch 8/10
60000/60000 [==============================] - 39s 655us/sample - loss: 0.0087 - sparse_categorical_accuracy: 0.9974
Epoch 9/10
60000/60000 [==============================] - 39s 657us/sample - loss: 0.0081 - sparse_categorical_accuracy: 0.9974
Epoch 10/10
60000/60000 [==============================] - 39s 656us/sample - loss: 0.0068 - sparse_categorical_accuracy: 0.9976

In [0]:

m.evaluate(x_test, y_test)[1] ## accuracy

10000/10000 [==============================] - 3s 289us/sample - loss: 0.0532 - sparse_categorical_accuracy: 0.9867

Out[0]:

0.9867

In [0]:

m = keras.models.load_model("model.h5")
plain_res = m.predict(x_test)
plain_res.shape

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor

Out[0]:

(10000, 10)

In [0]:

sum(np.argmax(plain_res, axis=1) == y_test)/len(y_test)  # test accuracy

Out[0]:

0.9867

TF Lite conversion options¶

In [0]:

def get_conv(model_file):  # create tflite converter for keras model
  """
  Create TFLiteConverter from keras model
  """
  if ver1_flag:
    conv = lite.TFLiteConverter.from_keras_model_file(model_file)
  else: 
    m = keras.models.load_model(model_file)
    conv = lite.TFLiteConverter.from_keras_model(m)
  return conv

In [0]:

def get_diff(result1, result2):
  """
  compute the difference between two results
  """
  assert result1.shape == result2.shape
  id1 = np.argmax(result1, axis=1)
  id2 = np.argmax(result2, axis=1)
  mismatch = sum(id1!=id2)
  diff = result1[id1]-result2[id1]
  return mismatch, diff

def get_res(filename, data_gen):  # get interpreter output
  """
  get output from tflite model
  
  filename - tflite model
  data_gen - generator for data input x
  """
  intp = lite.Interpreter(filename)
  intp.allocate_tensors()
  for i in data_gen():
    intp.set_tensor(intp.get_input_details()[0]['index'], i)
    intp.invoke()
    yield intp.get_tensor(intp.get_output_details()[0]['index'])

def get_acc(filename):
  "get acuracy from tflite model"
  data_gen = create_data(x_test)
  pred = np.squeeze([i for i in get_res(filename, data_gen)])
  return np.sum(np.argmax(pred, axis=1) == y_test) / len(y_test)

def get_res2(filename, data_gen):
  "get accuracy and time"
  intp = lite.Interpreter(filename)
  intp.allocate_tensors()

  for i in data_gen():
    t = time.monotonic()
    intp.set_tensor(intp.get_input_details()[0]['index'], i)
    intp.invoke()
    t = time.monotonic() - t
    yield np.argmax(intp.get_tensor(intp.get_output_details()[0]['index'])), t


def get_acc_and_time(filename):
  data_gen = create_data(x_test)
  pred = np.squeeze([i for i in get_res2(filename, data_gen)])
  return np.sum(pred[:,0]==y_test)/len(y_test), np.mean(pred[:,1]), np.std(pred[:,1])  # acc, mean, std of inference

Collect all options for tflite conversion¶

In [0]:

# for converter.target_spec.supported_types 
type_choice = {}
if ver1_flag:
    for i in lite.constants.__all__:
        type_choice[i.lower()] = [lite.constants.__dict__[i]]
else:
    from tensorflow.lite.python import lite_constants as constants
    type_choice = {
        "float": [constants.FLOAT],      # tf.float32
        "int8": [constants.INT8],        # tf.int8
        "int32": [constants.INT32],      # tf.int32
        "int64": [constants.INT64],      # tf.int64
        "string": [constants.STRING],    # tf.string
        "uint8": [constants.QUANTIZED_UINT8],  #tf.uint
    }
type_choice['none'] = None   
# for converter.target_spec.supported_ops
ops_choice = {
    "int8": [lite.OpsSet.TFLITE_BUILTINS_INT8],
    "tflite": [lite.OpsSet.TFLITE_BUILTINS],  # default
    "tf": [lite.OpsSet.SELECT_TF_OPS, lite.OpsSet.TFLITE_BUILTINS]
}

opt_choice = {
    "default": [lite.Optimize.DEFAULT], 
    "latency": [lite.Optimize.OPTIMIZE_FOR_LATENCY], 
    "size": [lite.Optimize.OPTIMIZE_FOR_SIZE],
    "none": []
}

# for converter.representative_dataset
data_gen2 = create_represent_data(x_train[:5000])
data_choice = {"with_data": data_gen2, "wo_data": None}

In [0]:

type_choice
# tflite and graphviz_dot are used to control output graph type.

Out[0]:

{'float': [tf.float32],
 'float16': [tf.float16],
 'graphviz_dot': [3],
 'int32': [tf.int32],
 'int64': [tf.int64],
 'int8': [tf.int8],
 'none': None,
 'quantized_uint8': [tf.uint8],
 'string': [tf.string],
 'tflite': [2]}

In [ ]:

%%capture convert_log 
# output has been cleared
res = []
for xdata in data_choice:
  for xopt in opt_choice:
    for xops in ops_choice:
      for xtype in type_choice:
        filename = "tmp/%s-opt(%s)-ops(%s)-type(%s).tflite"%(xdata, xopt, xops, xtype)
        print("********  %s ********" % filename)
        keras.backend.clear_session()
        try:
          conv = get_conv("model.h5")
          conv.optimizations = opt_choice[xopt]
          conv.representative_dataset = data_choice[xdata]
          conv.target_spec.supported_ops = ops_choice[xops]
          conv.target_spec.supported_types = type_choice[xtype]
          fb = conv.convert()
          msg = ("success")
          with open(filename, 'wb') as f:
            f.write(fb)
          size = path.getsize(filename)
          print("finished")
          acc = get_acc_and_time(filename)
        except Exception as e:
          msg = e.__str__()
          print("failed - %s"%msg)    
          size = None
          acc = None, None, None
        finally:
          res.append([xdata, xopt, xops, xtype, size, msg, *acc])

In [0]:

result = pd.DataFrame(res, columns=["data", "optimization", "ops", "type", "size", "status", "accuracy","mean_inference","std_inference"])
result.to_pickle("result.pkl")

In [0]:

files.download("result.pkl")

In [0]:

%%javascript
require.config({
    paths: {
        DT: '//cdn.datatables.net/1.10.19/js/jquery.dataTables.min',
    }
});

Raw results¶

In [0]:

HTML(result.to_html())

Out[0]:

	data	optimization	ops	type	size	status	accuracy	mean_inference	std_inference
0	with_data	default	int8	float	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
1	with_data	default	int8	float16	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
2	with_data	default	int8	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
3	with_data	default	int8	int32	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
4	with_data	default	int8	int64	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
5	with_data	default	int8	int8	60976.0	success	0.9869	0.001737	0.001835
6	with_data	default	int8	quantized_uint8	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
7	with_data	default	int8	string	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
8	with_data	default	int8	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
9	with_data	default	int8	none	60976.0	success	0.9869	0.001721	0.000083
10	with_data	default	tflite	float	60976.0	success	0.9869	0.001719	0.000059
11	with_data	default	tflite	float16	114928.0	success	0.9867	0.000301	0.000026
12	with_data	default	tflite	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
13	with_data	default	tflite	int32	60976.0	success	0.9869	0.001721	0.000060
14	with_data	default	tflite	int64	60976.0	success	0.9869	0.001719	0.000059
15	with_data	default	tflite	int8	60976.0	success	0.9869	0.001718	0.000060
16	with_data	default	tflite	quantized_uint8	60976.0	success	0.9869	0.001722	0.000087
17	with_data	default	tflite	string	60976.0	success	0.9869	0.001718	0.000059
18	with_data	default	tflite	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
19	with_data	default	tflite	none	60976.0	success	0.9869	0.001744	0.002122
20	with_data	default	tf	float	60976.0	success	0.9869	0.001716	0.000054
21	with_data	default	tf	float16	114928.0	success	0.9867	0.000299	0.000021
22	with_data	default	tf	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
23	with_data	default	tf	int32	60976.0	success	0.9869	0.001726	0.000068
24	with_data	default	tf	int64	60976.0	success	0.9869	0.001716	0.000058
25	with_data	default	tf	int8	60976.0	success	0.9869	0.001716	0.000070
26	with_data	default	tf	quantized_uint8	60976.0	success	0.9869	0.001727	0.000104
27	with_data	default	tf	string	60976.0	success	0.9869	0.001713	0.000051
28	with_data	default	tf	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
29	with_data	default	tf	none	60976.0	success	0.9869	0.001715	0.000049
30	with_data	latency	int8	float	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
31	with_data	latency	int8	float16	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
32	with_data	latency	int8	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
33	with_data	latency	int8	int32	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
34	with_data	latency	int8	int64	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
35	with_data	latency	int8	int8	60976.0	success	0.9869	0.001721	0.000119
36	with_data	latency	int8	quantized_uint8	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
37	with_data	latency	int8	string	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
38	with_data	latency	int8	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
39	with_data	latency	int8	none	60976.0	success	0.9869	0.001716	0.000070
40	with_data	latency	tflite	float	60976.0	success	0.9869	0.001716	0.000072
41	with_data	latency	tflite	float16	114928.0	success	0.9867	0.000301	0.000024
42	with_data	latency	tflite	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
43	with_data	latency	tflite	int32	60976.0	success	0.9869	0.001719	0.000080
44	with_data	latency	tflite	int64	60976.0	success	0.9869	0.001718	0.000060
45	with_data	latency	tflite	int8	60976.0	success	0.9869	0.001716	0.000063
46	with_data	latency	tflite	quantized_uint8	60976.0	success	0.9869	0.001717	0.000060
47	with_data	latency	tflite	string	60976.0	success	0.9869	0.001727	0.000079
48	with_data	latency	tflite	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
49	with_data	latency	tflite	none	60976.0	success	0.9869	0.001715	0.000067
50	with_data	latency	tf	float	60976.0	success	0.9869	0.001714	0.000049
51	with_data	latency	tf	float16	114928.0	success	0.9867	0.000303	0.000030
52	with_data	latency	tf	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
53	with_data	latency	tf	int32	60976.0	success	0.9869	0.001725	0.000072
54	with_data	latency	tf	int64	60976.0	success	0.9869	0.001716	0.000058
55	with_data	latency	tf	int8	60976.0	success	0.9869	0.001739	0.002140
56	with_data	latency	tf	quantized_uint8	60976.0	success	0.9869	0.001713	0.000066
57	with_data	latency	tf	string	60976.0	success	0.9869	0.001709	0.000040
58	with_data	latency	tf	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
59	with_data	latency	tf	none	60976.0	success	0.9869	0.001714	0.000067
60	with_data	size	int8	float	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
61	with_data	size	int8	float16	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
62	with_data	size	int8	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
63	with_data	size	int8	int32	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
64	with_data	size	int8	int64	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
65	with_data	size	int8	int8	60976.0	success	0.9869	0.001711	0.000049
66	with_data	size	int8	quantized_uint8	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
67	with_data	size	int8	string	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
68	with_data	size	int8	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
69	with_data	size	int8	none	60976.0	success	0.9869	0.001717	0.000065
70	with_data	size	tflite	float	60976.0	success	0.9869	0.001717	0.000055
71	with_data	size	tflite	float16	114928.0	success	0.9867	0.000300	0.000022
72	with_data	size	tflite	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
73	with_data	size	tflite	int32	60976.0	success	0.9869	0.001721	0.000069
74	with_data	size	tflite	int64	60976.0	success	0.9869	0.001717	0.000057
75	with_data	size	tflite	int8	60976.0	success	0.9869	0.001717	0.000059
76	with_data	size	tflite	quantized_uint8	60976.0	success	0.9869	0.001717	0.000070
77	with_data	size	tflite	string	60976.0	success	0.9869	0.001743	0.002344
78	with_data	size	tflite	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
79	with_data	size	tflite	none	60976.0	success	0.9869	0.001716	0.000053
80	with_data	size	tf	float	60976.0	success	0.9869	0.001720	0.000098
81	with_data	size	tf	float16	114928.0	success	0.9867	0.000298	0.000024
82	with_data	size	tf	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
83	with_data	size	tf	int32	60976.0	success	0.9869	0.001716	0.000052
84	with_data	size	tf	int64	60976.0	success	0.9869	0.001740	0.002122
85	with_data	size	tf	int8	60976.0	success	0.9869	0.001719	0.000070
86	with_data	size	tf	quantized_uint8	60976.0	success	0.9869	0.001713	0.000043
87	with_data	size	tf	string	60976.0	success	0.9869	0.001716	0.000061
88	with_data	size	tf	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
89	with_data	size	tf	none	60976.0	success	0.9869	0.001716	0.000057
90	with_data	none	int8	float	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
91	with_data	none	int8	float16	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
92	with_data	none	int8	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
93	with_data	none	int8	int32	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
94	with_data	none	int8	int64	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
95	with_data	none	int8	int8	60976.0	success	0.9869	0.001721	0.000063
96	with_data	none	int8	quantized_uint8	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
97	with_data	none	int8	string	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
98	with_data	none	int8	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
99	with_data	none	int8	none	60976.0	success	0.9869	0.001717	0.000071
100	with_data	none	tflite	float	223952.0	success	0.9867	0.000300	0.000032
101	with_data	none	tflite	float16	223952.0	success	0.9867	0.000300	0.000021
102	with_data	none	tflite	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
103	with_data	none	tflite	int32	223952.0	success	0.9867	0.000299	0.000022
104	with_data	none	tflite	int64	223952.0	success	0.9867	0.000303	0.000026
105	with_data	none	tflite	int8	60976.0	success	0.9869	0.001721	0.000061
106	with_data	none	tflite	quantized_uint8	223952.0	success	0.9867	0.000303	0.000023
107	with_data	none	tflite	string	223952.0	success	0.9867	0.000303	0.000028
108	with_data	none	tflite	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
109	with_data	none	tflite	none	223952.0	success	0.9867	0.000303	0.000023
110	with_data	none	tf	float	223952.0	success	0.9867	0.000302	0.000024
111	with_data	none	tf	float16	223952.0	success	0.9867	0.000301	0.000025
112	with_data	none	tf	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
113	with_data	none	tf	int32	223952.0	success	0.9867	0.000303	0.000034
114	with_data	none	tf	int64	223952.0	success	0.9867	0.000300	0.000022
115	with_data	none	tf	int8	60976.0	success	0.9869	0.001720	0.000062
116	with_data	none	tf	quantized_uint8	223952.0	success	0.9867	0.000300	0.000022
117	with_data	none	tf	string	223952.0	success	0.9867	0.000301	0.000026
118	with_data	none	tf	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
119	with_data	none	tf	none	223952.0	success	0.9867	0.000299	0.000022
120	wo_data	default	int8	float	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
121	wo_data	default	int8	float16	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
122	wo_data	default	int8	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
123	wo_data	default	int8	int32	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
124	wo_data	default	int8	int64	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
125	wo_data	default	int8	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
126	wo_data	default	int8	quantized_uint8	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
127	wo_data	default	int8	string	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
128	wo_data	default	int8	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
129	wo_data	default	int8	none	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
130	wo_data	default	tflite	float	59648.0	success	0.9864	0.000203	0.000023
131	wo_data	default	tflite	float16	114928.0	success	0.9867	0.000301	0.000026
132	wo_data	default	tflite	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
133	wo_data	default	tflite	int32	59648.0	success	0.9864	0.000193	0.000018
134	wo_data	default	tflite	int64	59648.0	success	0.9864	0.000190	0.000020
135	wo_data	default	tflite	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
136	wo_data	default	tflite	quantized_uint8	59648.0	success	0.9864	0.000191	0.000017
137	wo_data	default	tflite	string	59648.0	success	0.9864	0.000196	0.000019
138	wo_data	default	tflite	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
139	wo_data	default	tflite	none	59648.0	success	0.9864	0.000191	0.000016
140	wo_data	default	tf	float	59648.0	success	0.9864	0.000192	0.000019
141	wo_data	default	tf	float16	114928.0	success	0.9867	0.000301	0.000025
142	wo_data	default	tf	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
143	wo_data	default	tf	int32	59648.0	success	0.9864	0.000194	0.000017
144	wo_data	default	tf	int64	59648.0	success	0.9864	0.000188	0.000016
145	wo_data	default	tf	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
146	wo_data	default	tf	quantized_uint8	59648.0	success	0.9864	0.000198	0.000025
147	wo_data	default	tf	string	59648.0	success	0.9864	0.000197	0.000018
148	wo_data	default	tf	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
149	wo_data	default	tf	none	59648.0	success	0.9864	0.000189	0.000014
150	wo_data	latency	int8	float	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
151	wo_data	latency	int8	float16	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
152	wo_data	latency	int8	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
153	wo_data	latency	int8	int32	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
154	wo_data	latency	int8	int64	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
155	wo_data	latency	int8	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
156	wo_data	latency	int8	quantized_uint8	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
157	wo_data	latency	int8	string	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
158	wo_data	latency	int8	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
159	wo_data	latency	int8	none	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
160	wo_data	latency	tflite	float	59648.0	success	0.9864	0.000195	0.000017
161	wo_data	latency	tflite	float16	114928.0	success	0.9867	0.000302	0.000030
162	wo_data	latency	tflite	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
163	wo_data	latency	tflite	int32	59648.0	success	0.9864	0.000197	0.000017
164	wo_data	latency	tflite	int64	59648.0	success	0.9864	0.000213	0.002201
165	wo_data	latency	tflite	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
166	wo_data	latency	tflite	quantized_uint8	59648.0	success	0.9864	0.000194	0.000018
167	wo_data	latency	tflite	string	59648.0	success	0.9864	0.000190	0.000018
168	wo_data	latency	tflite	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
169	wo_data	latency	tflite	none	59648.0	success	0.9864	0.000189	0.000016
170	wo_data	latency	tf	float	59648.0	success	0.9864	0.000188	0.000014
171	wo_data	latency	tf	float16	114928.0	success	0.9867	0.000321	0.002138
172	wo_data	latency	tf	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
173	wo_data	latency	tf	int32	59648.0	success	0.9864	0.000196	0.000018
174	wo_data	latency	tf	int64	59648.0	success	0.9864	0.000191	0.000017
175	wo_data	latency	tf	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
176	wo_data	latency	tf	quantized_uint8	59648.0	success	0.9864	0.000195	0.000024
177	wo_data	latency	tf	string	59648.0	success	0.9864	0.000202	0.000022
178	wo_data	latency	tf	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
179	wo_data	latency	tf	none	59648.0	success	0.9864	0.000190	0.000016
180	wo_data	size	int8	float	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
181	wo_data	size	int8	float16	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
182	wo_data	size	int8	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
183	wo_data	size	int8	int32	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
184	wo_data	size	int8	int64	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
185	wo_data	size	int8	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
186	wo_data	size	int8	quantized_uint8	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
187	wo_data	size	int8	string	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
188	wo_data	size	int8	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
189	wo_data	size	int8	none	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
190	wo_data	size	tflite	float	59648.0	success	0.9864	0.000192	0.000017
191	wo_data	size	tflite	float16	114928.0	success	0.9867	0.000299	0.000025
192	wo_data	size	tflite	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
193	wo_data	size	tflite	int32	59648.0	success	0.9864	0.000203	0.000021
194	wo_data	size	tflite	int64	59648.0	success	0.9864	0.000194	0.000022
195	wo_data	size	tflite	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
196	wo_data	size	tflite	quantized_uint8	59648.0	success	0.9864	0.000197	0.000020
197	wo_data	size	tflite	string	59648.0	success	0.9864	0.000194	0.000018
198	wo_data	size	tflite	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
199	wo_data	size	tflite	none	59648.0	success	0.9864	0.000200	0.000018
200	wo_data	size	tf	float	59648.0	success	0.9864	0.000212	0.002256
201	wo_data	size	tf	float16	114928.0	success	0.9867	0.000310	0.000043
202	wo_data	size	tf	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
203	wo_data	size	tf	int32	59648.0	success	0.9864	0.000192	0.000016
204	wo_data	size	tf	int64	59648.0	success	0.9864	0.000194	0.000020
205	wo_data	size	tf	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
206	wo_data	size	tf	quantized_uint8	59648.0	success	0.9864	0.000191	0.000015
207	wo_data	size	tf	string	59648.0	success	0.9864	0.000213	0.002210
208	wo_data	size	tf	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
209	wo_data	size	tf	none	59648.0	success	0.9864	0.000195	0.000019
210	wo_data	none	int8	float	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
211	wo_data	none	int8	float16	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
212	wo_data	none	int8	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
213	wo_data	none	int8	int32	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
214	wo_data	none	int8	int64	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
215	wo_data	none	int8	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
216	wo_data	none	int8	quantized_uint8	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
217	wo_data	none	int8	string	NaN	TFLITE_BUILTINS_INT8 requires smallest supported type to be INT8.	NaN	NaN	NaN
218	wo_data	none	int8	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
219	wo_data	none	int8	none	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
220	wo_data	none	tflite	float	223952.0	success	0.9867	0.000301	0.000023
221	wo_data	none	tflite	float16	223952.0	success	0.9867	0.000322	0.002276
222	wo_data	none	tflite	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
223	wo_data	none	tflite	int32	223952.0	success	0.9867	0.000301	0.000025
224	wo_data	none	tflite	int64	223952.0	success	0.9867	0.000302	0.000031
225	wo_data	none	tflite	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
226	wo_data	none	tflite	quantized_uint8	223952.0	success	0.9867	0.000299	0.000021
227	wo_data	none	tflite	string	223952.0	success	0.9867	0.000300	0.000022
228	wo_data	none	tflite	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
229	wo_data	none	tflite	none	223952.0	success	0.9867	0.000300	0.000021
230	wo_data	none	tf	float	223952.0	success	0.9867	0.000300	0.000022
231	wo_data	none	tf	float16	223952.0	success	0.9867	0.000301	0.000048
232	wo_data	none	tf	graphviz_dot	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
233	wo_data	none	tf	int32	223952.0	success	0.9867	0.000301	0.000028
234	wo_data	none	tf	int64	223952.0	success	0.9867	0.000299	0.000024
235	wo_data	none	tf	int8	NaN	representative_dataset is required when specifying TFLITE_BUILTINS_INT8 or INT8 supported types.	NaN	NaN	NaN
236	wo_data	none	tf	quantized_uint8	223952.0	success	0.9867	0.000300	0.000023
237	wo_data	none	tf	string	223952.0	success	0.9867	0.000300	0.000025
238	wo_data	none	tf	tflite	NaN	'int' object has no attribute 'size'	NaN	NaN	NaN
239	wo_data	none	tf	none	223952.0	success	0.9867	0.000300	0.000026

Finished tf lite models¶

In [0]:

HTML(result.dropna().to_html())

Out[0]:

	data	optimization	ops	type	size	status	accuracy	mean_inference	std_inference
5	with_data	default	int8	int8	60976.0	success	0.9869	0.001737	0.001835
9	with_data	default	int8	none	60976.0	success	0.9869	0.001721	0.000083
10	with_data	default	tflite	float	60976.0	success	0.9869	0.001719	0.000059
11	with_data	default	tflite	float16	114928.0	success	0.9867	0.000301	0.000026
13	with_data	default	tflite	int32	60976.0	success	0.9869	0.001721	0.000060
14	with_data	default	tflite	int64	60976.0	success	0.9869	0.001719	0.000059
15	with_data	default	tflite	int8	60976.0	success	0.9869	0.001718	0.000060
16	with_data	default	tflite	quantized_uint8	60976.0	success	0.9869	0.001722	0.000087
17	with_data	default	tflite	string	60976.0	success	0.9869	0.001718	0.000059
19	with_data	default	tflite	none	60976.0	success	0.9869	0.001744	0.002122
20	with_data	default	tf	float	60976.0	success	0.9869	0.001716	0.000054
21	with_data	default	tf	float16	114928.0	success	0.9867	0.000299	0.000021
23	with_data	default	tf	int32	60976.0	success	0.9869	0.001726	0.000068
24	with_data	default	tf	int64	60976.0	success	0.9869	0.001716	0.000058
25	with_data	default	tf	int8	60976.0	success	0.9869	0.001716	0.000070
26	with_data	default	tf	quantized_uint8	60976.0	success	0.9869	0.001727	0.000104
27	with_data	default	tf	string	60976.0	success	0.9869	0.001713	0.000051
29	with_data	default	tf	none	60976.0	success	0.9869	0.001715	0.000049
35	with_data	latency	int8	int8	60976.0	success	0.9869	0.001721	0.000119
39	with_data	latency	int8	none	60976.0	success	0.9869	0.001716	0.000070
40	with_data	latency	tflite	float	60976.0	success	0.9869	0.001716	0.000072
41	with_data	latency	tflite	float16	114928.0	success	0.9867	0.000301	0.000024
43	with_data	latency	tflite	int32	60976.0	success	0.9869	0.001719	0.000080
44	with_data	latency	tflite	int64	60976.0	success	0.9869	0.001718	0.000060
45	with_data	latency	tflite	int8	60976.0	success	0.9869	0.001716	0.000063
46	with_data	latency	tflite	quantized_uint8	60976.0	success	0.9869	0.001717	0.000060
47	with_data	latency	tflite	string	60976.0	success	0.9869	0.001727	0.000079
49	with_data	latency	tflite	none	60976.0	success	0.9869	0.001715	0.000067
50	with_data	latency	tf	float	60976.0	success	0.9869	0.001714	0.000049
51	with_data	latency	tf	float16	114928.0	success	0.9867	0.000303	0.000030
53	with_data	latency	tf	int32	60976.0	success	0.9869	0.001725	0.000072
54	with_data	latency	tf	int64	60976.0	success	0.9869	0.001716	0.000058
55	with_data	latency	tf	int8	60976.0	success	0.9869	0.001739	0.002140
56	with_data	latency	tf	quantized_uint8	60976.0	success	0.9869	0.001713	0.000066
57	with_data	latency	tf	string	60976.0	success	0.9869	0.001709	0.000040
59	with_data	latency	tf	none	60976.0	success	0.9869	0.001714	0.000067
65	with_data	size	int8	int8	60976.0	success	0.9869	0.001711	0.000049
69	with_data	size	int8	none	60976.0	success	0.9869	0.001717	0.000065
70	with_data	size	tflite	float	60976.0	success	0.9869	0.001717	0.000055
71	with_data	size	tflite	float16	114928.0	success	0.9867	0.000300	0.000022
73	with_data	size	tflite	int32	60976.0	success	0.9869	0.001721	0.000069
74	with_data	size	tflite	int64	60976.0	success	0.9869	0.001717	0.000057
75	with_data	size	tflite	int8	60976.0	success	0.9869	0.001717	0.000059
76	with_data	size	tflite	quantized_uint8	60976.0	success	0.9869	0.001717	0.000070
77	with_data	size	tflite	string	60976.0	success	0.9869	0.001743	0.002344
79	with_data	size	tflite	none	60976.0	success	0.9869	0.001716	0.000053
80	with_data	size	tf	float	60976.0	success	0.9869	0.001720	0.000098
81	with_data	size	tf	float16	114928.0	success	0.9867	0.000298	0.000024
83	with_data	size	tf	int32	60976.0	success	0.9869	0.001716	0.000052
84	with_data	size	tf	int64	60976.0	success	0.9869	0.001740	0.002122
85	with_data	size	tf	int8	60976.0	success	0.9869	0.001719	0.000070
86	with_data	size	tf	quantized_uint8	60976.0	success	0.9869	0.001713	0.000043
87	with_data	size	tf	string	60976.0	success	0.9869	0.001716	0.000061
89	with_data	size	tf	none	60976.0	success	0.9869	0.001716	0.000057
95	with_data	none	int8	int8	60976.0	success	0.9869	0.001721	0.000063
99	with_data	none	int8	none	60976.0	success	0.9869	0.001717	0.000071
100	with_data	none	tflite	float	223952.0	success	0.9867	0.000300	0.000032
101	with_data	none	tflite	float16	223952.0	success	0.9867	0.000300	0.000021
103	with_data	none	tflite	int32	223952.0	success	0.9867	0.000299	0.000022
104	with_data	none	tflite	int64	223952.0	success	0.9867	0.000303	0.000026
105	with_data	none	tflite	int8	60976.0	success	0.9869	0.001721	0.000061
106	with_data	none	tflite	quantized_uint8	223952.0	success	0.9867	0.000303	0.000023
107	with_data	none	tflite	string	223952.0	success	0.9867	0.000303	0.000028
109	with_data	none	tflite	none	223952.0	success	0.9867	0.000303	0.000023
110	with_data	none	tf	float	223952.0	success	0.9867	0.000302	0.000024
111	with_data	none	tf	float16	223952.0	success	0.9867	0.000301	0.000025
113	with_data	none	tf	int32	223952.0	success	0.9867	0.000303	0.000034
114	with_data	none	tf	int64	223952.0	success	0.9867	0.000300	0.000022
115	with_data	none	tf	int8	60976.0	success	0.9869	0.001720	0.000062
116	with_data	none	tf	quantized_uint8	223952.0	success	0.9867	0.000300	0.000022
117	with_data	none	tf	string	223952.0	success	0.9867	0.000301	0.000026
119	with_data	none	tf	none	223952.0	success	0.9867	0.000299	0.000022
130	wo_data	default	tflite	float	59648.0	success	0.9864	0.000203	0.000023
131	wo_data	default	tflite	float16	114928.0	success	0.9867	0.000301	0.000026
133	wo_data	default	tflite	int32	59648.0	success	0.9864	0.000193	0.000018
134	wo_data	default	tflite	int64	59648.0	success	0.9864	0.000190	0.000020
136	wo_data	default	tflite	quantized_uint8	59648.0	success	0.9864	0.000191	0.000017
137	wo_data	default	tflite	string	59648.0	success	0.9864	0.000196	0.000019
139	wo_data	default	tflite	none	59648.0	success	0.9864	0.000191	0.000016
140	wo_data	default	tf	float	59648.0	success	0.9864	0.000192	0.000019
141	wo_data	default	tf	float16	114928.0	success	0.9867	0.000301	0.000025
143	wo_data	default	tf	int32	59648.0	success	0.9864	0.000194	0.000017
144	wo_data	default	tf	int64	59648.0	success	0.9864	0.000188	0.000016
146	wo_data	default	tf	quantized_uint8	59648.0	success	0.9864	0.000198	0.000025
147	wo_data	default	tf	string	59648.0	success	0.9864	0.000197	0.000018
149	wo_data	default	tf	none	59648.0	success	0.9864	0.000189	0.000014
160	wo_data	latency	tflite	float	59648.0	success	0.9864	0.000195	0.000017
161	wo_data	latency	tflite	float16	114928.0	success	0.9867	0.000302	0.000030
163	wo_data	latency	tflite	int32	59648.0	success	0.9864	0.000197	0.000017
164	wo_data	latency	tflite	int64	59648.0	success	0.9864	0.000213	0.002201
166	wo_data	latency	tflite	quantized_uint8	59648.0	success	0.9864	0.000194	0.000018
167	wo_data	latency	tflite	string	59648.0	success	0.9864	0.000190	0.000018
169	wo_data	latency	tflite	none	59648.0	success	0.9864	0.000189	0.000016
170	wo_data	latency	tf	float	59648.0	success	0.9864	0.000188	0.000014
171	wo_data	latency	tf	float16	114928.0	success	0.9867	0.000321	0.002138
173	wo_data	latency	tf	int32	59648.0	success	0.9864	0.000196	0.000018
174	wo_data	latency	tf	int64	59648.0	success	0.9864	0.000191	0.000017
176	wo_data	latency	tf	quantized_uint8	59648.0	success	0.9864	0.000195	0.000024
177	wo_data	latency	tf	string	59648.0	success	0.9864	0.000202	0.000022
179	wo_data	latency	tf	none	59648.0	success	0.9864	0.000190	0.000016
190	wo_data	size	tflite	float	59648.0	success	0.9864	0.000192	0.000017
191	wo_data	size	tflite	float16	114928.0	success	0.9867	0.000299	0.000025
193	wo_data	size	tflite	int32	59648.0	success	0.9864	0.000203	0.000021
194	wo_data	size	tflite	int64	59648.0	success	0.9864	0.000194	0.000022
196	wo_data	size	tflite	quantized_uint8	59648.0	success	0.9864	0.000197	0.000020
197	wo_data	size	tflite	string	59648.0	success	0.9864	0.000194	0.000018
199	wo_data	size	tflite	none	59648.0	success	0.9864	0.000200	0.000018
200	wo_data	size	tf	float	59648.0	success	0.9864	0.000212	0.002256
201	wo_data	size	tf	float16	114928.0	success	0.9867	0.000310	0.000043
203	wo_data	size	tf	int32	59648.0	success	0.9864	0.000192	0.000016
204	wo_data	size	tf	int64	59648.0	success	0.9864	0.000194	0.000020
206	wo_data	size	tf	quantized_uint8	59648.0	success	0.9864	0.000191	0.000015
207	wo_data	size	tf	string	59648.0	success	0.9864	0.000213	0.002210
209	wo_data	size	tf	none	59648.0	success	0.9864	0.000195	0.000019
220	wo_data	none	tflite	float	223952.0	success	0.9867	0.000301	0.000023
221	wo_data	none	tflite	float16	223952.0	success	0.9867	0.000322	0.002276
223	wo_data	none	tflite	int32	223952.0	success	0.9867	0.000301	0.000025
224	wo_data	none	tflite	int64	223952.0	success	0.9867	0.000302	0.000031
226	wo_data	none	tflite	quantized_uint8	223952.0	success	0.9867	0.000299	0.000021
227	wo_data	none	tflite	string	223952.0	success	0.9867	0.000300	0.000022
229	wo_data	none	tflite	none	223952.0	success	0.9867	0.000300	0.000021
230	wo_data	none	tf	float	223952.0	success	0.9867	0.000300	0.000022
231	wo_data	none	tf	float16	223952.0	success	0.9867	0.000301	0.000048
233	wo_data	none	tf	int32	223952.0	success	0.9867	0.000301	0.000028
234	wo_data	none	tf	int64	223952.0	success	0.9867	0.000299	0.000024
236	wo_data	none	tf	quantized_uint8	223952.0	success	0.9867	0.000300	0.000023
237	wo_data	none	tf	string	223952.0	success	0.9867	0.000300	0.000025
239	wo_data	none	tf	none	223952.0	success	0.9867	0.000300	0.000026

TF Lite Interpreter result details¶

In [0]:

data_gen = create_data(x_test)

Plain TF Lite Convert¶

In [0]:

plain_tflite = np.squeeze([i for i in get_res("tmp/wo_data-opt(none)-ops(tf)-type(float).tflite", data_gen)])
mismatch, diff = get_diff(plain_res, plain_tflite)
plt.hist(diff)
plt.title("total mismatch=%d"%mismatch)
plt.show()

In [0]:

def print_inpt(filename):
  intp = lite.Interpreter(filename)
  for i in intp.get_tensor_details():
    print("\t".join(["%d"%i['index'], ("%s"%i['dtype']).split("'")[1].split('.')[1], i['name'], "%s"%i['shape'], "(%f,%f)"%i['quantization']]))

In [0]:

print_inpt("tmp/wo_data-opt(none)-ops(tf)-type(float).tflite")

0	float32	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.000000,0.000000)
1	float32	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.000000,0.000000)
2	float32	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.000000,0.000000)
3	float32	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.000000,0.000000)
4	float32	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.000000,0.000000)
5	float32	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.000000,0.000000)
6	float32	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.000000,0.000000)
7	float32	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.000000,0.000000)
8	float32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	float32	conv2d/Relu	[ 1 26 26 16]	(0.000000,0.000000)
10	float32	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	float32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	float32	conv2d_1/Relu	[ 1 11 11 16]	(0.000000,0.000000)
13	float32	conv2d_1/kernel	[16  3  3 16]	(0.000000,0.000000)
14	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
15	float32	dense/MatMul_bias	[128]	(0.000000,0.000000)
16	float32	dense/Relu	[  1 128]	(0.000000,0.000000)
17	float32	dense/kernel/transpose	[128 400]	(0.000000,0.000000)
18	float32	dense_1/BiasAdd	[ 1 10]	(0.000000,0.000000)
19	float32	dense_1/MatMul_bias	[10]	(0.000000,0.000000)
20	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)
21	float32	dense_1/kernel/transpose	[ 10 128]	(0.000000,0.000000)
22	float32	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.000000,0.000000)
23	float32	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.000000,0.000000)

TF default optimization¶

In [0]:

plain_opt = np.squeeze([i for i in get_res("tmp/wo_data-opt(default)-ops(tflite)-type(float).tflite", data_gen)])
mismatch, diff = get_diff(plain_res, plain_opt)
plt.hist(diff)
plt.title("total mismatch=%d"%mismatch)
plt.show()

In [0]:

print_inpt("tmp/wo_data-opt(default)-ops(tflite)-type(float).tflite")

0	float32	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.000000,0.000000)
1	float32	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.000000,0.000000)
2	float32	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.000000,0.000000)
3	float32	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.000000,0.000000)
4	float32	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.000000,0.000000)
5	float32	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.000000,0.000000)
6	float32	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.000000,0.000000)
7	float32	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.000000,0.000000)
8	float32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	float32	conv2d/Relu	[ 1 26 26 16]	(0.000000,0.000000)
10	float32	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	float32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	float32	conv2d_1/Relu	[ 1 11 11 16]	(0.000000,0.000000)
13	int8	conv2d_1/kernel	[16  3  3 16]	(0.002425,0.000000)
14	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
15	float32	dense/MatMul_bias	[128]	(0.000000,0.000000)
16	float32	dense/Relu	[  1 128]	(0.000000,0.000000)
17	int8	dense/kernel/transpose	[128 400]	(0.002713,0.000000)
18	float32	dense_1/BiasAdd	[ 1 10]	(0.000000,0.000000)
19	float32	dense_1/MatMul_bias	[10]	(0.000000,0.000000)
20	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)
21	int8	dense_1/kernel/transpose	[ 10 128]	(0.003590,0.000000)
22	float32	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.000000,0.000000)
23	float32	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.000000,0.000000)

TF with representative data¶

In [0]:

data_opt = np.squeeze([i for i in get_res("tmp/with_data-opt(default)-ops(tflite)-type(float).tflite", data_gen)])
mismatch, diff = get_diff(plain_res, data_opt)
plt.hist(diff)
plt.title("total mismatch=%d"%mismatch)
plt.show()

In [0]:

print_inpt("tmp/with_data-opt(default)-ops(tflite)-type(float).tflite")

0	int8	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.043397,-107.000000)
1	int8	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.007060,0.000000)
2	int8	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.041571,-128.000000)
3	int8	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.239892,0.000000)
4	int8	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.078118,-119.000000)
5	int8	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.005670,0.000000)
6	int8	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.077301,-128.000000)
7	int8	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.021564,0.000000)
8	int32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	int8	conv2d/Relu	[ 1 26 26 16]	(0.002887,-128.000000)
10	int8	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	int32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	int8	conv2d_1/Relu	[ 1 11 11 16]	(0.039788,-128.000000)
13	int8	conv2d_1/kernel	[16  3  3 16]	(0.000000,0.000000)
14	int8	conv2d_input_int8	[ 1 28 28  1]	(0.003922,-128.000000)
15	int32	dense/MatMul_bias	[128]	(0.000212,0.000000)
16	int8	dense/Relu	[  1 128]	(0.092764,-128.000000)
17	int8	dense/kernel/transpose	[128 400]	(0.002713,0.000000)
18	int8	dense_1/BiasAdd	[ 1 10]	(0.281922,-28.000000)
19	int32	dense_1/MatMul_bias	[10]	(0.000333,0.000000)
20	int8	dense_1/Softmax_int8	[ 1 10]	(0.003906,-128.000000)
21	int8	dense_1/kernel/transpose	[ 10 128]	(0.003590,0.000000)
22	int8	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.043397,-107.000000)
23	int8	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.078118,-119.000000)
24	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
25	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)

Other quick comparison¶

In [0]:

print_inpt("tmp/with_data-opt(default)-ops(int8)-type(int8).tflite")

0	int8	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.043397,-107.000000)
1	int8	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.007060,0.000000)
2	int8	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.041571,-128.000000)
3	int8	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.239892,0.000000)
4	int8	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.078118,-119.000000)
5	int8	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.005670,0.000000)
6	int8	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.077301,-128.000000)
7	int8	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.021564,0.000000)
8	int32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	int8	conv2d/Relu	[ 1 26 26 16]	(0.002887,-128.000000)
10	int8	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	int32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	int8	conv2d_1/Relu	[ 1 11 11 16]	(0.039788,-128.000000)
13	int8	conv2d_1/kernel	[16  3  3 16]	(0.000000,0.000000)
14	int8	conv2d_input_int8	[ 1 28 28  1]	(0.003922,-128.000000)
15	int32	dense/MatMul_bias	[128]	(0.000212,0.000000)
16	int8	dense/Relu	[  1 128]	(0.092764,-128.000000)
17	int8	dense/kernel/transpose	[128 400]	(0.002713,0.000000)
18	int8	dense_1/BiasAdd	[ 1 10]	(0.281922,-28.000000)
19	int32	dense_1/MatMul_bias	[10]	(0.000333,0.000000)
20	int8	dense_1/Softmax_int8	[ 1 10]	(0.003906,-128.000000)
21	int8	dense_1/kernel/transpose	[ 10 128]	(0.003590,0.000000)
22	int8	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.043397,-107.000000)
23	int8	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.078118,-119.000000)
24	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
25	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)

In [0]:

print_inpt("tmp/with_data-opt(size)-ops(tflite)-type(int8).tflite")

0	int8	batch_normalization/FusedBatchNormV3	[ 1 26 26 16]	(0.043397,-107.000000)
1	int8	batch_normalization/FusedBatchNormV3_add_param	[16]	(0.007060,0.000000)
2	int8	batch_normalization/FusedBatchNormV3_mul_0	[ 1 26 26 16]	(0.041571,-128.000000)
3	int8	batch_normalization/FusedBatchNormV3_mul_0_param	[16]	(0.239892,0.000000)
4	int8	batch_normalization_1/FusedBatchNormV3	[ 1 11 11 16]	(0.078118,-119.000000)
5	int8	batch_normalization_1/FusedBatchNormV3_add_param	[16]	(0.005670,0.000000)
6	int8	batch_normalization_1/FusedBatchNormV3_mul_0	[ 1 11 11 16]	(0.077301,-128.000000)
7	int8	batch_normalization_1/FusedBatchNormV3_mul_0_param	[16]	(0.021564,0.000000)
8	int32	conv2d/Conv2D_bias	[16]	(0.000000,0.000000)
9	int8	conv2d/Relu	[ 1 26 26 16]	(0.002887,-128.000000)
10	int8	conv2d/kernel	[ 1  3  3 16]	(0.000000,0.000000)
11	int32	conv2d_1/Conv2D_bias	[16]	(0.000000,0.000000)
12	int8	conv2d_1/Relu	[ 1 11 11 16]	(0.039788,-128.000000)
13	int8	conv2d_1/kernel	[16  3  3 16]	(0.000000,0.000000)
14	int8	conv2d_input_int8	[ 1 28 28  1]	(0.003922,-128.000000)
15	int32	dense/MatMul_bias	[128]	(0.000212,0.000000)
16	int8	dense/Relu	[  1 128]	(0.092764,-128.000000)
17	int8	dense/kernel/transpose	[128 400]	(0.002713,0.000000)
18	int8	dense_1/BiasAdd	[ 1 10]	(0.281922,-28.000000)
19	int32	dense_1/MatMul_bias	[10]	(0.000333,0.000000)
20	int8	dense_1/Softmax_int8	[ 1 10]	(0.003906,-128.000000)
21	int8	dense_1/kernel/transpose	[ 10 128]	(0.003590,0.000000)
22	int8	max_pooling2d/MaxPool	[ 1 13 13 16]	(0.043397,-107.000000)
23	int8	max_pooling2d_1/MaxPool	[ 1  5  5 16]	(0.078118,-119.000000)
24	float32	conv2d_input	[ 1 28 28  1]	(0.000000,0.000000)
25	float32	dense_1/Softmax	[ 1 10]	(0.000000,0.000000)

Remarks¶

Take-aways¶

Now, let's take another look of the workflow provided by TF.

Lite convert decision map

Optimization types are not fully implemented, see [here].(https://github.com/tensorflow/tensorflow/blob/570206441717511720fdae9ac58dac16cc1d348a/tensorflow/lite/python/lite.py#L96)
No data, no optimization, ops and types doesn't matter except crashing cases (e.g. int). It will create a float32 tflite for runtime. This corresponds to N to Optimzie model?.
- Exception case 1: using int in types/ops
- Exception case 2: with data, and set ops to int, (types is int or none).
With optimization and types of float16, it will reduce to half size.
With optimization (and without float16), some weights are quantized to int8.
With data and optimization, weights are in int type. However int8 is not strictly enforced.
When ops is int8, data type needs to be int8.
int8 and uint8 are quite different.

Remaining mysterious¶

What's the difference between select and builtin?
What's the string or none op type?

An unexpected problem¶

If you check the source code](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/kernel_util.cc#L100), it is under the GetQuantizedConvolutionMultipler function. So there is some interesting conversion for the fully connected layer. To save the trouble and focus on our original goal.

In [0]:

m = keras.Sequential(
    [keras.layers.Dense(100, input_shape=(5,)),
     keras.layers.Dense(100),
     keras.layers.Dense(3)]
)
m.save("model2.h5")
x = np.random.randn(100,5).astype('float32')

def data_gen():
  for i in range(100):
    yield x[None, i]

def data_gen2():
  y = data_gen()
  for i in y:
    yield [list(i)]

if ver1_flag:
  conv = lite.TFLiteConverter.from_keras_model_file("model2.h5")
else: 
  conv = lite.TFLiteConverter.from_keras_model(m)
conv.optimizations = [lite.Optimize.DEFAULT]
conv.representative_dataset = data_gen2
with open("problem.tflite", "wb") as f:
  f.write(conv.convert())

WARNING:tensorflow:No training configuration found in save file: the model was *not* compiled. Compile it manually.

WARNING:tensorflow:No training configuration found in save file: the model was *not* compiled. Compile it manually.

INFO:tensorflow:Froze 6 variables.

INFO:tensorflow:Froze 6 variables.

INFO:tensorflow:Converted 6 variables to const ops.

INFO:tensorflow:Converted 6 variables to const ops.

In [0]:

intp = lite.Interpreter("problem.tflite")
intp.allocate_tensors()

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-33-5338b2d27dd6> in <module>()
      1 intp = lite.Interpreter("problem.tflite")
----> 2 intp.allocate_tensors()

/usr/local/lib/python3.6/dist-packages/tensorflow_core/lite/python/interpreter.py in allocate_tensors(self)
    242   def allocate_tensors(self):
    243     self._ensure_safe()
--> 244     return self._interpreter.AllocateTensors()
    245 
    246   def _safe_to_run(self):

/usr/local/lib/python3.6/dist-packages/tensorflow_core/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py in AllocateTensors(self)
    104 
    105     def AllocateTensors(self):
--> 106         return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_AllocateTensors(self)
    107 
    108     def Invoke(self):

RuntimeError: tensorflow/lite/kernels/kernel_util.cc:106 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale) was not true.Node number 2 (FULLY_CONNECTED) failed to prepare.

TF Lite Conversion Comparison¶

Load data¶

Build Keras Model¶

TF Lite conversion options¶

Collect all options for tflite conversion¶

Raw results¶

Finished tf lite models¶

TF Lite Interpreter result details¶

Plain TF Lite Convert¶

TF default optimization¶

TF with representative data¶

Other quick comparison¶

Remarks¶

Take-aways¶

Remaining mysterious¶

An unexpected problem¶

Comments