View source on GitHub |
Interpreter interface for running TensorFlow Lite models.
tf.lite.Interpreter(
model_path=None,
model_content=None,
experimental_delegates=None,
num_threads=None,
experimental_op_resolver_type=tf.lite.experimental.OpResolverType.AUTO
,
experimental_preserve_all_tensors=False,
experimental_disable_delegate_clustering=False,
experimental_default_delegate_latest_features=False
)
Used in the notebooks
Used in the guide | Used in the tutorials |
---|---|
Models obtained from TfLiteConverter
can be run in Python with
Interpreter
.
As an example, let's generate a simple Keras model and convert it to TFLite
(TfLiteConverter
also supports other input formats with from_saved_model
and from_concrete_function
)
x = np.array([[1.], [2.]])
y = np.array([[2.], [4.]])
model = tf.keras.models.Sequential([
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(units=1, input_shape=[1])
])
model.compile(optimizer='sgd', loss='mean_squared_error')
model.fit(x, y, epochs=1)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
tflite_model
can be saved to a file and loaded later, or directly into the
Interpreter
. Since TensorFlow Lite pre-plans tensor allocations to optimize
inference, the user needs to call allocate_tensors()
before any inference.
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors() # Needed before execution!
Sample execution:
output = interpreter.get_output_details()[0] # Model has single output.
input = interpreter.get_input_details()[0] # Model has single input.
input_data = tf.constant(1., shape=[1, 1])
interpreter.set_tensor(input['index'], input_data)
interpreter.invoke()
interpreter.get_tensor(output['index']).shape
(1, 1)
Use get_signature_runner()
for a more user-friendly inference API.
Args | |
---|---|
model_path
|
Path to TF-Lite Flatbuffer file. |
model_content
|
Content of model. |
experimental_delegates
|
Experimental. Subject to change. List of TfLiteDelegate objects returned by lite.load_delegate(). |
num_threads
|
Sets the number of threads used by the interpreter and available to CPU kernels. If not set, the interpreter will use an implementation-dependent default number of threads. Currently, only a subset of kernels, such as conv, support multi-threading. num_threads should be >= -1. Setting num_threads to 0 has the effect to disable multithreading, which is equivalent to setting num_threads to 1. If set to the value -1, the number of threads used will be implementation-defined and platform-dependent. |
experimental_op_resolver_type
|
The op resolver used by the interpreter. It must be an instance of OpResolverType. By default, we use the built-in op resolver which corresponds to tflite::ops::builtin::BuiltinOpResolver in C++. |
experimental_preserve_all_tensors
|
If true, then intermediate tensors used during computation are preserved for inspection, and if the passed op resolver type is AUTO or BUILTIN, the type will be changed to BUILTIN_WITHOUT_DEFAULT_DELEGATES so that no Tensorflow Lite default delegates are applied. If false, getting intermediate tensors could result in undefined values or None, especially when the graph is successfully modified by the Tensorflow Lite default delegate. |
experimental_disable_delegate_clustering
|
If true, don't perform delegate
clustering during delegate graph partitioning phase. Disabling delegate
clustering will make the execution order of ops respect the
explicitly-inserted control dependencies in the graph (inserted via
with tf.control_dependencies() ) since the TF Lite converter will drop
control dependencies by default. Most users shouldn't turn this flag to
True if they don't insert explicit control dependencies or the graph
execution order is expected. For automatically inserted control
dependencies (with tf.Variable , tf.Print etc), the user doesn't need
to turn this flag to True since they are respected by default. Note that
this flag is currently experimental, and it might be removed/updated if
the TF Lite converter doesn't drop such control dependencies in the
model. Default is False.
|
experimental_default_delegate_latest_features
|
If true, default delegates may enable all flag protected features. Default is False; |
Raises | |
---|---|
ValueError
|
If the interpreter was unable to create. |
Methods
allocate_tensors
allocate_tensors()
get_input_details
get_input_details()
Gets model input tensor details.
Returns | |
---|---|
A list in which each item is a dictionary with details about
an input tensor. Each dictionary contains the following fields
that describe the tensor:
|
get_output_details
get_output_details()
Gets model output tensor details.
Returns | |
---|---|
A list in which each item is a dictionary with details about
an output tensor. The dictionary contains the same fields as
described for get_input_details() .
|
get_signature_list
get_signature_list()
Gets the list of SignatureDefs in the model.
Example,
signatures = interpreter.get_signature_list()
print(signatures)
# {
# 'add': {'inputs': ['x', 'y'], 'outputs': ['output_0']}
# }
Then using the names in the signature list you can get a callable from
get_signature_runner().
Returns | |
---|---|
A list of SignatureDef details in a dictionary structure. It is keyed on the SignatureDef method name, and the value holds a dictionary of inputs and outputs. |
get_signature_runner
get_signature_runner(
signature_key=None
)
Gets callable for inference of specific SignatureDef.
Example usage,