Skip to main content
Design DocumentNxp

i.MX TensorFlow Lite on Android User's Guide

User guide for deploying TensorFlow Lite models on NXP i.MX 8 series processors using the Android NNAPI and eIQ software stack for hardware-accelerated NPU/GPU inference.

View design document

Overview

This document provides technical guidance for implementing TensorFlow Lite on NXP i.MX 8 series platforms running Android. It details the NXP eIQ software stack and the Neural Network Runtime (NNRT) middleware, which facilitates communication between inference frameworks and hardware accelerators via the Android NN HAL. The guide covers TensorFlow Lite v2.10.1 features, including support for ARM Neon SIMD instructions and both per-tensor and per-channel quantized models. It includes instructions for building and running benchmark applications using Bazel, the Android NDK, and ADB to evaluate performance on NPU and GPU hardware units.

Use Cases

  • Deploying machine learning models on i.MX 8 series processors
  • Benchmarking neural network inference performance on Android
  • Optimizing TensorFlow Lite models for NPU and GPU hardware acceleration
  • Building Android-based edge AI applications using the NXP eIQ software stack

Topics

NXP
i.MX 8
TensorFlow Lite
Android
NNAPI
eIQ
NPU
GPU acceleration
Neural Network Runtime
NNRT
OpenVX
Machine Learning

Referenced Parts

i.MX 8M Plus

NXP

Table 1. Comparison of inference time between CPU and NPU on i.MX 8M Plus EVK

i.MX 8 series

NXP

NNRT also acts as the heterogeneous compute platform for further distributing workloads efficiently across i.MX 8 series compute devices, such as NPU, GPU and CPU.