These are my running notes for porting the TIDL demo app that comes with the BeagleBone AI to a Python interface.
Importing TIDL from Python
TIDL includes a Python3 interface which is included with the BeagleBone AI version of the TIDL API. However it's not obvious how to use it since naively trying to run the Python example apps fails:
$ cd /usr/share/ti/examples/tidl/pybind
$ ./one_eo_per_frame.py
Traceback (most recent call last):
File "./one_eo_per_frame.py", line 35, in <module>
from tidl import DeviceId, DeviceType, Configuration, Executor, TidlError
ModuleNotFoundError: No module named 'tidl'
Unlike most Python packages, the TIDL Python interface is a .so
, not a
directory full of .py
files so I had a hard time figuring out where to look
for it. It turns out the trick is to add the following to your PYTHONPATH
:
$ PYTHONPATH=/usr/share/ti/tidl/tidl_api ./imagenet.py
Input: ../test/testvecs/input/objects/cat-pet-animal-domestic-104827.jpeg
TIOCL FATAL: Failed to open file /dev/mem
This time TIDL was found, but we need to run as root due to TIDL's need to
directly manipulate /dev/mem
. So,
$ sudo PYTHONPATH=/usr/share/ti/tidl/tidl_api ./imagenet.py
[sudo] password for debian:
Input: ../test/testvecs/input/objects/cat-pet-animal-domestic-104827.jpeg
1: Egyptian_cat, prob = 34.12%
2: tabby, prob = 34.12%
3: Angora, prob = 9.41%
4: tiger_cat, prob = 7.84%
Since TIDL Python apps have to run as root, you may run into unexpected errors
even if you export PYTHONPATH=/usr/share/ti/tidl/tidl_api
in your .bashrc
.
TIDL's Python API
This TIDL package for Python does not expose the entire TIDL C++ API to Python. Instead, it only provides the following as of version 01.04.00:
Method/Class/etc | Type |
---|---|
tidl.ArgInfo | pybind11_builtins.pybind11_type |
tidl.Configuration | pybind11_builtins.pybind11_type |
tidl.DeviceId | pybind11_builtins.pybind11_type |
tidl.DeviceType | pybind11_builtins.pybind11_type |
tidl.ExecutionObject | pybind11_builtins.pybind11_type |
tidl.ExecutionObjectPipeline | pybind11_builtins.pybind11_type |
tidl.Executor | pybind11_builtins.pybind11_type |
tidl.TidlError | type (it's an exception) |
tidl.allocate_memory | builtin_function_or_method |
tidl.enable_time_stamps | builtin_function_or_method |
tidl.free_memory | builtin_function_or_method |
The namespace of the tidl.so
package is dumped out by doing something like
this:
import sys
sys.path.append("/usr/share/ti/tidl/tidl_api")
import tidl
for part in dir(tidl):
print(part, eval("str(type(tidl.{}))".format(part)))
Notably, the C++ tidl::imgutil
is not included so we don't have access to
tidl::imgutil::PreprocessImage
. This function applies the
OpenCV steps required to transform any old input image into the format expected
TIDL's models. So, we have to preprocess images ourselves in Python with
OpenCV; TI's imagenet.py example shows how this is done.
For completeness (and since it's not documented anywhere), I've included all the interfaces provided by the TIDL Python API below. The best documentation on what each member does can be found in the TIDL API Reference.
tidl.ArgInfo
Member | Type |
---|---|
tidl.ArgInfo.size | instancemethod |
tidl.Configuration
Member | Type |
---|---|
tidl.Configuration.channels | property |
tidl.Configuration.enable_api_trace | property |
tidl.Configuration.enable_layer_dump | property |
tidl.Configuration.height | property |
tidl.Configuration.in_data | property |
tidl.Configuration.layer_index_to_layer_group_id | property |
tidl.Configuration.network_binary | property |
tidl.Configuration.network_heap_size | property |
tidl.Configuration.num_frames | property |
tidl.Configuration.out_data | property |
tidl.Configuration.param_heap_size | property |
tidl.Configuration.parameter_binary | property |
tidl.Configuration.pre_proc_type | property |
tidl.Configuration.read_from_file | instancemethod |
tidl.Configuration.run_full_net | property |
tidl.Configuration.show_heap_stats | property |
tidl.Configuration.width | property |
tidl.DeviceID
Member | Type |
---|---|
tidl.DeviceId.ID0 | tidl.DeviceId |
tidl.DeviceId.ID1 | tidl.DeviceId |
tidl.DeviceId.ID2 | tidl.DeviceId |
tidl.DeviceId.ID3 | tidl.DeviceId |
tidl.DeviceType
Member | Type |
---|---|
tidl.DeviceType.DSP | tidl.DeviceType |
tidl.DeviceType.EVE | tidl.DeviceType |
tidl.ExecutionObject
Member | Type |
---|---|
tidl.ExecutionObject.get_device_name | instancemethod |
tidl.ExecutionObject.get_frame_index | instancemethod |
tidl.ExecutionObject.get_input_buffer | instancemethod |
tidl.ExecutionObject.get_output_buffer | instancemethod |
tidl.ExecutionObject.get_process_time_in_ms | instancemethod |
tidl.ExecutionObject.process_frame_start_async | instancemethod |
tidl.ExecutionObject.process_frame_wait | instancemethod |
tidl.ExecutionObject.set_frame_index | instancemethod |
tidl.ExecutionObject.write_layer_outputs_to_file | instancemethod |
tidl.ExecutionObjectPipeline
Member | Type |
---|---|
tidl.ExecutionObjectPipeline.get_device_name | instancemethod |
tidl.ExecutionObjectPipeline.get_frame_index | instancemethod |
tidl.ExecutionObjectPipeline.get_input_buffer | instancemethod |
tidl.ExecutionObjectPipeline.get_output_buffer | instancemethod |
tidl.ExecutionObjectPipeline.process_frame_start_async | instancemethod |
tidl.ExecutionObjectPipeline.process_frame_wait | instancemethod |
tidl.ExecutionObjectPipeline.set_frame_index | instancemethod |
tidl.Executor
Member | Type |
---|---|
tidl.Executor.at | instancemethod |
tidl.Executor.get_api_version | builtin_function_or_method |
tidl.Executor.get_num_devices | builtin_function_or_method |
tidl.Executor.get_num_execution_objects | instancemethod |
Image interface into Python
The BeagleBone demo app and the TI Python example use OpenCV, so let's use that too. PIL is an alternative used in the Jetson Nano classification demo, and it may be a better choice in the future since it's a simpler library than OpenCV, but PIL is not included with the BeagleBone AI OS whereas OpenCV is.
To import an image from a USB webcam:
#!/usr/bin/env python3
import cv2
camera = cv2.VideoCapture("/dev/video1")
_, image = camera.read()
cv2.imwrite('cv2capture.jpg', image)
Streaming OpenCV over HTTP
Flask comes with BeagleBone's OS and provides everything you need to stream video over HTTP as the BeagleBone AI's classification demo app does. There's just a little glue code needed to read images using OpenCV and write them to an mjpg stream:
#!/usr/bin/env python3
import flask
import cv2
app = flask.Flask(__name__)
def stream_camera(camera):
while True:
ret, frame = camera.read()
if ret:
_, imstr = cv2.imencode(".jpg", frame)
yield (b'--frame\r\nContent-Type: image/jpeg\r\n\r\n'
+ bytes(imstr) + b'\r\n')
@app.route('/')
def video_feed():
return flask.Response(
stream_camera(cv2.VideoCapture("/dev/video1")),
mimetype='multipart/x-mixed-replace; boundary=frame')
if __name__ == '__main__':
app.run(host='0.0.0.0')
The beauty here is that you can modify frame
after it is read in
stream_camera()
to manipulate the image before sending it to the HTTP video
stream. This is where you can insert functionality (such as using OpenCV to
add a text overlay) on each video frame.
Classifying video frames
You can intercept frames in stream_camera()
and apply an arbitrary filter_function()
before passing it to the mjpg stream:
def stream_camera(camera):
while True:
ret, frame = camera.read()
if ret:
frame = filter_function(frame)
_, imstr = cv2.imencode(".jpg", frame)
yield (b'--frame\r\nContent-Type: image/jpeg\r\n\r\n'
+ bytes(imstr) + b'\r\n')
Naively, this filter_function()
could do something like...
- run the frame through a neural net by way of TIDL or any other ML framework to classify the contents of image
- overlay a text message with the label we got from step 1
On BeagleBone AI this would be slow though; every frame would have to be processed and streamed before the next frame could begin, bottlenecking inference on the inference performance of a single frame. With more careful integration, you can round-robin image frames over all EVEs and DSPs to keep them all busy processing frames though.
I've written an example app that integrates TIDL, OpenCV, and Flask to do streaming classification. Its core loop looks at each ExecutionObjectPipeline in a round-robin fashion and does the following:
- Checks to see if an ExecutionObjectPipeline (EOP) is done processing its work
- If so, read the results from the EOP's EVE or DSP.
- Find out the label with the highest confidence from those results
- Use
cv2.putText()
to write that topmost label on the image - Send the image as the next video frame via Flask
- Reads an image frame from the webcam
- Squeezes the frame down to 224 × 224 pixels, which is what the TIDL model being used expects
- Rearranges the layout of the image in memory to match the in-memory image representation that TIDL expects
- Asynchronously launches the ExecutionObjectPipeline to classify the image
Remember: One EVE or DSP can have one ExecutionObject (EO) at most. It seems that EOs can be assembled into ExecutionObjectPipelines (EOPs) without much restriction, but one EOP typically has only one EO when doing image classification. This means that one EVE or DSP processes one image frame.
By using TIDL asynchronously and looping over ExecutionObjectPipelines, we can load one frame on to each DSP and EVE to be classified in parallel. As EVEs and DSPs finish classifying their frame, we can load up another frame and launch it asynchronously while we look at the results we just got back.
This is like having your washer and dryer going at the same time--if you've
got multiple loads of laundry to wash, you can let your dryer run while you
load your second load into the washer, and you can fold your laundry when both
washer and dryer are running. On BeagleBone AI, we have four EVEs and two DSPs
which allows us to have five other video frames being processed while we
use cv2.putText()
to overlay text on the image we're about to stream out.