Inference c++ jetson xavier AGX #782
christophezeinaty
started this conversation in
General
Replies: 1 comment 45 replies
-
Hi @christophezeinaty, could you open the |
Beta Was this translation helpful? Give feedback.
45 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello I trained my model using larq and saved the model using this
with open("bnn.tflite", 'wb') as flatbuffer_file: flatbuffer_bytes = lce.convert_keras_model(model_b) flatbuffer_file.write(flatbuffer_bytes)
I worte a c++ code to do the inference using the LCE engine and compiled the code directly on the jetson xavier which is arm64 based processor. I am not sure if my inference uses matrix operations or is it really based only on bitwise operations. Whats bothering me is that this model has 76 000 params and do a 60 ms inference time per image, if I compare this model to a smaller one with 12 000 params and quantized weights int 8 the BNN is slower. The int8 models takes only 5ms per image is it normal ?
My hypothesis was even if I have a model that is a little bit larger but is binarized it should be faster (due to removing the mat mult operations) , am I wrong ?
Beta Was this translation helpful? Give feedback.
All reactions