ZHOU Lijun, LIU Yu, BAI Lu, LIU Fei, WANG Yawei. Using TensorRT for deep learning and inference applications[J]. Journal of Applied Optics, 2020, 41(2): 337-341. DOI: 10.5768/JAO202041.0202007
Citation: ZHOU Lijun, LIU Yu, BAI Lu, LIU Fei, WANG Yawei. Using TensorRT for deep learning and inference applications[J]. Journal of Applied Optics, 2020, 41(2): 337-341. DOI: 10.5768/JAO202041.0202007

Using TensorRT for deep learning and inference applications

More Information
  • Received Date: June 16, 2019
  • Revised Date: September 19, 2019
  • Available Online: March 31, 2020
  • TensorRT is a high-performance deep learning and inference platform. It includes a deep learning and inference optimizer as well as runtime that provides low latency and high throughput for deep learning and inference applications. An example of using TensorRT to quickly build computational pipelines to implement a typical application for performing intelligent video analysis with TensorRT was presented. This example demonstrated four concurrent video streams that used an on-chip decoder for decoding, on-chip scalar for video scaling, and GPU computing. For simplicity of presentation, only one channel used NVIDIA TensorRT to perform object identification and generate bounding boxes around the identified objects. This example also used video converter functions for various format conversions, EGLImage to demonstrate buffer sharing and image display. Finally, the GPU card V100 was used to test the TensorRT acceleration performance of ResNet network. The results show that TensorRT can improve the throughput by about 15 times.
  • [1]
    NVIDIA. NVIDIA Deep learning SDK[DB/OL]. [2019-11-27] https://docs.nvidia.com/deeplearning/sdk/index.html.
    [2]
    HINTON G E. Where do features come from?[J]. Cognitive Science,2014,38(6):1078-1101. doi: 10.1111/cogs.12049
    [3]
    BISHOP C.Neutal networks for pattern recognition[M]. London: Oxford University Press, 1995.
    [4]
    LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86:2278-2324.
    [5]
    RICHARD S, CHRISTOPHER M, ANDREW N. Learning continuous phrase representations and syntactic parsing with recursive neural networks[C]//Taboe Nevada: MPS Neural Information Processing System, 2010: 1-9.
    [6]
    CHEN Y, CHEN T, X UZ, et al. Diannao family: energy-efficient hardware accelerators for machine learning[J]. Communications of the ACM,2016,59(11):105-112. doi: 10.1145/2996864
    [7]
    SUN F, WANG C, GONG L, et al. A power-efficient accelerator for convolutional neural networks[C]//2017 IEEE International Conference on Cluster Computering (CLUSTER). Washington: IEEE, 2017: 631-632.
    [8]
    LUO T, LIU S, LI L, et al. Dadiannao: a neural network supercomputer[J]. IEEE Transactions on Computers,2017,66(1):73-88. doi: 10.1109/TC.2016.2574353
    [9]
    GOKEN E, ERASLAN Z A, JULIEN G, et al. Deep learning: new computational modelling techniques for genomics[J]. Nature Reviews Genetics, 2019, 20(7): 389-403.
    [10]
    AGARWAL A, DUCHI J C. Distributed delayed stochastic optimization[C]//Nevada: MIPS, Neural Information Processing Systems, 2011: 873-881.
    [11]
    YOU Y, ZHANG Z, HSIEH CJ, et al. Imagenet training in minutes [C]//Proceedings of the 47th International Conference on Parallel Processing. New York: NY ACM, 2018: 1-10.
    [12]
    SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of go without human knowledge[J]. Nature,2017,550(7676):354. doi: 10.1038/nature24270
    [13]
    LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceeedings of 2015 IEEE Conference on Computer Vision and Pattern Recongnition. Boston, Washington: IEEE, 2015: 3411-3440.
    [14]
    NG Y H, HAUSKNECHT M, VIJAYANARASIMHAN S, et al. Beyond short snippets: deep networks for video classification[J]. IEEE,2015,16(4):4694-4702.
    [15]
    YANG Z, NEVATIA R. A multi-scale cascade fully convolutional network face detector[C]//Pattern Recognition(ICPR), 2016 23rd International Conference on IEEE. Washington: IEEE, 2016: 633-638.
  • Cited by

    Periodical cited type(24)

    1. 陈俊杰,洪心皓. 面向嵌入式平台的智能巡视点位校正系统研究. 福建电脑. 2025(02): 88-92 .
    2. 孙晨,邓宽. 基于嵌入式AI设备的光伏电池片缺陷智能检测系统. 电子设计工程. 2024(04): 129-134 .
    3. 杨海斌. 基于改进YOLOv5S算法的铝型材表面缺陷检测. 电脑知识与技术. 2024(06): 47-51 .
    4. 陈炀,周雁,王庆娟,张馨元,谌业恒. 融合图像处理技术的红树林鸟类鸣声识别算法. 现代计算机. 2024(21): 38-42+48 .
    5. 程强,张友兵,周奎. 基于改进YOLOX的动态视觉SLAM方法. 电子测量技术. 2024(23): 123-133 .
    6. 马泉,张欣怡,李洪波,石广洋,郝斌,张飞. 胜利煤矿智能环境监测系统研究与设计. 中国煤炭. 2023(01): 77-82 .
    7. 郭奕裕,周箩鱼. 安全帽佩戴检测网络模型的轻量化设计. 计算机工程. 2023(04): 312-320 .
    8. 胡天鑫,邓超,马俊杰,刘旺. 方程式赛车ROS平台下基于TensorRT的YOLOv5算法改进. 农业装备与车辆工程. 2023(05): 14-19 .
    9. 程绳,葛雄,肖非,朱传刚,吴军,肖海涛,李嗣,楚江平,袁雨薇. 基于多任务学习的输电线路小金具缺失推理加速算法. 计算机测量与控制. 2023(07): 251-257 .
    10. 顾成伟,丁勇,李登华. 基于计算机视觉的工业厂区人员安全警戒系统. 计算机与现代化. 2023(09): 20-26 .
    11. 张宇昂,李琦. 基于Jetson TX2的路面病害检测应用. 信息技术与信息化. 2023(09): 112-115 .
    12. 刘一呈,赵建敏,赵宇飞. 基于机器视觉和TX2的牛生长参数测量系统设计. 信息技术与信息化. 2023(10): 13-18 .
    13. 范亚龙,李琦,于令君. 基于深度学习的冶炼工人安全着装监测系统. 科学技术与工程. 2023(31): 13626-13631 .
    14. 张宇昂,李琦,薛芳芳,于令君. 基于Jetson TX2的路面裂缝检测系统设计. 公路. 2023(12): 337-344 .
    15. 郭智超,徐君明,刘爱东. 基于嵌入式平台与优化YOLOv3的航拍目标检测方法. 兵工自动化. 2022(03): 10-15+20 .
    16. 徐丁天,李海峰,徐良. 基于人眼跟踪的360°悬浮显示系统及其畸变校正方法. 光学学报. 2022(09): 232-240 .
    17. 贾云飞,郑红木,刘闪亮. 基于YOLOv5s的金属制品表面缺陷的轻量化算法研究. 郑州大学学报(工学版). 2022(05): 31-38 .
    18. 何曦,李良福,王娇颖,王洁,卢晓燕,钱钧,杨一洲,刘培桢. 基于NPU的实时深度学习跟踪算法实现. 应用光学. 2022(04): 682-692 . 本站查看
    19. 王昀,刘泓,叶珺,刘亚璇,吴小莉. 一种基于YoloV4-tiny算法的智能电子秤设计. 长江信息通信. 2022(10): 37-41 .
    20. 刘之禹,李述,王英鹤. 基于ZYNQ的深度学习卷积神经网络加速平台设计. 计算机测量与控制. 2022(12): 264-269 .
    21. 刘智文,刘全,宋玮,赵天成. 基于共享主干网络的人物属性识别推理加速算法. 智能物联技术. 2022(06): 4-10 .
    22. 钟昊,陈博,李占文,杨永成. 面向铁路安全的智能视觉PaaS平台研究. 中国铁路. 2021(08): 128-132 .
    23. 李良熹,荣进国. 基于深度学习的智能烘培类商品识别系统研究. 信息与电脑(理论版). 2021(13): 156-158 .
    24. 柴荣轩,赵津发,吴航,张文昌,张广,陈炜. 基于IPPG的非接触式血氧饱和度检测技术研究. 医疗卫生装备. 2021(12): 23-28 .

    Other cited types(79)

Catalog

    Article views (1046) PDF downloads (76) Cited by(103)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return