Groups Similar Look up By Text Browse About



Similar articles
Article Id Title Prob Score Similar Compare
212209 VENTUREBEAT 2021-7-20:
Nvidia releases TensorRT 8 for faster AI inference
1.000 Find similar Compare side-by-side
212447 ZDNET 2021-7-20:
Nvidia announces launch of TensorRT 8 designed for chatbots, recommendations, and search
0.983 0.733 Find similar Compare side-by-side
212224 VENTUREBEAT 2021-7-20:
Untether AI nabs $125M for AI acceleration chips
0.440 Find similar Compare side-by-side
211943 VENTUREBEAT 2021-7-16:
OpenAI disbands its robotics research team
0.400 Find similar Compare side-by-side
212304 VENTUREBEAT 2021-7-16:
What to do when AI brings more questions than answers
0.386 Find similar Compare side-by-side
212265 VENTUREBEAT 2021-7-20:
Employees want more AI to boost productivity, study finds
0.383 Find similar Compare side-by-side
212480 VENTUREBEAT 2021-7-22:
Airbnb CTO says graph neural networks will be big in 2021
0.373 Find similar Compare side-by-side
212242 VENTUREBEAT 2021-7-20:
Lucata raises $11.9M to accelerate graph analytics with specialized hardware
0.373 Find similar Compare side-by-side
212346 VENTUREBEAT 2021-7-19:
Scaling AI and data science – 10 smart ways to move from pilot to production
0.370 Find similar Compare side-by-side
212096 VENTUREBEAT 2021-7-16:
Facebook’s BlenderBot 2.0 bot surfs the web for knowledge
0.366 Find similar Compare side-by-side
212307 VENTUREBEAT 2021-7-16:
AI Weekly: Can AI predict labor market trends?
0.364 Find similar Compare side-by-side
212293 VENTUREBEAT 2021-7-16:
Announcing the AI Innovation Awards winners at Transform 2021
0.363 Find similar Compare side-by-side
212593 ZDNET 2021-7-23:
Contentsquare acquires Upstride to speed up AI innovation for digital business
0.361 Find similar Compare side-by-side
212312 VENTUREBEAT 2021-7-16:
AI and financial processes: Balancing risk and reward
0.355 Find similar Compare side-by-side
212477 ARSTECHNICA 2021-7-22:
Ars AI headline experiment finale—we came, we saw, we used a lot of compute time
0.354 Find similar Compare side-by-side
212568 VENTUREBEAT 2021-7-22:
Bias in AI isn’t an enterprise priority, but it should be, survey warns
0.353 Find similar Compare side-by-side
212271 VENTUREBEAT 2021-7-20:
Freshworks: 93% of IT managers have deployed AI, or plan to soon
0.349 Find similar Compare side-by-side
212330 VENTUREBEAT 2021-7-18:
OpenAI Codex shows the limits of large language models
0.347 Find similar Compare side-by-side
212178 VENTUREBEAT 2021-7-19:
Unit4: 83% of finance pros expect to upskill on AI in 2 years
0.342 Find similar Compare side-by-side
212369 VENTUREBEAT 2021-7-21:
DNSFilter nabs $30M to fight DNS threats with AI
0.339 Find similar Compare side-by-side
212475 VENTUREBEAT 2021-7-22:
Equipping AI with emotional intelligence can improve outcomes
0.336 Find similar Compare side-by-side
212520 VENTUREBEAT 2021-7-21:
Algorithmia founder on MLOps’ promise and pitfalls
0.333 Find similar Compare side-by-side
212236 VENTUREBEAT 2021-7-20:
AI adoption and analytics are rising, survey finds
0.329 Find similar Compare side-by-side
212300 VENTUREBEAT 2021-7-16:
DeepMind open-sources AlphaFold 2 for protein structure predictions
0.323 Find similar Compare side-by-side
212375 VENTUREBEAT 2021-7-21:
BlueOcean raises $15M to measure brand sentiment with AI
0.321 Find similar Compare side-by-side

1

ID: 212209

URL: https://venturebeat.com/2021/07/20/nvidia-releases-tensorrt-8-for-faster-ai-inference/

Date: 2021-07-20

Nvidia releases TensorRT 8 for faster AI inference

All the sessions from Transform 2021 are available on-demand now. Watch now. Nvidia today announced the release of TensorRT 8, the latest version of its software development kit (SDK) designed for AI and machine learning inference. Built for deploying AI models that can power search engines, ad recommendations, chatbots, and more, Nvidia claims that TensorRT 8 cuts inference time in half for language queries compared with the previous release of TensorRT. Models are growing increasingly complex, and demand is on the rise for real-time deep learning applications. According to a recent OReilly survey, 86.7% of organizations are now considering, evaluating, or putting into production AI products. And Deloitte reports that 53% of enterprises adopting AI spent more than $20 million in 2019 and 2020 on technology and talent. TensorRT essentially dials a models mathematical coordinates to a balance of the smallest model size with the highest accuracy for the system itll run on. Nvidia claims that TensorRT-based apps perform up to 40 times faster than CPU-only platforms during inference, and that TensorRT 8-specific optimizations allow BERT-Large — one of the most popular Transformer-based models — to run in 1.2 milliseconds. Sparsity, a performance technique leveraged by Nvidias Ampere architecture GPUs, among others, increases efficiency in TensorRT 8 by reducing computational operations. Meanwhile, quantization-aware training enables developers to use trained models to run inference without sacrificing much accuracy. [Its] imperative for enterprises to deploy state-of-the-art inferencing solutions, Nvidia VP of developer programs Greg Estes said in a press release. The latest version of TensorRT introduces new capabilities that enable companies to deliver conversational AI applications to their customers with a level of quality and responsiveness that was never before possible. Nvidia claims that in the five years since its initial release, TensorRT has been downloaded nearly 2.5 million times and used by more than 350,000 developers across 27,500 companies in domains including health care, automotive, finance, and retail. Hugging Face worked with Nvidia to launch AI text analysis, neural search, and conversational AI services, while GE Healthcare tapped the SDK to bolster its computer vision systems for ultrasounds, improving the performance of its cardiac view detection algorithm. Were closely collaborating with Nvidia to deliver the best possible performance for state-of-the-art models on Nvidia GPUs, Hugging Face product director Jeff Boudier said in a statement. With TensorRT 8, Hugging Face achieved 1-millisecond inference latency on BERT, and were excited to offer this performance to our customers later this year. TensorRT 8 is now generally available to members of the Nvidia Developer program. The latest versions of plug-ins, parsers, and samples are also available as open source from the TensorRT GitHub repository.