Skip to main content

Windows 11 will soon harness your GPU for generative AI

Following the introduction of Copilot, its latest smart assistant for Windows 11, Microsoft is yet again advancing the integration of generative AI with Windows. At the ongoing Ignite 2023 developer conference in Seattle, the company announced a partnership with Nvidia on TensorRT-LLM that promises to elevate user experiences on Windows desktops and laptops with RTX GPUs.

The new release is set to introduce support for new large language models, making demanding AI workloads more accessible. Particularly noteworthy is its compatibility with OpenAI’s Chat API, which enables local execution (rather than the cloud) on PCs and workstations with RTX GPUs starting at 8GB of VRAM.

Nvidia’s TensorRT-LLM library was released just last month and is said to help improve the performance of large language models (LLMs) using the Tensor Cores on RTX graphics cards. It provides developers with a Python API to define LLMs and build TensorRT engines faster without deep knowledge of C++ or CUDA.

With the release of TensorRT-LLM v0.6.0, navigating the complexities of custom generative AI projects will be simplified thanks to the introduction of AI Workbench. This is a unified toolkit facilitating the quick creation, testing, and customization of pretrained generative AI models and LLMs. The platform is also expected to enable developers to streamline collaboration and deployment, ensuring efficient and scalable model development.

A graph showing TensorRT-LLM inference performance on Windows 11.
Nvidia

Recognizing the importance of supporting AI developers, Nvidia and Microsoft are also releasing DirectML enhancements. These optimizations accelerate foundational AI models like Llama 2 and Stable Diffusion, providing developers with increased options for cross-vendor deployment and setting new standards for performance.

The new TensorRT-LLM library update also promises a substantial improvement in inference performance, with speeds up to five times faster. This update also expands support for additional popular LLMs, including Mistral 7B and Nemotron-3 8B, and extends the capabilities of fast and accurate local LLMs to a broader range of portable Windows devices.

The integration of TensorRT-LLM for Windows with OpenAI’s Chat API through a new wrapper will allow hundreds of AI-powered projects and applications to run locally on RTX-equipped PCs. This will potentially eliminate the need to rely on cloud services and ensure the security of private and proprietary data on Windows 11 PCs.

The future of AI on Windows 11 PCs still has a long way to go. With AI models becoming increasingly available and developers continuing to innovate, harnessing the power of Nvidia’s RTX GPUs could be a game-changer. However, it is too early to say whether this will be the final piece of the puzzle that Microsoft desperately needs to fully unlock the capabilities of AI on Windows PCs.

Editors' Recommendations

Kunal Khullar
A PC hardware enthusiast and casual gamer, Kunal has been in the tech industry for almost a decade contributing to names like…
If you have an AMD GPU, stay away from the latest Windows Update
Two AMD Radeon RX 7000 graphics cards on a pink surface.

A quick PSA: If you own one of AMD's best graphics cards and you like to tweak the settings, now is not a good time to download the latest Windows Update. According to users on the AMD forums, the KB5030310 update really doesn't agree with AMD's Adrenalin Control Panel. While it's not the end of the world, this isn't the first Windows update in the last few months that has caused problems.

It appears that every time people restart their PCs, their Adrenalin settings are all reset back to default. This means that any changes made to things like AMD's Anti-Lag or Hyper RX will disappear upon every boot. Fortunately, the graphics driver itself is unaffected.

Read more
Meta just created a Snoop Dogg AI for your text RPGs
Meta AI's Dungeon Master looks like Snoop Dogg.

Meta Connect started with the Quest 3 announcement but that’s not the only big news. The metaverse company is also a leader in AI and has released several valuable models to the open-source community. Today, Meta announced its generative AI is coming soon to its social media apps, and it looks both fun and useful.
Meta AI for text
When CEO Mark Zuckerberg announced Meta AI for social media, it seemed interesting. When one of the custom AIs looked like Snoop Dogg wearing Dungeons and Dragons gear, there was a gasp from the live audience, followed by whoops of joy and applause.

Meta AI's Dungeon Master looks like Snoop Dogg. Meta

Read more
Newegg wants your old GPU — here’s how much you could get
Three graphics cards on a gray background.

Upgrading to a new graphics card can be a hassle, and it has been even more difficult ever since the GPU shortage. Today, there are way too many models to choose from, and keeping track of prices is not easy. In an effort to make things a bit simpler, Newegg has announced a new trade-in program. The online retailer is offering customers a deal in which they send in their existing eligible GPU and receive a trade-in credit amount toward the purchase of a new qualifying graphics card.

According to Amir Asadibagheri, product manager of customer experience for Newegg, “the benefit of our trade-in program is the ease to send a used graphics card and buy a new one all within the same platform and avoiding the hassle of selling through a secondary market.” Newegg has given a list of all Nvidia and AMD graphics cards that are eligible, along with an estimated trade-in value. Notably, the trade-in is limited to Nvidia’s RTX series and AMD’s Radeon 5000 series and beyond.

Read more