AMD’s latest graphics card, the Radeon RX 7900 XTX, has made waves in the computing world by outperforming NVIDIA’s GeForce RTX 4090 in running the DeepSeek R1 AI model, particularly in inference benchmarks.
Let’s dive into why AMD’s quick move to support DeepSeek’s R1 LLM models is a game-changer. With the advent of this new AI model from DeepSeek, industry insiders and enthusiasts have been curious about the power required to train it. Surprisingly, AMD’s “RDNA 3” Radeon RX 7900 XTX GPU is more than capable, delivering top-tier performance that surpasses expectations for regular consumers. Recently, AMD even released inference benchmarks that pit their flagship RX 7000 series against NVIDIA’s equivalent, and the results are impressive—AMD comes out on top across various model types.
David McAfee from AMD tweeted about how well DeepSeek performs on the 7900 XTX, directing users to resources for getting these models up and running on Radeon GPUs and Ryzen AI APUs.
For those looking to explore AI workloads on consumer GPUs, AMD has certainly made a compelling case. Many users have opted for these setups because they offer a solid performance-to-cost ratio compared to specialized AI accelerators. Running these models locally not only enhances performance but also keeps your data private—a significant concern with burgeoning AI applications like DeepSeek’s. To help users get started, AMD has put together a comprehensive guide for deploying DeepSeek R1 distillations on their GPUs:
1. Ensure your system is running on the 25.1.1 Optional or a newer Adrenalin driver.
2. Download LM Studio version 0.3.8 or higher from lmstudio.ai/ryzenai.
3. Install LM Studio, bypassing the onboarding screen.
4. Navigate to the discover tab.
5. Select your desired DeepSeek R1 Distill. Starting with smaller models like Qwen 1.5B is advisable due to their excellent speed, whereas larger models provide better reasoning abilities—all are quite powerful.
6. Check that “Q4 K M” quantization is marked on the right, then click “Download”.
7. After downloading, return to the chat tab, select the DeepSeek R1 distill from the menu, and ensure “manually select parameters” is checked.
8. For the GPU offload layers, push the slider to its maximum setting.
9. Click to load the model.
10. You’re all set to interact with a reasoning model running entirely on your AMD setup!
If you find these instructions a bit challenging, worry not. AMD has also released a YouTube tutorial breaking down each step so you can effectively run DeepSeek’s language models on your local AMD hardware, ensuring your privacy remains intact. Looking ahead, with new GPU releases from both NVIDIA and AMD, we anticipate significant advancements in inferencing capabilities, largely thanks to the onboard AI engines meant to optimize such tasks.