Skip to content
Main Navigation Puget Systems Logo
  • Solutions
    • Content Creation
      • Photo Editing
        • Recommended Systems For:
        • Adobe Lightroom Classic
        • Adobe Photoshop
        • Stable Diffusion
      • Video Editing & Motion Graphics
        • Recommended Systems For:
        • Adobe After Effects
        • Adobe Premiere Pro
        • DaVinci Resolve
        • Foundry Nuke
      • 3D Design & Animation
        • Recommended Systems For:
        • Autodesk 3ds Max
        • Autodesk Maya
        • Blender
        • Cinema 4D
        • Houdini
        • ZBrush
      • Real-Time Engines
        • Recommended Systems For:
        • Game Development
        • Unity
        • Unreal Engine
        • Virtual Production
      • Rendering
        • Recommended Systems For:
        • Keyshot
        • OctaneRender
        • Redshift
        • V-Ray
      • Digital Audio
        • Recommended Systems For:
        • Ableton Live
        • FL Studio
        • Pro Tools
    • Engineering
      • Architecture & CAD
        • Recommended Systems For:
        • Autodesk AutoCAD
        • Autodesk Inventor
        • Autodesk Revit
        • SOLIDWORKS
      • Visualization
        • Recommended Systems For:
        • Enscape
        • Lumion
        • Twinmotion
      • Photogrammetry & GIS
        • Recommended Systems For:
        • ArcGIS Pro
        • Agisoft Metashape
        • Pix4D
        • RealityCapture
    • AI & HPC
      • Recommended Systems For:
      • Data Science
      • Generative AI
      • Large Language Models
      • Machine Learning / AI Dev
      • Scientific Computing
    • More
      • Recommended Systems For:
      • Compact Size
      • Live Streaming
      • NVIDIA RTX Studio
      • Quiet Operation
      • Virtual Reality
    • Business & Enterprise
      We can empower your company
    • Government & Education
      Services tailored for your organization
  • Products
    • Puget Mobile
      Powerful laptop workstations
      • Puget Mobile 16″
    • Puget Workstations
      High-performance desktop PCs
      • AMD Ryzen
        • Ryzen 9000:
        • Small Form Factor
        • Mini Tower
        • Mid Tower
        • Full Tower
      • AMD Threadripper
        • Threadripper 7000:
        • Mid Tower
        • Full Tower
        • Threadripper PRO 7000WX:
        • Full Tower
      • AMD EPYC
        • EPYC 9004:
        • Full Tower
      • Intel Core Ultra
        • Core Ultra Series 2:
        • Small Form Factor
        • Mini Tower
        • Mid Tower
        • Full Tower
      • Intel Xeon
        • Xeon W-2500:
        • Mid Tower
        • Xeon W-3500:
        • Full Tower
    • Custom Computers
    • Puget Rackstations
      Workstations in rackmount chassis
      • AMD Rackstations
        • Ryzen 7000 / EPYC 4004:
        • R550-6U 5-Node
        • Ryzen 9000:
        • R132-4U
        • Threadripper 7000:
        • T121-4U
        • Threadripper PRO 7000WX:
        • T141-4U
        • T140-5U (Dual 5090s)
      • Intel Rackstations
        • Core Ultra Series 2:
        • C132-4U
        • Xeon W-3500:
        • X131-4U
        • X141-5U
    • Custom Rackmount Workstations
    • Puget Servers
      Enterprise-class rackmount servers
      • Rackmount Servers
        • AMD EPYC:
        • E200-1U
        • E140-2U
        • E280-4U
        • Intel Xeon:
        • X200-1U
    • Comino Grando GPU Servers
    • Custom Servers
    • Puget Storage
      Solutions from desktop to datacenter
      • Network-Attached Storage
        • Synology NAS Units:
        • 4-bay DiskStation
        • 8-bay DiskStation
        • 12-bay DiskStation
        • 4-bay RackStation
        • 12-bay FlashStation
      • Software-Defined Storage
        • Datacenter Storage:
        • 12-Bay 2U
        • 24-Bay 2U
        • 36-Bay 4U
    • Recommended Third Party Peripherals
      Curated list of accessories for your workstation
    • Puget Gear
      Quality apparel with Puget Systems branding
  • Publications
    • Articles
    • Blog Posts
    • Case Studies
    • HPC Blog
    • Podcasts
    • Press
    • PugetBench
  • Support
    • Contact Support
    • Support Articles
    • Warranty Details
    • Onsite Services
    • Unboxing
  • About Us
    • About Us
    • Contact Us
    • Our Customers
    • Enterprise
    • Gov & Edu
    • Press Kit
    • Testimonials
    • Careers
  • Talk to an Expert
  • My Account
  1. Home
  2. /
  3. Hardware Articles
  4. /
  5. NVIDIA GeForce RTX 5090 & 5080 AI Review

NVIDIA GeForce RTX 5090 & 5080 AI Review

Posted on February 21, 2025 (April 1, 2025) by Jon Allman

Table of Contents

  • Introduction
  • What is FP4?
  • Test Setup
  • LLM: llama.cpp
  • LLM: MLPerf Client
  • Image Generation: SD.Next
  • Video: DaVinci Resolve Studio AI Features
  • Video: Topaz Video AI
  • How Good is the NVIDIA GeForce RTX 50-Series for AI?

Introduction

At CES, NVIDIA announced its next generation of consumer desktop graphics cards: the GeForce RTX™ 50 series, based on the Blackwell architecture. As we covered in our recent GeForce RTX™ 5090 review, these GPUs promise improved performance in gaming and content creation. This boost is the result of more and improved CUDA cores, fourth-generation Ray Tracing and fifth-generation Tensor Cores, the latest NVIDIA NVENC/NVDEC media engines, and higher power budgets. These hardware updates are packaged alongside a multitude of new technologies like DLSS 4, RTX Mega Geometry, and new NVIDIA Broadcast features.

In this article, we will examine how the new GeForce RTX 5090 and 5080 compare against previous-gen GPUs in various AI-based workloads, both open-source and closed.

Image
Open Full Resolution

Below, we have listed the most relevant specifications from recent AMD, Intel, and NVIDIA GPUs. For more information, visit Intel Ark, NVIDIA’s 40-series GeForce page, NVIDIA’s 50-series GeForce page, or AMD’s Radeon RX Page.

GPUMSRPVRAMAI TOPS (INT8)Boost ClockVRAM BandwidthTDPRelease Date
NVIDIA GeForce RTX™ 5090$2,00032 GB8382.41 GHz1792 GB/sec575 WJan. 2025
NVIDIA GeForce RTX™ 3090 Ti$2,00024 GB3201.86 GHz1001 GB/sec450 WJan. 2022
NVIDIA GeForce RTX™ 4090$1,60024 GB660.62.52 GHz1001 GB/sec450 WOct. 2022
NVIDIA GeForce RTX™ 4080$1,20016 GB389.92.51 GHz736 GB/sec320 WNov. 2022
NVIDIA GeForce RTX™ 3080 Ti$1,20012 GB272.81.67 GHz912 GB/sec350 WJune 2021
AMD Radeon™ RX 7900 XTX$1,00024 GB122.82.5 GHz960 GB/sec355 WDec. 2022
NVIDIA GeForce RTX™ 5080$1,00016 GB450.22.62 GHz960 GB/sec360 WJan. 2025
NVIDIA GeForce RTX™ 4080 SUPER$1,00016 GB4182.55 GHz736 GB/sec320 WJan. 2024
NVIDIA GeForce RTX™ 2080 Ti$1,00011 GB227.71.55 GHz616 GB/sec250 WSept. 2018
NVIDIA GeForce RTX™ 4070 Ti SUPER$80016 GB3532.61 GHz706 GB/sec285 WJan. 2024
NVIDIA GeForce RTX™ 5070 Ti$75016 GB351.52.45 GHz896 GB/sec300 WFeb. 2025
NVIDIA GeForce RTX™ 4070 SUPER$60012 GB2842.48 GHz504 GB/sec220 WJan. 2024
NVIDIA GeForce RTX™ 5070$55012 GB246.92.51 GHz672 GB/sec250 WTBD
Intel Arc™ B580$25012 GB2332.67 GHz456 GB/sec190 WDec. 2024

Compared to the RTX 4090 with 512 4th generation Tensor Cores, the RTX 5090 features 680 5th generation Tensor Cores, an increase of 33%. This translates to an increase in FP/BF16 performance of about 27%, from 165.2 TFLOPS to 209.5 TFLOPS. This is somewhat disappointing for a new generation Tensor Core, but the real architectural change comes via support for FP4 precision (more on that later) rather than performance improvements to previously supported precisions. However, the most eye-catching improvement is the RTX 5090’s new memory configuration. Not only has the RTX 5090 received an extra 8 GB of VRAM for a total of 32 GB, but it also features an incredible 1.79 TB/s memory bandwidth contrasted with the RTX 4090’s 1.01 TB/s, made possible by the switch from GDDR6X to GDDR7. This is an exciting development for AI workloads, which are often constrained by memory bandwidth.

In contrast to the RTX 5090, the RTX 5080 presents a much more modest improvement over equivalent cards of the previous generation. It has a total of 336 Tensor Cores compared to the RTX 4080’s 304, an increase of about 10.5%. This translates to an FP/BF16 performance of 112.6 TFLOPS vs the RTX 4080’s 97.5 TFLOPS. Although the RTX 5080 also features GDDR7 memory, the total capacity remains unchanged at 16 GB, along with the memory bus-width of 256-bit. This does lead to a respectable 34% increase in memory bandwidth from the RTX 4080 to the RTX 5080 (716.8 GB/s vs 960 GB/s). Still, it’s hard not to be disappointed when considering that the improvement in memory bandwidth from the RTX 4090 to the RTX 5090 is a whopping 77%.

What is FP4?

Regarding AI/ML workloads, Blackwell’s big news is the inclusion of support for 4-bit precision floating point calculations, AKA FP4. Broadly speaking, lower precision calculations can be performed more quickly with less hardware real estate than those of higher precisions. This means that to optimize for performance, it’s best to use the lowest precision available that does not meaningfully affect the result of a calculation. 

Particularly in the large language model (LLM) space, evidence shows that quantizing models, i.e. reducing their precision, can provide immense savings in computational and memory requirements without drastically impacting their accuracy. Assuming that one can overcome the technical challenges presented, pre-training models using lower precision also have benefits, as evidenced by recent releases like DeepSeek-V3 and R1.

Therefore, it seems natural for hardware manufacturers like NVIDIA to look ahead and provide hardware with dedicated support for lower precisions, like FP4. However, the reality is that software support is basically nonexistent, so for end-users with existing workflows, there’s currently not much benefit to having hardware that can perform FP4 calculations. Although there are teams like Black Forest Labs who have worked with NVIDIA to provide an FP4 version of their FLUX.1 image generation model, actually using the model is not as simple as dropping it into your favorite image generation application.

So for now, the benefits of native FP4 support are primarily theoretical – but as the ecosystem continues to mature and develop, we fully expect to see more utilization of this capability.

Test Setup

Test Platform

CPUs:
AMD Ryzen™ 9 9950X
CPU Cooler: Noctua NH-U12A
Motherboard: ASUS ProArt X670E-Creator WiFi
BIOS Version: 2604
RAM: 2x DDR5-5600 32GB (64 GB total)
PSU: Super Flower LEADEX Platinum 1600W
Storage: Samsung 980 Pro 2TB
OS: Windows 11 Pro 64-bit (26100)
Power Profile: Balanced

GPUs

NVIDIA GeForce RTX™ 5090
Driver: 571.86

NVIDIA GeForce RTX™ 5080
Driver 572.12
NVIDIA GeForce RTX™ 4090
NVIDIA GeForce RTX 4080™ SUPER
NVIDIA GeForce RTX 4080
NVIDIA GeForce RTX™ 3090 Ti
NVIDIA GeForce RTX™ 3080 Ti
NVIDIA GeForce RTX™ 3080
NVIDIA GeForce RTX™ 2080 Ti
Driver: 566.36
AMD Radeon™ RX 7900 XTX
Driver: Adrenaline 24.12.1

Benchmark Software

llama.cpp (b4493) – Phi-3 Mini Q4_K_M
MLPerf Client v.0.5.0 – Llama 2 7B Chat DML
SD.Next (2025-01-29) – SDXL Base 1.0
DaVinci Resolve 19.1 – PugetBench for DaVinci Resolve 1.1.0
Topaz Video AI 6.0.3.0

For our GPU testing, we have shifted to an AMD Ryzen 9 9950X-based platform from our traditional Threadripper platform. The 9950X has fantastic all-around performance in most of our workflows and should let the video cards be the primary limiting factor where there is the possibility of a GPU bottleneck. This means the results are more comparable to our recent Intel Arc B580 review, but less so to our past GPU reviews. However, we are doing our best to test a variety of past GPUs throughout our RTX 50-series reviews, so we should have a lot of updated results in the near future. For testing, we used the latest available GPU drivers and tested everything on the “balanced” Windows power profile. Resizeable BAR and “Above 4G Decoding” were enabled for every GPU as well.

For our first hardware review focused on AI workloads, we chose a suite of five tests comprised of a mix of both open-source and commercial software. The former tests consist of llama.cpp and MLPerf Client, both of which test how the GPUs handle LLM workloads, along with SD.Next for testing image generation. On the commercial software side, we used our PugetBench for DaVinci Resolve benchmark, focusing on Resolve Studio’s AI-based features found in the “Extended” version of the test and the benchmark included within Topaz Video AI. More details about each test can be found in their respective sections below.

Tower Computer Icon in Puget Systems Colors

Looking for AI and HPC Workstations?

We build computers tailor-made for your workflow. 

Configure a System!
Talking Head Icon in Puget Systems Colors

Don’t know where to start?
We can help!

Get in touch with our technical consultants today.

Talk to an Expert

LLM: llama.cpp

llama.cpp prompt processing results chart
llama.cpp token generation results chart
llama.cpp prompt processing results chart
llama.cpp token generation results chart
Previous Next
System Image
llama.cpp prompt processing results chart
Open Full Resolution
llama.cpp token generation results chart
Open Full Resolution
Previous Next

Starting with the ubiquitous llama.cpp, we immediately see some unexpected results from both the RTX 5090 and the RTX 5080. During the prompt processing test (Chart 1), both GPUs scored considerably lower than expected. Considering that prompt processing speed correlates closely with compute performance, we should expect to see the RTX 5090 score noticeably higher than the RTX 4090, but it falls short of that expectation, achieving performance similar to the RTX 4080 SUPER. Likewise, the RTX 5080 should come out just ahead of the RTX 4080 SUPER, but it instead achieves performance similar to that of the RTX 3080 Ti.

It’s clear that these results are not actually indicative of the relative performance of the 50-series cards, and we suspect that a driver issue may be the cause, as performance testing using llama.cpp in Linux does not show these same discrepancies.

Moving on to the token generation test (Chart 2), we thankfully did not encounter any anomalous results with the new GPUs. As shown previously, token generation performance is more dependent on a GPU’s memory bandwidth, and with an astounding 1.79 TB/s, the RTX 5090 handily takes the top spot on the token generation chart. The RTX 5090 leads the previous flagship, the RTX 4090, by a healthy margin of about 29%. The RTX 5080 also scores very well, landing just short of the RTX 4090’s performance.

LLM: MLPerf Client

MLPerf Time to First Token results chart
MLPerf Token Generation results chart
MLPerf Raw results table
MLPerf Time to First Token results chart
MLPerf Token Generation results chart
MLPerf Raw results table
Previous Next
System Image
MLPerf Time to First Token results chart
Open Full Resolution
MLPerf Token Generation results chart
Open Full Resolution
MLPerf Raw results table
Open Full Resolution
Previous Next

Next up is MLPerf Client, a new addition to our AI benchmarks. Although it’s still in early development with some kinks to work out, MLPerf Client already has good flexibility in testing LLM workloads, and its use of ONNX Runtime for DirectML means that it offers broad support for different GPU manufacturers in a single package. Despite the fact that it runs a few different styles of tests featuring a mix of prompt sizes and response token limits, we opted to show only the overall results because the shape of the scores is functionally identical across each workload.

This is the first time we’ve recorded and presented Time To First Token (TTFT) in our benchmarks (Chart 1). TTFT represents an important metric regarding the end-user experience of interacting with an LLM because it measures how long a user has to wait after their query to begin receiving a response. However, with small models hosted locally within a single GPU, the delay is so minuscule it’s worth asking how valuable the metric is at this scale. In any case, the RTX 5090 was the only GPU that achieved a sub-100ms result, with the RTX 4090 lagging behind by about 33%. Surprisingly, the RTX 5080 also managed to come out ahead of the RTX 4080 and 4080 SUPER by about the same margin of 33%. This seems to indicate a preference for higher memory bandwidths in this test, but the lackluster results of the 3080 Ti and 3090 Ti also point towards compute being a limiting factor as well. This makes sense, considering that prompt processing is known to be compute-bound while token generation is memory bandwidth-bound – and to measure TTFT, both stages are required. So, although most would be hard-pressed to notice differences measured in tens of millliseconds, TTFT does prove itself to be an excellent measure of overall GPU performance for LLM usage.

Just like we saw with llama.cpp, the RTX 5090’s memory bandwidth allows it to shine during token generation (Chart 2), far outpacing any other GPU in this round of testing. As expected, with memory bandwidth similar to the RTX 4090, the RTX 5080 approaches the 4090’s performance in this test and outpaces the RTX 4080 and 4080 SUPER by around 20%.

Image Generation: SD.Next

SD.Next Batch 1 iterations per second results chart
SD.Next Batch 1 seconds per image results chart
SD.Next Batch 2 iterations per second results chart
SD.Next Batch 2 seconds per image results chart
SD.Next Batch 4 iterations per second results chart
SD.Next Batch 1 seconds per image results chart
SD.Next Batch 8 iterations per second results chart
SD.Next Batch 8 seconds per image results chart
SD.Next Batch 16 iterations per second results chart
SD.Next Batch 16 seconds per image results chart
SD.Next Batch 1 iterations per second results chart
SD.Next Batch 1 seconds per image results chart
SD.Next Batch 2 iterations per second results chart
SD.Next Batch 2 seconds per image results chart
SD.Next Batch 4 iterations per second results chart
SD.Next Batch 1 seconds per image results chart
SD.Next Batch 8 iterations per second results chart
SD.Next Batch 8 seconds per image results chart
SD.Next Batch 16 iterations per second results chart
SD.Next Batch 16 seconds per image results chart
Previous Next
System Image
SD.Next Batch 1 iterations per second results chart
Open Full Resolution
SD.Next Batch 1 seconds per image results chart
Open Full Resolution
SD.Next Batch 2 iterations per second results chart
Open Full Resolution
SD.Next Batch 2 seconds per image results chart
Open Full Resolution
SD.Next Batch 4 iterations per second results chart
Open Full Resolution
SD.Next Batch 1 seconds per image results chart
Open Full Resolution
SD.Next Batch 8 iterations per second results chart
Open Full Resolution
SD.Next Batch 8 seconds per image results chart
Open Full Resolution
SD.Next Batch 16 iterations per second results chart
Open Full Resolution
SD.Next Batch 16 seconds per image results chart
Open Full Resolution
Previous Next

Switching gears from text to image generation, we have the SD.Next benchmark. We’ve chosen to stick with SDXL for our model, SDPA as the cross-attention method, 50 steps per batch, and a resolution of 512×512. The relatively low resolution may strike some as an odd choice, but as it’s the default resolution for SD.Next’s benchmark, we thought it would make it simpler for others to replicate our results. Something to note on these charts is that we opted to omit the 2080 Ti’s results from the “seconds per image” charts to help with chart scaling, as it took considerably longer for it to complete the tests compared to the other GPUs tested. Finally, it’s worth keeping in mind that we used early-access PyTorch wheels (torch-2.6.0+cu128.nv) to enable support for the 50-series GPUs, and it’s likely that these are not fully optimized quite yet. We are eagerly awaiting the full release of PyTorch supporting CUDA 12.8, and plan to publish updated results when that becomes available.

Right off the bat, we encountered some strange results at the minimum batch size of 1 (Charts 1 & 2). We unexpectedly found the RTX 5080 taking the top spot among all of the GPUs tested, somehow beating out both the RTX 5090 and 4090. We also see both RTX 4080 variants come out slightly ahead of the RTX 5090 and 4090. The most likely explanation for this is a CPU bottleneck somewhere in the generation pipeline, which most prominently affects these shorter runs at smaller batch sizes. Indeed, Task Manager shows 100% usage on a single CPU thread at the start of the generation. Once the batch size is increased sufficiently, the workload stays on the GPU long enough that the time spent waiting on the CPU thread becomes less impactful.

In any case, once we progress to batch sizes higher than 1, we see things start to take a more familiar shape. At a batch size of 2 (Charts 3 & 4), we still encountered some odd results – such as the 4080 SUPER, 4090, and 5080 all achieving very similar scores. However, at batch sizes 4 and above the results become more aligned with the expected performance. With a batch size 4 (Charts 5 & 6), the RTX 5090 and RTX 4090 have nearly identical performance, but as the batch sizes continue to increase, the RTX 5090 pulls further ahead – translating to roughly .15 seconds less per image generated. In contrast, we see the RTX 5080 and RTX 4080 SUPER achieve practically identical scores at all batch sizes above 1.

Video: DaVinci Resolve Studio AI Features

PugetBench for Davinci Resolve overall AI score results chart
PugetBench for Davinci Resolve Super Scale results chart
PugetBench for Davinci Resolve Face Refinement results chart
PugetBench for Davinci Resolve Person Mask results chart
PugetBench for Davinci Resolve Relight results chart
PugetBench for Davinci Resolve Smart Reframe results chart
DaVinci Resolve AI Raw results table
PugetBench for Davinci Resolve overall AI score results chart
PugetBench for Davinci Resolve Super Scale results chart
PugetBench for Davinci Resolve Face Refinement results chart
PugetBench for Davinci Resolve Person Mask results chart
PugetBench for Davinci Resolve Relight results chart
PugetBench for Davinci Resolve Smart Reframe results chart
DaVinci Resolve AI Raw results table
Previous Next
System Image
PugetBench for Davinci Resolve overall AI score results chart
Open Full Resolution
PugetBench for Davinci Resolve Super Scale results chart
Open Full Resolution
PugetBench for Davinci Resolve Face Refinement results chart
Open Full Resolution
PugetBench for Davinci Resolve Person Mask results chart
Open Full Resolution
PugetBench for Davinci Resolve Relight results chart
Open Full Resolution
PugetBench for Davinci Resolve Smart Reframe results chart
Open Full Resolution
DaVinci Resolve AI Raw results table
Open Full Resolution
Previous Next

DaVinci Resolve is the first example of a closed-source commercial application included in our AI review. These tests are performed as part of the “Extended” preset in PugetBench for DaVinci Resolve. Instead of including charts for every individual test, we chose a selection that highlights the results with the largest differences between GPUs. However, a table with all the results is included in the last image.

Overall, the RTX 5090 performed well, consistently outpacing the RTX 4090 in the “Render” class of tests. Compared to the RTX 4090, the RTX 5090 was able to achieve a performance lead of about 20-25% in five out of eight Render tests: Super Scale, Face Refinement, Person Mask (Faster), Relight, and Optical Flow. However, in the “Runtime” tests, the RTX 5090 was typically within about ± 5% of the RTX 4090, and fell behind by about 10% in two tests: Create Subtitles from Audio and Scene Cut Detection. However, the results from these particular tests fall outside expectations; for example, the RTX 3080 took the top spot in the Create Subtitles from Audio Test.

The RTX 5080 was unable to strongly differentiate itself from the RTX 4080 variants and had very similar performance overall. It was able to pull ahead in the Face Refinement and Optical Flow tests, outperforming the RTX 4080 SUPER by about 13.5% and 20% in those tests, but by and large, the RTX 5080 performs in line with the 4080 variants in this benchmark. Unlike the RTX 5090, which provides the additional benefit of increased VRAM capacity compared to its predecessor, the RTX 5080 doesn’t distinguish itself as an attractive upgrade from the RTX 4080 SUPER and RTX 4080 for DaVinci Resolve AI workloads.

Video: Topaz Video AI

Topaz Video AI Overall Score results chart
Topaz Video AI Enhancement Score results chart
Topaz Video AI Slowmo Score results chart
Topaz Video AI Raw results table
Topaz Video AI Overall Score results chart
Topaz Video AI Enhancement Score results chart
Topaz Video AI Slowmo Score results chart
Topaz Video AI Raw results table
Previous Next
System Image
Topaz Video AI Overall Score results chart
Open Full Resolution
Topaz Video AI Enhancement Score results chart
Open Full Resolution
Topaz Video AI Slowmo Score results chart
Open Full Resolution
Topaz Video AI Raw results table
Open Full Resolution
Previous Next

As our final benchmark in this review, we have another commercial application: Topaz Video AI, which is popular for tasks like upscaling and frame interpolation. For more details on how our scoring works for Topaz Video AI, please see our previous analysis here. Just like in that article, we provide an “Overall” result, along with “Enhancement” and “Slowmo” subscores. Additionally, we have provided the raw results for anyone interested in the details of specific tests.

Overall, the Topaz Video AI results for both of the 50-series GPUs were disappointing. In many cases, the RTX 5090 was unable to keep up with the RTX 4090, much less beat it by a meaningful margin, and the same can be said for the RTX 5080 versus the RTX 4080 & 4080 SUPER. It’s clear that, while Blackwell support has been implemented, there is still room for optimization.

How Good is the NVIDIA GeForce RTX 50-Series for AI?

With 32GB of GDDR7 running at 1.79TB/s and impressive compute capabilities, the RTX 5090 is theoretically the best consumer GPU for AI workloads that money can buy. Likewise, although the RTX 5080 has disappointed some by maintaining a total of 16GB of memory, its compute and memory bandwidth improvements provide a respectable performance boost over the RTX 4080 / RTX 4080 SUPER. However, due to the immaturity of support on the software side currently, it’s difficult to give a proper assessment of the RTX 5090 and RTX 5080 for AI workloads. Hopefully, we will soon see better support in libraries like PyTorch, along with proper FP4 implementations in our favorite AI applications and models. 

Until then, it’s hard to say definitively that the average user with existing 40-series hardware will benefit significantly from an upgrade. That will almost certainly change with time, and those with 30-series and older GPUs can be assured that an upgrade to Blackwell can net them significant performance improvements in AI/ML workloads even today.


If you need a powerful workstation to tackle the applications we’ve tested, the Puget Systems workstations on our solutions page are tailored to excel in various software packages. If you prefer to take a more hands-on approach, our custom configuration page helps you to configure a workstation that matches your exact needs. Otherwise, if you would like more guidance in configuring a workstation that aligns with your unique workflow, our knowledgeable technology consultants are here to lend their expertise.

Tower Computer Icon in Puget Systems Colors

Looking for an AI and Scientific Computing workstation?

We build computers tailor-made for your workflow. 

Configure a System
Talking Head Icon in Puget Systems Colors

Don’t know where to start?
We can help!

Get in touch with one of our technical consultants today.

Talk to an Expert

Related Content

  • Do Video Editors Need GeForce RTX 50 Series GPUs?
  • AMD Radeon RX 9070 XT Content Creation Review
  • Do Graphic and Motion Designers Need GeForce RTX 50 Series GPUs?
  • Is it Worth Upgrading to NVIDIA’s GeForce RTX 50 Series for Unreal Engine?
View All Related Content

Latest Content

  • Do Video Editors Need GeForce RTX 50 Series GPUs?
  • Adobe Premiere Pro and After Effects – What’s New In Version 25.2?
  • The Future of LED Walls: Arena & Nuke Stage Go Beyond Game Engines
  • 2025 Tariff Impacts at Puget Systems
View All
Tags: AI, AMD, DaVinci Resolve, GPU, llama.cpp, LLM, Machine Learning, MLPerf, NVIDIA, Performance, Radeon RX 7900 XTX, RTX 2080 Ti, RTX 3080, RTX 3080 Ti, RTX 3090 Ti, RTX 4080, RTX 4080 SUPER, RTX 4090, RTX 5080, RTX 5090, SD.Next, Topaz Video AI

Who is Puget Systems?

Puget Systems builds custom workstations, servers and storage solutions tailored for your work.

We provide:

Extensive performance testing
making you more productive and giving better value for your money

Reliable computers
with fewer crashes means more time working & less time waiting

Support that understands
your complex workflows and can get you back up & running ASAP

A proven track record
as shown by our case studies and customer testimonials

Get Started

Browse Systems

Puget Systems Mobile Laptop Workstation Icon

Mobile

Puget Systems Tower Workstation Icon

Workstations

Puget Systems Rackmount Workstation Icon

Rackstations

Puget Systems Rackmount Server Icon

Servers

Puget Systems Rackmount Storage Icon

Storage

Latest Articles

  • Do Video Editors Need GeForce RTX 50 Series GPUs?
  • Adobe Premiere Pro and After Effects – What’s New In Version 25.2?
  • The Future of LED Walls: Arena & Nuke Stage Go Beyond Game Engines
  • 2025 Tariff Impacts at Puget Systems
  • Z890 vs. B860 vs. H810
View All

Post navigation

 NVIDIA GeForce RTX 5080 Content Creation ReviewNVIDIA GeForce RTX 5070 Ti Content Creation Review 
Puget Systems Logo
Build Your Own PC Site Map FAQ
facebook instagram linkedin rss twitter youtube

Optimized Solutions

  • Adobe Premiere
  • Adobe Photoshop
  • Solidworks
  • Autodesk AutoCAD
  • Machine Learning

Workstations

  • Content Creation
  • Engineering
  • Scientific PCs
  • More

Support

  • Online Guides
  • Request Support
  • Remote Help

Publications

  • All News
  • Puget Blog
  • HPC Blog
  • Hardware Articles
  • Case Studies

Policies

  • Warranty & Return
  • Terms and Conditions
  • Privacy Policy
  • Delivery Times
  • Accessibility

About Us

  • Testimonials
  • Careers
  • About Us
  • Contact Us
  • Newsletter

© Copyright 2025 - Puget Systems, All Rights Reserved.