Message boards :
Number crunching :
GPU FLOPS: Theory vs Reality
Message board moderation
Author | Message |
---|---|
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
I'm trying to collect data to make the best computation/power-usage choices possible for upgrading my modest farm. I was hoping to get some help to fill in the blanks. Here's my observed / theoretical performance for my cards on SETI@home tasks:
|
![]() ![]() Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 ![]() ![]() |
I'm trying to collect data to make the best computation/power-usage choices possible for upgrading my modest farm. I was hoping to get some help to fill in the blanks. I'm not sure using Device peak FLOPS in a task result is the best way to determine the application efficiency. Taking the value displayed in Flopcounter: and diving by the number of seconds the task took might be a more accurate value to compare to the manufacture mex theoretical FLOPs. If you are running multiple tasks per GPU you would also want to correct for that. I don't see anything in the Radeon 400 series wiki page that points out they will be using a different way to calculate Single Precision FLOPs. For many years (Shaders*2)*clock has been used for Nvidia & Radeon GPUs. SETI@home classic workunits: 93,865 CPU time: 863,447 hours ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
I'm not sure using Device peak FLOPS in a task result is the best way to determine the application efficiency. Taking the value displayed in Flopcounter: and diving by the number of seconds the task took might be a more accurate value to compare to the manufacture mex theoretical FLOPs. Ok -- so how about if I took the credit for a task and divided by the run-time? I can go scrape that together and see. That's probably closer to what I want anyway -- Credit/kWh. If you are running multiple tasks per GPU you would also want to correct for that. How would I tell if this were happening? BOINC shows at most one task active on my GPU at any given time. This was also why I was hoping to get single-GPU task stats from other people to form a basis of comparison. I don't see anything in the Radeon 400 series wiki page that points out they will be using a different way to calculate Single Precision FLOPs. For many years (Shaders*2)*clock has been used for Nvidia & Radeon GPUs. The Nvidia page says "Single precision performance is calculated as 2 times the number of shaders multiplied by the base core clock speed" but the AMD page says "Single precision performance is calculated as based on a FMA operation" which is probably a single-cycle instruction. Same thing? I don't know -- but that's beside the point really: I want work-unit stats so I can work out throughput/electricity estimates! |
Kiska Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0 ![]() |
OK I'll answer your question on credit per kwh, you really can't determine that as credit is assigned differently and what hardware your wingmen has. For the multiple WU running concurrently that is done by anonymous platform and the possible usage of lunatics app package. The current correct formula is: (Shaders*2)*clock speed. I.e Rx 480 (2304*2)*1120000000 Hz = 5.16096e12 = 5.1609 tflops |
Al ![]() ![]() ![]() ![]() Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 ![]() ![]() |
Check out the bottom of this thread, I just posted a wealth of data from my 1080 FTW card that I have been running for a week now. Hope it is helpful to you. ![]() ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
Thanks Al, I think I'm going to try to write a perl script to generate CR/W and I'll see if I can aim it at your stats; it looks like just the data-point I'm after! |
Al ![]() ![]() ![]() ![]() Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 ![]() ![]() |
If you would like to PM me your email addy, I will just send you the text file and save you all the work. :-) ![]() ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
I wrote a quick script for aggregating stats results. It's not fancy but it gave me some perplexing data: CPUs
|
![]() ![]() ![]() ![]() Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 ![]() ![]() |
If you are using Users' stats page to calculate runtimes - They are wrong. You have no idea how many tasks an individual is running simultaneously. Greatly affecting the times shown. |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
If you are using Users' stats page to calculate runtimes - They are wrong. Ok well I know how many tasks I'm running (1 GPU) and it looks Al is running 4 at once for some reason so I can account for that (so maybe 1028 cr/hr for his GTX 1080 which is consistent). This is why I'm looking for people with other hardware to help me build a picture here -- if anyone is running the default of one task at once on a 1080, RX 480 or R9 Nano I'd love to analyze your stats! |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
I scrolled through the top 250 hosts to fish out host IDs for single-GPU systems and ran them through my script; the results were remarkably consistent! ![]() Sadly all the AMD GPUs were named by only their series name (Hawaii, Fiji, etc) so I couldn't cross-reference the observed results with a specific video-card. It would be super helpful if I could have people call out their single-GPU hosts with the model of their card so I can fill in the blanks. |
Admiral Gloval ![]() Send message Joined: 31 Mar 13 Posts: 21790 Credit: 5,308,449 RAC: 0 ![]() |
I have a AMD Radeon FX 260X GPU (Bonaire). Feel free to check my results. ![]() ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
I have a AMD Radeon FX 260X GPU (Bonaire). Feel free to check my results. Thanks! It looks like you're getting about 260 CR/h which seems low. How many GPU tasks do you have running at once? |
Admiral Gloval ![]() Send message Joined: 31 Mar 13 Posts: 21790 Credit: 5,308,449 RAC: 0 ![]() |
I have a AMD Radeon FX 260X GPU (Bonaire). Feel free to check my results. One WU. Three CPU. ![]() ![]() ![]() |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
If you are using Users' stats page to calculate runtimes - They are wrong. My main system has 4 Nano each running only one task at a time. I did have a system crash and issues with Crimson 16.6.1, so only got it stable again last night. Really look forward to see how the Nano compares. I chose them since HBM should give a power advantage. GitHub: Ricks-Lab Instagram: ricks_labs ![]() |
Al ![]() ![]() ![]() ![]() Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 ![]() ![]() |
Shaggie, just for complete accuracy, in your chart, you may want to add FTW to the 1080, it isn't a massive difference, but it should probably be noted, as it is a factory overclocked version, which I have then bumped up even higher. I am still amazed at how cool it is running compared to my 980Ti, night and day. I wonder if I have a bit more headroom on it to OC it some more? I thought that heat was the limiting factor when overclocking, and if so, it appears there is more room to run. But, I don't feel the need to go nuts with it, I would like it to do long term, reliable crunching service for me. Now that these are starting to appear out in the real world, I'll have to do a little looking around to see what others experiences are and compare notes. ![]() ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
My main system has 4 Nano each running only one task at a time. I did have a system crash and issues with Crimson 16.6.1, so only got it stable again last night. Really look forward to see how the Nano compares. I chose them since HBM should give a power advantage. Thanks! The script puts that at about 1000 cr/hr for each card which is pretty amazing! |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
And it has a TDP of only 175W! It will be thermally throttled with the original cooling solution, so they have to be waterblock'ed. GitHub: Ricks-Lab Instagram: ricks_labs ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
Yeah I think I actually found your video about that before you'd mentioned -- it's pretty awesome but I'm kind of leery about doing that much surgery myself. I'm surprised nobody is selling R9 Nanos with waterblocks pre-installed like the ASUS ROG Poseidons. |
![]() ![]() Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 ![]() |
Yeah I think I actually found your video about that before you'd mentioned -- it's pretty awesome but I'm kind of leery about doing that much surgery myself. I'm surprised nobody is selling R9 Nanos with waterblocks pre-installed like the ASUS ROG Poseidons. Isn't that basically what a fury X is ? (I don't actually know, just what I thought it was...) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.