Message boards :
Number crunching :
GPU FLOPS: Theory vs Reality
Message board moderation
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
It's been a few weeks since 8.19 was released so I've run another scan. New cards in the charts today: the NVIDIA GTX Titan Black and the AMD Baffin (ie: RX 470?) Here are the median 60% of the work units scanned: ![]() I'm not sure if it's the new GPU optimizations that made it into stock 8.19 or if the recent mix of work-units has somehow been favorable to the slightly higher memory bandwidth in the 980 Ti -- either way I'm surprised to see it at the top. I'll probably run an incremental scan next weekend and see if it's still ahead. Here are the number of hosts and work-units aggregated is posted as well if you want some idea of the confidence: ![]() |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 ![]() ![]() |
I'm not sure if it's the new GPU optimizations that made it into stock 8.19 or if the recent mix of work-units has somehow been favorable to the slightly higher memory bandwidth in the 980 Ti -- either way I'm surprised to see it at the top. I'll probably run an incremental scan next weekend and see if it's still ahead. I'm not surprised. I've always said the 980Ti was more productive than the 1080 or 1070s based on my testing of them. What will be interesting to see is the 1080Tis in January. If they proved to be as productive as they sound, might be an option to upgrade a few machines. ![]() ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
I'm not so sure about the 1080 Ti's -- my guess is they'll still be a 250W card and perform somewhere between the 1080's and the Pascal Titans. From what I've seen in my scans the Titans are between 1200 and 1500 CR/hr but there aren't enough of them to qualify for the charts. Personally I'm thinking about a set of 4 1070's @ 600W rather than a pair of 1080 Tis. My guess is it'll be about the same price as a pair of 1080 Ti's, a bit faster, and a bit better credit/watt. Of course if they're a 200W card instead I might be willing to bite the extra cost and just run 3. |
![]() ![]() Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 ![]() |
Hey Shaggie! I SOooo love to see the updates to GPU outputs. Great work as always! Have you given any thought about putting that talent of yours towards promoting the importance of optimization with Lunatics v0.45 and MrK's prog? ...since running stock is very inefficient (especially because of the CPU stock app)! My GTX1060 and my 2 GTX750Ti seem to have a 30%-45% better throughput than what I would do with stock...and that's without mentioning the almost 100% improvement on the CPU tasks with the Lunatics CPU app for my CPUs (Xeon W3550). The way I see it: by improving my throughput I'm improving the project's overall throughput ...and if more SETIzens could see that in some charts, that would be visual data worth acting upon with what they already have. I think your current charts are still incredible at showing the better buys since the electricity consumption is likely the greatest cost for dedicated crunchers. But optimization seems to me so important that I think it worthwhile to mention it to you again. Just me submitting my wish list...again...for the greater good (aka throughput)! ;-} Cheers, RobG :-D |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
It's been a few weeks since 8.19 was released so I've run another scan. New cards in the charts today: the NVIDIA GTX Titan Black and the AMD Baffin (ie: RX 470?) Always look forward to your summary table! One question about the approach. Are you including only stock apps for 8.19? Since stock and optimized for 8.19 are identical at this time, maybe it is better to look at all work using r3528 apps. GitHub: Ricks-Lab Instagram: ricks_labs ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
Are you including only stock apps for 8.19? Since stock and optimized for 8.19 are identical at this time, maybe it is better to look at all work using r3528 apps. The crude way I crawl the SETI website is quite limited -- it takes a few hours to scrape out a report and to get enough detail to discriminate specific versions would require literally 20x the queries. The easiest (and probably most consistent) filter is to just take stock apps running single-GPUs (ie: the vast majority). Sampling just the common case means I'm unlikely to have confounding results from weird custom builds like petri33's, command-line tweaks, concurrent tasks, overclocked parts, and hacks like the GUPPI Rescheduler. It's a good picture of baseline performance which I find helpful as a basis of comparison when selecting new parts and comparing the results of system tweaks> It's not trying to be a leaderboard of the 'best' systems because we have that already. There is a periodic dump of some of the database that I use to push some aspects offline (eg hosts.gz) but there's no export of the tasks table; if I could get that I'd be able to do a lot more detailed analysis without grinding the SETI servers. |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
Are you including only stock apps for 8.19? Since stock and optimized for 8.19 are identical at this time, maybe it is better to look at all work using r3528 apps. Just kind of a bummer to know my systems aren't in the mix. With the quick release of Lunatics into stock, I think it is more likely the case now that people running optimized apps would be behind stock. I have recommended two people I interact with to upgrade to stock! Lunatics only keeps you ahead if you go through the work of manual installs when a new app is available. That's probably a small number of people. Also, I noticed recently that stock apps with no arguments for Fury are giving nearly the same output as optimized arguments. Have defaults changed in the latest release? GitHub: Ricks-Lab Instagram: ricks_labs ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Have defaults changed in the latest release? Now defaults only starting point. App adapts to real GPU performance in PulseFind area. SETI apps news We're not gonna fight them. We're gonna transcend them. |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
Have defaults changed in the latest release? Very cool! Thanks for the update. GitHub: Ricks-Lab Instagram: ricks_labs ![]() |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
I did a few scan this weekend; taking only the results from this scan I get: ![]() But if I aggregate both this week's and last week's scans I still get the 980 Ti in the lead. ![]() Even when just including one week's worth of data the top cards are including over a hundred different hosts and around 10,000 work-units so the sampling is pretty thorough. |
![]() ![]() Send message Joined: 20 May 04 Posts: 76 Credit: 45,752,966 RAC: 8 ![]() |
Thanks Shaggie. Any ideas on 980ti/1080 case? Accoding to nVidia, 980ti is around 6TFLOPS and 1080 is around 9TFLOPS. Raw memory bandwidth wise they are almost the same but 1080 should alse have a benefits of better memory compression of around 20% as claimed by nVidia. |
![]() ![]() Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 ![]() ![]() |
Any ideas on 980ti/1080 case? My guess is that you've got the answer right there: bandwidth. AFAIK memory compression is a trick for render-target transfers and won't help OpenCL. I think if you look at the relative bandwidth of the 1070 vs the 1080 the SETI credit/hour is suspiciously proportional, too. But that's just my guess. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Parts like Pulse, Triplet, Autocorr are basically summation so memory-restrained in most cases. SETI apps news We're not gonna fight them. We're gonna transcend them. |
![]() ![]() Send message Joined: 20 May 04 Posts: 76 Credit: 45,752,966 RAC: 8 ![]() |
To be even worse, nVidia is on Pascal limiting computation to P2 power state, i.e. throttling back memory clock by approx 10%, without any proper reason given to users. This is easy to check with GPUZ or similar tool, while GPU tasks are running. :( |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
To be even worse, nVidia is on Pascal limiting computation to P2 power state, i.e. throttling back memory clock by approx 10%, without any proper reason given to users. This is easy to check with GPUZ or similar tool, while GPU tasks are running. :( And easy to overcome with available tools like NvidiaInspector. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 20 May 04 Posts: 76 Credit: 45,752,966 RAC: 8 ![]() |
To be even worse, nVidia is on Pascal limiting computation to P2 power state, i.e. throttling back memory clock by approx 10%, without any proper reason given to users. This is easy to check with GPUZ or similar tool, while GPU tasks are running. :( I have tried it (Win10 and GTX1080) but I could't make it work. This was possible on Maxwell but not on Pascal I think... |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
To be even worse, nVidia is on Pascal limiting computation to P2 power state, i.e. throttling back memory clock by approx 10%, without any proper reason given to users. This is easy to check with GPUZ or similar tool, while GPU tasks are running. :( Don't know about that. I find it interesting that it doesn't work on Win10 and Pascal. It's worked on all my machines so far but the Win10 machine has my old Maxwell cards and not the newer Pascal cards. It works fine on my GTX970's on the Win10 machine. Just what kind of troubles or issues did you run into on Win10 and Pascal. I'd like to know because in the future I might upgrade the Win10 machine to Pascal cards. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 ![]() ![]() |
Just be wary of the Pascals from EVGA, lots of issues with temps in their forums lately ![]() ![]() |
AMDave Send message Joined: 9 Mar 01 Posts: 234 Credit: 11,671,730 RAC: 0 ![]() |
Just be wary of the Pascals from EVGA, lots of issues with temps in their forums latelyJust be wary of the Pascals from EVGA, lots of issues with temps in their forums lately EVGA has problems with GTX 1080/1070 FTW
    marginally within spec and needed to be addressed.      To fix the bug, EVGA will be rolling out a VBIOS update, which should adjust the fan speed curve to ensure     sufficient cooling of all components. EVGA claims that this will resolve potential thermal problems..." ►  "EVGA also notes that all graphics cards shipped from EVGA after 1st of November will have the VBIOS     update applied." |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 ![]() ![]() |
They will also send you thermal pads that the user has to applied to both the VRAM and the VRMs.. I'm waiting on them to send me both before I tear mine apart to place these on there. ![]() ![]() |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.