GPU FLOPS: Theory vs Reality

Message boards : Number crunching : GPU FLOPS: Theory vs Reality
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 17 · Next

AuthorMessage
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1808066 - Posted: 9 Aug 2016, 2:49:41 UTC - in response to Message 1805969.  

Running faster. 12670+-~10 right now. May have some speedup to go though. I see some 1080s are running at 16,000+.
ID: 1808066 · Report as offensive
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1808176 - Posted: 9 Aug 2016, 22:00:54 UTC - in response to Message 1808066.  

Still hasn't stabilized. 13,500 now.
ID: 1808176 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1808187 - Posted: 9 Aug 2016, 22:33:33 UTC - in response to Message 1808176.  

Still hasn't stabilized. 13,500 now.

Hey Micky,
It will take more than another week for your RAC to slowly stabilize since you started crunching with it 10 days ago.
RAC is not an appropriate metric for anything short-term as can be seen in the 3rd graph

Looking at the 3rd graph from the 1st link provided, it seems that it processes about 20k/day.

Let me know if you have Qs/comments.
Cheers,
Rob :-)
ID: 1808187 · Report as offensive
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1808321 - Posted: 10 Aug 2016, 14:53:12 UTC - in response to Message 1808187.  

Thanks, Rob.
ID: 1808321 · Report as offensive
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1808323 - Posted: 10 Aug 2016, 14:59:22 UTC - in response to Message 1806227.  

Any idea why the AMD Ellesmere (Radeon RX 400 series) is slower than the AMD Fiji (Radeon RX 300 series)?

I was thinking about adding a $200 RX470, but the performance seems very low.
ID: 1808323 · Report as offensive
Profile M_M
Avatar

Send message
Joined: 20 May 04
Posts: 76
Credit: 45,752,966
RAC: 8
Serbia
Message 1808343 - Posted: 10 Aug 2016, 16:15:04 UTC - in response to Message 1808323.  
Last modified: 10 Aug 2016, 16:16:03 UTC

AMD Fiji is R9 Fury series, i.e. still AMD top performance series, waiting to be replaced soon, since it cannot compete to new nVidia GPUs.

However, R9 Fury/Fury X is still more powerful comparing to Ellemere RX470/480, which are new AMD performance/power efficient middle level GPU, with overall performance level similar to AMD Hawaii R9 290/290x.
ID: 1808343 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34367
Credit: 79,922,639
RAC: 80
Germany
Message 1808346 - Posted: 10 Aug 2016, 16:38:31 UTC - in response to Message 1808343.  

AMD Fiji is R9 Fury series, i.e. still AMD top performance series, waiting to be replaced soon, since it cannot compete to new nVidia GPUs.

However, R9 Fury/Fury X is still more powerful comparing to Ellemere RX470/480, which are new AMD performance/power efficient middle level GPU, with overall performance level similar to AMD Hawaii R9 290/290x.


Errr ummm.

For seti FuryX is still faster than all new NV GPUs if set up correctly.


With each crime and every kindness we birth our future.
ID: 1808346 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1808347 - Posted: 10 Aug 2016, 16:49:55 UTC - in response to Message 1808346.  

Mike is right.. The Fury is faster.

The issue is not being able to run more than 1 work unit in parallel on them.

If they could, they would tear the stats.

I know a lot of people that wish that issue would be resolved.
ID: 1808347 · Report as offensive
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1808348 - Posted: 10 Aug 2016, 16:54:15 UTC - in response to Message 1808343.  

Thanks, M_M,

I see you are running a system similar to mine. How long have you been using the 1080?
ID: 1808348 · Report as offensive
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1808350 - Posted: 10 Aug 2016, 17:07:01 UTC - in response to Message 1808347.  

At least one web site claims that the RX 480 is about as fast as the GTX 1070: http://www.pcadvisor.co.uk/review/graphics-cards/nvidia-geforce-gtx-1080-1070-1060-vs-amd-radeon-rx-480-3641216/

Is the Fury the same as the AMD Compute Engine?

How do you set up the Fury to be faster than stock?

SETI does not use all the memory of the 1080 either, but the 1080 wasn't designed for SETI. Mine will not be SETI full time in the future. I will be running deep learning, then I will be learning to program CUDA for IPCA and genetic algorithms, so much of the SETI number crunching will slow down.
ID: 1808350 · Report as offensive
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1808354 - Posted: 10 Aug 2016, 17:23:58 UTC - in response to Message 1808346.  
Last modified: 10 Aug 2016, 17:27:04 UTC

Is SETI integer or floating point? AMD used to have a reputation for faster integer and NVidia was faster for float. Don't know where they stand now.
ID: 1808354 · Report as offensive
Profile M_M
Avatar

Send message
Joined: 20 May 04
Posts: 76
Credit: 45,752,966
RAC: 8
Serbia
Message 1808366 - Posted: 10 Aug 2016, 18:19:05 UTC - in response to Message 1808354.  

As I know, it is float 32bit, i.e. single precision (SP), where nVidia is in general slightly faster in same price bracket. However, in DP AMD is usually faster as nVidia is "saving" DP performance for much more expensive dedicated compute cards like Tesla P100 for example.
ID: 1808366 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1808390 - Posted: 10 Aug 2016, 20:57:01 UTC

I know a guy who collected data from people running SETI@home and graphed out the performance of different cards. Note in this case "Ellesmere" refers to RX480 (and probably 470 now although the last time the data was collected it was probably untainted by 470's).

Just sayin'
ID: 1808390 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1808394 - Posted: 10 Aug 2016, 22:07:59 UTC - in response to Message 1808390.  

I know a guy who collected data from people running SETI@home and graphed out the performance of different cards. Note in this case "Ellesmere" refers to RX480 (and probably 470 now although the last time the data was collected it was probably untainted by 470's).
Just sayin'

lmao!!! You'd be a great male researcher Shaggie
That puts a whole different spin on being a stud! ;-p
ID: 1808394 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1808422 - Posted: 11 Aug 2016, 1:55:11 UTC

Since I was so busy last weekend fighting with the 1070s I didn't get to run another scan. Here's the latest results:



The 1060's have enough data now to show up. I also ran a large enough scan to get CUDA results from a lot of cards (running stock). They definitely brought the average down before I started filtering them out.
ID: 1808422 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1808426 - Posted: 11 Aug 2016, 2:09:40 UTC - in response to Message 1808422.  

Wow, I was hoping for more from the 1060's, they are still behind the 1070 in credit/watt-hour. Looks like we'll have to wait till they release (hopefully) the 1050Ti, and see if that can take the crown back from the 1070. Especially if the price can be competitive eventually with the 750, though that will probably take a while to drop to that level. I am just finishing setting up my new dual 1060 system, it'll be interesting to see how they do in a pair. Still need to read up on the best way to configure them, lots of .xml editing it seems to get the most out of them, above the standard Lunatics install.

ID: 1808426 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1808432 - Posted: 11 Aug 2016, 3:50:14 UTC - in response to Message 1808422.  

hmmm.... I wonder where CPUs would end up on that graph. Would you mind running that wonderful script on my CPU only machine? ID 8034200
the xeon e3-1230 v3 tdp is 80w, whilst I am measuring 63w off the wall, I am only running 7 tasks at once
ID: 1808432 · Report as offensive
Profile M_M
Avatar

Send message
Joined: 20 May 04
Posts: 76
Credit: 45,752,966
RAC: 8
Serbia
Message 1808446 - Posted: 11 Aug 2016, 5:41:19 UTC
Last modified: 11 Aug 2016, 5:53:12 UTC

I am also a bit surprised that difference between GTX1080 and GTX1070 is so small, since GTX1080 has 33% more compute units (2560 vs 1920 shaders) and 25% faster memory (10GHz vs 8GHz), and its even a bit higher clocked, so something is holding GTX1080 back? Even nVidia was advertizing GTX1080 as 8.9 TFLOPS and GTX1070 as 6.5 TFLOPS.

I would guess that current application implementation is not using its extra resources well... Maybe time for some new, optimized application?
ID: 1808446 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1808450 - Posted: 11 Aug 2016, 6:25:51 UTC - in response to Message 1808446.  

My guess would be that the CUDA app doesn't take advantage of the card, that so many people are still running. CUDA you'll need to run 3 or even 4 concurrently to utilise the card to 100%. With CUDA maximum that I have seen use is 33% of a card. Whereas the SoG app uses close to 100% and doesn't need to run multiple concurrently
ID: 1808450 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13851
Credit: 208,696,464
RAC: 304
Australia
Message 1808455 - Posted: 11 Aug 2016, 6:46:27 UTC - in response to Message 1808450.  

Whereas the SoG app uses close to 100% and doesn't need to run multiple concurrently

Although doing so actually results in more work per hour, at least with the default stock settings.

on my GTX 1070,
1 WU, 6 per hr
2 WUs, 6.5 per hr
3 WUs, 7.2 per hr

With optimised settings, 1WU at a time my very well give the most output per hour.
Grant
Darwin NT
ID: 1808455 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 17 · Next

Message boards : Number crunching : GPU FLOPS: Theory vs Reality


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.