GPU FLOPS: Theory vs Reality

Message boards : Number crunching : GPU FLOPS: Theory vs Reality
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 17 · Next

AuthorMessage
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1806180 - Posted: 31 Jul 2016, 22:17:14 UTC - in response to Message 1805996.  

When I started, I was also running climateprediction.net code for half the resources. Both were running fairly slow, so I stopped the SETI@home. The climate software loaded more tasks and had calculation errors on half of them with just a few percent complete.

I stopped the climate software and reloaded the SETI@home. It is noticeably faster now, but the SETI@home may only be seeing 4GB of GPU RAM. Can't tell, because I updated my GPU driver and it does not show up when I look at my computer from SET@home.
ID: 1806180 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1806182 - Posted: 31 Jul 2016, 22:31:24 UTC - in response to Message 1806180.  

When I started, I was also running climateprediction.net code for half the resources. Both were running fairly slow, so I stopped the SETI@home. The climate software loaded more tasks and had calculation errors on half of them with just a few percent complete.

I stopped the climate software and reloaded the SETI@home. It is noticeably faster now, but the SETI@home may only be seeing 4GB of GPU RAM. Can't tell, because I updated my GPU driver and it does not show up when I look at my computer from SET@home.


In Windows, most of the applications are 32 bit, so memory amounts >4GiB make no sense. 64 bit would allow accessing large amounts of memory that the current work/apps don't need. Windows on Windows (i.e running 32 bit apps on 64 bit system) uses smaller addresses, which on GPU saves precious register space, and the host CPU has a feature called 'register renaming' that then lets the 32 bit apps run quite efficiently.

Things are changing, such that there are so many registers in new hardware a lot of that doesn't matter anymore, and there are moves to deprecate 32 bit OS support entirely. The rub then becomes only whether to supply a slower native 64 bit application and multiple builds. Linux made this choice pretty easy by breaking 32 bit app compiling on 64 bit host, and deprecating various Cuda devices early.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1806182 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14654
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1806190 - Posted: 31 Jul 2016, 22:55:00 UTC

climateprediction.net doesn't have, has never had, and probably will never have any GPU applications (VM apps are more likely).

The climateprediction.net failures are unlikely to have any direct connection with the SETI@home GPU applications. Indirect connections are certainly possible (CPDN saves large files to hard disk, frequently, for example), but I don't think they justify this juxtaposition.
ID: 1806190 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1806227 - Posted: 1 Aug 2016, 1:03:42 UTC

I updated my script to check which API was used for each work unit; this revealed that a small number of hosts were running CUDA when most others were running OpenCL (those running CUDA tasks were generally earning lower credit and brought the average down).

After I ran a regular scan I didn't find enough work units to qualify so I ran a second complementary scan to collect data from different hosts; interestingly this collected enough data from the more rare GPU types to include them so my graphs are bigger than normal.

ID: 1806227 · Report as offensive
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1806238 - Posted: 1 Aug 2016, 1:39:29 UTC - in response to Message 1806182.  

Is SETI@home a 32 bit app?
ID: 1806238 · Report as offensive
Micky Badgero

Send message
Joined: 26 Jul 16
Posts: 44
Credit: 21,373,673
RAC: 83
United States
Message 1806239 - Posted: 1 Aug 2016, 1:45:03 UTC - in response to Message 1806190.  

Too bad about climateprediction.net not using GPU. Testing out my new GPU was one of my reasons for joining BOINC.

I was doing SETI@home before BOINC many years ago under my starman2020 user name, but the accounts are not connected today and I do not have access to that yahoo email anymore.

Whatever the case, SETI@home was doing 4,000 credits a day with climateprediction.net, and without it, it is easily doing twice that.
ID: 1806239 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22227
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1806262 - Posted: 1 Aug 2016, 5:00:16 UTC

For MultiBeam there are 32 and 64 bit CPU applications, however the CPU part of the GPU applications are 32 bit (for windows).
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1806262 · Report as offensive
Rockhount
Avatar

Send message
Joined: 29 May 00
Posts: 34
Credit: 31,935,954
RAC: 29
Germany
Message 1806340 - Posted: 1 Aug 2016, 11:13:05 UTC

Hi Shaggie76,
great work with your script. How big is your database already?

Could you check this results too? They are made with Lunatics 0.44

Host:
https://setiathome.berkeley.edu/results.php?hostid=7450977

Intel Xeon L5640 HT on, but 7 tasks max.
https://setiathome.berkeley.edu/result.php?resultid=4970894174

Nvidia Quadro 4000, 1 task only
https://setiathome.berkeley.edu/result.php?resultid=4970804541
https://setiathome.berkeley.edu/result.php?resultid=4970804543

I'm switching boinc between two machines @home(up&download) and the Xeon @Work (crunching).
Regards from nothern Germany
Roman

SETI@home classic workunits 207,059
SETI@home classic CPU time 1,251,095 hours

ID: 1806340 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1806349 - Posted: 1 Aug 2016, 12:35:52 UTC - in response to Message 1806340.  

How big is your database already?

I reset the collection every week when I get a fresh host DB from SETI.

Roughly speaking here's how many hosts it found for each GPU in the last scan (minus 1 or 2 for the CSV headers):

wc -l *.csv
      6 AMD Radeon R9 200 Series.csv
     69 Bonaire.csv
      4 Carrizo.csv
     23 Ellesmere.csv
     20 Fiji.csv
     23 GeForce GTX 1060.csv
     81 GeForce GTX 1070.csv
     93 GeForce GTX 1080.csv
     48 GeForce GTX 260.csv
     34 GeForce GTX 275.csv
      6 GeForce GTX 280.csv
     27 GeForce GTX 285.csv
     75 GeForce GTX 460.csv
     12 GeForce GTX 465.csv
     55 GeForce GTX 470.csv
     27 GeForce GTX 480.csv
     91 GeForce GTX 550 Ti.csv
     10 GeForce GTX 555.csv
     68 GeForce GTX 560 Ti.csv
     77 GeForce GTX 560.csv
     73 GeForce GTX 570.csv
     57 GeForce GTX 580.csv
     65 GeForce GTX 645.csv
     79 GeForce GTX 650 Ti.csv
     74 GeForce GTX 650.csv
     74 GeForce GTX 660 Ti.csv
     80 GeForce GTX 660.csv
     87 GeForce GTX 670.csv
     86 GeForce GTX 680.csv
     81 GeForce GTX 745.csv
     86 GeForce GTX 750 Ti.csv
     92 GeForce GTX 750.csv
     96 GeForce GTX 760.csv
     81 GeForce GTX 770.csv
     68 GeForce GTX 780 Ti.csv
     65 GeForce GTX 780.csv
     76 GeForce GTX 950.csv
     87 GeForce GTX 960.csv
     87 GeForce GTX 970.csv
     78 GeForce GTX 980 Ti.csv
     81 GeForce GTX 980.csv
      9 GeForce GTX TITAN Black.csv
     43 GeForce GTX TITAN X.csv
     25 GeForce GTX TITAN.csv
     20 Hainan.csv
     71 Hawaii.csv
      9 Iceland.csv
     54 Kalindi.csv
     56 Mullins.csv
     70 Oland.csv
     73 Spectre.csv
     65 Tonga.csv
   2997 total


Could you check this results too? They are made with Lunatics 0.44

Host:
https://setiathome.berkeley.edu/results.php?hostid=7450977

Intel Xeon L5640 HT on, but 7 tasks max.
https://setiathome.berkeley.edu/result.php?resultid=4970894174

Nvidia Quadro 4000, 1 task only
https://setiathome.berkeley.edu/result.php?resultid=4970804541
https://setiathome.berkeley.edu/result.php?resultid=4970804543

I'm switching boinc between two machines @home(up&download) and the Xeon @Work (crunching).


The website only shows 2 validated tasks so there isn't much to take an average of:

C:\SETI>aggregate.pl -anon -v 7450977
Host, API, Device, Credit, Seconds, Credit/Hour, Work Units
wuid=2178135221 25.1364450165141 CR/h cpu
wuid=2178093785 29.3952630946179 CR/h gpu
7450977, cpu, Intel Core i5-2500K @ 3.30GHz, 98.24, 3517.4425, 100.545780066057, 1
7450977, gpu, NVIDIA GeForce GTX 750 Ti, 89.65, 10979.32, 29.3952630946179, 1
ID: 1806349 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1806358 - Posted: 1 Aug 2016, 13:14:36 UTC

Someone privately asked me how the data was extracted and I figured I'd share the reply here as well.

The scripts are pretty simple (which has serious limitations).

  • get a list of all hosts from the SETI XML dump
  • distill that list into a list of Host Ids for each GPU
  • filter list to single-GPU setups with a minimum activity and total credit
  • select 50 random hosts for each GPU and scan the website task list to get credit & time stats
  • ignore hosts without a minimum number of work-units validated (10)
  • discard GPUs that have less than 10 hosts with enough work validated
  • "winsorize" the stats for each GPU: sort them by credit/hr and average the top 25% if there are 20+ hosts or the top 50% otherwise (trying to avoid multiprocessing).
  • fart around in Excel for a few minutes to turn the CSV into pictures


Note that currently for my graphs I omit tasks running the anonymous platform:


  • anon users are more likely to run multiple tasks concurrently (and I can't tell how many they're doing at once)
  • anon users might be using the GUPPI rescheduler (which misleads the server about whether the task actually ran on CPU).


Although Perl tends to be write-only I keep the source published on GitHub in case anyone wants to see how I did it.

ID: 1806358 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806461 - Posted: 1 Aug 2016, 21:03:21 UTC - in response to Message 1806358.  

Hey Shaggie,
Great stats again...but this time I have a few Qs!

Don't know if I understand the big picture wrt your GPU count stats above.
Are there only ~3k GPUs out of the 100k+ active hosts crunching for S@h?

As for:
    * select 50 random hosts for each GPU and scan the website task list to get credit & time stats

Why are you not selecting those with the top RAC?
I'm certain there's a good reason. I just can't seem to see one! :-S
Are you trying to avoid getting too many "anonymous platforms"?

Cheers,
Rob :-)
ID: 1806461 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1806472 - Posted: 1 Aug 2016, 21:59:27 UTC - in response to Message 1806461.  
Last modified: 1 Aug 2016, 21:59:53 UTC


Don't know if I understand the big picture wrt your GPU count stats above.
Are there only ~3k GPUs out of the 100k+ active hosts crunching for S@h?

No no -- I'm just taking a random sampling and that was how many samples I collected (because I was asked "How big is your database already?").

Why are you not selecting those with the top RAC?
I'm certain there's a good reason. I just can't seem to see one! :-S
Are you trying to avoid getting too many "anonymous platforms"?

I guess I'm not looking for the best hosts but rather a measure of central tendency -- the fastest hosts might be overclocking, doing rescheduling tricks or something else I can't account for. The only reason I winsorize the mean at the end is to try to cull multiprocessing dragging the average down.
ID: 1806472 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1806484 - Posted: 1 Aug 2016, 23:30:18 UTC - in response to Message 1806472.  

The only reason I winsorize the mean at the end is to try to cull multiprocessing dragging the average down.


Can you explain that statement?

My experience has show multiprocessing to be faster than single processing
ID: 1806484 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1806503 - Posted: 2 Aug 2016, 0:28:38 UTC - in response to Message 1806484.  

The only reason I winsorize the mean at the end is to try to cull multiprocessing dragging the average down.


Can you explain that statement?

My experience has show multiprocessing to be faster than single processing


Because I can't tell how many tasks users are running concurrently I can't account for it -- their tasks will just seem to take longer which will make it look like the GPU doesn't perform very well.
ID: 1806503 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806522 - Posted: 2 Aug 2016, 1:31:56 UTC - in response to Message 1806503.  
Last modified: 2 Aug 2016, 1:32:54 UTC

Instead of looking at credit, what about just doing a tally of tasks/day?
(and possibly ignoring shorties)
[edit] that way your stats wouln't be affected by multiple tasks on GPUs[/e]

That could give you better stats since:
1. CreditNew is skewed and screwed up; and
2. you're only capturing a subset of tasks: those that have been validated.

This way you could include all tasks during the last 24-or-so hrs.
Just a thought...but I have no idea what the level of coding effort that would be required.
R :-)
ID: 1806522 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1806527 - Posted: 2 Aug 2016, 2:01:14 UTC - in response to Message 1806522.  

Instead of looking at credit, what about just doing a tally of tasks/day?
(and possibly ignoring shorties)
[edit] that way your stats wouln't be affected by multiple tasks on GPUs[/e]

That could give you better stats since:
1. CreditNew is skewed and screwed up; and
2. you're only capturing a subset of tasks: those that have been validated.

This way you could include all tasks during the last 24-or-so hrs.
Just a thought...but I have no idea what the level of coding effort that would be required.

Not everyone crunches 24/7 and this approach would favor full-time hosts and dilute the stats for everyone else.

I could bend over backwards and do silly stuff but I expect it would be dramatically more PHP queries (ie: slower for me, and more annoying for the people running the servers).

And again: people running the GUPPI scheduler cause the server to mislead my stats and so I'm extremely apprehensive about bothering with trying to analyze anonymous data in bulk.

If task.gz is ever a thing I might dig deeper out but for now I'm pretty content -- my scripts gives data that applies to the vast majority of users who don't mess around with the settings.
ID: 1806527 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13751
Credit: 208,696,464
RAC: 304
Australia
Message 1806572 - Posted: 2 Aug 2016, 7:24:53 UTC - in response to Message 1806522.  

2. you're only capturing a subset of tasks: those that have been validated.

Makes sense as only valid tasks matter.
Doesn't matter how fast you pump them out; if they aren't valid they're of no use, so there's not much point pumping out all those errors.
Grant
Darwin NT
ID: 1806572 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806581 - Posted: 2 Aug 2016, 7:49:30 UTC - in response to Message 1806572.  

2. you're only capturing a subset of tasks: those that have been validated.

Makes sense as only valid tasks matter.
Doesn't matter how fast you pump them out; if they aren't valid they're of no use, so there's not much point pumping out all those errors.

There's a group named "Validation pending" whose count is often bigger than those "Valid"
ID: 1806581 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13751
Credit: 208,696,464
RAC: 304
Australia
Message 1806582 - Posted: 2 Aug 2016, 7:56:18 UTC - in response to Message 1806581.  

There's a group named "Validation pending" whose count is often bigger than those "Valid"

The validated WUs get cleared from the database after a set time so it doesn't come to a grinding halt.
Validation pendings aren't valid, so they're of no use when it comes to determining how well a GPU is performing.
Generally, the smaller the cache & the more powerful the GPU, the larger the number of pendings will be.
Grant
Darwin NT
ID: 1806582 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806602 - Posted: 2 Aug 2016, 10:06:54 UTC - in response to Message 1806582.  

Validation pendings aren't valid, so they're of no use when it comes to determining how well a GPU is performing.

There's a 99+% chance that my "Validation pending" will become "Valid".
Obviously for hosts that have many "invalids" my suggested task tally approach would not be recommended.
IBM's World Community Grid already uses 3 metrics and one of them is raw task count...so I don't think a variant that doesn't give much weight to shorties) should be dismissed outright.
Cheers,
Rob :-)
ID: 1806602 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 17 · Next

Message boards : Number crunching : GPU FLOPS: Theory vs Reality


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.