some basics in amd gpu computing

Message boards : Number crunching : some basics in amd gpu computing
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
merle van osdol

Send message
Joined: 23 Oct 02
Posts: 809
Credit: 1,980,117
RAC: 0
United States
Message 1578328 - Posted: 26 Sep 2014, 18:39:23 UTC

From what I understand so far, we don't want to combine high end and low end amd gpu's in the same computer. For example, r7 265 with 7770 with 6570. Why is that? Because the amd drivers are different for the different cards and we cannot use multiple amd drivers on the same computer? Or what? Or what don't I understand?
ID: 1578328 · Report as offensive
woohoo
Volunteer tester

Send message
Joined: 30 Oct 13
Posts: 972
Credit: 165,671,404
RAC: 5
United States
Message 1578340 - Posted: 26 Sep 2014, 18:52:40 UTC

if i had all three of those cards i would at least try them with the latest driver, that might work. using lunatics i would only run one wu per gpu because trying to run multiple wus on the faster gpus while simultaneously running one wu on the slower gpus might require multiple instances of boinc and i'm too lazy for that. there are are also command line parameters that you can optionally use but i don't know if you can apply different settings for each individual card. plus there could be timing issues as well.
ID: 1578340 · Report as offensive
merle van osdol

Send message
Joined: 23 Oct 02
Posts: 809
Credit: 1,980,117
RAC: 0
United States
Message 1578353 - Posted: 26 Sep 2014, 19:05:53 UTC - in response to Message 1578340.  

thanks woohoo,

Yeah, I am pretty sure that the reason I couldn't work that last time was because the 6570 was no longer supported by the current amd driver needed for the newer cards.

So my answer to myself then is that you can only use one graphics driver on the same computer. Seems strange to me but I'm not a CS person.
ID: 1578353 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1578360 - Posted: 26 Sep 2014, 19:09:30 UTC - in response to Message 1578328.  
Last modified: 26 Sep 2014, 19:17:04 UTC

From what I understand so far, we don't want to combine high end and low end amd gpu's in the same computer. For example, r7 265 with 7770 with 6570. Why is that? Because the amd drivers are different for the different cards and we cannot use multiple amd drivers on the same computer? Or what? Or what don't I understand?

You don't want to mix fast and slow GPU's from any vendor. In the case of those 3 cards they actually all use the same driver. So you can in fact use them all at the same time if you really want, but it isn't a good idea for a few reasons. The issues of why you do not want to mix fast and slow cards are related to how BOINC treats the GPUs.

The top two issues that come to mind are:
- Configuration of a GPU is common across all GPUs for that app. Go configuring a fast GPU to run 2 at once will also force the slower card to run 2 at once. Which may cause it to choke and error.
- BOINC only stores one value for estimated times per app. So your really fast GPU would cause the estimated time to be very low. Then when the slower card runs a task it will exit early because it "ran to long".
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1578360 · Report as offensive
merle van osdol

Send message
Joined: 23 Oct 02
Posts: 809
Credit: 1,980,117
RAC: 0
United States
Message 1578375 - Posted: 26 Sep 2014, 19:22:30 UTC - in response to Message 1578360.  
Last modified: 26 Sep 2014, 19:38:33 UTC

thanks Hal,
I understand the first point clearly and wouldn't want to run multiple wu's on a gpu.

That second item seems unclear. What would estimated times have to do with the actual running of a task? Why would the slower card even consider exiting based on a prefigured estimate to complete?

Is that one of the "features" of boinc that someone else mentioned before?
ID: 1578375 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1578426 - Posted: 26 Sep 2014, 20:09:47 UTC - in response to Message 1578375.  

thanks Hal,
I understand the first point clearly and wouldn't want to run multiple wu's on a gpu.

That second item seems unclear. What would estimated times have to do with the actual running of a task? Why would the slower card even consider exiting based on a prefigured estimate to complete?

Is that one of the "features" of boinc that someone else mentioned before?

Basically a mechanism thinks the task is taking much longer than it should, based on the estimate. Then it says "OK we are stopping this right now". IIRC the task is marked with an exit status of -177.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1578426 · Report as offensive
merle van osdol

Send message
Joined: 23 Oct 02
Posts: 809
Credit: 1,980,117
RAC: 0
United States
Message 1578453 - Posted: 26 Sep 2014, 20:35:17 UTC - in response to Message 1578426.  

lol

thanks Hal
ID: 1578453 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1578458 - Posted: 26 Sep 2014, 20:43:18 UTC - in response to Message 1578375.  

thanks Hal,
I understand the first point clearly and wouldn't want to run multiple wu's on a gpu.

That second item seems unclear. What would estimated times have to do with the actual running of a task? Why would the slower card even consider exiting based on a prefigured estimate to complete?

Is that one of the "features" of boinc that someone else mentioned before?

Yes, it's another feature. These features are the result of BOINC being App based verses Device based. All device results are lumped into the App statistics instead of BOINC having Statistics for different Devices. Therefore, two unequal cards result in BOINC using an average of the two devices. This will cause the fast card to receive Low credits per task and the Slow card to receive High credits per task. This because the Fast card finished faster than the estimate and the slower card finished slower than the estimate. As for the time_exceeded exit, it would have to be a very large difference because it allows for 10x longer than the estimate. You wouldn't have this problem unless you mixed a card that took 10x longer to finish than the fastest card, but after the estimate was averaged over a number of runs you wouldn't have the problem.

I know a certain developer who believes BONIC should be Device Based...
ID: 1578458 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1578460 - Posted: 26 Sep 2014, 20:47:38 UTC

If you configurate your host to run multiple intances of Boinc, you could work with any combination of GPU´s (slow/mid/fast) without any problem.

But multiple intances it´s for advanced users only.
ID: 1578460 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1578485 - Posted: 26 Sep 2014, 21:17:48 UTC - in response to Message 1578453.  

lol

thanks Hal

The exit status is actually 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED & not 177. 177 is another one I remember but I forget the instance it occurs now.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1578485 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1578520 - Posted: 26 Sep 2014, 22:03:09 UTC - in response to Message 1578485.  

lol

thanks Hal

The exit status is actually 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED & not 177. 177 is another one I remember but I forget the instance it occurs now.

LoL, -177 *was* used for both time limit exceeded and memory limit exceeded. Now they're separate 197 and 198 errors.

@merle - The intended purpose of the time limit is to protect against an application running forever because it got hung in a loop on an unattended or seldom watched computer. The S@H setting used to be 10 times the estimate, but was changed to 20 times within the last year. That may be enough that few cases will be seen where a task is killed while nearing completion simply because the servers had an inflated idea of how fast the app version should run. But it is probably unsafe to use two GPUs of the same brand which differ in speed by a factor of 10 or more (I mean real speed, not the theoretical peak GFLOPS).
                                                                   Joe
ID: 1578520 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1578593 - Posted: 27 Sep 2014, 2:31:29 UTC

The most benefit might be to put the 6570 in place of the 4650 in your other machine. The lower power usage of 44W for the 6570 vs 48W of the 4650 isn't huge, but the processing times on the 6570 should be much better than ~18 hours. The 6370m in my notebook runs just under 6. hours.

Or perhaps your Core 2 Quad CPU Q8300 has a PCIe slot to put one of the cards in for crunching?
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1578593 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1578597 - Posted: 27 Sep 2014, 2:49:30 UTC - in response to Message 1578593.  
Last modified: 27 Sep 2014, 2:54:01 UTC

The most benefit might be to put the 6570 in place of the 4650 in your other machine. The lower power usage of 44W for the 6570 vs 48W of the 4650 isn't huge, but the processing times on the 6570 should be much better than ~18 hours. The 6370m in my notebook runs just under 6. hours.

Or perhaps your Core 2 Quad CPU Q8300 has a PCIe slot to put one of the cards in for crunching?

I'm surprised his 4650 is even completing a task considering he has the unroll set over twice what it should be. It's so high it's running out of memory at unroll 18.
http://setiathome.berkeley.edu/result.php?resultid=3730825669
WARNING: current unroll value requires more memory than currently allowed for single buffer. Check your settings
ERROR: OpenCL kernel/call 'clCreateBuffer (ocl_global_buf1)' call failed (-61) in file ..\..\ap_science.cpp near line 128.
Waiting 30 sec before restart...

Normal run-time for an Unblanked AP on a 4650 is around 7 hours. All those restarts isn't helping one bit.
Also, since the ATI HD4650 can't run ATI MBs, why not rerun the Lunatics installer and NOT install ATI Multibeam so the server will stop sending ATI MB tasks which are then Aborted...
ID: 1578597 · Report as offensive
merle van osdol

Send message
Joined: 23 Oct 02
Posts: 809
Credit: 1,980,117
RAC: 0
United States
Message 1578677 - Posted: 27 Sep 2014, 7:50:11 UTC

I don't have the -unroll line in this system at all.

The 6570 won't work in that computer. The mobo just spits it out.

I give up.
ID: 1578677 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1578679 - Posted: 27 Sep 2014, 7:55:18 UTC
Last modified: 27 Sep 2014, 7:56:16 UTC

1 of my i5 CPU cores can usually complete an AP in less than half that time and still quicker on a well running 1 on that hardware then. :-O

Cheers.
ID: 1578679 · Report as offensive
merle van osdol

Send message
Joined: 23 Oct 02
Posts: 809
Credit: 1,980,117
RAC: 0
United States
Message 1578706 - Posted: 27 Sep 2014, 9:17:40 UTC - in response to Message 1578679.  
Last modified: 27 Sep 2014, 9:24:36 UTC

Well I just retired that machine from seti. Maybe it will work somewhere else then. I didn't realize I was being such a burden. Sorry. :-)
ID: 1578706 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1578710 - Posted: 27 Sep 2014, 9:54:23 UTC

Don't let me put you off, but I just thought that I'd point that out. ;-)

Cheers.
ID: 1578710 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1578712 - Posted: 27 Sep 2014, 10:10:19 UTC - in response to Message 1578706.  
Last modified: 27 Sep 2014, 10:15:46 UTC

Well I just retired that machine from seti. Maybe it will work somewhere else then. I didn't realize I was being such a burden. Sorry. :-)

Well, no one called you a burden. That machine was just starting to produce more completed tasks than errors(aborts) No need to retire it :-)
You do realize some people crunch on their smartphone, so no matter how slow, completions are welcome. If you look at mister CPU you will see he use to run cards not much faster than your much cheaper 4650. As long as the card is faster than your CPU, and doesn't require much power, no harm.

The last 3 completions on that card show an -unroll setting of 18. It should run best around unroll 6 to 8. You might want to check it out. Not long ago an out of memory warning was fatal. The task would immediately error out with 'out_of_resources'. Nice to see it now allows the task a chance to complete.
ID: 1578712 · Report as offensive
merle van osdol

Send message
Joined: 23 Oct 02
Posts: 809
Credit: 1,980,117
RAC: 0
United States
Message 1578714 - Posted: 27 Sep 2014, 10:11:53 UTC - in response to Message 1578710.  

No problem my friend. I understand what you were saying. But my machine has been causing problems over and over again in the group and it was just time now to remove it elsewhere. And thanks for your kind clarification. Peace.
ID: 1578714 · Report as offensive
merle van osdol

Send message
Joined: 23 Oct 02
Posts: 809
Credit: 1,980,117
RAC: 0
United States
Message 1578715 - Posted: 27 Sep 2014, 10:17:58 UTC

Hal,
That Acer is in a place, and it's one of those mini's, where I don't want a gpu card hanging out of it. But good idea though. Thanks.
ID: 1578715 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : some basics in amd gpu computing


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.