Message boards :
Number crunching :
some basics in amd gpu computing
Message board moderation
Author | Message |
---|---|
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0 |
From what I understand so far, we don't want to combine high end and low end amd gpu's in the same computer. For example, r7 265 with 7770 with 6570. Why is that? Because the amd drivers are different for the different cards and we cannot use multiple amd drivers on the same computer? Or what? Or what don't I understand? |
woohoo Send message Joined: 30 Oct 13 Posts: 972 Credit: 165,671,404 RAC: 5 |
if i had all three of those cards i would at least try them with the latest driver, that might work. using lunatics i would only run one wu per gpu because trying to run multiple wus on the faster gpus while simultaneously running one wu on the slower gpus might require multiple instances of boinc and i'm too lazy for that. there are are also command line parameters that you can optionally use but i don't know if you can apply different settings for each individual card. plus there could be timing issues as well. |
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0 |
thanks woohoo, Yeah, I am pretty sure that the reason I couldn't work that last time was because the 6570 was no longer supported by the current amd driver needed for the newer cards. So my answer to myself then is that you can only use one graphics driver on the same computer. Seems strange to me but I'm not a CS person. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
From what I understand so far, we don't want to combine high end and low end amd gpu's in the same computer. For example, r7 265 with 7770 with 6570. Why is that? Because the amd drivers are different for the different cards and we cannot use multiple amd drivers on the same computer? Or what? Or what don't I understand? You don't want to mix fast and slow GPU's from any vendor. In the case of those 3 cards they actually all use the same driver. So you can in fact use them all at the same time if you really want, but it isn't a good idea for a few reasons. The issues of why you do not want to mix fast and slow cards are related to how BOINC treats the GPUs. The top two issues that come to mind are: - Configuration of a GPU is common across all GPUs for that app. Go configuring a fast GPU to run 2 at once will also force the slower card to run 2 at once. Which may cause it to choke and error. - BOINC only stores one value for estimated times per app. So your really fast GPU would cause the estimated time to be very low. Then when the slower card runs a task it will exit early because it "ran to long". SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0 |
thanks Hal, I understand the first point clearly and wouldn't want to run multiple wu's on a gpu. That second item seems unclear. What would estimated times have to do with the actual running of a task? Why would the slower card even consider exiting based on a prefigured estimate to complete? Is that one of the "features" of boinc that someone else mentioned before? |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
thanks Hal, Basically a mechanism thinks the task is taking much longer than it should, based on the estimate. Then it says "OK we are stopping this right now". IIRC the task is marked with an exit status of -177. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0 |
lol thanks Hal |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
thanks Hal, Yes, it's another feature. These features are the result of BOINC being App based verses Device based. All device results are lumped into the App statistics instead of BOINC having Statistics for different Devices. Therefore, two unequal cards result in BOINC using an average of the two devices. This will cause the fast card to receive Low credits per task and the Slow card to receive High credits per task. This because the Fast card finished faster than the estimate and the slower card finished slower than the estimate. As for the time_exceeded exit, it would have to be a very large difference because it allows for 10x longer than the estimate. You wouldn't have this problem unless you mixed a card that took 10x longer to finish than the fastest card, but after the estimate was averaged over a number of runs you wouldn't have the problem. I know a certain developer who believes BONIC should be Device Based... |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
If you configurate your host to run multiple intances of Boinc, you could work with any combination of GPU´s (slow/mid/fast) without any problem. But multiple intances it´s for advanced users only. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
lol The exit status is actually 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED & not 177. 177 is another one I remember but I forget the instance it occurs now. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
lol LoL, -177 *was* used for both time limit exceeded and memory limit exceeded. Now they're separate 197 and 198 errors. @merle - The intended purpose of the time limit is to protect against an application running forever because it got hung in a loop on an unattended or seldom watched computer. The S@H setting used to be 10 times the estimate, but was changed to 20 times within the last year. That may be enough that few cases will be seen where a task is killed while nearing completion simply because the servers had an inflated idea of how fast the app version should run. But it is probably unsafe to use two GPUs of the same brand which differ in speed by a factor of 10 or more (I mean real speed, not the theoretical peak GFLOPS). Joe |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
The most benefit might be to put the 6570 in place of the 4650 in your other machine. The lower power usage of 44W for the 6570 vs 48W of the 4650 isn't huge, but the processing times on the 6570 should be much better than ~18 hours. The 6370m in my notebook runs just under 6. hours. Or perhaps your Core 2 Quad CPU Q8300 has a PCIe slot to put one of the cards in for crunching? SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
The most benefit might be to put the 6570 in place of the 4650 in your other machine. The lower power usage of 44W for the 6570 vs 48W of the 4650 isn't huge, but the processing times on the 6570 should be much better than ~18 hours. The 6370m in my notebook runs just under 6. hours. I'm surprised his 4650 is even completing a task considering he has the unroll set over twice what it should be. It's so high it's running out of memory at unroll 18. http://setiathome.berkeley.edu/result.php?resultid=3730825669 WARNING: current unroll value requires more memory than currently allowed for single buffer. Check your settings Normal run-time for an Unblanked AP on a 4650 is around 7 hours. All those restarts isn't helping one bit. Also, since the ATI HD4650 can't run ATI MBs, why not rerun the Lunatics installer and NOT install ATI Multibeam so the server will stop sending ATI MB tasks which are then Aborted... |
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0 |
I don't have the -unroll line in this system at all. The 6570 won't work in that computer. The mobo just spits it out. I give up. |
Wiggo Send message Joined: 24 Jan 00 Posts: 34854 Credit: 261,360,520 RAC: 489 |
1 of my i5 CPU cores can usually complete an AP in less than half that time and still quicker on a well running 1 on that hardware then. :-O Cheers. |
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0 |
Well I just retired that machine from seti. Maybe it will work somewhere else then. I didn't realize I was being such a burden. Sorry. :-) |
Wiggo Send message Joined: 24 Jan 00 Posts: 34854 Credit: 261,360,520 RAC: 489 |
Don't let me put you off, but I just thought that I'd point that out. ;-) Cheers. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Well I just retired that machine from seti. Maybe it will work somewhere else then. I didn't realize I was being such a burden. Sorry. :-) Well, no one called you a burden. That machine was just starting to produce more completed tasks than errors(aborts) No need to retire it :-) You do realize some people crunch on their smartphone, so no matter how slow, completions are welcome. If you look at mister CPU you will see he use to run cards not much faster than your much cheaper 4650. As long as the card is faster than your CPU, and doesn't require much power, no harm. The last 3 completions on that card show an -unroll setting of 18. It should run best around unroll 6 to 8. You might want to check it out. Not long ago an out of memory warning was fatal. The task would immediately error out with 'out_of_resources'. Nice to see it now allows the task a chance to complete. |
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0 |
No problem my friend. I understand what you were saying. But my machine has been causing problems over and over again in the group and it was just time now to remove it elsewhere. And thanks for your kind clarification. Peace. |
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0 |
Hal, That Acer is in a place, and it's one of those mini's, where I don't want a gpu card hanging out of it. But good idea though. Thanks. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.