How to set single host resource share? (CUDA!)

Questions and Answers : GPU applications : How to set single host resource share? (CUDA!)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20389
Credit: 7,508,002
RAC: 20
United Kingdom
Message 872022 - Posted: 4 Mar 2009, 13:48:18 UTC - in response to Message 871773.  
Last modified: 4 Mar 2009, 13:50:34 UTC

It has no s@h work.

It does have a few CPDN WUs lined up. I have half of those suspended to avoid running in EDF. Two are running at present. If I 'resume' the (not yet started) suspended tasks, then the two CPDNs running immediately go into EDF.


Is work fetch suspended for the GPU if the CPU is maxed out on work yet to be done?...

OK...

So I aborted the suspended CPDN work and... a few work fetch requests later and a few s@h WUs have been downloaded and are now being worked on.

So... Looks like a quirk of the scheduler logic. If the CPU is considered maxed out on current (and suspended) work, then the GPU can be starved. Yet "EDF" does not consider suspended work and you can push the client out of EDF by suspending WUs.

Also, the (expected) CPU fraction for feeding the GPU is preset in the xml... Reality varies and is different!

A good question is how multiple resources should be balanced for the user specified resource share when there are more than one or there are alternate bottlenecks to performance (CPU maxed out in this case and the GPU utilisation is dependant on the CPU).


And there is still my original question of how you can set unique resource shares for a host rather than being limited to the default-home-school-work values?

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 872022 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 872025 - Posted: 4 Mar 2009, 13:55:46 UTC - in response to Message 872022.  

And there is still my original question of how you can set unique resource shares for a host rather than being limited to the default-home-school-work values?

BAM! has user made venues. But else, you're out of luck, I think.
ID: 872025 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 872292 - Posted: 5 Mar 2009, 2:38:10 UTC - in response to Message 871982.  

Any project that has a contact backoff will not be eligible.

Even when it's the only project allowed to fetch work and it has no work at that moment?

Yes. If it has a backoff, it is not allowed to fetch work.


BOINC WIKI
ID: 872292 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 872359 - Posted: 5 Mar 2009, 8:25:08 UTC - in response to Message 872292.  

Since we just learned that the anonymous platform mechanism is expecting to find CPU information only in the app_info.xml file, you think that that's at the start of Martin's problem?

(BOINC 6.6 does not support CUDA through the app_info.xml)
ID: 872359 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20389
Credit: 7,508,002
RAC: 20
United Kingdom
Message 872378 - Posted: 5 Mar 2009, 10:20:28 UTC - in response to Message 872359.  

Since we just learned that the anonymous platform mechanism is expecting to find CPU information only in the app_info.xml file, you think that that's at the start of Martin's problem?

(BOINC 6.6 does not support CUDA through the app_info.xml)

Now that gets even more confusing because it appears to be getting WUs and returning results via CUDA...

For example: Task 1178537956

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 872378 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20389
Credit: 7,508,002
RAC: 20
United Kingdom
Message 872380 - Posted: 5 Mar 2009, 10:23:29 UTC
Last modified: 5 Mar 2009, 10:24:59 UTC

Aside: Note that the original problem is in the confusion in how work is added up by the scheduler for CPU tasks and GPU tasks and for considering tasks that are user suspended.

The refusal to request more work was cleared by aborting the queued up CPDN tasks. Suspending those tasks merely removed the EDF status. Suspending the CPDN tasks wouldn't allow any s@h WUs to be downloaded.

There is still the anomalous server-side messages about not completing work in time even though Boinc gets 100% CPU time!


A local resource share override would be rather nice ;-)

Cheers,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 872380 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 872627 - Posted: 5 Mar 2009, 21:25:24 UTC - in response to Message 872359.  

Since we just learned that the anonymous platform mechanism is expecting to find CPU information only in the app_info.xml file, you think that that's at the start of Martin's problem?

(BOINC 6.6 does not support CUDA through the app_info.xml)

The latest word is that the client does, but the server does not really support it yet. It should only take a couple of days to get the CUDA anonymous platform support to work correctly at the server.


BOINC WIKI
ID: 872627 · Report as offensive
Previous · 1 · 2

Questions and Answers : GPU applications : How to set single host resource share? (CUDA!)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.