How to set single host resource share? (CUDA!)

Questions and Answers : GPU applications : How to set single host resource share? (CUDA!)
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20331
Credit: 7,508,002
RAC: 20
United Kingdom
Message 871298 - Posted: 2 Mar 2009, 12:39:10 UTC

Well, I've just seen the RAC for my CUDA GPU test jump towards a RAC of 3000 and... then suddenly come to a halt.

I'm guessing that the boinc client isn't asking for any more s@h work because the RAC (or resource non-debt) has rocketed way past the resource share set for s@h.


So...

How can I set a resource share individually for just one host?

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 871298 · Report as offensive
Profile AspirationTower

Send message
Joined: 4 Jan 06
Posts: 97
Credit: 946,069
RAC: 0
United Kingdom
Message 871326 - Posted: 2 Mar 2009, 14:05:22 UTC
Last modified: 2 Mar 2009, 14:10:23 UTC

Perhaps like me you have Astropulse work units ticked ON in your settings on the Seti website under your Account and so the GPU is flying through the work for Cuda and then slowing down to a crawl for the Astropulse stuff. It will only get more Cuda work when the Astropulse stuff is low enough for the Seti part of your Boinc Program to ask for more work. Then again your Cuda jobs will finish fast and leave your GPU's waiting in the wings for something to do.

If you want to set the Seti project to do all the Cuda work and stay fast (and also shove your credits through the roof) then you will need to take the ticks out of the Astropulse stuff, but if everyone does that then they will have a hard time getting anyone to do the Astropulse work at all.
Resistence is futile! {evil cackle}
ID: 871326 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20331
Credit: 7,508,002
RAC: 20
United Kingdom
Message 871346 - Posted: 2 Mar 2009, 15:26:57 UTC - in response to Message 871326.  
Last modified: 2 Mar 2009, 15:28:09 UTC

Nice guess but that's not the case. APs are disabled in my preferences and there are none running.

As a wild guess, I'm now trying Boinc 6.6.11 on this linux system. Same result.

I've also tried tweaking the client_state.xml LTD values for s@h... Which way should they go to have more work requested?

Interestingly, I'm now getting:

02-Mar-2009 15:24:42 [SETI@home] Sending scheduler request: To fetch work.
02-Mar-2009 15:24:42 [SETI@home] Requesting new tasks
02-Mar-2009 15:24:47 [SETI@home] Scheduler request completed: got 0 new tasks
02-Mar-2009 15:24:47 [SETI@home] Message from server: No work sent
02-Mar-2009 15:24:47 [SETI@home] Message from server: No work available for the applications you have selected.  Please check your settings on the web site.
02-Mar-2009 15:24:47 [SETI@home] Message from server: (won't finish in time) BOINC runs 99.7% of time, computation enabled 100.0% of that


?!

Help?!

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 871346 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 871473 - Posted: 2 Mar 2009, 22:28:22 UTC - in response to Message 871346.  
Last modified: 2 Mar 2009, 22:29:36 UTC

Can you link to that computer, please?

And also state the lines for

% of time BOINC client is running
While BOINC running, % of time host has an Internet connection
While BOINC running, % of time work is allowed
Task duration correction factor
ID: 871473 · Report as offensive
Profile AspirationTower

Send message
Joined: 4 Jan 06
Posts: 97
Credit: 946,069
RAC: 0
United Kingdom
Message 871542 - Posted: 3 Mar 2009, 0:18:16 UTC - in response to Message 871346.  

Well I don't run linux and "client_state.xml LTD values for s@h" means nothing to me so I shall bow out of this and let Ageless take over, looks like he needs to see your computer which is hidden by your settings though. Good luck with solving the issue.

Resistence is futile! {evil cackle}
ID: 871542 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20331
Credit: 7,508,002
RAC: 20
United Kingdom
Message 871691 - Posted: 3 Mar 2009, 11:45:35 UTC - in response to Message 871473.  
Last modified: 3 Mar 2009, 11:49:11 UTC

Can you link to that computer, please?

And also state the lines for

% of time BOINC client is running
While BOINC running, % of time host has an Internet connection
While BOINC running, % of time work is allowed
Task duration correction factor

Host 4606186 (Can individual hosts be unhidden?)

Measured floating point speed                  2556.06 million ops/sec
Measured integer speed                         6391.82 million ops/sec
Average upload rate                            21.83 KB/sec
Average download rate                          39.1 KB/sec
Average turnaround time                        14.98 days
Maximum daily WU quota per CPU                 100/day
Tasks                                          197
Number of times client has contacted server    1711
Last time contacted server                     3 Mar 2009 1:33:16 UTC
% of time BOINC client is running              99.6199 %
While BOINC running, % of time work is allowed 99.9895 %
Average CPU efficiency                         0.985038
Task duration correction factor                0.27724


Note that there are no WUs waiting on the host, although there are quite a few pending credits on the Berkeley servers.

The DCF doesn't look to be so extreme as to be a problem... So...?

The average turnaround time is from a glut that the servers delivered when first trying cuda. I've actually got only a 0.5 day cache set.

Running Boinc 6.6.11

Cheers,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 871691 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 871693 - Posted: 3 Mar 2009, 12:06:49 UTC - in response to Message 871691.  

Oh, hold on, where was I with my head? Linux... Seti doesn't have a Linux CUDA app yet. (That I know of, at least) .. are you using a 3rd party application? Ah, yes. I see you're using Crunch3r's version.

The other thing I see on that host is that you aborted a lot of tasks. ALL APs? (If so, why not just disable getting them for this machine through a different venue?)

Open up BOINC Manager->Advanced view->Projects tab
Select Seti->click Properties.

What are your scheduling entries here and their numbers?
ID: 871693 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20331
Credit: 7,508,002
RAC: 20
United Kingdom
Message 871697 - Posted: 3 Mar 2009, 12:35:06 UTC - in response to Message 871693.  
Last modified: 3 Mar 2009, 12:36:24 UTC

Oh, hold on, where was I with my head? Linux... Seti doesn't have a Linux CUDA app yet. (That I know of, at least) .. are you using a 3rd party application? Ah, yes. I see you're using Crunch3r's version.

Yep, crunching with Crunch3r's version.

The other thing I see on that host is that you aborted a lot of tasks. ALL APs? (If so, why not just disable getting them for this machine through a different venue?)

Not quite... I aborted a few APs but most were MBs... About a 1000 or so. There were twice as many as could be completed in time.

The preferences are selected for MB only and GPU is enabled.


What are your scheduling entries here and their numbers?

5 for s@h, 40 for CPDN.

Hence the guess that the problem is that the GPU has clocked up an excess of time and so s@h is now on hold until the debts proportions settle. The host has been in EDF for about two months. Hence... How should the STD & LTD numbers go in the state file?... Manual frig to get s@h crunching again on the GPU.


What this might show is that really we need separate resource shares preferences for each available resource. Or at least have each available resource considered separately with the one set resource shares values.

My intention here is to keep the GPU 100% utilised with s@h and the CPU fully utilised with CPDN...

Cheers,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 871697 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 871707 - Posted: 3 Mar 2009, 13:28:58 UTC - in response to Message 871697.  

you wrote:
me wrote:
What are your scheduling entries here and their numbers?

5 for s@h, 40 for CPDN.

Hmmm... what I meant was, what are the actual entries under the scheduling section in those properties, for Seti only, and what are the numbers displayed there? Take a screen shot, or write them over on some paper.

They are in client_state.xml as well:
<short_term_debt>
<long_term_debt>
<cpu_backoff_interval>
<cpu_backoff_time>
<cuda_debt>
<cuda_backoff_interval>
<cuda_backoff_time>
<duration_correction_factor>


ID: 871707 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20331
Credit: 7,508,002
RAC: 20
United Kingdom
Message 871722 - Posted: 3 Mar 2009, 14:09:03 UTC - in response to Message 871707.  
Last modified: 3 Mar 2009, 14:09:41 UTC

Sorry, misinterpreted...

<project>
    <master_url>http://setiathome.berkeley.edu/</master_url>
    <project_name>SETI@home</project_name>
    <symstore></symstore>
    <user_name>ML1</user_name>
    <team_name>The Planetary Society</team_name>
    <host_venue>home</host_venue>
    <email_hash>[...]</email_hash>
    <cross_project_id>[...]</cross_project_id>
    <cpid_time>1006694916.000000</cpid_time>
    <user_total_credit>917201.997884</user_total_credit>
    <user_expavg_credit>3000.027976</user_expavg_credit>
    <user_create_time>1006694916.000000</user_create_time>
    <rpc_seqno>1712</rpc_seqno>
    <hostid>4606186</hostid>
    <host_total_credit>47363.107736</host_total_credit>
    <host_expavg_credit>1132.604976</host_expavg_credit>
    <host_create_time>1223166759.000000</host_create_time>
    <nrpc_failures>0</nrpc_failures>
    <master_fetch_failures>0</master_fetch_failures>
    <min_rpc_time>1236044010.999537</min_rpc_time>
    <next_rpc_time>0.000000</next_rpc_time>
    <short_term_debt>0.000000</short_term_debt>
    <long_term_debt>0.000000</long_term_debt>
    <cpu_backoff_interval>49152.000000</cpu_backoff_interval>
    <cpu_backoff_time>1236093151.999537</cpu_backoff_time>
    <cuda_debt>-162932.570000</cuda_debt>
    <cuda_backoff_interval>86400.000000</cuda_backoff_interval>
    <cuda_backoff_time>1236105762.419611</cuda_backoff_time>
    <resource_share>5.000000</resource_share>
    <duration_correction_factor>0.277244</duration_correction_factor>
    <sched_rpc_pending>0</sched_rpc_pending>
    <send_time_stats_log>0</send_time_stats_log>
    <send_job_log>0</send_job_log>
    <ams_resource_share>0.000000</ams_resource_share>
    <scheduler_url>http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi</scheduler_url>
    <code_sign_key>
1024
[...]
.
</code_sign_key>
</project>


Soo... What do the numbers mean?

Cheers,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 871722 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 871728 - Posted: 3 Mar 2009, 14:29:54 UTC - in response to Message 871722.  
Last modified: 3 Mar 2009, 14:32:44 UTC

I hope I am interpreting the numbers correctly, else I am sure JM7 will tell you and me what's wrong. :-)

<cpu_backoff_interval>49152.000000</cpu_backoff_interval>

The deferral time in seconds, (13h 39s), defined by BOINC. It completely ignores any back-off time set by the project. Fun, huh?

<cpu_backoff_time>1236093151.999537</cpu_backoff_time>

This one tells you that you're still backed off until Tue, 03 Mar 2009 15:12:31 GMT (It's Unix time, use a converter such as this one to know what it says).

If no work is gotten then, you'll be deferred for another time set, possibly those 13h 39s, but probably more than that. After 11 consecutive tries to get work and failing to get it, you'll be deferred for 24 hours. Constantly. (so you don't DDOS the project servers... but hum, who ever wants to update their BOINC when it'll do this?)

<cuda_debt>-162932.570000</cuda_debt>

The debt that CUDA has gotten after its last round of work. All debts now have a maximum of zero, the closer they are to zero, the more chance they have to get work and do work.

<cuda_backoff_interval>86400.000000</cuda_backoff_interval>

The deferral time of 24 hours on trying to get new work for CUDA, set by BOINC.
It'll only try to get work from Seti once every this amount of seconds... meaning that:

]<cuda_backoff_time>1236105762.419611</cuda_backoff_time>

We still have till Tue, 03 Mar 2009 18:42:42 GMT to go before we try again... if no work for CUDA is gotten then, you're automatically backing off for another 24 hours (!!)

Soo... What do the numbers mean?

Thus, said the raven.

(edit: fixed BBCode tags)
ID: 871728 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20331
Credit: 7,508,002
RAC: 20
United Kingdom
Message 871736 - Posted: 3 Mar 2009, 15:02:58 UTC - in response to Message 871728.  

Thanks for all that.

Soo... I adjusted the backoff times and debt accordingly and the restarted Boinc didst quoth:

03-Mar-2009 14:59:17 [SETI@home] Sending scheduler request: To fetch work.
03-Mar-2009 14:59:17 [SETI@home] Requesting new tasks
03-Mar-2009 14:59:22 [SETI@home] Scheduler request completed: got 0 new tasks
03-Mar-2009 14:59:33 [SETI@home] Sending scheduler request: To fetch work.
03-Mar-2009 14:59:33 [SETI@home] Requesting new tasks
03-Mar-2009 14:59:38 [SETI@home] Scheduler request completed: got 0 new tasks
03-Mar-2009 14:59:38 [SETI@home] Message from server: No work sent
03-Mar-2009 14:59:38 [SETI@home] Message from server: No work available for the applications you have selected.  Please check your settings on the web site.
03-Mar-2009 14:59:38 [SETI@home] Message from server: (won't finish in time) BOINC runs 99.6% of time, computation enabled 100.0% of that



Have I stumbled across a server-side calculation glitch?

??

Cheers,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 871736 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 871749 - Posted: 3 Mar 2009, 15:44:02 UTC - in response to Message 871736.  

and the restarted Boinc didst quoth

LOL, thanks for that. :-)

Can you turn on the <work_fetch_debug>, <debt_debug> and <sched_op_debug> flags in cc_config.xml and do a repeat request for work, please? Let's see what that one gives.
ID: 871749 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20331
Credit: 7,508,002
RAC: 20
United Kingdom
Message 871757 - Posted: 3 Mar 2009, 16:02:01 UTC - in response to Message 871749.  

Wheee, you don't ask for much! ;-)

With:

<cc_config>
<options>
        <save_stats_days>365</save_stats_days>
</options>
<log_flags>
        <work_fetch_debug>1</work_fetch_debug>
        <debt_debug>1</debt_debug>
        <sched_op_debug>1</sched_op_debug>
</log_flags>
</cc_config>


Boinc thence didst spew forth:

03-Mar-2009 15:55:49 [---] Re-reading cc_config.xml
03-Mar-2009 15:55:49 [---] [work_fetch_debug] Request work fetch: Core client configuration
03-Mar-2009 15:55:50 [---] [debt] CPU: no eligible projects
03-Mar-2009 15:55:50 [---] [debt] CUDA: no eligible projects
03-Mar-2009 15:55:53 [---] [wfd] ------- start work fetch state -------
03-Mar-2009 15:55:53 [---] [wfd] target work buffer: 86400.00 sec
03-Mar-2009 15:55:53 [---] [wfd] CPU: shortfall 0.00 nidle 0.00 est. delay 16928964.82 RS fetchable 0.00 runnable 40.00
03-Mar-2009 15:55:53 [Cosmology@Home] [wfd] CPU: runshare 0.00 debt -1722087.65 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [climateprediction.net] [wfd] CPU: runshare 1.00 debt 0.00 backoff dt 0.00 int 60.00 (no new tasks)
03-Mar-2009 15:55:53 [APS@Home] [wfd] CPU: runshare 0.00 debt -1644455.40 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [lhcathome] [wfd] CPU: runshare 0.00 debt -1631355.64 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [ABC@home] [wfd] CPU: runshare 0.00 debt -1722232.67 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [Artificial Intelligence System] [wfd] CPU: runshare 0.00 debt -1740096.00 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [orbit@home] [wfd] CPU: runshare 0.00 debt -1592398.42 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [Milkyway@home] [wfd] CPU: runshare 0.00 debt -1723367.91 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [SETI@home] [wfd] CPU: runshare 0.00 debt 0.00 backoff dt 83025.41 int 86400.00
03-Mar-2009 15:55:53 [PrimeGrid] [wfd] CPU: runshare 0.00 debt -1744595.04 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [boincsimap] [wfd] CPU: runshare 0.00 debt -1575645.35 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:55:53 [---] [wfd] CUDA: shortfall 86400.00 nidle 1.00 est. delay 0.00 RS fetchable 0.00 runnable 0.00
03-Mar-2009 15:55:53 [Cosmology@Home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:55:53 [climateprediction.net] [wfd] CUDA: runshare 0.00 debt 0.00 backoff dt 0.00 int 86400.00 (no new tasks)
03-Mar-2009 15:55:53 [APS@Home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:55:53 [lhcathome] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 240.00 (no new tasks)
03-Mar-2009 15:55:53 [ABC@home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 7680.00 (no new tasks)
03-Mar-2009 15:55:53 [Artificial Intelligence System] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 3840.00 (no new tasks)
03-Mar-2009 15:55:53 [orbit@home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:55:53 [Milkyway@home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:55:53 [SETI@home] [wfd] CUDA: runshare 0.00 debt -1.63 backoff dt 83009.31 int 86400.00
03-Mar-2009 15:55:53 [PrimeGrid] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 3840.00 (no new tasks)
03-Mar-2009 15:55:53 [boincsimap] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:55:53 [Cosmology@Home] [wfd] overall_debt -2291655
03-Mar-2009 15:55:53 [climateprediction.net] [wfd] overall_debt 0
03-Mar-2009 15:55:53 [APS@Home] [wfd] overall_debt -2214023
03-Mar-2009 15:55:53 [lhcathome] [wfd] overall_debt -2200923
03-Mar-2009 15:55:53 [ABC@home] [wfd] overall_debt -2291800
03-Mar-2009 15:55:53 [Artificial Intelligence System] [wfd] overall_debt -2309663
03-Mar-2009 15:55:53 [orbit@home] [wfd] overall_debt -2161966
03-Mar-2009 15:55:53 [Milkyway@home] [wfd] overall_debt -2292935
03-Mar-2009 15:55:53 [SETI@home] [wfd] overall_debt -11
03-Mar-2009 15:55:53 [PrimeGrid] [wfd] overall_debt -2314163
03-Mar-2009 15:55:53 [boincsimap] [wfd] overall_debt -2145213
03-Mar-2009 15:55:53 [---] [wfd] ------- end work fetch state -------
03-Mar-2009 15:55:53 [---] No project chosen for work fetch
03-Mar-2009 15:56:50 [---] [debt] CPU: no eligible projects
03-Mar-2009 15:56:50 [---] [debt] CUDA: no eligible projects
03-Mar-2009 15:56:51 [---] [debt] CPU: no eligible projects
03-Mar-2009 15:56:51 [---] [debt] CUDA: no eligible projects
03-Mar-2009 15:56:55 [---] [wfd] ------- start work fetch state -------
03-Mar-2009 15:56:55 [---] [wfd] target work buffer: 86400.00 sec
03-Mar-2009 15:56:55 [---] [wfd] CPU: shortfall 0.00 nidle 0.00 est. delay 16928487.80 RS fetchable 0.00 runnable 40.00
03-Mar-2009 15:56:55 [Cosmology@Home] [wfd] CPU: runshare 0.00 debt -1722087.65 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [climateprediction.net] [wfd] CPU: runshare 1.00 debt 0.00 backoff dt 0.00 int 60.00 (no new tasks)
03-Mar-2009 15:56:55 [APS@Home] [wfd] CPU: runshare 0.00 debt -1644455.40 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [lhcathome] [wfd] CPU: runshare 0.00 debt -1631355.64 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [ABC@home] [wfd] CPU: runshare 0.00 debt -1722232.67 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [Artificial Intelligence System] [wfd] CPU: runshare 0.00 debt -1740096.00 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [orbit@home] [wfd] CPU: runshare 0.00 debt -1592398.42 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [Milkyway@home] [wfd] CPU: runshare 0.00 debt -1723367.91 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [SETI@home] [wfd] CPU: runshare 0.00 debt 0.00 backoff dt 82963.15 int 86400.00
03-Mar-2009 15:56:55 [PrimeGrid] [wfd] CPU: runshare 0.00 debt -1744595.04 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [boincsimap] [wfd] CPU: runshare 0.00 debt -1575645.35 backoff dt 0.00 int 0.00 (no new tasks) (overworked)
03-Mar-2009 15:56:55 [---] [wfd] CUDA: shortfall 86400.00 nidle 1.00 est. delay 0.00 RS fetchable 0.00 runnable 0.00
03-Mar-2009 15:56:55 [Cosmology@Home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:56:55 [climateprediction.net] [wfd] CUDA: runshare 0.00 debt 0.00 backoff dt 0.00 int 86400.00 (no new tasks)
03-Mar-2009 15:56:55 [APS@Home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:56:55 [lhcathome] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 240.00 (no new tasks)
03-Mar-2009 15:56:55 [ABC@home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 7680.00 (no new tasks)
03-Mar-2009 15:56:55 [Artificial Intelligence System] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 3840.00 (no new tasks)
03-Mar-2009 15:56:55 [orbit@home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:56:55 [Milkyway@home] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:56:55 [SETI@home] [wfd] CUDA: runshare 0.00 debt -1.63 backoff dt 82947.05 int 86400.00
03-Mar-2009 15:56:55 [PrimeGrid] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 3840.00 (no new tasks)
03-Mar-2009 15:56:55 [boincsimap] [wfd] CUDA: runshare 0.00 debt -87370.79 backoff dt 0.00 int 1920.00 (no new tasks)
03-Mar-2009 15:56:55 [Cosmology@Home] [wfd] overall_debt -2291655
03-Mar-2009 15:56:55 [climateprediction.net] [wfd] overall_debt 0
03-Mar-2009 15:56:55 [APS@Home] [wfd] overall_debt -2214023
03-Mar-2009 15:56:55 [lhcathome] [wfd] overall_debt -2200923
03-Mar-2009 15:56:55 [ABC@home] [wfd] overall_debt -2291800
03-Mar-2009 15:56:55 [Artificial Intelligence System] [wfd] overall_debt -2309663
03-Mar-2009 15:56:55 [orbit@home] [wfd] overall_debt -2161966
03-Mar-2009 15:56:55 [Milkyway@home] [wfd] overall_debt -2292935
03-Mar-2009 15:56:55 [SETI@home] [wfd] overall_debt -11
03-Mar-2009 15:56:55 [PrimeGrid] [wfd] overall_debt -2314163
03-Mar-2009 15:56:55 [boincsimap] [wfd] overall_debt -2145213
03-Mar-2009 15:56:55 [---] [wfd] ------- end work fetch state -------
03-Mar-2009 15:56:55 [---] No project chosen for work fetch
03-Mar-2009 15:57:51 [---] [debt] CPU: no eligible projects
03-Mar-2009 15:57:51 [---] [debt] CUDA: no eligible projects

...And continued for similar repeats.

?

Cheers,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 871757 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 871765 - Posted: 3 Mar 2009, 16:19:31 UTC - in response to Message 871757.  
Last modified: 3 Mar 2009, 16:19:44 UTC

03-Mar-2009 15:56:55 [---] No project chosen for work fetch

03-Mar-2009 15:57:51 [---] [debt] CPU: no eligible projects

03-Mar-2009 15:57:51 [---] [debt] CUDA: no eligible projects


Funny, that one. I see all your other projects are set to No New Tasks, correct?
Does this machine have any work at this time?
ID: 871765 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20331
Credit: 7,508,002
RAC: 20
United Kingdom
Message 871773 - Posted: 3 Mar 2009, 16:38:40 UTC - in response to Message 871765.  
Last modified: 3 Mar 2009, 16:39:39 UTC

03-Mar-2009 15:56:55 [---] No project chosen for work fetch

03-Mar-2009 15:57:51 [---] [debt] CPU: no eligible projects

03-Mar-2009 15:57:51 [---] [debt] CUDA: no eligible projects

Funny, that one. I see all your other projects are set to No New Tasks, correct?

Indeed so. Only s@h is set to allow new tasks.

Does this machine have any work at this time?

It has no s@h work.

It does have a few CPDN WUs lined up. I have half of those suspended to avoid running in EDF. Two are running at present. If I 'resume' the (not yet started) suspended tasks, then the two CPDNs running immediately go into EDF.


Is work fetch suspended for the GPU if the CPU is maxed out on work yet to be done?...

Cheers,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 871773 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 871802 - Posted: 3 Mar 2009, 22:00:59 UTC - in response to Message 871773.  

I'd report this on the BOINC Alpha list, if I were you. Put a link in to this thread, show what we've done already. Explain what you did in the mean time and what you expect it to be doing, but it isn't doing. ;-)
ID: 871802 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 871843 - Posted: 4 Mar 2009, 0:01:37 UTC - in response to Message 871773.  



<snip>

Is work fetch suspended for the GPU if the CPU is maxed out on work yet to be done?...

Cheers,
Martin

Not a direct answer to your question but perhaps another symptom of the same problem...
I have noted with 6.6.11 that CPU tasks going into EDF can cause running CUDA tasks to go into "waiting to run" mode so the GPU becomes idle. Curiously, it also put 2 more CUDA tasks into "Ready to start(0.15 CPUs 1 CUDA)" mode but did not actually start them and then on the next server connect those 2 tasks that never started reported as errors (no output file). However, I got so weary with the GPU cores being idle for several hours at a time while it was finishing an AP5.03 to get it out of EDF that I have reverted to 6.4.5.

F.
ID: 871843 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 871905 - Posted: 4 Mar 2009, 2:41:26 UTC - in response to Message 871765.  

03-Mar-2009 15:56:55 [---] No project chosen for work fetch

03-Mar-2009 15:57:51 [---] [debt] CPU: no eligible projects

03-Mar-2009 15:57:51 [---] [debt] CUDA: no eligible projects


Funny, that one. I see all your other projects are set to No New Tasks, correct?
Does this machine have any work at this time?

Any project that has a contact backoff will not be eligible.


BOINC WIKI
ID: 871905 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 871982 - Posted: 4 Mar 2009, 10:55:05 UTC - in response to Message 871905.  

Any project that has a contact backoff will not be eligible.

Even when it's the only project allowed to fetch work and it has no work at that moment?
ID: 871982 · Report as offensive
1 · 2 · Next

Questions and Answers : GPU applications : How to set single host resource share? (CUDA!)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.