Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/boinc_db.inc on line 147
Tests of new scheduler features.

Tests of new scheduler features.

Message boards : News : Tests of new scheduler features.
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 · Next

AuthorMessage
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 46171 - Posted: 1 Jun 2013, 21:33:51 UTC

perhaps VLARs should be disabled for GPUs again.
Very negative attitude on SETI main boards to VLARs on GPU, even on ATi GPUs though NV GPUs mentioned more often.
It's worth to distribute VLAR to GPU only if GPU is idle and server can't offer another work. Kind of "backup work". In other way GPU will be idle or drift to another project. If it's possible to implement such logic it's worth to do. If not maybe worth to disable VLARs again.
ID: 46171 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 46173 - Posted: 1 Jun 2013, 22:36:31 UTC - in response to Message 46170.  

Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu:


That's odd, since I haven't updated anything related to AP over there.
ID: 46173 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 46175 - Posted: 2 Jun 2013, 0:08:28 UTC - in response to Message 46168.  
Last modified: 2 Jun 2013, 0:39:30 UTC

The right way to do this (and I indicated this to David a long time ago) is to use an estimate of the median rather than weighted averages, as medians are not strongly affected by outliers.

I could change the current code make an estimate of the running median...


Oooh, sounds much more robust :D

Now on the client side, having experimented with stabilising estimates for work fetch & task scheduling, I used a tuned PID controller for dead-reckoning with feedback. That made estimates much more stable, tuned for very slight overshoot for rapid convergence (system usage change etc) without ringing (sufficiently damped still).

If I switch from using (custom) per application DCF as the control (fudge factor), over to adaptive flops as suggested by Joe some time back, does the server receive the (application) flops value on each contact ? and, if so, could you possibly combine the more robust longer term median processing rate(s) with the flops using something like a Kalman filter ? My basis for thought there is why recalculate something the client already knows, if you don't have to. or alternatively if the calculations are in different time scales, combine (Kalman filter) them to get the best of both.

To me anyway, stable & adaptive estimates proved to solve a lot of problems... On a relatively fast system I typically see stable estimate convergence track system usage or hardware change on the order of minutes, as opposed to APR's days to weeks.
ID: 46175 · Report as offensive
TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 28 Jan 11
Posts: 619
Credit: 2,580,051
RAC: 0
Sweden
Message 46178 - Posted: 2 Jun 2013, 8:19:35 UTC - in response to Message 46170.  

Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu:

http://setiathome.berkeley.edu/forum_thread.php?id=71827&postid=1374823

Valid AstroPulse v6 tasks for computer 6910524

I grabbed some Stock OpenCL AP work, and got very low awarded Credit too:

All AstroPulse v6 tasks for computer 5427475

Claggy



I have a couple of more with low credits on main.

http://setiathome.berkeley.edu/workunit.php?wuid=1257047105
http://setiathome.berkeley.edu/workunit.php?wuid=1257039458
ID: 46178 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 29 May 06
Posts: 1037
Credit: 8,440,339
RAC: 0
United Kingdom
Message 46179 - Posted: 2 Jun 2013, 8:46:53 UTC - in response to Message 46178.  

Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu:

http://setiathome.berkeley.edu/forum_thread.php?id=71827&postid=1374823

Valid AstroPulse v6 tasks for computer 6910524

I grabbed some Stock OpenCL AP work, and got very low awarded Credit too:

All AstroPulse v6 tasks for computer 5427475

Claggy



I have a couple of more with low credits on main.

http://setiathome.berkeley.edu/workunit.php?wuid=1257047105
http://setiathome.berkeley.edu/workunit.php?wuid=1257039458

I wouldn't worry about that, we need to populate the project's app version Peak Flop Count Average' for all app versions, that has probably happened for GPU versions already, it'll be a few days before it's done for the CPU AP app,
also none of those hosts had reached their 10 validations yet,

Claggy
ID: 46179 · Report as offensive
TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 28 Jan 11
Posts: 619
Credit: 2,580,051
RAC: 0
Sweden
Message 46182 - Posted: 2 Jun 2013, 10:25:10 UTC - in response to Message 46179.  
Last modified: 2 Jun 2013, 10:25:26 UTC

Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu:

http://setiathome.berkeley.edu/forum_thread.php?id=71827&postid=1374823

Valid AstroPulse v6 tasks for computer 6910524

I grabbed some Stock OpenCL AP work, and got very low awarded Credit too:

All AstroPulse v6 tasks for computer 5427475

Claggy



I have a couple of more with low credits on main.

http://setiathome.berkeley.edu/workunit.php?wuid=1257047105
http://setiathome.berkeley.edu/workunit.php?wuid=1257039458

I wouldn't worry about that, we need to populate the project's app version Peak Flop Count Average' for all app versions, that has probably happened for GPU versions already, it'll be a few days before it's done for the CPU AP app,
also none of those hosts had reached their 10 validations yet,

Claggy


Now I see, it's wingman that needs 10 validations.
ID: 46182 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 46189 - Posted: 2 Jun 2013, 22:39:39 UTC

YAY - cuda32 is speeding up - APR is 101, and cuda50 only 97. So of course, cuda32 rules.

Doesn't it?
ID: 46189 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 46190 - Posted: 2 Jun 2013, 23:41:49 UTC - in response to Message 46189.  

YAY - cuda32 is speeding up - APR is 101, and cuda50 only 97. So of course, cuda32 rules.

Doesn't it?


Could do. Still doesn't make David any better at statistics than me, which is pretty bad.
ID: 46190 · Report as offensive
Profile Mike
Volunteer tester
Avatar

Send message
Joined: 16 Jun 05
Posts: 2531
Credit: 1,074,556
RAC: 0
Germany
Message 46191 - Posted: 3 Jun 2013, 7:42:10 UTC

Something is definetly wrong.
Host stas on main

number of completed tasks 1
consecutive valid tasks 646
APR 7740

With each crime and every kindness we birth our future.
ID: 46191 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 46192 - Posted: 3 Jun 2013, 7:58:46 UTC - in response to Message 46191.  

Something is definetly wrong.
Host stas on main

number of completed tasks 1
consecutive valid tasks 646
APR 7740

Link to host would help, please.
ID: 46192 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 46193 - Posted: 3 Jun 2013, 8:04:38 UTC
Last modified: 3 Jun 2013, 8:23:50 UTC

Here's an interesting one on main.
http://setiathome.berkeley.edu/show_host_detail.php?hostid=6739873
Looks to me like if 'completed tasks' stays zero, i.e. it's spitting out only invalids, then max tasks per day never goes down ?
ID: 46193 · Report as offensive
Profile Mike
Volunteer tester
Avatar

Send message
Joined: 16 Jun 05
Posts: 2531
Credit: 1,074,556
RAC: 0
Germany
Message 46195 - Posted: 3 Jun 2013, 8:51:01 UTC - in response to Message 46192.  

Something is definetly wrong.
Host stas on main

number of completed tasks 1
consecutive valid tasks 646
APR 7740

Link to host would help, please.


http://setiathome.berkeley.edu/results.php?hostid=5735690
With each crime and every kindness we birth our future.
ID: 46195 · Report as offensive
William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 606
Credit: 588,843
RAC: 0
Message 46196 - Posted: 3 Jun 2013, 8:52:52 UTC - in response to Message 46175.  

The right way to do this (and I indicated this to David a long time ago) is to use an estimate of the median rather than weighted averages, as medians are not strongly affected by outliers.

I could change the current code make an estimate of the running median...


Oooh, sounds much more robust :D

Now on the client side, having experimented with stabilising estimates for work fetch & task scheduling, I used a tuned PID controller for dead-reckoning with feedback. That made estimates much more stable, tuned for very slight overshoot for rapid convergence (system usage change etc) without ringing (sufficiently damped still).

If I switch from using (custom) per application DCF as the control (fudge factor), over to adaptive flops as suggested by Joe some time back, does the server receive the (application) flops value on each contact ? and, if so, could you possibly combine the more robust longer term median processing rate(s) with the flops using something like a Kalman filter ? My basis for thought there is why recalculate something the client already knows, if you don't have to. or alternatively if the calculations are in different time scales, combine (Kalman filter) them to get the best of both.

To me anyway, stable & adaptive estimates proved to solve a lot of problems... On a relatively fast system I typically see stable estimate convergence track system usage or hardware change on the order of minutes, as opposed to APR's days to weeks.


Since you mention it here - my idea was always, that APR shpuld be calculated with the same contril circuit you did for aDCF. You don't feel like doing a spot of server code, perhaps? ;)
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 46196 · Report as offensive
William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 606
Credit: 588,843
RAC: 0
Message 46197 - Posted: 3 Jun 2013, 8:58:15 UTC - in response to Message 46190.  

YAY - cuda32 is speeding up - APR is 101, and cuda50 only 97. So of course, cuda32 rules.

Doesn't it?


Could do. Still doesn't make David any better at statistics than me, which is pretty bad.

I still think David is using the completely wrong type of statistics for CreditNew.
With all modelling (and statistics are a type of modelling) you need to know the assupmtions and the limitations of the model.

I maintain that the nature of the data distribution here is not one where the type of statistical analysis David does can be used.

But that's only my mathematician's gut feeling :(
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 46197 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 46198 - Posted: 3 Jun 2013, 9:07:33 UTC - in response to Message 46196.  
Last modified: 3 Jun 2013, 9:20:16 UTC

Since you mention it here - my idea was always, that APR shpuld be calculated with the same contril circuit you did for aDCF. You don't feel like doing a spot of server code, perhaps? ;)


Since, by control theory, cascading different controllers in different time domains ( i.e. one slow tracking, one rapid/fine ) tends to be better than one, a server side longer term figure is fine (if done correctly & independently). Ideally the final control 'output' (currently raw APR) would instead be a fusion weighted by trust. You might for example, trust a client generating good results more than one spitting out invalids or errors. These host examples might then allow responsive estimates & conservative/overdamped response respectively.

Eric's move to median should help a lot. If after that, it proves too unresponsive when hosts up/downgrade, use machines heavily periodically while crunching, change number of tasks etc, I wouldn't mind taking a look.

The basic concept is the same as navigation systems using a fusion of erratic/noisy GPS readings with dead-reckoning. Neither on their own is perfect, but combined is stable, responsive and more accurate.
ID: 46198 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 46207 - Posted: 4 Jun 2013, 10:07:05 UTC

Well, I got my wish with the VLARs for cuda32. By the time the next 30 have run through, that version should be dead and buried.
ID: 46207 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 46220 - Posted: 4 Jun 2013, 21:21:50 UTC
Last modified: 4 Jun 2013, 21:27:46 UTC

Hm....
This host http://setiweb.ssl.berkeley.edu/beta/host_app_versions.php?hostid=39394 has such AP app table:

AstroPulse v6 6.06 windows_intelx86 (opencl_ati_100)
Number of tasks completed 83
Max tasks per day 163
Number of tasks today 34
Consecutive valid tasks 130
Average processing rate 618.37064112399
Среднее оборотное время 0.50 days
AstroPulse v6 6.06 windows_intelx86 (ati_opencl_100)
Number of tasks completed 2
Max tasks per day 45
Number of tasks today 0
Consecutive valid tasks 12
Average processing rate 742.57870281709
Среднее оборотное время 2.74 days

AstroPulse v6 6.06 windows_intelx86 (cal_ati)
Number of tasks completed 7
Max tasks per day 40
Number of tasks today 0
Consecutive valid tasks 7
Average processing rate 56.793048610545
Среднее оборотное время 1.12 days


Note, not fastest app recives all the work now.
Fastest (by current APR) app can't even collect enough tasks to pass 10 eligible tasks threshold!
Is it normal? I think no. All compatible apps shoudl get their 10 eligibles before the best will imprinted in server's mind, right ?

EDIT: also note that Mv7 GPU tasks were relatively fast and host was configured with big cache. So, it had opportunity to recive tasks for all apps just because of quota limits. Here, with AP, task takes longer, cache was reduced (intentionally) so we have some another situation to test. And looks like test not passed OK :/
ID: 46220 · Report as offensive
juan BFB
Volunteer tester

Send message
Joined: 5 May 13
Posts: 2
Credit: 390,921
RAC: 0
Brazil
Message 46227 - Posted: 5 Jun 2013, 10:51:49 UTC - in response to Message 46171.  
Last modified: 5 Jun 2013, 10:52:07 UTC

perhaps VLARs should be disabled for GPUs again.
Very negative attitude on SETI main boards to VLARs on GPU, even on ATi GPUs though NV GPUs mentioned more often.
It's worth to distribute VLAR to GPU only if GPU is idle and server can't offer another work. Kind of "backup work". In other way GPU will be idle or drift to another project. If it's possible to implement such logic it's worth to do. If not maybe worth to disable VLARs again.

+1

A simple <VlartoGPU>0|1<VlartoGPU> switch in the client side, could allow us to choose if we want or no crunch the VLARs on the GPU´s, and solve a lot of problems.
ID: 46227 · Report as offensive
TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 28 Jan 11
Posts: 619
Credit: 2,580,051
RAC: 0
Sweden
Message 46229 - Posted: 5 Jun 2013, 14:33:11 UTC - in response to Message 46227.  

perhaps VLARs should be disabled for GPUs again.
Very negative attitude on SETI main boards to VLARs on GPU, even on ATi GPUs though NV GPUs mentioned more often.
It's worth to distribute VLAR to GPU only if GPU is idle and server can't offer another work. Kind of "backup work". In other way GPU will be idle or drift to another project. If it's possible to implement such logic it's worth to do. If not maybe worth to disable VLARs again.

+1

A simple <VlartoGPU>0|1<VlartoGPU> switch in the client side, could allow us to choose if we want or no crunch the VLARs on the GPU´s, and solve a lot of problems.


Or maybe make Vlar's more attractive.

.vlar=3 times credit
ID: 46229 · Report as offensive
juan BFB
Volunteer tester

Send message
Joined: 5 May 13
Posts: 2
Credit: 390,921
RAC: 0
Brazil
Message 46231 - Posted: 5 Jun 2013, 16:12:48 UTC - in response to Message 46229.  
Last modified: 5 Jun 2013, 16:14:14 UTC

perhaps VLARs should be disabled for GPUs again.
Very negative attitude on SETI main boards to VLARs on GPU, even on ATi GPUs though NV GPUs mentioned more often.
It's worth to distribute VLAR to GPU only if GPU is idle and server can't offer another work. Kind of "backup work". In other way GPU will be idle or drift to another project. If it's possible to implement such logic it's worth to do. If not maybe worth to disable VLARs again.

+1

A simple <VlartoGPU>0|1<VlartoGPU> switch in the client side, could allow us to choose if we want or no crunch the VLARs on the GPU´s, and solve a lot of problems.


Or maybe make Vlar's more attractive.

.vlar=3 times credit


I like that, but is not just about credits, the main problem is the video lag the Vlars crunching produces in some not dedicated crunching hosts, even with changes in the configuration file, specialy when more than one Vlar is crunching on multiple GPU hosts that makes the host simple unuseable to do other simple tasks.

Maybe limit the number of vlars allowed to run simultaneusly on the host could fix that problem on fasters GPUs, but not sure if that could work fine on the slowers models.
ID: 46231 · Report as offensive
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 · Next

Message boards : News : Tests of new scheduler features.


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.