Message boards :
News :
Tests of new scheduler features.
Message board moderation
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
perhaps VLARs should be disabled for GPUs again. Very negative attitude on SETI main boards to VLARs on GPU, even on ATi GPUs though NV GPUs mentioned more often. It's worth to distribute VLAR to GPU only if GPU is idle and server can't offer another work. Kind of "backup work". In other way GPU will be idle or drift to another project. If it's possible to implement such logic it's worth to do. If not maybe worth to disable VLARs again. |
![]() Send message Joined: 15 Mar 05 Posts: 1547 Credit: 27,183,456 RAC: 0 ![]() |
Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu: That's odd, since I haven't updated anything related to AP over there. ![]() |
Send message Joined: 11 Dec 08 Posts: 198 Credit: 658,573 RAC: 0 ![]() |
The right way to do this (and I indicated this to David a long time ago) is to use an estimate of the median rather than weighted averages, as medians are not strongly affected by outliers. Oooh, sounds much more robust :D Now on the client side, having experimented with stabilising estimates for work fetch & task scheduling, I used a tuned PID controller for dead-reckoning with feedback. That made estimates much more stable, tuned for very slight overshoot for rapid convergence (system usage change etc) without ringing (sufficiently damped still). If I switch from using (custom) per application DCF as the control (fudge factor), over to adaptive flops as suggested by Joe some time back, does the server receive the (application) flops value on each contact ? and, if so, could you possibly combine the more robust longer term median processing rate(s) with the flops using something like a Kalman filter ? My basis for thought there is why recalculate something the client already knows, if you don't have to. or alternatively if the calculations are in different time scales, combine (Kalman filter) them to get the best of both. To me anyway, stable & adaptive estimates proved to solve a lot of problems... On a relatively fast system I typically see stable estimate convergence track system usage or hardware change on the order of minutes, as opposed to APR's days to weeks. |
![]() Send message Joined: 28 Jan 11 Posts: 619 Credit: 2,580,051 RAC: 0 ![]() |
Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu: I have a couple of more with low credits on main. http://setiathome.berkeley.edu/workunit.php?wuid=1257047105 http://setiathome.berkeley.edu/workunit.php?wuid=1257039458 |
Send message Joined: 29 May 06 Posts: 1037 Credit: 8,440,339 RAC: 0 ![]() |
Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu: I wouldn't worry about that, we need to populate the project's app version Peak Flop Count Average' for all app versions, that has probably happened for GPU versions already, it'll be a few days before it's done for the CPU AP app, also none of those hosts had reached their 10 validations yet, Claggy |
![]() Send message Joined: 28 Jan 11 Posts: 619 Credit: 2,580,051 RAC: 0 ![]() |
Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu: Now I see, it's wingman that needs 10 validations. |
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
YAY - cuda32 is speeding up - APR is 101, and cuda50 only 97. So of course, cuda32 rules. Doesn't it? |
Send message Joined: 11 Dec 08 Posts: 198 Credit: 658,573 RAC: 0 ![]() |
YAY - cuda32 is speeding up - APR is 101, and cuda50 only 97. So of course, cuda32 rules. Could do. Still doesn't make David any better at statistics than me, which is pretty bad. |
![]() ![]() Send message Joined: 16 Jun 05 Posts: 2531 Credit: 1,074,556 RAC: 0 ![]() |
Something is definetly wrong. Host stas on main number of completed tasks 1 consecutive valid tasks 646 APR 7740 With each crime and every kindness we birth our future. |
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
Something is definetly wrong. Link to host would help, please. |
Send message Joined: 11 Dec 08 Posts: 198 Credit: 658,573 RAC: 0 ![]() |
Here's an interesting one on main. http://setiathome.berkeley.edu/show_host_detail.php?hostid=6739873 Looks to me like if 'completed tasks' stays zero, i.e. it's spitting out only invalids, then max tasks per day never goes down ? |
![]() ![]() Send message Joined: 16 Jun 05 Posts: 2531 Credit: 1,074,556 RAC: 0 ![]() |
Something is definetly wrong. http://setiathome.berkeley.edu/results.php?hostid=5735690 With each crime and every kindness we birth our future. |
![]() Send message Joined: 14 Feb 13 Posts: 606 Credit: 588,843 RAC: 0 |
The right way to do this (and I indicated this to David a long time ago) is to use an estimate of the median rather than weighted averages, as medians are not strongly affected by outliers. Since you mention it here - my idea was always, that APR shpuld be calculated with the same contril circuit you did for aDCF. You don't feel like doing a spot of server code, perhaps? ;) A person who won't read has no advantage over one who can't read. (Mark Twain) |
![]() Send message Joined: 14 Feb 13 Posts: 606 Credit: 588,843 RAC: 0 |
YAY - cuda32 is speeding up - APR is 101, and cuda50 only 97. So of course, cuda32 rules. I still think David is using the completely wrong type of statistics for CreditNew. With all modelling (and statistics are a type of modelling) you need to know the assupmtions and the limitations of the model. I maintain that the nature of the data distribution here is not one where the type of statistical analysis David does can be used. But that's only my mathematician's gut feeling :( A person who won't read has no advantage over one who can't read. (Mark Twain) |
Send message Joined: 11 Dec 08 Posts: 198 Credit: 658,573 RAC: 0 ![]() |
Since you mention it here - my idea was always, that APR shpuld be calculated with the same contril circuit you did for aDCF. You don't feel like doing a spot of server code, perhaps? ;) Since, by control theory, cascading different controllers in different time domains ( i.e. one slow tracking, one rapid/fine ) tends to be better than one, a server side longer term figure is fine (if done correctly & independently). Ideally the final control 'output' (currently raw APR) would instead be a fusion weighted by trust. You might for example, trust a client generating good results more than one spitting out invalids or errors. These host examples might then allow responsive estimates & conservative/overdamped response respectively. Eric's move to median should help a lot. If after that, it proves too unresponsive when hosts up/downgrade, use machines heavily periodically while crunching, change number of tasks etc, I wouldn't mind taking a look. The basic concept is the same as navigation systems using a fusion of erratic/noisy GPS readings with dead-reckoning. Neither on their own is perfect, but combined is stable, responsive and more accurate. |
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
Well, I got my wish with the VLARs for cuda32. By the time the next 30 have run through, that version should be dead and buried. |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
Hm.... This host http://setiweb.ssl.berkeley.edu/beta/host_app_versions.php?hostid=39394 has such AP app table: AstroPulse v6 6.06 windows_intelx86 (opencl_ati_100) Number of tasks completed 83 Max tasks per day 163 Number of tasks today 34 Consecutive valid tasks 130 Average processing rate 618.37064112399 Среднее оборотное Ð²Ñ€ÐµÐ¼Ñ 0.50 days AstroPulse v6 6.06 windows_intelx86 (ati_opencl_100) Number of tasks completed 2 Max tasks per day 45 Number of tasks today 0 Consecutive valid tasks 12 Average processing rate 742.57870281709 Среднее оборотное Ð²Ñ€ÐµÐ¼Ñ 2.74 days AstroPulse v6 6.06 windows_intelx86 (cal_ati) Number of tasks completed 7 Max tasks per day 40 Number of tasks today 0 Consecutive valid tasks 7 Average processing rate 56.793048610545 Среднее оборотное Ð²Ñ€ÐµÐ¼Ñ 1.12 days Note, not fastest app recives all the work now. Fastest (by current APR) app can't even collect enough tasks to pass 10 eligible tasks threshold! Is it normal? I think no. All compatible apps shoudl get their 10 eligibles before the best will imprinted in server's mind, right ? EDIT: also note that Mv7 GPU tasks were relatively fast and host was configured with big cache. So, it had opportunity to recive tasks for all apps just because of quota limits. Here, with AP, task takes longer, cache was reduced (intentionally) so we have some another situation to test. And looks like test not passed OK :/ |
Send message Joined: 5 May 13 Posts: 2 Credit: 390,921 RAC: 0 ![]() |
perhaps VLARs should be disabled for GPUs again. +1 A simple <VlartoGPU>0|1<VlartoGPU> switch in the client side, could allow us to choose if we want or no crunch the VLARs on the GPU´s, and solve a lot of problems. |
![]() Send message Joined: 28 Jan 11 Posts: 619 Credit: 2,580,051 RAC: 0 ![]() |
perhaps VLARs should be disabled for GPUs again. Or maybe make Vlar's more attractive. .vlar=3 times credit |
Send message Joined: 5 May 13 Posts: 2 Credit: 390,921 RAC: 0 ![]() |
perhaps VLARs should be disabled for GPUs again. I like that, but is not just about credits, the main problem is the video lag the Vlars crunching produces in some not dedicated crunching hosts, even with changes in the configuration file, specialy when more than one Vlar is crunching on multiple GPU hosts that makes the host simple unuseable to do other simple tasks. Maybe limit the number of vlars allowed to run simultaneusly on the host could fix that problem on fasters GPUs, but not sure if that could work fine on the slowers models. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.