Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/boinc_db.inc on line 147
Tests of new scheduler features.

Tests of new scheduler features.

Message boards : News : Tests of new scheduler features.
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 17 · Next

AuthorMessage
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 46138 - Posted: 30 May 2013, 18:15:10 UTC - in response to Message 46135.  

You're lucky. My home desktop hasn't gotten any work, or asked for any. Sometimes I wonder what BOINC 7 is thinking.
ID: 46138 · Report as offensive
Profile Byron Leigh Hatch @ team Carl ...
Volunteer tester
Avatar

Send message
Joined: 15 Jun 05
Posts: 970
Credit: 1,495,169
RAC: 0
Canada
Message 46139 - Posted: 30 May 2013, 20:06:24 UTC

Hello everyone,

Just now successfully crunched ... my first SETI@home Application Version 7 WorkUnit

on my old, slow, WXP, Intel Box ... (Circa 2004)

... over at Main:

as Eric K. said:

Many thanks to all the folks over at ... KWSN / Lunatics for getting this done and especially ... to Raistmer, Jason, Josef, Urs, Claggy, Mike, Richard, and too many beta testers to name.

Well done everyone!


Stderr output
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_v7 7.00 DevC++/MinGW/g++ 4.5.2
libboinc: 7.1.0

Work Unit Info:
...............
WU true angle range is : 0.376255
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrumUnrolled 0.000222 0.00000
sse2_ChirpData_ak8 0.014452 0.00000
v_vTranspose4x16ntw 0.007808 0.00000
BH SSE folding 0.001742 0.00000
Restarted at 37.91 percent.
Restarted at 66.65 percent.
Restarted at 84.21 percent.
Restarted at 97.51 percent.

Flopcounter: 52844596891914.078000

Spike count: 0
Autocorr count: 0
Pulse count: 1
Triplet count: 0
Gaussian count: 1
11:56:21 (5424): called boinc_finish

</stderr_txt>
]]>

''When Johannes Kepler found his long-cherished belief did not agree with the most precise observation,
he accepted the uncomfortable fact.
He preferred the hard truth to his dearest illusions,

that is the heart of science"
... Carl Sagan

Well done everyone!
Byron
ID: 46139 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 18 Jan 06
Posts: 1038
Credit: 18,734,730
RAC: 0
Germany
Message 46144 - Posted: 31 May 2013, 17:03:39 UTC - in response to Message 45884.  
Last modified: 31 May 2013, 17:04:07 UTC

As a reminder, the first 9 CAL targets are

1 = ATI Radeon HD 2900 (RV600)
2 = ATI Radeon HD 2300/2400/3200/4200 (RV610)
3 = ATI Radeon HD 2600/3650 (RV630/RV635)
4 = ATI Radeon HD 3800 (RV670)
5 = ATI Radeon HD 4350/4550 (R710)
6 = ATI Radeon HD 4600 series (R730)
7 = ATI Radeon (RV700 class)
8 = ATI Radeon HD 4700/4800 (RV740/RV770)
9 = ATI Radeon HD 5800 series (Cypress)

Hi Eric,
Did you forget to set the opencl_ati5 plan classes to capability 8+ on main ?

See http://setiathome.berkeley.edu/forum_thread.php?id=71810&postid=1374253
and his hosts result list http://setiathome.berkeley.edu/results.php?hostid=5879391
_\|/_
U r s
ID: 46144 · Report as offensive
Father Ambrose
Volunteer tester

Send message
Joined: 1 May 07
Posts: 556
Credit: 6,470,846
RAC: 0
United Kingdom
Message 46145 - Posted: 31 May 2013, 18:19:40 UTC
Last modified: 31 May 2013, 18:21:11 UTC

I have just suspended 40 opencl_ati5_sah HD4600 downloaded this afternoon on Main.

I do not want to abort them at the moment if I can get away with a reset project.

Also received a batch of cuda 50 on the other host.

Michael.
ID: 46145 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 46146 - Posted: 31 May 2013, 19:07:06 UTC - in response to Message 46144.  
Last modified: 31 May 2013, 19:08:17 UTC

I thought the recommendations was cal target 6+ for the ati5 versions. Did I miss a message somewhere?

Must have missed it. I'll set it to 8+
ID: 46146 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 18 Jan 06
Posts: 1038
Credit: 18,734,730
RAC: 0
Germany
Message 46147 - Posted: 31 May 2013, 20:15:42 UTC - in response to Message 46146.  
Last modified: 31 May 2013, 20:16:04 UTC

I thought the recommendations was cal target 6+ for the ati5 versions. Did I miss a message somewhere?

Must have missed it. I'll set it to 8+

cal target 6+ was ment only for use here at Beta and only for temporary experimentation purposes, if i recall that discussion correct.
_\|/_
U r s
ID: 46147 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 46151 - Posted: 31 May 2013, 22:46:55 UTC - in response to Message 46147.  

All the ati5 versions at the main project are now cal target 8+
ID: 46151 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 46153 - Posted: 31 May 2013, 23:35:14 UTC

Fine! And how about NV AP for Linux here ?
ID: 46153 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 46155 - Posted: 1 Jun 2013, 0:10:32 UTC - in response to Message 46153.  

I hope to have it and the latest AMD version released today.
ID: 46155 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 46157 - Posted: 1 Jun 2013, 10:01:35 UTC

I made it! Cuda32 now has a higher APR than cuda42 for my Kepler, and was chosen above cuda50 for the latest work fetch.

Application details for host 63280
ID: 46157 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 46159 - Posted: 1 Jun 2013, 12:07:32 UTC - in response to Message 46157.  

I made it! Cuda32 now has a higher APR than cuda42 for my Kepler, and was chosen above cuda50 for the latest work fetch.

Application details for host 63280


I must have looked too late, Cuda5 says highest APR there at the moment :)
ID: 46159 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 46160 - Posted: 1 Jun 2013, 13:25:39 UTC - in response to Message 46159.  

I made it! Cuda32 now has a higher APR than cuda42 for my Kepler, and was chosen above cuda50 for the latest work fetch.

Application details for host 63280

I must have looked too late, Cuda5 says highest APR there at the moment :)

So what? I never said it wasn't. I only stated cuda32 was higher than cuda42 - I didn't make any comparison with cuda 50.

But now you come to mention it, you looked too soon. Here's a full show as at the time I started to type this post (13:20 UTC)

SETI@home v7 7.00 windows_intelx86 (cuda32)
Number of tasks completed 102
Max tasks per day 131
Number of tasks today 11
Consecutive valid tasks 104
Average processing rate 96.845328974813
Average turnaround time 0.63 days

SETI@home v7 7.00 windows_intelx86 (cuda42)
Number of tasks completed 338
Max tasks per day 106
Number of tasks today 8
Consecutive valid tasks 73
Average processing rate 86.39745140371
Average turnaround time 0.78 days

SETI@home v7 7.00 windows_intelx86 (cuda50)
Number of tasks completed 963
Max tasks per day 1012
Number of tasks today 7
Consecutive valid tasks 988
Average processing rate 95.436230768859
Average turnaround time 0.74 days
ID: 46160 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 46162 - Posted: 1 Jun 2013, 14:18:08 UTC - in response to Message 46160.  
Last modified: 1 Jun 2013, 14:22:37 UTC

Yep different numbers than when I looked (which is certainly a factor in trying to work out if the mechanism is remotely working).

With nearly 1000 consecutive valid on Cuda5, are you suggesting the average processing rate for that is more or less accurate than the ~100 & ~300 for the lower Cuda revisions ?

With a given more or less random mix of tasks in a large enough population, which APR is correct ? The measured one or one concocted from a synthetic benchmark?

I'm not attempting to answer those questions myself, other than to suggest perhaps 100-300 tasks for a given app version isn't enough AR spread to dial in.

How many would make averages relatively stable, and what would happen if you upgraded to 2 x classified 780 water cooled, or downgraded to an 8400GS ? .... and should the system handle that without reset of some sort ?
ID: 46162 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 46163 - Posted: 1 Jun 2013, 15:09:21 UTC - in response to Message 46162.  

It's all a bit complicated. Before we start, I suggest you try and skim the VLAR conversation I've been having with Eric, ever since his release announcement in message 46039.

Basically, the 'dialling-in' process relies - critically - on the fpops/time curve being reasonably accurate over all ARs. The original curve that Josef and I researched all those years ago was for CPU tasks only: it has been updated by Eric to compensate with an additional (non-linear) component for autocorrs (I think he added the fudge-factor, instead of multiplying by it).

That means that the perceived speed of computation, as averaged into APR, is good for all CPU builds and tasks (E & OE), and good for CUDA builds at mid-AR through VHAR. When I started this test, the APRs were (as recorded in message 46045):

cuda50: 163
cuda42: 149
cuda32: 96

which is pretty reasonable. Maybe 163 is a little high for cuda50 - it had been pushed that way by a recent shorty storm (VHAR), but cuda42 hadn't - but I've been watching, and I'd say it's working, once the initial boundary conditions have been left behind.

But as we found at the very beginning, APR can be skewed by outliers - the dreaded 30/30 AP storm, if you remember.

VLAR on Kepler haven't been treated as outliers, but perhaps they should - they certainly behave like outliers. With a runtime of the order of x5 the time predicted by the calibration curve, APR plummets - as I predicted to Eric.

Both cuda50 and cuda42 have run long, contiguous, blocks of VLAR - cuda50 as the result of deliberate micro-management by me, cuda42 by the natural run of work allocation. That's what has driven both APRs southwards - cuda42 faster than cuda50, because (as you rightly note) it has fewer completed tasks incorporated in the average. A 'young' card or app_version will always be more dynamic than an 'older' one (like teenagers anywhere, I suspect) - 'mature' cards, with 10,000 or 100,000 tasks completed, will be much more stable.
ID: 46163 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 46164 - Posted: 1 Jun 2013, 16:03:44 UTC - in response to Message 46163.  
Last modified: 1 Jun 2013, 16:05:17 UTC

...
But as we found at the very beginning, APR can be skewed by outliers - the dreaded 30/30 AP storm, if you remember.

VLAR on Kepler haven't been treated as outliers, but perhaps they should - they certainly behave like outliers. With a runtime of the order of x5 the time predicted by the calibration curve, APR plummets - as I predicted to Eric.


See my Engineering perspective suggests this: "There Are no Runtime Outliers" ... they are tasks. i.e. they take what they take.

That fact is that these 'runtime outliers' break the estimate system (which we knew was already won6ky and controls task scheduling & all manner of work fetch issues)

Remove the artificially imposed limits (at 10x then kill a task) and award credit by an absolute figure, or alternatively some scale depending on6 project total throughput.

Hard limits are meant as fail-safes. If they come into normal operation then they introduce instability in the system. An unstable system inherently will not stabilise.

It's the 'control freak' factor again, which says if you hold your pet hamster too tight you will kill it.
ID: 46164 · Report as offensive
Josef W. Segur
Volunteer tester

Send message
Joined: 14 Oct 05
Posts: 1137
Credit: 1,848,733
RAC: 0
United States
Message 46165 - Posted: 1 Jun 2013, 18:50:21 UTC - in response to Message 46163.  

Richard Haselgrove wrote:
...
Both cuda50 and cuda42 have run long, contiguous, blocks of VLAR - cuda50 as the result of deliberate micro-management by me, cuda42 by the natural run of work allocation. That's what has driven both APRs southwards - cuda42 faster than cuda50, because (as you rightly note) it has fewer completed tasks incorporated in the average. A 'young' card or app_version will always be more dynamic than an 'older' one (like teenagers anywhere, I suspect) - 'mature' cards, with 10,000 or 100,000 tasks completed, will be much more stable.

The exponential average underlying APR has a fixed 0.01 factor (once the app version has 20 "completed"), and the host pfc average does the same. That is in effect a half-life of 69.4 non-outlier task validations. For high end GPU work that makes a fairly volatile APR unless the work is exceptionally well spread across angle ranges. For legacy CPU work it is extremely slow to adapt.

OTOH, the code for choosing the "best" app version is affected by larger counts, so once there are 10,000 or more completed whichever app version has the higher APR at the time of a work request will have a very high probability of being chosen.
                                                                  Joe
ID: 46165 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 29 May 06
Posts: 1037
Credit: 8,440,339
RAC: 0
United Kingdom
Message 46166 - Posted: 1 Jun 2013, 19:34:16 UTC
Last modified: 1 Jun 2013, 19:37:43 UTC

After only ever getting Cuda5 Wu's for my GTX460, I aborted them all to drive my Max Tasks per Day low enough (down to 3), so I couldn't get any more Cuda5 Wu's,
when I tried again (it was a CPU & Nvidia & AMD request) I got AMD OpenCL tasks (AMD app version's APRs are higher than non-existent Cuda32 and Cuda42 app version's APRs), Ahhhhhh,
so I remove some of those Wu's from my client_state.xml and set my preferences so Boinc could only do a Nvidia work request, now I get Cuda32 app and Wu's, and next request gets me Cuda42 app and Wu's,
after removing the remainder of those recently sent AMD Wu's from my client_state.xml, I got some of the remainder resent, before the server expires the rest, that is annoying too.

All tasks for computer 5427475

Claggy
ID: 46166 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 46167 - Posted: 1 Jun 2013, 19:42:17 UTC - in response to Message 46165.  

...
OTOH, the code for choosing the "best" app version is affected by larger counts, so once there are 10,000 or more completed whichever app version has the higher APR at the time of a work request will have a very high probability of being chosen.
                                                                  Joe

So, now my cuda50 APR is below my cuda32 APR, I need to pray for a block of cuda32 VLARs (ugh!) to let me pick cuda50 again?
ID: 46167 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 46168 - Posted: 1 Jun 2013, 20:37:19 UTC - in response to Message 46167.  

The right way to do this (and I indicated this to David a long time ago) is to use an estimate of the median rather than weighted averages, as medians are not strongly affected by outliers.

I could change the current code make an estimate of the running median...
ID: 46168 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 29 May 06
Posts: 1037
Credit: 8,440,339
RAC: 0
United Kingdom
Message 46170 - Posted: 1 Jun 2013, 21:28:27 UTC - in response to Message 46168.  
Last modified: 1 Jun 2013, 21:45:37 UTC

Have you heard there's odd Credit awards going on for Astropulse v6 now at the Main project, around 15 to 25 Credits per AP Wu:

http://setiathome.berkeley.edu/forum_thread.php?id=71827&postid=1374823

Valid AstroPulse v6 tasks for computer 6910524

I grabbed some Stock OpenCL AP work, and got very low awarded Credit too:

All AstroPulse v6 tasks for computer 5427475

Claggy
ID: 46170 · Report as offensive
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 17 · Next

Message boards : News : Tests of new scheduler features.


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.