Message boards :
Number crunching :
Monitoring inconclusive GBT validations and harvesting data for testing
Message board moderation
Previous · 1 . . . 31 · 32 · 33 · 34 · 35 · 36 · Next
Author | Message |
---|---|
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Because of overflow IMHO Okay, thanks. And I guess what I'm also seeing, from looking at a few result files from apps other than Cuda, is that even though a "score" value is reported on the Stderr, it never actually gets stored in the <score> field in the result file, which explains why they all seem to be 0, regardless of the app. Another bit of trivia to occupy my dwindling supply of brain cells. ;^) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
So what version restrictions should be for plan classes? It should be the same as current. The only question is if the nVidia App will work down to Darwin 11.4.2 since it is using the Intel path. According to the Configure File you need Darwin 13.x for the Intel App, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/ConfigureOSX_AKv8d_OPENCL_SSE3_MBv8.txt. Someone will just have to try it. I'm more concerned about it working in Darwin 15.x and above. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Do any use of SoG path? If yes please rebuild with today's r3556. It has improved overflow handling so better start beta with it included. I see r3552 in that pack. SETI apps news We're not gonna fight them. We're gonna transcend them. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
All of the OSX Apps were compiled with version r3551. They have different numbers so they have different Wisdom files. None of them use the SoG path. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
All of the OSX Apps were compiled with version r3551. They have different numbers so they have different Wisdom files. None of them use the SoG path. ok, thanks. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
None of the 6 Inconclusives showing up for r3556 on my list this evening appear to be of any significance. And, on a positive note, as soon as I identified each of the following two incoming Inconclusive WUs on the list, I went ahead and promptly ran my r3556 tiebreaker task. In both cases, r3556 agreed with the stock CPU app and not r3528, basically dropping a Pulse in favor of an additional Autocorr. Workunit 2316599294 (blc3_2bit_guppi_57451_27034_HIP69732_0025.22902.416.17.26.74.vlar) Task 5266041033 (S=1, A=29, P=0, T=0, G=0) v8.00 windows_intelx86 Task 5266041034 (S=1, A=28, P=1, T=0, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 Workunit 2317552749 (blc3_2bit_guppi_57451_28083_HIP69824_OFF_0028.14923.0.18.27.141.vlar) Task 5268056029 (S=0, A=29, P=1, T=0, G=0) v8.00 windows_intelx86 Task 5268056030 (S=0, A=28, P=2, T=0, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 Off to a good start. :^) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13778 Credit: 208,696,464 RAC: 304 |
Another late overflow disagreement. blc3_2bit_guppi_57451_27034_HIP69732_0025.22306.831.18.27.124.vlar Me Spike count: 1 Autocorr count: 0 Pulse count: 28 Triplet count: 1 Gaussian count: 0 The others Spike count: 20 Autocorr count: 0 Pulse count: 9 Triplet count: 1 Gaussian count: 0 Spike count: 20 Autocorr count: 0 Pulse count: 9 Triplet count: 1 Gaussian count: 0 Grant Darwin NT |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
All of the OSX Apps were compiled with version r3551. They have different numbers so they have different Wisdom files. None of them use the SoG path. So... I built an ATI version of SoG from r3556 and it appears to be a little faster than the non-SoG version. I tried it with the Intel path and I'm receiving an Error on Kernel build, this happens on the ATI card; Build features: SETI8 Non-graphics OpenCL USE_OPENCL_INTEL OCL_ZERO_COPY SIGNALS_ON_GPU OCL_CHIRP3 FFTW JSPF SSSE3 64bit OpenCL-kernels filename : MultiBeam_Kernels_r3557.cl On the NV cards it's not leaving a stderr, but saying, ./benchmark: line 145: 589800 / 0: division by 0 (error token is "0") I tried it without using the benchmark App and it says; OpenCL-kernels filename : MultiBeam_Kernels_r3557.cl It works fine with the ATI HD5 build. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
r3556 didn't make friends with GT470 so builds updated: https://cloud.mail.ru/public/HwkP/tvN9YmvVp SETI apps news We're not gonna fight them. We're gonna transcend them. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
These three Inconclusive overflow WUs might be worth watching. On each, the r3556 on my host disagreed with a host running r3528. The potential tiebreakers are assigned as stock Windows CPU tasks on what appear to be reliable hosts. Workunit 2318052329 (18no09ai.8385.9065.6.33.251) Task 5269109797 (S=24, A=6, P=0, T=0, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 Task 5269109798 (S=25, A=5, P=0, T=0, G=0) SSE3xj Win32 Build 3556 Workunit 2318310141 (17dc09ab.4125.21340.12.39.4) Task 5269651782 (S=14, A=13, P=0, T=3, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 Task 5269651783 (S=16, A=11, P=0, T=3, G=0) SSE3xj Win32 Build 3556 Workunit 2318633990 (blc2_2bit_guppi_57424_80118_HIP9480_0003.6148.416.17.26.240.vlar) Task 5270333468 (S=21, A=0, P=7, T=2, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 Task 5270333469 (S=30, A=0, P=0, T=0, G=0) SSE3xj Win32 Build 3556 EDIT: I also have two more examples where a tiebreaker r3556 task that ran on one of my hosts for an Inconclusive overflow WU matched a stock CPU app result (one Windows, one Linux), rather than the r3528 result. (See WUs for details.) Workunit 2318235909 (17dc09ab.20636.16841.11.38.248) Task 5269495978 (S=28, A=2, P=0, T=0, G=0) v8.00 windows_intelx86 Task 5269495979 (S=27, A=3, P=0, T=0, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 Workunit 2318768911 (18no09ai.27410.13564.15.42.0) Task 5270620630 (S=29, A=0, P=0, T=1, G=0) v8.00 x86_64-pc-linux-gnu Task 5270620631 (S=25, A=0, P=4, T=1, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 Still looking very good and consistent in matching the stock apps, albeit only 2 days in. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
... As a side note to the petri special features, in digging into potential reasons for my test Windows build host 'freaking out', I stumbled across the new telemetry debacle surrounding newish nVidia drivers. Frankly, as a developer, I'm ropable (and I believe most nVidia users should be too) While I doubt the specific instability in my case is directly related to the said telemetry injection, With some boinc and OS/driver components known to be 'magic number' time sensitive, There is a small chance the new data mining features induce faults in a way similar to BoincApi's near useless stderr crash reports... i.e. looking for internet resources when it shouldn't be, inducing unexpected delays. [and of course a layer of complexity that isn't supposed to be there...] I've used the reddit/Sysinternals Autoruns workaround to disable telemetry, now to see if this Datamining driver/code injection has compromised otherwise working code. If it proves to be the case, you may find this developer invoicing nVidia for lost time. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
I suppose that first I should finally get around to updating my copy of Autoruns (v9.02, (c) 2007). I was thinking I did that last year, but guess I never got a round tuit. :^) I don't see anything in those NVIDIA telemetry articles that specify exactly what the "latest" drivers are that they're so worked up about. However, since the newest on any of my boxes is 359.00, I don't think I'll need to worry for awhile. Wouldn't you be able to rule out the telemetry as a possible cause of those differing Pulse periods simply by an offline run of those WUs with the special app under different conditions? I'm pretty sure Raistmer saved the WU from the first one I posted a few weeks ago, and I've got this latest one squirreled away, if you want it. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I suppose that first I should finally get around to updating my copy of Autoruns (v9.02, (c) 2007). I was thinking I did that last year, but guess I never got a round tuit. :^) The main reference is in This reddit thread which is discussing This Article on major geeks. In my special case (previously perfectly fine running old Core2Duo driving an obviously unbalanced GTX 980) I'd say I can't rule anything out just yet. The reason a 'laboratory' offline run may not reproduce such faults is simply that the conditions are different to live running (nothing more). 'normally' Boinc would be doing network stuff, and I'd probably be watching youtube or twitch streams. Laboratory conditions prove laboratory conditions work, as opposed to throwing in extra variables. As soon as we introduce hidden telemetry, then we introduce undocumented unknowns, so variables into the mix. I've yet to see a driver reset since the autoruns workaround described, but we'll see. It could be a case of m$ malevolent narcissism having finally mutated into cancer. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I added the ATI/AMD HD5 SoG App to that pack. It is about 3-5% faster on my machine, but, it still has the Progress bar anomaly. It was compiled with the MacOSX10.9.sdk so probably needs Darwin 13.x to work. The r3556 build works slightly better than r3557. I still haven't been able to compile a working Intel App with the SoG path even with r3557. My guess is the SoG path doesn't work with the Intel iGPU path...on a Mac anyway. |
rob smith Send message Joined: 7 Mar 03 Posts: 22300 Credit: 416,307,556 RAC: 380 |
Two "validation errors", one waiting, and one just sent out... http://setiathome.berkeley.edu/workunit.php?wuid=2316673486 Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I'm afraid this particular issue simplier. just lack of required synchronisation in result reduction. So, time to time it picks the wrong one. i know that some ways of parallelization in pulsefind very tempting but proper synching kills benefits then. But it should be fixed anyway. Non-overflows are affected too. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Two "validation errors", one waiting, and one just sent out... these are server-related, not too interesting in this thread context. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
On Windows too. Please do full rebuild with latest r3557 to avoid any of issues that this last rev fixing. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
I don't know that there's anything to be learned from this one, but it's hard to ignore a quintuple Inconclusive. Workunit 2262356860 (blc4_2bit_guppi_57449_42556_HIP78775_0009.14621.416.18.27.57.vlar) Task 5151207901 (S=23, A=0, P=7, T=0, G=0) SSE3xj Win32 Build 3500 Task 5265978903 (S=12, A=0, P=18, T=0, G=0) v8.19 (opencl_ati5_cat132) windows_intelx86 Task 5267523392 (S=19, A=0, P=11, T=0, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 Task 5268933831 (S=26, A=0, P=4, T=0, G=0) v8.00 (opencl_intel_gpu_sah) x86_64-apple-darwin Task 5270421343 (S=12, A=0, P=18, T=0, G=0) v8.19 (opencl_nvidia_SoG) windows_intelx86 My host is the first one, which ran r3500 way back on 12 Sep, but my original wingman finally timed out last Friday. Since then, one host per day has been added to the Inconclusive gaggle. The latest potential tiebreaker is set to run with the stock Windows CPU app, so hopefully it will match with something. I just have no idea which one! |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
I don't know that there's anything to be learned from this one, but it's hard to ignore a quintuple Inconclusive. Crappola! The Windows CPU app didn't match any of them, with (S=25, A=0, P=5, T=0, G=0), and now the WU is marked as "Too many results (may be nondeterministic)". Each individual task ended up with a status of "Completed, can't validate", with 0.00 credit for all their wasted effort. 5151207901 8064262 12 Sep 2016, 1:18:42 UTC 12 Sep 2016, 20:37:05 UTC Completed, can't validate 1,345.85 196.03 0.00 SETI@home v8 Anonymous platform (NVIDIA GPU) 5265978903 8103780 4 Nov 2016, 10:37:14 UTC 4 Nov 2016, 20:43:10 UTC Completed, can't validate 1,091.56 213.80 0.00 SETI@home v8 v8.19 (opencl_ati5_cat132) windows_intelx86 5267523392 8083187 5 Nov 2016, 1:49:15 UTC 5 Nov 2016, 10:36:02 UTC Completed, can't validate 1,010.88 1,005.91 0.00 SETI@home v8 v8.19 (opencl_nvidia_SoG) windows_intelx86 5268933831 7929271 5 Nov 2016, 14:58:45 UTC 6 Nov 2016, 0:12:22 UTC Completed, can't validate 1,205.80 92.72 0.00 SETI@home v8 v8.00 (opencl_intel_gpu_sah) x86_64-apple-darwin 5270421343 7433014 6 Nov 2016, 5:08:07 UTC 7 Nov 2016, 18:21:52 UTC Completed, can't validate 893.83 881.06 0.00 SETI@home v8 v8.19 (opencl_nvidia_SoG) windows_intelx86 5273827969 8011878 7 Nov 2016, 20:02:37 UTC 8 Nov 2016, 13:53:27 UTC Completed, can't validate 7,101.45 7,089.08 0.00 SETI@home v8 v8.00 windows_intelx86 I eagerly await the rollout of the latest SoG and Intel GPU apps to full production status. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.