Monitoring inconclusive GBT validations and harvesting data for testing

Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 36 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1810665 - Posted: 20 Aug 2016, 7:48:18 UTC

Having this 'pause' in GBT splitting is really useful. Normally, the vast majority of WUs pass straight through the system and down the pan at the first attempt. But a period of 'resends only' allows the real turds to float to the top for screening. Once all my 'initial split' tasks (_0 and _1) are out of the way - a few more hours yet - I might harvest the remaining data files so we have a representative crop for use in future application testing.
ID: 1810665 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1810669 - Posted: 20 Aug 2016, 7:54:01 UTC - in response to Message 1810665.  

Having this 'pause' in GBT splitting is really useful. Normally, the vast majority of WUs pass straight through the system and down the pan at the first attempt. But a period of 'resends only' allows the real turds to float to the top for screening. Once all my 'initial split' tasks (_0 and _1) are out of the way - a few more hours yet - I might harvest the remaining data files so we have a representative crop for use in future application testing.


Good thanks. Will be delighted to stand up for my portion of the floaters, and justify my growing saltiness.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1810669 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1810688 - Posted: 20 Aug 2016, 10:00:22 UTC

Down to three Messier tapes at 09:40 UTC, so one of the GBT splitters will have switched back to HIPs - and it won't take long for the rest to follow.

With a reduced RTS, and faster returns for pure Arecibo work, I reckon the optimum time for a floater harvest will be around 15:00 UTC. I'll try and clear the pipes by then, if I can work around yet another hardware failure. Single most critical (and frequent) failure point for my little shrubbery is the £5 wall-wart that powers the KVM switch. This one's "lifetime warranty" lasted about six months - and I need it for real-world time sensitive work tomorrow Sunday. Bah.
ID: 1810688 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1810695 - Posted: 20 Aug 2016, 11:18:49 UTC - in response to Message 1810688.  

Is there any particular files that need harvesting? I can see if I can get some downloaded and uploaded to google drive
ID: 1810695 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1810700 - Posted: 20 Aug 2016, 12:00:50 UTC - in response to Message 1810695.  

Is there any particular files that need harvesting? I can see if I can get some downloaded and uploaded to google drive

Thanks, but I don't think that will be necessary.

I'm not planning to download anything - simply to copy the datafiles which are already sitting on my home machines, waiting to be processed. But choosing specifically those tasks where there's already some evidence of a validation difficulty.

Then, I'll be following the progress of those workunits, and looking for patterns. Not for evidence of individual badly-maintained or over-stressed hosts, but of systemic errors in particular application builds.

Top of the list will be Petri's "special" code, because that is still under active development out here in the volunteer community, and shows great promise - if only the inaccuracies can be ironed out. I'm hoping that having a stock of WUs known to trigger the 'inconclusive' outcome will allow the users and developers affected - perhaps after the challenge is over - to run the harvested WUs offline under bench conditions, and find out exactly what the differences in the result files are. That's the first step in responsible debugging.

I'll also be keeping an eye open for examples of the other examples of turd-droppers listed by Jason - such as those awful stock apple-darwin apps - but most of them are less amenable to fixing by the external community.

All assuming I can get my monitor to light up again. Off into town now to try and source a replacement for that wall-wart - probably end up paying £20 at Maplin for a £5 part.
ID: 1810700 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1810741 - Posted: 20 Aug 2016, 15:22:14 UTC - in response to Message 1810700.  

Is there any particular files that need harvesting? I can see if I can get some downloaded and uploaded to google drive

Thanks, but I don't think that will be necessary.

I'm not planning to download anything - simply to copy the datafiles which are already sitting on my home machines, waiting to be processed. But choosing specifically those tasks where there's already some evidence of a validation difficulty.

Then, I'll be following the progress of those workunits, and looking for patterns. Not for evidence of individual badly-maintained or over-stressed hosts, but of systemic errors in particular application builds.

Top of the list will be Petri's "special" code, because that is still under active development out here in the volunteer community, and shows great promise - if only the inaccuracies can be ironed out. I'm hoping that having a stock of WUs known to trigger the 'inconclusive' outcome will allow the users and developers affected - perhaps after the challenge is over - to run the harvested WUs offline under bench conditions, and find out exactly what the differences in the result files are. That's the first step in responsible debugging.

I'll also be keeping an eye open for examples of the other examples of turd-droppers listed by Jason - such as those awful stock apple-darwin apps - but most of them are less amenable to fixing by the external community.

All assuming I can get my monitor to light up again. Off into town now to try and source a replacement for that wall-wart - probably end up paying £20 at Maplin for a £5 part.

I would suggest you concentrate on the older Apps that have already been identified as troublesome rather than Apps that are updated regularly. This has already been posted;
So, Darwin 15.4, 15.5.
Ok, this match perfectly with what Urs supplied to me yesterday.
Will try to get exclusion of these OS versions.

You can add Darwin 15.6 & 16.0 to that list now. For the Laptops the list would be 15.0-16.0.
Here are a few machines I've got marked,
http://setiathome.berkeley.edu/results.php?hostid=1575265
http://setiathome.berkeley.edu/results.php?hostid=6787046
http://setiathome.berkeley.edu/results.php?hostid=6134063
They are Many others.

The problems with the AVX CPUs in Darwin 11.4.2 have been known about since the MBv7 days.
I'm still trying to figure out why a handful of ATI HD4 cards can't be excluded from being sent the HD5 App.
Those few items would cure many problems with Inconclusive results.
Then there is all those Intel iGPUs...
ID: 1810741 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1810747 - Posted: 20 Aug 2016, 15:39:34 UTC - in response to Message 1810741.  

I would suggest you concentrate on the older Apps that have already been identified as troublesome rather than Apps that are updated regularly.

I'll gladly identify task data files which appear to trigger the instabilities, and make them available to testers with offline capability. But could you help me identify active developers who would be capable of fixing the code? If we can demonstrate that we have supplied better-performing applications (measured by percentage of first-time validation, nothing to do with speed) - perhaps through anonymous platform running - I'd feel more confident about approaching Eric and the other lab staff to seek a stock upgrade.
ID: 1810747 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1810750 - Posted: 20 Aug 2016, 15:57:42 UTC - in response to Message 1810747.  
Last modified: 20 Aug 2016, 16:01:10 UTC

I would suggest you concentrate on the older Apps that have already been identified as troublesome rather than Apps that are updated regularly.

I'll gladly identify task data files which appear to trigger the instabilities, and make them available to testers with offline capability. But could you help me identify active developers who would be capable of fixing the code? If we can demonstrate that we have supplied better-performing applications (measured by percentage of first-time validation, nothing to do with speed) - perhaps through anonymous platform running - I'd feel more confident about approaching Eric and the other lab staff to seek a stock upgrade.

I believe you are aware of the developers.
Look at these results, http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=73656
Note how the one App was producing Valid results and then the machine hit two opencl_nvidia_mac tasks resulting in two Inconclusive results. There are a few of those machines at Beta, unfortunately there was never an announcement on the the App's availability. Currently there are few people testing the App at Beta. The SSSE3 CPU App that could run on those Darwin 11.4.2 AVX CPUs was offered to Beta at the same time as the CUDA Apps, the CPU App didn't make it.
ID: 1810750 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1810754 - Posted: 20 Aug 2016, 16:05:14 UTC - in response to Message 1810747.  

I would suggest you concentrate on the older Apps that have already been identified as troublesome rather than Apps that are updated regularly.

I'll gladly identify task data files which appear to trigger the instabilities, and make them available to testers with offline capability. But could you help me identify active developers who would be capable of fixing the code? If we can demonstrate that we have supplied better-performing applications (measured by percentage of first-time validation, nothing to do with speed) - perhaps through anonymous platform running - I'd feel more confident about approaching Eric and the other lab staff to seek a stock upgrade.


Also still wrestling with the challenges of being down a key man especially with the CPU builds. No solutions from my direction yet, though might come across something during x42 Cuda that could aid some of the issues.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1810754 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1810758 - Posted: 20 Aug 2016, 16:16:15 UTC - in response to Message 1810754.  

Also still wrestling with the challenges of being down a key man ...

Two key men if you include Charlie Fenton, who would be Eric's go-to man for stock Mac builds, but was lost in the BOINC NSF cull - he hung around for a while, but I haven't seen him posting since early May.
ID: 1810758 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1810761 - Posted: 20 Aug 2016, 16:32:41 UTC - in response to Message 1810758.  

... BOINC NSF cull ...

Ugh, oh yeah that... might be something worth bouncing off the committee then. Maybe some compute based project with Mac developers would be amenable to some exchange ? just musing.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1810761 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1810776 - Posted: 20 Aug 2016, 17:39:01 UTC - in response to Message 1810642.  

So far, a Windows CPU, a Mac NVIDIA GPU and a Windows Intel GPU all disagree, while a Windows Cuda50 timed out and a Mac ATI GPU crapped out. My Win7 host is next in line and will probably run it as Cuda50 sometime tomorrow, unless I reschedule it to the CPU or to SoG. Let's see....what might produce the most interesting result? Hmmm...

Just a quick follow-up on WU 2192117866. I chose to run my _5 task with SoG this morning. The result agreed with the Windows CPU (_0 host). The Mac NVIDIA and the Intel GPU were close enough to also get validated. Richard, if this is the sort of mixed-results WU you're looking to archive, I have it saved, just in case.
ID: 1810776 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1810851 - Posted: 20 Aug 2016, 22:37:56 UTC - in response to Message 1810776.  

So far, a Windows CPU, a Mac NVIDIA GPU and a Windows Intel GPU all disagree, while a Windows Cuda50 timed out and a Mac ATI GPU crapped out. My Win7 host is next in line and will probably run it as Cuda50 sometime tomorrow, unless I reschedule it to the CPU or to SoG. Let's see....what might produce the most interesting result? Hmmm...

Just a quick follow-up on WU 2192117866. I chose to run my _5 task with SoG this morning. The result agreed with the Windows CPU (_0 host). The Mac NVIDIA and the Intel GPU were close enough to also get validated. Richard, if this is the sort of mixed-results WU you're looking to archive, I have it saved, just in case.

Sorry, took a bit of a breather while my machines flushed the remainder of the uninteresting caches (i.e. not resends). A couple are still doing that - may have to wait until morning.

I think you've got the right idea - make a note of what the original reason for the inconclusiveness was, and return later to see how many validate in the end. The most interesting one I've got so far is WU 2239728586, which is a triple inconclusive between stock CPU vs. stock nvidia_mac vs. petri special being run by -= Vyper =-. I've got the tie-breaker on an optimised AVX. The Mac and the Petri both record one more pulse than the stock, but must differ somewhere - they both have printed signal summaries, so if no-one beats me to it, I can try a visual comparison in the morning.

(That phrase "stock nvidia_mac" is appearing far too often in my working notes already)
ID: 1810851 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1811304 - Posted: 22 Aug 2016, 9:05:17 UTC
Last modified: 22 Aug 2016, 9:51:36 UTC

Placeholder post for thread separation - please don't post here until threads rebuilt.

Edit - OK, thread separation complete, feel free to join the conversation.
ID: 1811304 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1811307 - Posted: 22 Aug 2016, 9:53:26 UTC

I think it would only be fair to link back to Jeff Buck's message 1810642, which sparked the whole idea off.
ID: 1811307 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1811335 - Posted: 22 Aug 2016, 12:55:13 UTC

I have harvested some workunits that have _2 or more and have them in a google drive folder. Folder is set to read only, but I can adjust that if needed.
Included is the data file and the results file.
Drive
ID: 1811335 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1811336 - Posted: 22 Aug 2016, 13:03:23 UTC
Last modified: 22 Aug 2016, 13:17:33 UTC

Workunit analysis via stderr.

Pulse: peak=1.450148, time=45.82, period=2.1, d_freq=1209298291.57, score=1.002, chirp=-48.902, fft_len=256
Pulse: peak=6.229137, time=45.99, period=16.37, d_freq=1209299147.79, score=1.012, chirp=-54.71, fft_len=4k

This pulse appears on AVX CPU build but not on GPU.
Pulse: peak=1.284491, time=45.82, period=1.556, d_freq=1209295817.29, score=1.005, chirp=-56.07, fft_len=256
Pulse: peak=7.644391, time=45.9, period=22.43, d_freq=1209299911.74, score=1.003, chirp=60.614, fft_len=2k
Pulse: peak=5.469792, time=45.9, period=13.96, d_freq=1209300442.47, score=1.003, chirp=95.306, fft_len=2k

This pulse appears on GPU but not on AVX CPU.

5 Differing pulses on both sides not recorded.
I am only analysing taking the frequency of each pulse/triplet
Phenom running SSE3
Spike count: 0
Autocorr count: 0
Pulse count: 24
Triplet count: 1
Gaussian count: 0

Mac OpenCL Nvidia
Spike count: 0
Autocorr count: 0
Pulse count: 25
Triplet count: 1
Gaussian count: 0

Petri special code
Spike count: 0
Autocorr count: 0
Pulse count: 25
Triplet count: 1
Gaussian count: 0

AVX optimised app
Spike count: 0
Autocorr count: 0
Pulse count: 24
Triplet count: 1
Gaussian count: 0

Looks like CPU and GPU have one less pulse count. All CPU's agree with 0/0/24/1/0, and all GPU's agree with 0/0/25/1/0. Hmmm..... interesting
ID: 1811336 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1811340 - Posted: 22 Aug 2016, 13:34:01 UTC

Here's an interesting one;
blc5_2bit_guppi_57451_69387_HIP117559_0023.13978.0.18.27.23.vlar
All three Hosts report;
Spike count:    3
Autocorr count: 1
Pulse count:    22
Triplet count:  4
Gaussian count: 0

All three Hosts are Inconclusive.
ID: 1811340 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1811347 - Posted: 22 Aug 2016, 13:47:52 UTC - in response to Message 1811340.  

Analysis of workunit via stderr

Pulse: peak=3.748531, time=45.99, period=8.545, d_freq=1163343856.57, score=1.051, chirp=4.6176, fft_len=4k
Pulse: peak=5.39325, time=45.99, period=14.5, d_freq=1163343830.48, score=1.013, chirp=5.3259, fft_len=4k
Pulse: peak=5.36132, time=45.99, period=14.05, d_freq=1163343598.39, score=1.008, chirp=11.761, fft_len=4k


ATI had these pulses but Intel one doesn't

Pulse: peak=5.636717, time=45.86, period=13, d_freq=1163348386.64, score=1.012, chirp=83.249, fft_len=1024
Pulse: peak=7.938285, time=45.86, period=22.85, d_freq=1163353871.61, score=1.016, chirp=-83.742, fft_len=1024
Pulse: peak=2.082857, time=45.84, period=3.294, d_freq=1163350491.92, score=1.036, chirp=84.358, fft_len=512


Intel had these pulses but ATI doesn't

CPU cannot be analysed because it doesn't print out where it found resultts
ID: 1811347 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1811354 - Posted: 22 Aug 2016, 14:08:06 UTC
Last modified: 22 Aug 2016, 14:08:20 UTC

Hey Look another one with 3 PC do not agree with each other. There is unfortunately no stderr output for where it found its results.
Workunit

Cuda42 says:
Spike count: 30
Autocorr count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 0

Intel Xeon E3-1230 v3 says:
Spike count: 0
Autocorr count: 1
Pulse count: 5
Triplet count: 4
Gaussian count: 0

Unknown Nvidia GPU says:

Spike count: 0
Autocorr count: 1
Pulse count: 3
Triplet count: 4
Gaussian count: 0

So much differ.......
ID: 1811354 · Report as offensive
1 · 2 · 3 · 4 . . . 36 · Next

Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.