SETI@home | Task postponed: Suspicious gaussian results, host needs reboot or maintenance

Message boards : Number crunching : SETI@home | Task postponed: Suspicious gaussian results, host needs reboot or maintenance
Message board moderation

To post messages, you must log in.

AuthorMessage
Microns

Send message
Joined: 19 May 99
Posts: 12
Credit: 13,074,878
RAC: 14
United States
Message 1789649 - Posted: 23 May 2016, 2:01:34 UTC

Have now seen this error message in the Event Log several times in the last 3 days. What is the cause? Thanks!
ID: 1789649 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1789744 - Posted: 23 May 2016, 9:54:08 UTC - in response to Message 1789649.  

Have you followed its advice, and rebooted?

When you have various tasks with dubious outcome, it's possible that something in the computer is wrong which could be corrected with a reboot. Think about a bit wrong in memory, or something stuck on the hard drive.
ID: 1789744 · Report as offensive
Microns

Send message
Joined: 19 May 99
Posts: 12
Credit: 13,074,878
RAC: 14
United States
Message 1789753 - Posted: 23 May 2016, 11:50:08 UTC - in response to Message 1789744.  

Yes, with each occurrence.
ID: 1789753 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1789784 - Posted: 23 May 2016, 15:13:58 UTC - in response to Message 1789753.  
Last modified: 23 May 2016, 15:14:26 UTC

Did any of those tasks finish normally or in error? Or do they validate correctly?
Can you link to a result file? You have your computer(s) hidden so I cannot check whether you have errors or not.
ID: 1789784 · Report as offensive
Microns

Send message
Joined: 19 May 99
Posts: 12
Credit: 13,074,878
RAC: 14
United States
Message 1789896 - Posted: 24 May 2016, 0:12:28 UTC - in response to Message 1789784.  

Several recent Validation Inconclusive:
http://setiathome.berkeley.edu/results.php?userid=8150&offset=0&show_names=0&state=3&appid=
For example:
http://setiathome.berkeley.edu/result.php?resultid=4939723313

Several recent Invalid:
http://setiathome.berkeley.edu/results.php?userid=8150&offset=0&show_names=0&state=5&appid=
For example:
http://setiathome.berkeley.edu/result.php?resultid=4939722803

All appear to be GPU accelerated tasks.
ID: 1789896 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1789992 - Posted: 24 May 2016, 9:27:06 UTC - in response to Message 1789896.  

OK, I'll try to get the Mac developer of the Seti v8 application to come and look.

Do you know what driver you use for the AMD GPU?
Did you install it yourself from the AMD site, or is it included in your OS X?
Which OS X is this?
ID: 1789992 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1790040 - Posted: 24 May 2016, 14:24:36 UTC

For the attention of the developer(s) of the opencl_ati5_SoG_mac AMD GPU OpenCL application, this is what it returns on invalids:

Host: http://setiathome.berkeley.edu/show_host_detail.php?hostid=7962642

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<stderr_txt>
OpenCL platform detected: Apple
Number of OpenCL devices found : 1 
BOINC assigns slot on device #0.
Info: BOINC provided OpenCL device ID used

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_ZERO_COPY SIGNALS_ON_GPU OCL_CHIRP3 FFTW SSE3 64bit 
 System: Darwin  x86_64  Kernel: 15.5.0
CPU : Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 
 GenuineIntel x86, Family 6 Model 94 Stepping 3
 Features : FPU TSC PAE APIC MTRR MMX SSE  SSE2 HT  SSE3 SSSE3 SSE4.1 SSE4.2 AVX1.0  

OpenCL-kernels filename : MultiBeam_Kernels_r3430.cl 
ar=0.470383  NumCfft=186671  NumGauss=1005077640  NumPulse=112911491163  NumTriplet=225881010637
Currently allocated 229 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized setiathome_v8 application
Version info: SSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE3x OS X 64bit Build 3430 , Ported by : Raistmer, JDWhale, Urs Echternacht


OpenCL version by Raistmer, r3430

AMD HD5 version by Raistmer

Number of OpenCL platforms:				 1


 OpenCL Platform Name:					 Apple
Number of devices:				 1
  Max compute units:				 32
  Max work group size:				 256
  Max clock frequency:				 909Mhz
  Max memory allocation:			 1073741824
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 4294967296
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 AMD Radeon R9 M395X Compute Engine
  Vendor:					 AMD
  Driver version:				 1.2 (Apr 26 2016 00:27:37)
  Version:					 OpenCL 1.2 
  Extensions:					 cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_image2d_from_buffer cl_khr_depth_images cl_APPLE_command_queue_priority cl_APPLE_command_queue_select_compute_units cl_khr_fp64


Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  0.470383
Used GPU device parameters are:
	Number of compute units: 32
	Single buffer allocation size: 128MB
	Total device global memory: 4096MB
	max WG size: 256
	local mem type: Real
	LotOfMem path: yes
	LowPerformanceGPU path: no
period_iterations_num=50
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=2.103e+05, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=4.302e+05, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=6.501e+05, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=8.7e+05, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=1.09e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=1.31e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=1.53e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=1.75e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=1.97e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=2.189e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=2.409e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=2.629e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=2.849e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=3.069e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=3.289e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=3.509e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=3.729e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=3.949e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=4.169e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=4.388e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=4.608e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=4.828e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=5.048e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=5.268e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=5.488e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=5.708e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=5.928e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=6.148e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=6.368e+06, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Autocorr: peak=18.43179, time=73.82, delay=2.63, d_freq=1419679888.29, chirp=29.178, fft_len=128k
OpenCL queue synchronized
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected equals the storage space allocated.

Best spike: peak=22.93343, time=87.24, d_freq=1419679887.7, chirp=12.331, fft_len=128k
Best autocorr: peak=18.43179, time=73.82, delay=2.63, d_freq=1419679888.29, chirp=29.178, fft_len=128k
Best gaussian: peak=5.100965, mean=0.3546413, ChiSq=1.105278, time=2.103e+05, d_freq=1419677677.15,
	score=3.860626, null_hyp=2.295059, chirp=0, fft_len=2k
Best pulse: peak=1.374034, time=39.96, period=0.2771, d_freq=1419677733.43, score=0.9924, chirp=26.706, fft_len=64 
Best triplet: peak=0, time=-2.121e+11, period=0, d_freq=0, chirp=0, fft_len=0 


Flopcounter: 2745352304217.902344

Spike count:    0
Autocorr count: 1
Pulse count:    0
Triplet count:  0
Gaussian count: 29
Time cpu in use since last restart: 115.6 seconds

 Gaussian_transfer_not_needed       	 total=0.0000E+00, N=0         , <>=0         , min=0         , max=0          
 Gaussian_transfer_needed           	 total=0.0000E+00, N=0         , <>=0         , min=0         , max=0          


 Gaussian_skip1_no_peak             	 total=0         , N=0         , <>=0         , min=0         , max=0          
 Gaussian_skip2_bad_group_peak      	 total=0         , N=0         , <>=0         , min=0         , max=0          
 Gaussian_skip3_too_weak_peak       	 total=0         , N=0         , <>=0         , min=0         , max=0          
 Gaussian_skip4_too_big_ChiSq       	 total=0         , N=0         , <>=0         , min=0         , max=0          
 Gaussian_skip6_low_power           	 total=0         , N=0         , <>=0         , min=0         , max=0          


 Gaussian_new_best                  	 total=1         , N=1         , <>=1         , min=1         , max=1          
 Gaussian_report                    	 total=29        , N=29        , <>=1         , min=1         , max=1          
 Gaussian_miss                      	 total=0         , N=0         , <>=0         , min=0         , max=0          


 PC_triplet_find_hit                	 total=4.7450E+03, N=4745      , <>=1         , min=1         , max=1          
 PC_triplet_find_miss               	 total=9.9000E+01, N=99        , <>=1         , min=1         , max=1          


 PC_pulse_find_hit                  	 total=2.4140E+03, N=2414      , <>=1         , min=1         , max=1          
 PC_pulse_find_miss                 	 total=6.0000E+00, N=6         , <>=1         , min=1         , max=1          
 PC_pulse_find_early_miss           	 total=4.0000E+00, N=4         , <>=1         , min=1         , max=1          
 PC_pulse_find_2CPU                 	 total=1.0000E+00, N=1         , <>=1         , min=1         , max=1          


 PoT_transfer_not_needed            	 total=4.7410E+03, N=4741      , <>=1         , min=1         , max=1          
 PoT_transfer_needed                	 total=1.0400E+02, N=104       , <>=1         , min=1         , max=1          

GPU device sync requested...  ...GPU device synched
12:10:53 (1995): called boinc_finish(0)

</stderr_txt>
]]>

ID: 1790040 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1790041 - Posted: 24 May 2016, 14:37:45 UTC - in response to Message 1790040.  
Last modified: 24 May 2016, 14:51:23 UTC

There is an issue with the SoG app (possibly all the 8.10 apps), OS X 10.11.4 & 10.11.5 and Mac D300, D500, D700 and R9 M3xx series cards. It doesn't happen on 10.11.3 or below and apparently doesn't happen on the unreleased 10.12.0. If you can downgrade your OS to at possibly 10.11.3 or 10.10.x for sure you won't have any issues. If you don't want to downgrade, you can create an app_info.xml file and use the previous app. The issue is being looked at but right now seems to mostly be a driver/OS incompatibility. Bad news is the only driver is supplied with the OS so you can't update it by itself. The good news is it looks like its fixed in the next OS, which will be out to developers and previewers after the Apple developer conference in a couple of weeks.

That said, the known issues has presented itself thus far and not reporting any Gaussians. Finding too many of anything is over attributed to problems with the GPU itself, maybe overheating maybe glitched up memory that a reboot will often resolve. You haven't run enough work units to really tell what's going on. The ati5 version of 8.10 seems to be finding Gaussians fine as were the 8.00 versions of the app. One errant wu that does something crazy just happens sometimes. If you start getting tons of these then its something to worry about.

Thanks,

Chris
ID: 1790041 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1790057 - Posted: 24 May 2016, 21:12:44 UTC

OS X apps reverted to prev version. So there should be no issues with Gaussian finding now.
ID: 1790057 · Report as offensive
Microns

Send message
Joined: 19 May 99
Posts: 12
Credit: 13,074,878
RAC: 14
United States
Message 1790104 - Posted: 24 May 2016, 23:27:49 UTC - in response to Message 1789992.  

AMD driver as supplied by Mac OS X.
Currently running Mac OS version 10.11.5
These failures are correlated with the update from 10.11.4 to 10.11.5.
ID: 1790104 · Report as offensive
Microns

Send message
Joined: 19 May 99
Posts: 12
Credit: 13,074,878
RAC: 14
United States
Message 1790107 - Posted: 24 May 2016, 23:36:59 UTC - in response to Message 1790041.  

Frequency is difficult to determine. I discontinued processing GPU work units when it became obvious that they were failing.
ID: 1790107 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 1790112 - Posted: 24 May 2016, 23:43:15 UTC - in response to Message 1790104.  

AMD driver as supplied by Mac OS X.
Currently running Mac OS version 10.11.5
These failures are correlated with the update from 10.11.4 to 10.11.5.

No, the failures started with 10.11.4 and continue to happen with 10.11.5 !
_\|/_
U r s
ID: 1790112 · Report as offensive
Microns

Send message
Joined: 19 May 99
Posts: 12
Credit: 13,074,878
RAC: 14
United States
Message 1790114 - Posted: 24 May 2016, 23:45:50 UTC - in response to Message 1790057.  

I will resume GPU processing and notify if failures reoccur.
Thanks to all, 𝝻
ID: 1790114 · Report as offensive
Microns

Send message
Joined: 19 May 99
Posts: 12
Credit: 13,074,878
RAC: 14
United States
Message 1790117 - Posted: 24 May 2016, 23:47:02 UTC - in response to Message 1790112.  

I don't believe that I indicated when problems started on your system.
ID: 1790117 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1790121 - Posted: 24 May 2016, 23:58:34 UTC - in response to Message 1790041.  

You haven't run enough work units to really tell what's going on.

Point of attention, I was trying to help, not running the work. I don't have a Mac. :)
ID: 1790121 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1790146 - Posted: 25 May 2016, 2:01:02 UTC - in response to Message 1790121.  

You haven't run enough work units to really tell what's going on.

Point of attention, I was trying to help, not running the work. I don't have a Mac. :)


No worries.:)
ID: 1790146 · Report as offensive

Message boards : Number crunching : SETI@home | Task postponed: Suspicious gaussian results, host needs reboot or maintenance


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.