Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/boinc_db.inc on line 147
SETI@home v8 beta to begin on Tuesday

SETI@home v8 beta to begin on Tuesday

Message boards : News : SETI@home v8 beta to begin on Tuesday
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 99 · Next

AuthorMessage
Profile Mr. Kevvy
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 21 Apr 13
Posts: 23
Credit: 2,253,909
RAC: 0
Canada
Message 55781 - Posted: 9 Jan 2016, 13:47:31 UTC
Last modified: 9 Jan 2016, 14:38:21 UTC

Here are three of my error tasks, all SETI@home v8 v8.05 (opencl_nvidia_sah) windows_intelx86, from the past two days with the same access violation error from the same address. All three were on different machines, two Win7 32-bit and one Win7 64-bit.

21681801
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0041A526 read attempt to address 0x069B901C

21678723
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0041A526 read attempt to address 0x071D901C

21669345
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0041A526 read attempt to address 0x0A11E01C

Edit: I checked the workunit pages for these tasks and they appear to have crashed with the same error on every Windows host they were on. This one completed on Darwin.

All GPU and opt CPU builds updated. Should fix random crash at the end of task processing.


Now that's what I call responsive. :^)
ID: 55781 · Report as offensive
MarkJ
Volunteer tester

Send message
Joined: 18 Oct 09
Posts: 48
Credit: 73,283
RAC: 0
Australia
Message 55782 - Posted: 9 Jan 2016, 14:00:28 UTC - in response to Message 55763.  
Last modified: 9 Jan 2016, 14:07:09 UTC

Just attached this host

Its an i7-6700 with HD Graphics 530. My main aim is to try and shake out the OpenCL multi-beam app. I have this host and 5 others doing CPU work on main without any issues.

27 tasks completed so far, some were CPU. Of those 5 are inconclusive, all OpenCL Intel GPU. Its using the latest (Dec 2015) 4352 driver from Intel.
Inconclusive tasks
ID: 55782 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55783 - Posted: 9 Jan 2016, 14:01:57 UTC

All GPU and opt CPU builds updated. Should fix random crash at the end of task processing.
Link: https://cloud.mail.ru/public/3nxq/SgQBZXcM7
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55783 · Report as offensive
Brent Norman
Volunteer tester

Send message
Joined: 5 Jan 16
Posts: 9
Credit: 552,318
RAC: 0
Canada
Message 55784 - Posted: 9 Jan 2016, 15:11:28 UTC - in response to Message 55783.  

Is this the link where future tests apps will be posted also?
ID: 55784 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55785 - Posted: 9 Jan 2016, 15:25:58 UTC - in response to Message 55784.  

Is this the link where future tests apps will be posted also?

perhaps so. Cause to make link to separate file takes too much precious time...
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55785 · Report as offensive
Profile Jeff Buck
Volunteer tester

Send message
Joined: 11 Dec 14
Posts: 96
Credit: 1,240,941
RAC: 0
United States
Message 55787 - Posted: 9 Jan 2016, 16:56:31 UTC - in response to Message 55775.  

This report is going to be long because I want to be thorough and record any details that I've been able to cobble together.

Had an apparent NVIDIA driver (361.43) crash on my T7400 earlier this evening.

Looks not like usual driver restart because of watchdog timer issue. Such restarts will eventually cause OS reboot but number of driver restarts should happen before that.
What's interesting: BOINC appears to do nothing more than hour. Just as computer would be switched off/suspended.
Also, bugcheck number supports this:
"0x0000009F" Stop error in Windows 7 or in Windows Server 2008 R2 when the computer enters or resumes from the Soft Off (S5) power state
https://support.microsoft.com/en-us/kb/2459268
So, please check if your PC could enter in sleep mode (suspend).

Thanks for looking into that bugcheck number. I just didn't have time last night to dig any deeper than what I did. The symptoms certainly seem to partially fit, although that box is running Win 8.1. The machine should not have entered suspend or sleep mode under normal operation. My power option is set to "Turn off the display: 3 minutes; Put the computer to sleep: Never", which is the way it has always been run, so there's been no change there.

When I went in to physically check on the machine, it sounded like it was still running normally, not in a sleep state. However, the timing of the BOINC scheduler request at 19:40:13 appears to match exactly with the time of the driver restart, and probably also corresponds with the time I tried to wake the monitor with the mouse (although I can't be precise about that time). So, perhaps that mouse movement did wake something up, including BOINC. Then again, BOINC Manager here on my daily driver didn't report any loss of connection to the T7400 until it actually crashed and rebooted.
ID: 55787 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 55789 - Posted: 9 Jan 2016, 17:16:45 UTC

First availability of Windows SETI MB v8 CUDA applications for NVidia cards.

These are for use on the BETA SITE ONLY.

Notes:
- Each has the same requirements and limitations as the corresponding v7 version,
-----> e,g, Cuda 2.3 is for pre Fermi (< compute capability 2.0)
-----> Maxwell GPUs (compute capability 5.0+) don't like Cuda 3.2)

Each package is self-contained, with the required CUDA runtime DLL files, an aistub file suitable for use as a standalone app_info.xml file (or merging with an existing app_info.xml), and a skeleton configuration file.

Links are to my personal OneDrive account.

Lunatics_x41zh_win32_cuda50.zip
Lunatics_x41zh_win32_cuda42.zip
Lunatics_x41zh_win32_cuda32.zip
Lunatics_x41zh_win32_cuda23.zip

Jason has submitted these builds to Eric for deployment as stock Beta applications: I am making this preliminary release with his permission.
ID: 55789 · Report as offensive
Profile Jeff Buck
Volunteer tester

Send message
Joined: 11 Dec 14
Posts: 96
Credit: 1,240,941
RAC: 0
United States
Message 55791 - Posted: 9 Jan 2016, 17:22:51 UTC - in response to Message 55777.  

This report is going to be long because I want to be thorough and record any details that I've been able to cobble together.

Had an apparent NVIDIA driver (361.43) crash on my T7400 earlier this evening. I discovered it when I went to look at BOINC Manager here on my daily driver, which was remotely monitoring the T7400, and found:

All 3 GPUs were showing greater than 99% complete, with 0% remaining, but were showing run times that were about 10 times what they should have taken to complete (and still incrementing).

To me, that shows at least two things. First, although I took the screenshot at about 19:45 or so, which showed about 40-44 minutes of run time for the tasks, they had actually been started more than an hour earlier and must have been stalled for almost all of that time since they never hit their first checkpoint. Second, BOINC continued to run for long after the tasks stalled, reporting and downloading tasks at 19:40.

That machine (it says it's an E5430, but you know them better than me) is running BOINC v7.6.9. That, and subsequent, versions of BOINC display a pseudo-progress value, designed to reassure users that something is happening, even when a rogue application (from another project, of course, not here!) is incapable of reporting its own progress. The pseudo-progress displayed is designed to converge asymptotically towards 100%, at a rate determined by the initial estimate for the job. It looks as if that's what you're seeing, and the tasks indeed never started running.

It's actually a Dell Precision T7400 Workstation with Xeon E5430 CPUs, but I tend to reference the model number, rather than the CPU. Seems to be easier for me to remember, for some reason. ;^)

I figured that pseudo-progress "feature" must have been in play, although the Elapsed time figure is still rather puzzling. It's difficult to tell where that was measured from.

I have seen similar problems on one of my Windows 7 machines. Occasional GPU task stalls, accompanied by a complete screen freeze (even the clock stops updating), and no response to mouse or keyboard input. It appears that the complete Windows graphics sub-system goes south, but the kernel processes keep running. That includes the BOINC client, and CPU applications launched by it. I've been able to connect to the machine via a remote BOINC Manager, shut down the running client cleanly, and restart the whole computer with the Big Red Switch.

Unfortunately, I wasn't able to see what state the screen was in because I couldn't wake the monitor, and I never got a chance to see if I could do anything remotely with the BOINC client because the whole thing crashed just shortly after I took the screenshot from here on my daily driver. It did its own rebooting, although that in itself is strange because my recollection is that I have the BIOS set to not automatically restart after a crash. Perhaps that only takes effect after a power loss, rather than a crash. I'll have to remember to recheck it sometime when I'm manually booting it up.

You mention remote monitoring. I use BoincView, which still calculates and displays CPU efficiency (I believe BoincTasks may do the same - unfortunately the concept was removed from BOINC itself some years ago). A BOINC CPU task will show 97-100% efficiency, and a good GPU task in the low single figure range. But one of the stalled tasks we're talking about will display exactly 0.0000 CPU efficiency - that's a useful warning that a visit to the remote machine is required urgently.

I've just been using BOINC Manager for my remote monitoring since it was already on hand and was so easy to make the connections once I started running more than one machine. I tend to not add extra tools to the kit unless I have a compelling need to do so, but I'll take a look at BoincView when I get a chance. Thanks for the recommendation.
ID: 55791 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 55792 - Posted: 9 Jan 2016, 17:34:27 UTC - in response to Message 55791.  

I'll take a look at BoincView when I get a chance. Thanks for the recommendation.

You'll find that difficult, because the developer (and his website) disappeared without warning many years ago - sometime between early 2006 and the start of the GPU revolution. (I can dig out a copy of the final release version, if anyone else prefers it to BoincTasks)

BoincTasks, on the other hand, is still actively supported. It does have the efficiency measure, shown on the screenshot as 'CPU %'.
ID: 55792 · Report as offensive
Profile Jeff Buck
Volunteer tester

Send message
Joined: 11 Dec 14
Posts: 96
Credit: 1,240,941
RAC: 0
United States
Message 55793 - Posted: 9 Jan 2016, 17:40:04 UTC - in response to Message 55792.  

I'll take a look at BoincView when I get a chance. Thanks for the recommendation.

You'll find that difficult, because the developer (and his website) disappeared without warning many years ago - sometime between early 2006 and the start of the GPU revolution. (I can dig out a copy of the final release version, if anyone else prefers it to BoincTasks)

BoincTasks, on the other hand, is still actively supported. It does have the efficiency measure, shown on the screenshot as 'CPU %'.

Okay, thanks! Will redirect my attention to BoincTasks.
ID: 55793 · Report as offensive
Profile Jeff Buck
Volunteer tester

Send message
Joined: 11 Dec 14
Posts: 96
Credit: 1,240,941
RAC: 0
United States
Message 55796 - Posted: 9 Jan 2016, 19:53:48 UTC - in response to Message 55789.  

My first look at v8 cuda50 vs. opencl on my T7400's NVIDIA GPUs, with "normal" AR tasks, shows:

GTX 660
opencl: 21731337 AR=0.414587, RT=15:01, CPU=14:33
cuda50: 21733502 AR=0.414664, RT=16:52, CPU=5:27

GTX 670
opencl: 21731340 AR=0.414587, RT=12:31, CPU=12:06
cuda50: 21733503 AR=0.414664, RT=14:23, CPU=5:23

GTX 780
opencl: 21731336 AR=0.414587, RT=8:42, CPU=8:24
cuda50: 21733373 AR=0.414664, RT=12:47, CPU=6:50

All of those were running 1 task per GPU, with no CPU tasks running.

A couple of other observations on the cuda50 that I saw while monitoring the first batch of tasks. The core usage seemed to fluctuate quite a bit, from less than 10% to over 60%, with the GTX780 consistently using more CPU than the other two. Those numbers are much less than the 95%+ core usage of the opencl apps.

Also, the GPU Load was much less for cuda50, with the load appearing to increase in the later stages of each task run. For the GTX660, it averaged 51% early, rising to 67% late. For the GTX 670, it was about 43% to 51%, and for the GTX 780, it was about 31% to 48%. That compares to about 90%, 88% and 84%, respectively for the opencl apps. Theoretically, that should make it more practical to run multiple instances per GPU with the cuda50, which should, I think, make the overall throughput better.
ID: 55796 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 55797 - Posted: 9 Jan 2016, 20:17:26 UTC - in response to Message 55796.  
Last modified: 9 Jan 2016, 20:26:17 UTC

Also, the GPU Load was much less for cuda50, with the load appearing to increase in the later stages of each task run. For the GTX660, it averaged 51% early, rising to 67% late. For the GTX 670, it was about 43% to 51%, and for the GTX 780, it was about 31% to 48%. That compares to about 90%, 88% and 84%, respectively for the opencl apps. Theoretically, that should make it more practical to run multiple instances per GPU with the cuda50, which should, I think, make the overall throughput better.


That's pretty much the route I've taken here (running 2 up), while leaving things at default settings. Alternatively the priority and pulsefind settings would likely stabilise the load a fair bit. Time will tell whether fewer instancess and more aggressive settings + code is a better option than treading lightly and running more instances. either way, both will ultimately be an option once we have reliable baseline functionality to prove the new code against. [ I'll be happy enough to start shovelling in alternate codepaths once I'm convinced inconclusive to pending ratios against stock CPU are converging on 5% or better]
Chaos: When the present determines the future, but the approximate present does not approximately determine the future.
Edward Lorenz
ID: 55797 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1700
Credit: 13,216,373
RAC: 0
Sweden
Message 55798 - Posted: 9 Jan 2016, 20:25:10 UTC

OK, here's my NVIDIA GeForce GTX 980, running 3 tasks at a time, and with processpriority = abovenormal.

Looking good so far:

https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=75292&offset=0&show_names=0&state=0&appid=38
ID: 55798 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 55799 - Posted: 9 Jan 2016, 20:46:49 UTC - in response to Message 55745.  
Last modified: 9 Jan 2016, 20:53:04 UTC

Anyone want some good news?

I am quite happily running Nvidia V8 GPU work on my rig :-))

Ditto.
8.05 on three machines with 2 750tis each, Win7x64.
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 55799 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 55800 - Posted: 9 Jan 2016, 20:59:11 UTC
Last modified: 9 Jan 2016, 21:04:28 UTC

New to the party here, so I'm not sure what the protocol is.
Should I mention all error tasks here? If so, only 1 so far:

https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=21719471

8.05 opencl_nvidia_sah on 750ti under Win7x64

<message>
One or more arguments are invalid
(0x80000003) - exit code -2147483645 (0x80000003)
</message>

At this point, over 400 completed valids.
If there's thing I can do , or not do, to be helpful please let me know.
Regards, Jim ...

[edit] In case it was unclear, the above is SAH Beta. [\edit]
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 55800 · Report as offensive
jason_gee
Volunteer tester

Send message
Joined: 11 Dec 08
Posts: 198
Credit: 658,573
RAC: 0
Australia
Message 55801 - Posted: 9 Jan 2016, 21:01:57 UTC - in response to Message 55800.  

Probably something for the setiathome_enhanced forum section. Not sure what that error is myself, but suspect the OpenCL gurus could help out with settings or somesuch
Chaos: When the present determines the future, but the approximate present does not approximately determine the future.
Edward Lorenz
ID: 55801 · Report as offensive
TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 28 Jan 11
Posts: 619
Credit: 2,580,051
RAC: 0
Sweden
Message 55802 - Posted: 9 Jan 2016, 22:19:54 UTC

This task behaved kinda strangely....
It didn't show any Gpu usage but continued to run as if it was using the Gpu...

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=21737325

And the wu after worked as it should..... I hope.

there is some error information in the link above for the one who understands it.
ID: 55802 · Report as offensive
MarkJ
Volunteer tester

Send message
Joined: 18 Oct 09
Posts: 48
Credit: 73,283
RAC: 0
Australia
Message 55803 - Posted: 9 Jan 2016, 23:07:02 UTC - in response to Message 55789.  

First availability of Windows SETI MB v8 CUDA applications for NVidia cards.

These are for use on the BETA SITE ONLY.

<snip>

Links are to my personal OneDrive account.

Jason has submitted these builds to Eric for deployment as stock Beta applications: I am making this preliminary release with his permission.

Thank you Jason and Richard. I have attached a dual GTX750Ti host (Win7 x64) here and its done the first couple.
ID: 55803 · Report as offensive
TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 28 Jan 11
Posts: 619
Credit: 2,580,051
RAC: 0
Sweden
Message 55804 - Posted: 9 Jan 2016, 23:30:46 UTC

Ok.

Suspending gpu 0 and keep running on gpu 1 and 2 on 5970 while watching another episode of Farscape. The program paused for 4 seconds and then played fastforward to where it was and continued to play.
What happend was a driver restart and gpu 1 got down to 0% usage.
Hmm, I thought....nah, I do as I shouldn't and exit BM and then restart it. This normaly gives a blue screen within 4-12 seconds.
But this time it didn't and the 5970 kept crunching on both gpu's and still does.
The Farscape episode is soon done and I resume seeing it now.
ID: 55804 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 55805 - Posted: 10 Jan 2016, 0:18:53 UTC - in response to Message 55804.  
Last modified: 10 Jan 2016, 0:37:21 UTC

The program paused for 4 seconds and then played fastforward to where it was and continued to play.
What happend was a driver restart ...

I've had 3-4 driver restarts on 77531 here this afternoon.

Looking at the tasks that were running at the restart I either see:
ERROR: OpenCL kernel/call 'clEnqueueMapBuffer(cpu_MeanMaxIdx_buf)' call failed (-36) in file ..\analyzeFuncs.cpp near line 4950.
Waiting 30 sec before restart...

or

OpenCL queue synchronized
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected equals the storage space allocated.

in each task. These were all tasks complete on the above host right around 0000gmt 10Jan.
Jim ...
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 55805 · Report as offensive
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 99 · Next

Message boards : News : SETI@home v8 beta to begin on Tuesday


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.