Message boards :
News :
SETI@home v8 beta to begin on Tuesday
Message board moderation
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 99 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 21 Apr 13 Posts: 23 Credit: 2,253,909 RAC: 0 ![]() |
Here are three of my error tasks, all SETI@home v8 v8.05 (opencl_nvidia_sah) windows_intelx86, from the past two days with the same access violation error from the same address. All three were on different machines, two Win7 32-bit and one Win7 64-bit. 21681801 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0041A526 read attempt to address 0x069B901C 21678723 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0041A526 read attempt to address 0x071D901C 21669345 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0041A526 read attempt to address 0x0A11E01C Edit: I checked the workunit pages for these tasks and they appear to have crashed with the same error on every Windows host they were on. This one completed on Darwin. All GPU and opt CPU builds updated. Should fix random crash at the end of task processing. Now that's what I call responsive. :^) |
Send message Joined: 18 Oct 09 Posts: 48 Credit: 73,283 RAC: 0 ![]() |
Just attached this host 27 tasks completed so far, some were CPU. Of those 5 are inconclusive, all OpenCL Intel GPU. Its using the latest (Dec 2015) 4352 driver from Intel. Inconclusive tasks |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
All GPU and opt CPU builds updated. Should fix random crash at the end of task processing. Link: https://cloud.mail.ru/public/3nxq/SgQBZXcM7 News about SETI opt app releases: https://twitter.com/Raistmer |
Send message Joined: 5 Jan 16 Posts: 9 Credit: 552,318 RAC: 0 ![]() |
Is this the link where future tests apps will be posted also? |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
Is this the link where future tests apps will be posted also? perhaps so. Cause to make link to separate file takes too much precious time... News about SETI opt app releases: https://twitter.com/Raistmer |
![]() Send message Joined: 11 Dec 14 Posts: 96 Credit: 1,240,941 RAC: 0 ![]() |
This report is going to be long because I want to be thorough and record any details that I've been able to cobble together. Thanks for looking into that bugcheck number. I just didn't have time last night to dig any deeper than what I did. The symptoms certainly seem to partially fit, although that box is running Win 8.1. The machine should not have entered suspend or sleep mode under normal operation. My power option is set to "Turn off the display: 3 minutes; Put the computer to sleep: Never", which is the way it has always been run, so there's been no change there. When I went in to physically check on the machine, it sounded like it was still running normally, not in a sleep state. However, the timing of the BOINC scheduler request at 19:40:13 appears to match exactly with the time of the driver restart, and probably also corresponds with the time I tried to wake the monitor with the mouse (although I can't be precise about that time). So, perhaps that mouse movement did wake something up, including BOINC. Then again, BOINC Manager here on my daily driver didn't report any loss of connection to the T7400 until it actually crashed and rebooted. |
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
First availability of Windows SETI MB v8 CUDA applications for NVidia cards. These are for use on the BETA SITE ONLY. Notes: - Each has the same requirements and limitations as the corresponding v7 version, -----> e,g, Cuda 2.3 is for pre Fermi (< compute capability 2.0) -----> Maxwell GPUs (compute capability 5.0+) don't like Cuda 3.2) Each package is self-contained, with the required CUDA runtime DLL files, an aistub file suitable for use as a standalone app_info.xml file (or merging with an existing app_info.xml), and a skeleton configuration file. Links are to my personal OneDrive account. Lunatics_x41zh_win32_cuda50.zip Lunatics_x41zh_win32_cuda42.zip Lunatics_x41zh_win32_cuda32.zip Lunatics_x41zh_win32_cuda23.zip Jason has submitted these builds to Eric for deployment as stock Beta applications: I am making this preliminary release with his permission. |
![]() Send message Joined: 11 Dec 14 Posts: 96 Credit: 1,240,941 RAC: 0 ![]() |
This report is going to be long because I want to be thorough and record any details that I've been able to cobble together. It's actually a Dell Precision T7400 Workstation with Xeon E5430 CPUs, but I tend to reference the model number, rather than the CPU. Seems to be easier for me to remember, for some reason. ;^) I figured that pseudo-progress "feature" must have been in play, although the Elapsed time figure is still rather puzzling. It's difficult to tell where that was measured from. I have seen similar problems on one of my Windows 7 machines. Occasional GPU task stalls, accompanied by a complete screen freeze (even the clock stops updating), and no response to mouse or keyboard input. It appears that the complete Windows graphics sub-system goes south, but the kernel processes keep running. That includes the BOINC client, and CPU applications launched by it. I've been able to connect to the machine via a remote BOINC Manager, shut down the running client cleanly, and restart the whole computer with the Big Red Switch. Unfortunately, I wasn't able to see what state the screen was in because I couldn't wake the monitor, and I never got a chance to see if I could do anything remotely with the BOINC client because the whole thing crashed just shortly after I took the screenshot from here on my daily driver. It did its own rebooting, although that in itself is strange because my recollection is that I have the BIOS set to not automatically restart after a crash. Perhaps that only takes effect after a power loss, rather than a crash. I'll have to remember to recheck it sometime when I'm manually booting it up. You mention remote monitoring. I use BoincView, which still calculates and displays CPU efficiency (I believe BoincTasks may do the same - unfortunately the concept was removed from BOINC itself some years ago). A BOINC CPU task will show 97-100% efficiency, and a good GPU task in the low single figure range. But one of the stalled tasks we're talking about will display exactly 0.0000 CPU efficiency - that's a useful warning that a visit to the remote machine is required urgently. I've just been using BOINC Manager for my remote monitoring since it was already on hand and was so easy to make the connections once I started running more than one machine. I tend to not add extra tools to the kit unless I have a compelling need to do so, but I'll take a look at BoincView when I get a chance. Thanks for the recommendation. |
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
I'll take a look at BoincView when I get a chance. Thanks for the recommendation. You'll find that difficult, because the developer (and his website) disappeared without warning many years ago - sometime between early 2006 and the start of the GPU revolution. (I can dig out a copy of the final release version, if anyone else prefers it to BoincTasks) BoincTasks, on the other hand, is still actively supported. It does have the efficiency measure, shown on the screenshot as 'CPU %'. |
![]() Send message Joined: 11 Dec 14 Posts: 96 Credit: 1,240,941 RAC: 0 ![]() |
I'll take a look at BoincView when I get a chance. Thanks for the recommendation. Okay, thanks! Will redirect my attention to BoincTasks. |
![]() Send message Joined: 11 Dec 14 Posts: 96 Credit: 1,240,941 RAC: 0 ![]() |
My first look at v8 cuda50 vs. opencl on my T7400's NVIDIA GPUs, with "normal" AR tasks, shows: GTX 660 opencl: 21731337 AR=0.414587, RT=15:01, CPU=14:33 cuda50: 21733502 AR=0.414664, RT=16:52, CPU=5:27 GTX 670 opencl: 21731340 AR=0.414587, RT=12:31, CPU=12:06 cuda50: 21733503 AR=0.414664, RT=14:23, CPU=5:23 GTX 780 opencl: 21731336 AR=0.414587, RT=8:42, CPU=8:24 cuda50: 21733373 AR=0.414664, RT=12:47, CPU=6:50 All of those were running 1 task per GPU, with no CPU tasks running. A couple of other observations on the cuda50 that I saw while monitoring the first batch of tasks. The core usage seemed to fluctuate quite a bit, from less than 10% to over 60%, with the GTX780 consistently using more CPU than the other two. Those numbers are much less than the 95%+ core usage of the opencl apps. Also, the GPU Load was much less for cuda50, with the load appearing to increase in the later stages of each task run. For the GTX660, it averaged 51% early, rising to 67% late. For the GTX 670, it was about 43% to 51%, and for the GTX 780, it was about 31% to 48%. That compares to about 90%, 88% and 84%, respectively for the opencl apps. Theoretically, that should make it more practical to run multiple instances per GPU with the cuda50, which should, I think, make the overall throughput better. |
Send message Joined: 11 Dec 08 Posts: 198 Credit: 658,573 RAC: 0 ![]() |
Also, the GPU Load was much less for cuda50, with the load appearing to increase in the later stages of each task run. For the GTX660, it averaged 51% early, rising to 67% late. For the GTX 670, it was about 43% to 51%, and for the GTX 780, it was about 31% to 48%. That compares to about 90%, 88% and 84%, respectively for the opencl apps. Theoretically, that should make it more practical to run multiple instances per GPU with the cuda50, which should, I think, make the overall throughput better. That's pretty much the route I've taken here (running 2 up), while leaving things at default settings. Alternatively the priority and pulsefind settings would likely stabilise the load a fair bit. Time will tell whether fewer instancess and more aggressive settings + code is a better option than treading lightly and running more instances. either way, both will ultimately be an option once we have reliable baseline functionality to prove the new code against. [ I'll be happy enough to start shovelling in alternate codepaths once I'm convinced inconclusive to pending ratios against stock CPU are converging on 5% or better] Chaos: When the present determines the future, but the approximate present does not approximately determine the future. Edward Lorenz |
![]() Send message Joined: 10 Mar 12 Posts: 1700 Credit: 13,216,373 RAC: 0 ![]() |
OK, here's my NVIDIA GeForce GTX 980, running 3 tasks at a time, and with processpriority = abovenormal. Looking good so far: https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=75292&offset=0&show_names=0&state=0&appid=38 |
![]() ![]() Send message Joined: 9 Jan 16 Posts: 51 Credit: 1,038,205 RAC: 0 ![]() |
Anyone want some good news? Ditto. 8.05 on three machines with 2 750tis each, Win7x64. If I can help out by testing something, please let me know. Available hardware and software is listed in my profile here. |
![]() ![]() Send message Joined: 9 Jan 16 Posts: 51 Credit: 1,038,205 RAC: 0 ![]() |
New to the party here, so I'm not sure what the protocol is. Should I mention all error tasks here? If so, only 1 so far: https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=21719471 8.05 opencl_nvidia_sah on 750ti under Win7x64 <message> One or more arguments are invalid (0x80000003) - exit code -2147483645 (0x80000003) </message> At this point, over 400 completed valids. If there's thing I can do , or not do, to be helpful please let me know. Regards, Jim ... [edit] In case it was unclear, the above is SAH Beta. [\edit] If I can help out by testing something, please let me know. Available hardware and software is listed in my profile here. |
Send message Joined: 11 Dec 08 Posts: 198 Credit: 658,573 RAC: 0 ![]() |
Probably something for the setiathome_enhanced forum section. Not sure what that error is myself, but suspect the OpenCL gurus could help out with settings or somesuch Chaos: When the present determines the future, but the approximate present does not approximately determine the future. Edward Lorenz |
![]() Send message Joined: 28 Jan 11 Posts: 619 Credit: 2,580,051 RAC: 0 ![]() |
This task behaved kinda strangely.... It didn't show any Gpu usage but continued to run as if it was using the Gpu... http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=21737325 And the wu after worked as it should..... I hope. there is some error information in the link above for the one who understands it. |
Send message Joined: 18 Oct 09 Posts: 48 Credit: 73,283 RAC: 0 ![]() |
First availability of Windows SETI MB v8 CUDA applications for NVidia cards. Thank you Jason and Richard. I have attached a dual GTX750Ti host (Win7 x64) here and its done the first couple. |
![]() Send message Joined: 28 Jan 11 Posts: 619 Credit: 2,580,051 RAC: 0 ![]() |
Ok. Suspending gpu 0 and keep running on gpu 1 and 2 on 5970 while watching another episode of Farscape. The program paused for 4 seconds and then played fastforward to where it was and continued to play. What happend was a driver restart and gpu 1 got down to 0% usage. Hmm, I thought....nah, I do as I shouldn't and exit BM and then restart it. This normaly gives a blue screen within 4-12 seconds. But this time it didn't and the 5970 kept crunching on both gpu's and still does. The Farscape episode is soon done and I resume seeing it now. |
![]() ![]() Send message Joined: 9 Jan 16 Posts: 51 Credit: 1,038,205 RAC: 0 ![]() |
The program paused for 4 seconds and then played fastforward to where it was and continued to play. I've had 3-4 driver restarts on 77531 here this afternoon. Looking at the tasks that were running at the restart I either see: ERROR: OpenCL kernel/call 'clEnqueueMapBuffer(cpu_MeanMaxIdx_buf)' call failed (-36) in file ..\analyzeFuncs.cpp near line 4950. Waiting 30 sec before restart... or OpenCL queue synchronized SETI@Home Informational message -9 result_overflow NOTE: The number of results detected equals the storage space allocated. in each task. These were all tasks complete on the above host right around 0000gmt 10Jan. Jim ... If I can help out by testing something, please let me know. Available hardware and software is listed in my profile here. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.