Message boards :
Number crunching :
Panic Mode On (106) Server Problems?
Message board moderation
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 29 · Next
Author | Message |
---|---|
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Yes, amazed too. Caught me off guard. It looks like I banked enough tasks to get through the outage finally. And didn't make any ghosts. I barely squeaked through with the Ryzen system for CPU task. Probably wouldn't have made for the typical 13 hour long outages of late. . . Having a working rescheduler for windows the windows boxes survive the outage, but neither Linux box comes even close. The little fella (Mi-Burrito) is out of work in about 7 hours and La-Bamba is out in about 4 hours. That rig needs a cache of about 6 to 8 hundred to make it all the way through an outage, but E@H probably don't mind :) I really need to learn more about Laurent's app. Stephen ?? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Well, I guess I jinxed myself. As soon as I said I had made no ghosts, I dumped the entire machine because of a TDR crash. I corrupted both the MilkyWay and SETI files and dumped all their tasks. I have had Einstein tasks working in High Priority mode because I goofed and forgot to toggle NNT off for about an hour a week ago. I've been working through them and actually would not have had any problem clearing them before their deadline on the 26th. But BOINC thought differently and forced the two tasks I have running at all times to HP mode. I can't even suspend the project or tasks without BSOD'ing the machine now. I should just have aborted the tasks and been done with them. This has happened a couple of times now. I corrupt the account file and the statistics files for MW and SETI. Einstein always escaped unscathed for some reason. So ..... OH JOY, now I get to spend the evening recovering 400 ghosts. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Well, I just discovered something interesting, I think. It seems you can't recover your ghosts if you have originally downloaded them as CPU tasks and you are currently in system GPU task loading. My attempts at recovering my 438 ghosts only succeeded in the first try of getting 16 tasks that were assigned as GPU tasks. Each ghost recovery try since, (about 8 so far) has not recovered any ghosts, just received normal tasks. I will have to wait until I have filled up my 200 GPU quota I think before I try and recover my CPU ghosts. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
That seems odd. Except for Arecibo VLARs, I don't know that it really matters what the original task assignments were. In my experience, the scheduler just sends them back based on the current request. That's why it sometimes tries to send ghosted Arecibo VLARs to NVIDIA GPUs before it realizes that it can't and then immediately marks them as errors |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Whenever I do recovery I make sure I have an opening for 20 tasks on both CPU and GPU since those Arecibo vlars will say 'can't resend' if there is no room for them. You never know what will be in the batch they are sending, so make room for them. EDIT: If you are seeing a 'normal' download after the communication interrupt. You didn't a). Have NNT set first, b). didn't wait 5 minutes between the next request c). Didn't restart BOINC. EDIT 2: d). don't have ghosts, or server doesn't think you do. After this maintenance and server catch up, if may be the server hasn't flagged them yet. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Whenever I do recovery I make sure I have an opening for 20 tasks on both CPU and GPU since those Arecibo vlars will say 'can't resend' if there is no room for them. You never know what will be in the batch they are sending, so make room for them. NOPE. I did have NNT set. I did wait 5 minutes after shutting down BOINC before restarting. Definitely fully exited BOINC. Well my theory about having to wait until my 200 task GPU cache was filled before it started filling the missing CPU tasks ... just went out the window. I had 1 CPU task on board and room for 99 new tasks. My ghost recovery just got me new tasks. No resends of lost tasks. So unless SETI turned off resends just this evening, I don't have any answer for why it is not working. I can only think it was because all of my ghosted tasks are CPU tasks in the servers database. Anyone want to try to explain why ghost recovery isn't working. [EDIT] I can prove it. Keith-Windows7 1 5/23/2017 20:42:42 Starting BOINC client version 7.6.33 for windows_x86_64 2 5/23/2017 20:42:42 log flags: file_xfer, sched_ops, task 3 5/23/2017 20:42:42 Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8 4 5/23/2017 20:42:42 Data directory: C:\ProgramData\BOINC 5 5/23/2017 20:42:42 Running under account Keith 6 5/23/2017 20:42:43 CUDA: NVIDIA GPU 0: GeForce GTX 1070 (driver version 378.92, CUDA version 8.0, compute capability 6.1, 4096MB, 3046MB available, 6463 GFLOPS peak) 7 5/23/2017 20:42:43 CUDA: NVIDIA GPU 1: GeForce GTX 1070 (driver version 378.92, CUDA version 8.0, compute capability 6.1, 4096MB, 3046MB available, 6463 GFLOPS peak) 8 5/23/2017 20:42:43 OpenCL: NVIDIA GPU 0: GeForce GTX 1070 (driver version 378.92, device version OpenCL 1.2 CUDA, 8192MB, 3046MB available, 6463 GFLOPS peak) 9 5/23/2017 20:42:43 OpenCL: NVIDIA GPU 1: GeForce GTX 1070 (driver version 378.92, device version OpenCL 1.2 CUDA, 8192MB, 3046MB available, 6463 GFLOPS peak) 10 SETI@home 5/23/2017 20:42:43 Found app_info.xml; using anonymous platform 11 5/23/2017 20:42:43 Host name: Keith-Windows7 12 5/23/2017 20:42:43 Processor: 8 AuthenticAMD AMD FX-8370 Eight-Core Processor [Family 21 Model 2 Stepping 0] 13 5/23/2017 20:42:43 Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 popcnt aes f16c syscall nx lm avx svm sse4a osvw ibs xop skinit wdt lwp fma4 tce tbm topx page1gb rd 14 5/23/2017 20:42:43 OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00) 15 5/23/2017 20:42:43 Memory: 15.90 GB physical, 31.80 GB virtual 16 5/23/2017 20:42:43 Disk: 238.47 GB total, 179.94 GB free 17 5/23/2017 20:42:43 Local time is UTC -7 hours 18 Einstein@Home 5/23/2017 20:42:43 Found app_config.xml 19 Milkyway@Home 5/23/2017 20:42:43 Found app_config.xml 20 SETI@home 5/23/2017 20:42:43 Found app_config.xml 21 5/23/2017 20:42:43 Config: GUI RPC allowed from any host 22 5/23/2017 20:42:43 Config: GUI RPCs allowed from: 23 5/23/2017 20:42:43 192.168.2.192 24 5/23/2017 20:42:43 keith-windows7 25 5/23/2017 20:42:43 Config: report completed tasks immediately 26 Einstein@Home 5/23/2017 20:42:43 URL http://einstein.phys.uwm.edu/; Computer ID 12444941; resource share 125 27 Milkyway@Home 5/23/2017 20:42:43 URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 257518; resource share 75 28 SETI@home 5/23/2017 20:42:43 URL http://setiathome.berkeley.edu/; Computer ID 5741129; resource share 800 29 Milkyway@Home 5/23/2017 20:42:43 General prefs: from Milkyway@Home (last modified 25-Apr-2017 12:04:39) 30 Milkyway@Home 5/23/2017 20:42:43 Host location: none 31 Milkyway@Home 5/23/2017 20:42:43 General prefs: using your defaults 32 5/23/2017 20:42:43 Reading preferences override file 33 5/23/2017 20:42:43 Preferences: 34 5/23/2017 20:42:43 max memory usage when active: 8141.04MB 35 5/23/2017 20:42:43 max memory usage when idle: 14653.88MB 36 5/23/2017 20:42:43 max disk usage: 1.00GB 37 5/23/2017 20:42:43 (to change preferences, visit a project web site or select Preferences in the Manager) 38 5/23/2017 20:42:43 Suspending network activity - user request 39 SETI@home 5/23/2017 20:42:50 work fetch resumed by user 40 5/23/2017 20:42:57 Resuming network activity 41 SETI@home 5/23/2017 20:42:57 Sending scheduler request: To report completed tasks. 42 SETI@home 5/23/2017 20:42:57 Reporting 1 completed tasks 43 SETI@home 5/23/2017 20:42:57 Requesting new tasks for CPU and NVIDIA GPU 44 SETI@home 5/23/2017 20:43:00 Scheduler request completed: got 14 new tasks 45 SETI@home 5/23/2017 20:43:02 Started download of blc03_2bit_guppi_57835_10813_HIP48714_0038.920.409.23.46.24.vlar Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
EDIT 2: d). don't have ghosts, or server doesn't think you do. After this maintenance and server catch up, if may be the server hasn't flagged them yet. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Well, that is the only plausible theory I can grasp at so far. I DID in fact receive about 16 resent tasks the first time I tried ghost recovery. I got 15 GPU tasks and 1 expired task. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
After this maintenance and server catch up, if may be the server hasn't flagged them yet.Yeah, that kinda makes sense to me, too. The Replica's over 30,000 seconds behind and perhaps your ghosts are in that limbo. Wonder why some came back on the first try, though. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I guess to test out the theory is to wait till the replica recovers and then try ghost recovery again once I set NNT and make room for 20 resends. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Just set a small cache for now, then you will have time later on. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
OK, just set my preferences for 0.5 day cache. Its going to impact everybody since I have everyone at no venue. I am still working through the buffer I built up on the other Windows 7 machine. The Win 10 machine already worked through everything already since it is the fastest machine in the stable with the Ryzen. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Well, I guess I jinxed myself. As soon as I said I had made no ghosts, I dumped the entire machine because of a TDR crash. I have had Einstein tasks working in High Priority mode because I goofed and forgot to toggle NNT off for about an hour a week ago. I can't even suspend the project or tasks without BSOD'ing the machine now. I should just have aborted the tasks and been done with them. This has happened a couple of times now. I corrupt the account file and the statistics files for MW and SETI. Einstein always escaped unscathed for some reason. So ..... OH JOY, now I get to spend the evening recovering 400 ghosts. . . I have run Einstein at 0% resource priority so it only runs when SETI has no work (ie outages). But I had a hiccough with the rescheduler on Bertie last week and have spent that week recovering nearly 350 ghosts so you have my sympathy ... . . Dare I tempt fate and state that I now have zero ghosts?? Stephen ? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13839 Credit: 208,696,464 RAC: 304 |
Looks like someone is using Seti to test run some yet to be released hardware. Lurking in the top hosts list. CPU type GenuineIntel Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz [Family 6 Model 85 Stepping 4] Number of processors 96 Memory 195245.19 MB Considering the CPUs are only clocked at 2.70GHz, its pumping out WUs in pretty good time. If it were running the Lunatics AVX application I suspect it would really churn out some work. Grant Darwin NT |
Wiggo Send message Joined: 24 Jan 00 Posts: 36468 Credit: 261,360,520 RAC: 489 |
Charles Long maybe? Cheers. |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
This is new to me, it popped up while changing apps chasing AP tasks. Instead of the usual: AstroPulse v7: yes SETI@home v8: yes It showed: (all applications) Other work was set to NO. I haven't been able to recreate it, but it does indicate to me there IS something lurking in the code which may cause strange behaviours regarding AP/MB selections. Anyone else ever see this? |
Advent42 Send message Joined: 23 Mar 17 Posts: 175 Credit: 4,015,683 RAC: 0 |
Well I have about a full day of tasks to get through but I can only seem to get 3 done at a time....where as before it was 5....so not sure what it is...and no AstroPulseV7!! |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
This is new to me, it popped up while changing apps chasing AP tasks. Yes, this has been changed from what it used to do. This mechanism was in fact the way I toggled preferences to get work running again on my systems. I got AP tasks for the first time in a really long time on two systems. 12 on Numbskull. I didn't get any on Keith-Windows7 because it was set to NNT in order to make room for ghosts. I tried again to recover this morning. No dice. The system doesn't seem to think I have any ghosts so won't resend them. The 438 tasks that got abandoned in the project corruption are gone to me. Going to take a long time clear out naturally now. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1 |
And the lower 'quality' of the Kibble available to feed to my crunchers lately has affected my SETI RAC by 10%.............not in a good way. "Sour Grapes make a bitter Whine." <(0)> |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
The system doesn't seem to think I have any ghosts so won't resend them. The 438 tasks that got abandoned in the project corruption are gone to me. Going to take a long time clear out naturally now.Looks like they did get marked as Abandoned, https://setiathome.berkeley.edu/results.php?hostid=5741129&offset=0&show_names=0&state=6&appid=, so they've already been sent out to new hosts. No need to worry. :^) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.