Panic Mode On (94) Server Problems?

Author	Message
Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 1622882 - Posted: 3 Jan 2015, 19:48:46 UTC - in response to Message 1622867. Here's my run-through of my job_log: First one's unix stamp is Aug 08, 2012: 1344411405.758627 ue 40443.513279 ct 44801.180000 fe 100366914056685.200000 nm ap_06my12ab_B6_P1_00397_20120728_06219.wu_0 1348411651.444180 ue 41462.286580 ct 41549.820000 fe 99312357715860.406000 nm ap_28jn12ab_B1_P1_00327_20120913_22283.wu_1 1353416891 ue 41154.808134 ct 44612.900000 fe 100748165223275 nm ap_27au12ab_B4_P1_00033_20121109_21887.wu_2 et 44755.875742 1354547709 ue 43096.733593 ct 41306.480000 fe 103747720196172 nm ap_27au12ab_B3_P0_00383_20121109_08471.wu_3 et 41408.402161 1354823446 ue 42485.311151 ct 36223.400000 fe 103782242333548 nm ap_01se12ab_B2_P1_00034_20121106_10784.wu_3 et 36360.880091 1365786085 ue 42041.303007 ct 43234.320000 fe 101221168606141 nm ap_30jn12ad_B0_P0_00323_20130331_14381.wu_2 et 43327.372242 1367166402 ue 41867.606117 ct 44128.500000 fe 102913816340793 nm ap_29ja13aa_B4_P0_00011_20130418_18429.wu_0 et 44210.891695 1367705446 ue 42201.410116 ct 44351.470000 fe 103734332406452 nm ap_26fe13ae_B6_P0_00170_20130424_29841.wu_0 et 44437.584740 1371698309 ue 43973.768278 ct 35040.840000 fe 107073863886764 nm ap_24no12aa_B3_P0_00177_20130610_00904.wu_0 et 35083.251799 1371782925 ue 43491.213005 ct 37562.650000 fe 105898866616610 nm ap_03ja12ai_B5_P0_00025_20130611_18331.wu_0 et 37606.444761 Last one's timestamp is June 21, 2013. If you (or anybody else) feel you have a comprehensive job log/s with tasks from all AP tapes, I could set up a parallel database and compare them with my MB records. ID: 1622882 ·

Julie Volunteer moderator Volunteer tester Send message Joined: 28 Oct 09 Posts: 34053 Credit: 18,883,157 RAC: 18	Message 1622975 - Posted: 3 Jan 2015, 23:39:43 UTC - in response to Message 1622850. This was posted by Mark Sattler on the Bionc boards, he has apparently e-mailed all at the lab. Got word back from Matt that he believes these are datasets which have already had the AP work split and sent, but not the MB work. So the splitters run through them quickly and send no AP outbound because it has already been done. As a laboratory plan, that makes perfect sense. Unfortunately, Matt's recollection may not be perfect, as according to my records, all of the tapes (20) in the last two batches deployed have been processed for MB already, between 25 April 2012 and 04 May 2013 (hence all before sah v7, so at least the autocorrs are mew). Would one of our AP specialists like to verify whether or not they have processed those tapes in the past, before we waste any more time redoing old work? I'll email my list to Matt. My personal feeling is that any data which involved blanking by AP v6 ought to be reprocessed by AP v7. The v7 processing is better, not just faster. OTOH, splitting data which has never had Astropulse analysis done would be even better. Joe I would reprocess only tasks with considerable blanking involved perhaps, to remove all those false positives from Berkeley's random noise generator :) Also, in MB6 => MB7 reprocessing would be good to insist on task modification to enable only those searches that really need to be redone. Any changes in Gaussians or Spikes or Triplets for example... I don't like the idea to redo ALL those searches just to get Autocorrelation too. Seems much bigger waste than to use unoptimized apps so can't tolerate such, LoL. +1 rOZZ Music Pictures ID: 1622975 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1622978 - Posted: 3 Jan 2015, 23:51:40 UTC I don't like the idea to redo ALL those searches just to get Autocorrelation too I'd rather do that than nothing. My personal feeling is that any data which involved blanking by AP v6 ought to be reprocessed by AP v7. The v7 processing is better, not just faster . + a lot OTOH, splitting data which has never had Astropulse analysis done would be even better. No it wouldn't, it would be best. ID: 1622978 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1622985 - Posted: 4 Jan 2015, 0:25:57 UTC - in response to Message 1622975. This was posted by Mark Sattler on the Bionc boards, he has apparently e-mailed all at the lab. Got word back from Matt that he believes these are datasets which have already had the AP work split and sent, but not the MB work. So the splitters run through them quickly and send no AP outbound because it has already been done. As a laboratory plan, that makes perfect sense. Unfortunately, Matt's recollection may not be perfect, as according to my records, all of the tapes (20) in the last two batches deployed have been processed for MB already, between 25 April 2012 and 04 May 2013 (hence all before sah v7, so at least the autocorrs are mew). Would one of our AP specialists like to verify whether or not they have processed those tapes in the past, before we waste any more time redoing old work? I'll email my list to Matt. My personal feeling is that any data which involved blanking by AP v6 ought to be reprocessed by AP v7. The v7 processing is better, not just faster. OTOH, splitting data which has never had Astropulse analysis done would be even better. Joe I would reprocess only tasks with considerable blanking involved perhaps, to remove all those false positives from Berkeley's random noise generator :) Also, in MB6 => MB7 reprocessing would be good to insist on task modification to enable only those searches that really need to be redone. Any changes in Gaussians or Spikes or Triplets for example... I don't like the idea to redo ALL those searches just to get Autocorrelation too. Seems much bigger waste than to use unoptimized apps so can't tolerate such, LoL. +1 I disagree. Little science is still science. With each crime and every kindness we birth our future. ID: 1622985 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1622999 - Posted: 4 Jan 2015, 0:49:17 UTC - in response to Message 1622985. OTOH, splitting data which has never had Astropulse analysis done would be even better. That would not be "little science", it would be the oldest data with the newest MB processing plus autocorrelation and a whole new source of APs. As I have stated previously rerunning old data with new filters has been proven to yield results over at Einstein, how is this project different in that aspect? ID: 1622999 ·

Ulrich Metzner Volunteer tester Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13	Message 1623004 - Posted: 4 Jan 2015, 1:01:50 UTC Ok, what now, should it recrunch on MB7 or just crunch Einstein ahead? :? Please, answer fast, or Einstein will take over... :/ Aloha, Uli ID: 1623004 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1623012 - Posted: 4 Jan 2015, 1:57:57 UTC - in response to Message 1623004. Ok, what now, should it recrunch on MB7 or just crunch Einstein ahead? :? Please, answer fast, or Einstein will take over... :/ For me I crunch a lot over at Einstein so I would prefer to make modest contributions here also. ID: 1623012 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1623091 - Posted: 4 Jan 2015, 9:41:33 UTC - in response to Message 1622999. OTOH, splitting data which has never had Astropulse analysis done would be even better. That would not be "little science", it would be the oldest data with the newest MB processing plus autocorrelation and a whole new source of APs. As I have stated previously rerunning old data with new filters has been proven to yield results over at Einstein, how is this project different in that aspect? Doesn`t matter IMHO. Compared to the distance in space what are 2 years. With each crime and every kindness we birth our future. ID: 1623091 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1623258 - Posted: 4 Jan 2015, 19:28:32 UTC - in response to Message 1623091. Fresh APs....hmm.. gobble gobble!!! ID: 1623258 ·

JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1	Message 1623271 - Posted: 4 Jan 2015, 19:59:42 UTC I REALLY gotta quit maxing out on MB's when the AP Splitters aren't working. :{{ I'm now at the dreaded "Computer has reached it's limit on tasks in progress". Can't gobble any AP's. Grrr. "Sour Grapes make a bitter Whine." <(0)> ID: 1623271 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 1623277 - Posted: 4 Jan 2015, 20:10:57 UTC - in response to Message 1623272. There you go, SETI's servers just hate producing AP's: 1711 SETI@home 2015-01-04 20:59:59 Requesting new tasks for GPU 1712 2015-01-04 21:00:22 Project communication failed: attempting access to reference site 1713 SETI@home 2015-01-04 21:00:22 Scheduler request failed: Couldn't connect to server 1714 2015-01-04 21:00:23 Internet access OK - project servers may be temporarily down. Didn't last long: 04/01/2015 19:58:27 \| SETI@home \| Requesting new tasks for NVIDIA GPU 04/01/2015 19:59:38 \| SETI@home \| Scheduler request failed: Timeout was reached 04/01/2015 20:01:18 \| SETI@home \| Sending scheduler request: To fetch work. 04/01/2015 20:01:18 \| SETI@home \| Requesting new tasks for NVIDIA GPU 04/01/2015 20:01:43 \| SETI@home \| Scheduler request completed: got 3 new tasks 04/01/2015 20:01:43 \| SETI@home \| Resent lost task 24no12aa.301.18881.438086664199.12.162_1 04/01/2015 20:01:43 \| SETI@home \| Resent lost task 24no12aa.301.18881.438086664199.12.165_0 04/01/2015 20:01:43 \| SETI@home \| Resent lost task 24no12aa.301.18881.438086664199.12.168_0 The servers most commonly fail at midnight: that was midday (Pacific time). I wonder... ID: 1623277 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1623286 - Posted: 4 Jan 2015, 20:31:55 UTC - in response to Message 1623260. Nom, nom, nom.... While they last, which will not be for long this time. Here I am getting ready to switch my fastest machines to run mostly on their backup projects & then we get all of these new AP tasks. ~~That is rather inconvenient.~~ I mean... Yay! New AP work! SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1623286 ·

Dena Wiltsie Volunteer tester Send message Joined: 19 Apr 01 Posts: 1628 Credit: 24,230,968 RAC: 26	Message 1623302 - Posted: 4 Jan 2015, 21:02:18 UTC I keep thinking about bumping my queue size from one day to two but at 48 APs a day, I am having problems getting enough work units to fill a one days queue. Good thing I decided to run one GPU and the CPU instead of no CPU and both GPUs. ID: 1623302 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1623309 - Posted: 4 Jan 2015, 21:17:02 UTC Oh wow. I'm amazed. The B3_P1's from 01dc10aa are *NOT* 100% blanked. It's been a while since I've had a B3_P1 that actually has useful data in it. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1623309 ·

Julie Volunteer moderator Volunteer tester Send message Joined: 28 Oct 09 Posts: 34053 Credit: 18,883,157 RAC: 18	Message 1623327 - Posted: 4 Jan 2015, 21:38:40 UTC - in response to Message 1622985. This was posted by Mark Sattler on the Bionc boards, he has apparently e-mailed all at the lab. Got word back from Matt that he believes these are datasets which have already had the AP work split and sent, but not the MB work. So the splitters run through them quickly and send no AP outbound because it has already been done. As a laboratory plan, that makes perfect sense. Unfortunately, Matt's recollection may not be perfect, as according to my records, all of the tapes (20) in the last two batches deployed have been processed for MB already, between 25 April 2012 and 04 May 2013 (hence all before sah v7, so at least the autocorrs are mew). Would one of our AP specialists like to verify whether or not they have processed those tapes in the past, before we waste any more time redoing old work? I'll email my list to Matt. My personal feeling is that any data which involved blanking by AP v6 ought to be reprocessed by AP v7. The v7 processing is better, not just faster. OTOH, splitting data which has never had Astropulse analysis done would be even better. Joe I would reprocess only tasks with considerable blanking involved perhaps, to remove all those false positives from Berkeley's random noise generator :) Also, in MB6 => MB7 reprocessing would be good to insist on task modification to enable only those searches that really need to be redone. Any changes in Gaussians or Spikes or Triplets for example... I don't like the idea to redo ALL those searches just to get Autocorrelation too. Seems much bigger waste than to use unoptimized apps so can't tolerate such, LoL. +1 I disagree. Little science is still science. +1 Mike, as long as it's not mainstream. rOZZ Music Pictures ID: 1623327 ·

Admiral Gloval Send message Joined: 31 Mar 13 Posts: 20306 Credit: 5,308,449 RAC: 0	Message 1623329 - Posted: 4 Jan 2015, 21:42:10 UTC I am Not complaining but... I upgraded to 7.4.36 and now I am getting a lot of mini wu files. They are 01:36:14 in length. Is this something new or just short wu's? ID: 1623329 ·

Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20	Message 1623411 - Posted: 5 Jan 2015, 3:19:26 UTC - in response to Message 1623329. I am Not complaining but... I upgraded to 7.4.36 and now I am getting a lot of mini wu files. They are 01:36:14 in length. Is this something new or just short wu's? More shorties, Admiral. Half my current queue on both Core2Duo boxes are 2-hour shorties. Donald Infernal Optimist / Submariner, retired ID: 1623411 ·

Sleepy Volunteer tester Send message Joined: 21 May 99 Posts: 219 Credit: 98,947,784 RAC: 28,360	Message 1623747 - Posted: 5 Jan 2015, 16:11:11 UTC Last modified: 5 Jan 2015, 16:48:25 UTC Is it just me or now when we get at the CPU 100 WUs limit, the server issues and enforces the limit also for GPU tasks, though those are far from the limit? 05/01/2015 17:44:50 SETI@home Sending scheduler request: To fetch work. 05/01/2015 17:44:50 SETI@home Requesting new tasks for GPU 05/01/2015 17:44:55 SETI@home Scheduler request completed: got 0 new tasks 05/01/2015 17:44:55 SETI@home Message from server: No tasks sent 05/01/2015 17:44:55 SETI@home Message from server: No tasks are available for AstroPulse v6 05/01/2015 17:44:55 SETI@home Message from server: No tasks are available for AstroPulse v7 05/01/2015 17:44:55 SETI@home Message from server: Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them 05/01/2015 17:44:55 SETI@home Message from server: Tasks for Intel GPU are available, but your preferences are set to not accept them 05/01/2015 17:44:55 SETI@home Message from server: This computer has reached a limit on tasks in progress Though actually on this PC I am slowly reaching the 100+100 WUs, therefore I may be wrong, but the answers above from the server made me a bit (wrongly?) suspicious. I am using BOINC 6.10.55 (but this problem seems server side, though it may be triggered by a non up-to-date answer from the client). Cheers, Sleepy ID: 1623747 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1623766 - Posted: 5 Jan 2015, 16:48:42 UTC - in response to Message 1623747. Is it just me or now when we get at the CPU 100 WUs limit, the server issues and enforces the limit also for GPU tasks, though those are well far from the limit? Happening on both PCs of mine under BOINC 6.10.55 (but this problem seems server side, though it may be triggered by a non up-to-date answer from the client)... In case it is a general problem, in the meantime: Aborting (urghh...) CPU task to make room for GPU tasks (with all bad consequences in terms of penalties and for the whole system)? Rescheduling now almost necessary (added urghhh...)? Cheers, Sleepy My host 6727898 Seems to be staying at its limit of 100 CPU + 100 GPU tasks. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1623766 ·

Sleepy Volunteer tester Send message Joined: 21 May 99 Posts: 219 Credit: 98,947,784 RAC: 28,360	Message 1623773 - Posted: 5 Jan 2015, 17:13:42 UTC - in response to Message 1623766. Yes, it seems my mistake. Too many things happening lately and now that all is getting business as usual again I got scared by it! ;-) Sleepy ID: 1623773 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.