Panic Mode On (94) Server Problems?

Message boards : Number crunching : Panic Mode On (94) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 22 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1622882 - Posted: 3 Jan 2015, 19:48:46 UTC - in response to Message 1622867.  

Here's my run-through of my job_log:

First one's unix stamp is Aug 08, 2012:
1344411405.758627 ue 40443.513279 ct 44801.180000 fe 100366914056685.200000 nm ap_06my12ab_B6_P1_00397_20120728_06219.wu_0
1348411651.444180 ue 41462.286580 ct 41549.820000 fe 99312357715860.406000 nm ap_28jn12ab_B1_P1_00327_20120913_22283.wu_1
1353416891 ue 41154.808134 ct 44612.900000 fe 100748165223275 nm ap_27au12ab_B4_P1_00033_20121109_21887.wu_2 et 44755.875742
1354547709 ue 43096.733593 ct 41306.480000 fe 103747720196172 nm ap_27au12ab_B3_P0_00383_20121109_08471.wu_3 et 41408.402161
1354823446 ue 42485.311151 ct 36223.400000 fe 103782242333548 nm ap_01se12ab_B2_P1_00034_20121106_10784.wu_3 et 36360.880091
1365786085 ue 42041.303007 ct 43234.320000 fe 101221168606141 nm ap_30jn12ad_B0_P0_00323_20130331_14381.wu_2 et 43327.372242
1367166402 ue 41867.606117 ct 44128.500000 fe 102913816340793 nm ap_29ja13aa_B4_P0_00011_20130418_18429.wu_0 et 44210.891695
1367705446 ue 42201.410116 ct 44351.470000 fe 103734332406452 nm ap_26fe13ae_B6_P0_00170_20130424_29841.wu_0 et 44437.584740
1371698309 ue 43973.768278 ct 35040.840000 fe 107073863886764 nm ap_24no12aa_B3_P0_00177_20130610_00904.wu_0 et 35083.251799
1371782925 ue 43491.213005 ct 37562.650000 fe 105898866616610 nm ap_03ja12ai_B5_P0_00025_20130611_18331.wu_0 et 37606.444761

Last one's timestamp is June 21, 2013.

If you (or anybody else) feel you have a *comprehensive* job log/s with tasks from all AP tapes, I could set up a parallel database and compare them with my MB records.
ID: 1622882 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1622975 - Posted: 3 Jan 2015, 23:39:43 UTC - in response to Message 1622850.  

This was posted by Mark Sattler on the Bionc boards, he has apparently e-mailed all at the lab.

Got word back from Matt that he believes these are datasets which have already had the AP work split and sent, but not the MB work.
So the splitters run through them quickly and send no AP outbound because it has already been done.

As a laboratory plan, that makes perfect sense.

Unfortunately, Matt's recollection may not be perfect, as according to my records, all of the tapes (20) in the last two batches deployed have been processed for MB already, between 25 April 2012 and 04 May 2013 (hence all before sah v7, so at least the autocorrs are mew). Would one of our AP specialists like to verify whether or not they have processed those tapes in the past, before we waste any more time redoing old work? I'll email my list to Matt.

My personal feeling is that any data which involved blanking by AP v6 ought to be reprocessed by AP v7. The v7 processing is better, not just faster.

OTOH, splitting data which has never had Astropulse analysis done would be even better.
                                                                  Joe


I would reprocess only tasks with considerable blanking involved perhaps, to remove all those false positives from Berkeley's random noise generator :)

Also, in MB6 => MB7 reprocessing would be good to insist on task modification to enable only those searches that really need to be redone. Any changes in Gaussians or Spikes or Triplets for example...

I don't like the idea to redo ALL those searches just to get Autocorrelation too. Seems much bigger waste than to use unoptimized apps so can't tolerate such, LoL.


+1
rOZZ
Music
Pictures
ID: 1622975 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1622978 - Posted: 3 Jan 2015, 23:51:40 UTC

I don't like the idea to redo ALL those searches just to get Autocorrelation too

I'd rather do that than nothing.
My personal feeling is that any data which involved blanking by AP v6 ought to be reprocessed by AP v7. The v7 processing is better, not just faster
.
+ a lot

OTOH, splitting data which has never had Astropulse analysis done would be even better.

No it wouldn't, it would be best.
ID: 1622978 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1622985 - Posted: 4 Jan 2015, 0:25:57 UTC - in response to Message 1622975.  

This was posted by Mark Sattler on the Bionc boards, he has apparently e-mailed all at the lab.

Got word back from Matt that he believes these are datasets which have already had the AP work split and sent, but not the MB work.
So the splitters run through them quickly and send no AP outbound because it has already been done.

As a laboratory plan, that makes perfect sense.

Unfortunately, Matt's recollection may not be perfect, as according to my records, all of the tapes (20) in the last two batches deployed have been processed for MB already, between 25 April 2012 and 04 May 2013 (hence all before sah v7, so at least the autocorrs are mew). Would one of our AP specialists like to verify whether or not they have processed those tapes in the past, before we waste any more time redoing old work? I'll email my list to Matt.

My personal feeling is that any data which involved blanking by AP v6 ought to be reprocessed by AP v7. The v7 processing is better, not just faster.

OTOH, splitting data which has never had Astropulse analysis done would be even better.
                                                                  Joe


I would reprocess only tasks with considerable blanking involved perhaps, to remove all those false positives from Berkeley's random noise generator :)

Also, in MB6 => MB7 reprocessing would be good to insist on task modification to enable only those searches that really need to be redone. Any changes in Gaussians or Spikes or Triplets for example...

I don't like the idea to redo ALL those searches just to get Autocorrelation too. Seems much bigger waste than to use unoptimized apps so can't tolerate such, LoL.


+1


I disagree.
Little science is still science.


With each crime and every kindness we birth our future.
ID: 1622985 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1622999 - Posted: 4 Jan 2015, 0:49:17 UTC - in response to Message 1622985.  

OTOH, splitting data which has never had Astropulse analysis done would be even better.

That would not be "little science", it would be the oldest data with the newest MB processing plus
autocorrelation and a whole new source of APs. As I have stated previously rerunning old data with new filters has been proven to yield results over at Einstein, how is this project different in that aspect?
ID: 1622999 · Report as offensive
Ulrich Metzner
Volunteer tester
Avatar

Send message
Joined: 3 Jul 02
Posts: 1256
Credit: 13,565,513
RAC: 13
Germany
Message 1623004 - Posted: 4 Jan 2015, 1:01:50 UTC

Ok, what now, should it recrunch on MB7 or just crunch Einstein ahead? :?
Please, answer fast, or Einstein will take over... :/
Aloha, Uli

ID: 1623004 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1623012 - Posted: 4 Jan 2015, 1:57:57 UTC - in response to Message 1623004.  

Ok, what now, should it recrunch on MB7 or just crunch Einstein ahead? :?
Please, answer fast, or Einstein will take over... :/

For me I crunch a lot over at Einstein so I would prefer to make modest contributions here also.
ID: 1623012 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1623091 - Posted: 4 Jan 2015, 9:41:33 UTC - in response to Message 1622999.  

OTOH, splitting data which has never had Astropulse analysis done would be even better.

That would not be "little science", it would be the oldest data with the newest MB processing plus
autocorrelation and a whole new source of APs. As I have stated previously rerunning old data with new filters has been proven to yield results over at Einstein, how is this project different in that aspect?


Doesn`t matter IMHO.

Compared to the distance in space what are 2 years.


With each crime and every kindness we birth our future.
ID: 1623091 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1623258 - Posted: 4 Jan 2015, 19:28:32 UTC - in response to Message 1623091.  

Fresh APs....hmm.. gobble gobble!!!
ID: 1623258 · Report as offensive
Profile JaundicedEye
Avatar

Send message
Joined: 14 Mar 12
Posts: 5375
Credit: 30,870,693
RAC: 1
United States
Message 1623271 - Posted: 4 Jan 2015, 19:59:42 UTC

I REALLY gotta quit maxing out on MB's when the AP Splitters aren't working. :{{

I'm now at the dreaded "Computer has reached it's limit on tasks in progress".

Can't gobble any AP's.

Grrr.

"Sour Grapes make a bitter Whine." <(0)>
ID: 1623271 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1623277 - Posted: 4 Jan 2015, 20:10:57 UTC - in response to Message 1623272.  

There you go, SETI's servers just hate producing AP's:

1711 SETI@home 2015-01-04 20:59:59 Requesting new tasks for GPU
1712 2015-01-04 21:00:22 Project communication failed: attempting access to reference site
1713 SETI@home 2015-01-04 21:00:22 Scheduler request failed: Couldn't connect to server

1714 2015-01-04 21:00:23 Internet access OK - project servers may be temporarily down.

Didn't last long:

04/01/2015 19:58:27 | SETI@home | Requesting new tasks for NVIDIA GPU
04/01/2015 19:59:38 | SETI@home | Scheduler request failed: Timeout was reached
04/01/2015 20:01:18 | SETI@home | Sending scheduler request: To fetch work.
04/01/2015 20:01:18 | SETI@home | Requesting new tasks for NVIDIA GPU
04/01/2015 20:01:43 | SETI@home | Scheduler request completed: got 3 new tasks
04/01/2015 20:01:43 | SETI@home | Resent lost task 24no12aa.301.18881.438086664199.12.162_1
04/01/2015 20:01:43 | SETI@home | Resent lost task 24no12aa.301.18881.438086664199.12.165_0
04/01/2015 20:01:43 | SETI@home | Resent lost task 24no12aa.301.18881.438086664199.12.168_0

The servers most commonly fail at midnight: that was midday (Pacific time). I wonder...
ID: 1623277 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1623286 - Posted: 4 Jan 2015, 20:31:55 UTC - in response to Message 1623260.  

Nom, nom, nom....
While they last, which will not be for long this time.

Here I am getting ready to switch my fastest machines to run mostly on their backup projects & then we get all of these new AP tasks. That is rather inconvenient.
I mean... Yay! New AP work!
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1623286 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1623302 - Posted: 4 Jan 2015, 21:02:18 UTC

I keep thinking about bumping my queue size from one day to two but at 48 APs a day, I am having problems getting enough work units to fill a one days queue. Good thing I decided to run one GPU and the CPU instead of no CPU and both GPUs.
ID: 1623302 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1623309 - Posted: 4 Jan 2015, 21:17:02 UTC

Oh wow. I'm amazed. The B3_P1's from 01dc10aa are *NOT* 100% blanked. It's been a while since I've had a B3_P1 that actually has useful data in it.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1623309 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34053
Credit: 18,883,157
RAC: 18
Belgium
Message 1623327 - Posted: 4 Jan 2015, 21:38:40 UTC - in response to Message 1622985.  

This was posted by Mark Sattler on the Bionc boards, he has apparently e-mailed all at the lab.

Got word back from Matt that he believes these are datasets which have already had the AP work split and sent, but not the MB work.
So the splitters run through them quickly and send no AP outbound because it has already been done.

As a laboratory plan, that makes perfect sense.

Unfortunately, Matt's recollection may not be perfect, as according to my records, all of the tapes (20) in the last two batches deployed have been processed for MB already, between 25 April 2012 and 04 May 2013 (hence all before sah v7, so at least the autocorrs are mew). Would one of our AP specialists like to verify whether or not they have processed those tapes in the past, before we waste any more time redoing old work? I'll email my list to Matt.

My personal feeling is that any data which involved blanking by AP v6 ought to be reprocessed by AP v7. The v7 processing is better, not just faster.

OTOH, splitting data which has never had Astropulse analysis done would be even better.
                                                                  Joe


I would reprocess only tasks with considerable blanking involved perhaps, to remove all those false positives from Berkeley's random noise generator :)

Also, in MB6 => MB7 reprocessing would be good to insist on task modification to enable only those searches that really need to be redone. Any changes in Gaussians or Spikes or Triplets for example...

I don't like the idea to redo ALL those searches just to get Autocorrelation too. Seems much bigger waste than to use unoptimized apps so can't tolerate such, LoL.


+1


I disagree.
Little science is still science.


+1 Mike, as long as it's not mainstream.
rOZZ
Music
Pictures
ID: 1623327 · Report as offensive
Admiral Gloval
Avatar

Send message
Joined: 31 Mar 13
Posts: 20306
Credit: 5,308,449
RAC: 0
United States
Message 1623329 - Posted: 4 Jan 2015, 21:42:10 UTC

I am Not complaining but... I upgraded to 7.4.36 and now I am getting a lot of mini wu files. They are 01:36:14 in length. Is this something new or just short wu's?

ID: 1623329 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1623411 - Posted: 5 Jan 2015, 3:19:26 UTC - in response to Message 1623329.  

I am Not complaining but... I upgraded to 7.4.36 and now I am getting a lot of mini wu files. They are 01:36:14 in length. Is this something new or just short wu's?

More shorties, Admiral. Half my current queue on both Core2Duo boxes are 2-hour shorties.
Donald
Infernal Optimist / Submariner, retired
ID: 1623411 · Report as offensive
Sleepy
Volunteer tester
Avatar

Send message
Joined: 21 May 99
Posts: 219
Credit: 98,947,784
RAC: 28,360
Italy
Message 1623747 - Posted: 5 Jan 2015, 16:11:11 UTC
Last modified: 5 Jan 2015, 16:48:25 UTC

Is it just me or now when we get at the CPU 100 WUs limit, the server issues and enforces the limit also for GPU tasks, though those are far from the limit?

05/01/2015 17:44:50	SETI@home	Sending scheduler request: To fetch work.
05/01/2015 17:44:50	SETI@home	Requesting new tasks for GPU
05/01/2015 17:44:55	SETI@home	Scheduler request completed: got 0 new tasks
05/01/2015 17:44:55	SETI@home	Message from server: No tasks sent
05/01/2015 17:44:55	SETI@home	Message from server: No tasks are available for AstroPulse v6
05/01/2015 17:44:55	SETI@home	Message from server: No tasks are available for AstroPulse v7
05/01/2015 17:44:55	SETI@home	Message from server: Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
05/01/2015 17:44:55	SETI@home	Message from server: Tasks for Intel GPU are available, but your preferences are set to not accept them
05/01/2015 17:44:55	SETI@home	Message from server: This computer has reached a limit on tasks in progress


Though actually on this PC I am slowly reaching the 100+100 WUs, therefore I may be wrong, but the answers above from the server made me a bit (wrongly?) suspicious.
I am using BOINC 6.10.55 (but this problem seems server side, though it may be triggered by a non up-to-date answer from the client).

Cheers,

Sleepy
ID: 1623747 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1623766 - Posted: 5 Jan 2015, 16:48:42 UTC - in response to Message 1623747.  

Is it just me or now when we get at the CPU 100 WUs limit, the server issues and enforces the limit also for GPU tasks, though those are well far from the limit?

Happening on both PCs of mine under BOINC 6.10.55 (but this problem seems server side, though it may be triggered by a non up-to-date answer from the client)...

In case it is a general problem, in the meantime:
Aborting (urghh...) CPU task to make room for GPU tasks (with all bad consequences in terms of penalties and for the whole system)?
Rescheduling now almost necessary (added urghhh...)?

Cheers,

Sleepy

My host 6727898 Seems to be staying at its limit of 100 CPU + 100 GPU tasks.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1623766 · Report as offensive
Sleepy
Volunteer tester
Avatar

Send message
Joined: 21 May 99
Posts: 219
Credit: 98,947,784
RAC: 28,360
Italy
Message 1623773 - Posted: 5 Jan 2015, 17:13:42 UTC - in response to Message 1623766.  

Yes, it seems my mistake.

Too many things happening lately and now that all is getting business as usual again I got scared by it! ;-)

Sleepy
ID: 1623773 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (94) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.