Message boards :
Number crunching :
Panic Mode On (88) Server Problems?
Message board moderation
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
18fe09ag has claimed its 2nd splitter, I see this morning, so we'll just have to make do running on 3 splitters until someone kicks that file loose again. Whatever has happened, for the last 3 hours splitter output has been less than it has been. It's dropped from low 20s, to mid teens. And as a result the Ready-to-send buffer is very, very slowly (at this stage) running down. Grant Darwin NT |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
18fe09ag has claimed its 2nd splitter, I see this morning, so we'll just have to make do running on 3 splitters until someone kicks that file loose again. 22fe09ah did finally start moving, but 18fe09ag has a hold of 2 splitters. So we are running on 3 out of 5 cylinders. We might just make it to maintenance, in about 16ish hours, without running dry. If no ones touches anything & nothing else goes wonky... SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
22fe09ah did finally start moving, but 18fe09ag has a hold of 2 splitters. So we are running on 3 out of 5 cylinders. We might just make it to maintenance, in about 16ish hours, without running dry. If no ones touches anything & nothing else goes wonky... At the present rate the Ready-to-send buffer is running down we'd probably get close to 48 hours (from the time it started running down) to when it's empty. As you say, as long as nothing else happens... Grant Darwin NT |
Wiggo Send message Joined: 24 Jan 00 Posts: 36322 Credit: 261,360,520 RAC: 489 |
18fe09ag has claimed its 2nd splitter, I see this morning, so we'll just have to make do running on 3 splitters until someone kicks that file loose again. 22fe09ah has made it to 3 done now so it must be alive. Cheers. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
I am unsure how you can tell how many splitters tape is taking up could somebody please explain this to me? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
There are 7 PFB (Multi Beam) splitters. Splitters 0 & 14 I've never seen running (always disabled) so that leaves 5 splitters available. Under Splitter Status are all the "tapes" loaded for work. They show as being Done, Channels in progress, Completed channels, Channels with errors. At the moment, all the AP work has been split, so they all show as Done. For MB, all the light green blocks show completed channels (they have been split). The dark green ones are those that are in the process of being split. As there are 5 splitters running, there will be 5 dark green blocks to show which channels are being split. Generally as the channels are split they will go from dark green (in progress) to light green (completed). If after several hours a block remains dark green, you can be sure that it's a "stuck tape". Since it's sitting there, and not being split that means that splitter isn't producing any work. At the moment 18fe09ag is the problem child- it shows 2 channels being split, unfortunately it's been that way for quite a few hours now, which means those 2 splitters aren't producing any work, leaving only the other 3 to pump out new WUs. Grant Darwin NT |
Wiggo Send message Joined: 24 Jan 00 Posts: 36322 Credit: 261,360,520 RAC: 489 |
Notice how only 4 files are showing as being in progress instead of the usual 5 and also notice that the "channels in progress" dark green beside the the file 18fe09ag is twice the size of the others? ;-) Cheers. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Thank you so much Grant and Wig go. It certainly makes sense to have one splitter on 20 au 08 ag as it is close to finishing. Currently at (14) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
18fe09ag has claimed its 2nd splitter, I see this morning, so we'll just have to make do running on 3 splitters until someone kicks that file loose again. And it seems to be producing mainly shorties... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
I hope the weekly outage clears the blockages. What looked like almost 2 days worth of work Ready-to-send now looks like not much more than 12 hours worth, if that. Grant Darwin NT |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
The 18fe09ag tape is stuck and the new work buffer slowly goes down. Hope they could kick it in the today´s outage. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Yesterday, I looked at the Cricket and saw a big jump. I thought, oh good, APs are flowing. Then I got to the SSP and saw that APs are not flowing. So what's all that data going out? David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Yesterday, I looked at the Cricket and saw a big jump. I thought, oh good, APs are flowing. Then I got to the SSP and saw that APs are not flowing. So what's all that data going out? Probably the processed data going from the colo back to the lab. So they can free up space to dump more data to the colo. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Now let's see if 18fe09ag gets stuck on the 3rd channel. Maybe it will take off like the other tape did (once it was kicked enough). SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Now let's see if 18fe09ag gets stuck on the 3rd channel. Maybe it will take off like the other tape did (once it was kicked enough). I´m not so confident, normaly without problem the production is about 30/s and now is at about 21/s, that happening when i have one tape with problem. But let´s wait for few hours more to be sure. BTW Today´s outage was realy fast. Now we need more AP WU to fill our caches. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36322 Credit: 261,360,520 RAC: 489 |
Now let's see if 18fe09ag gets stuck on the 3rd channel. Maybe it will take off like the other tape did (once it was kicked enough). It certainly looks like that file is stuck again and I'm not getting many of my requests for more work answered. :-( Cheers. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Now let's see if 18fe09ag gets stuck on the 3rd channel. Maybe it will take off like the other tape did (once it was kicked enough). Somebody who have access to the lab people could ping them about? They must still around for a couple of hours since must be about 14:30 in CA now (if i not made a mistake on the time zone conversion again). |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
MB splitter output still borked. Initially started off OK after the outage, but didn't take long to drop down to 20/s or less again. At least there's some AP work going out, so that will help reduce the demand for MB work. Grant Darwin NT |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
MB splitter output still borked. Initially started off OK after the outage, but didn't take long to drop down to 20/s or less again. Of course it´s broken, the 18fe09ag tape still stuck. And since nobody kick it, slow host start to get AP WU and that is bad for us since they normaly return the crunched WU close to the time limit and that makes our pendings rise. I still don´t understand how anybody could write a software who teoricaly works stand-alone and don´t prevent a watch-dog exactly to avoid things like this. That´s one of the first thing you learn on the elementary software school. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
http://setiathome.berkeley.edu/show_host_detail.php?hostid=7352368 How is it possible for "Number of times client has contacted server" to be 0? Doesn't it have to contact at least once to get the 127 tasks that are probably going to start timing out in another 10 days? (This is just one of the TWO cases I currently have where an inconclusive has gone out to a _2 host, only to have said host disappear. The other one was working for five years, though, so I have a small bit of hope for its return.) David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.