Message boards :
Number crunching :
Panic Mode On (88) Server Problems?
Message board moderation
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 21 · Next
Author | Message |
---|---|
Wiggo Send message Joined: 24 Jan 00 Posts: 35392 Credit: 261,360,520 RAC: 489 |
23ja09aa Again? It looks that way. :-( Cheers. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13786 Credit: 208,696,464 RAC: 304 |
23ja09aa looks to be running normally again & only has one splitter on it with 3 channels complete now. Creation rate is now at ~26/sec. In a few hours the RTS will probably be back at full capacity. Splitter output has increased, but only slightly. Ready-to-send buffer remains at 0. Grant Darwin NT |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
23ja09aa looks to be running normally again & only has one splitter on it with 3 channels complete now. Creation rate is now at ~26/sec. In a few hours the RTS will probably be back at full capacity. Yeah that tape is only nerfing 1 splitter instead of 2 now. The number of tasks in progress is rising. So once the tapes full of shorties are done we might build up a send buffer again. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Seems like our friend 23ja09aa is still having problems. |
Wiggo Send message Joined: 24 Jan 00 Posts: 35392 Credit: 261,360,520 RAC: 489 |
Straight back from the outrage and 23ja09aa has tied up 2 splitters again. :-( Cheers. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
One of the Berkeley campus core routers has a problem, so services failed over to a backup. An interesting side effect is that the usual Cricket graph from the data center inr-211 router shows only the blue line for our uploads, etc. The green for our downloads, etc., is shown on the alternate Cricket graph from data center router inr-210. Joe |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
The bandwidth graphs show the outbound is on router port 210 again, but inbound seems to still be on 211. http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-211/gigabitethernet6_17&ranges=d%3Aw&view=Octets http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-210/gigabitethernet6_17&ranges=d%3Aw&view=Octets So data is flying out normally after the outage. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13786 Credit: 208,696,464 RAC: 304 |
Straight back from the outrage and 23ja09aa has tied up 2 splitters again. :-( Joy. Splitter output is less than it was before the outage, and before the outage it wasn't enough. Probably one in 15-20 requests result in work, but no where near enough to fill up the cache again. Before the outage about 1 in 7-10 requests got work, and usually enough to keep the cache close to full. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13786 Credit: 208,696,464 RAC: 304 |
EDIT- well the work going out should help offset how little there is. Everything I've got so far for the CPU is VLAR & all the GPU stuff is long running as well, no shorties or even mid range stuff in sight. Still a good 50+ short of a full cache. Grant Darwin NT |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
I hope someone is looking into it it seems to be getting worse |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
Thamks Hal9000 for the cricket links i've been trying to find the cricket links so thank you very much |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13786 Credit: 208,696,464 RAC: 304 |
Splitter output has dropped off even further. While not full, my caches have remained close to it, but now they are starting to drain down. Even more requests for work are resulting in none, and those that do get some work are getting less than before. Grant Darwin NT |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
same here grant fast running out...backup progect till it's fixed if i run out |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
23ja09aa still havng problems. |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Confirmed splitter problem(s): one or more of them need a re-boot! . Hello, from Albany, CA!... |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Another night passes and another splitter problem apears, seems like this is the new constant. |
kittyman Send message Joined: 9 Jul 00 Posts: 51470 Credit: 1,018,363,574 RAC: 1,004 |
Another night passes and another splitter problem apears, seems like this is the new constant. It's a bum dataset, 23ja09aa, that keeps getting stuck and ties up a splitter or two working on it when that happens. Have sent off another message to headquarters. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Thanks Mark. Something else call my attention, they don´t have a "watch dog" on the splitters exactly to avoid that? |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Thanks Mark. I think the main problem is that with the complexity of dependencies of the various bits of the back-end. Adding another process to the mix to watch for errors may introduce more errors. I believe that there is/was a system in place to prevent the db from getting to large, but it was not working correctly. So the db server would crash instead. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Maybe they could do that in a more simple way, watch the spllinitng process and if it hangs, just stop splitting on this determinated tape until someone could verify the problem. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.