Message boards :
Number crunching :
Two Fried Solaris Servers in less then a week.
Message board moderation
Author | Message |
---|---|
Celtic Wolf Send message Joined: 3 Apr 99 Posts: 3278 Credit: 595,676 RAC: 0 |
OK I am hoping this is just a coincidence, but having two Solaris Server crash in less then a week at different locations has me wondering. Both were running the BOINC 4.25 client. Once was running Solaris 8 (E-220-R, 2G RAM, Dual Processors) the other Solaris 9 (Ultra 10, 512 RAM, Single Processor). It appears in the case of the Ultra 10 to be a power supply issue. The reason for the E-220-R dying is still up in the air. Has anyone run into a similar issue with a Solaris System just dying? Both of these systems are heavily monitored and until they died there were no indications of failure. I have shut down BOINC on my other Solaris systems until I can rule out BOINC being the issue. Before anyone asks both systems are or were well cooled!! |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
Well, if they were well cooled, it's not that. Running seti does put extra load on the CPU which in turn draws extra current from the supply. On PC based systems i've observed this first hand, both in my UPS logs and by simply sticking a volt meter in the supply lines and watching the voltages drop slightly. A quality supply will handle this without problem, but a flaky supply could theoretically be pushed past it's limits and cause failure. Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
MikeSW17 Send message Joined: 3 Apr 99 Posts: 1603 Credit: 2,700,523 RAC: 0 |
Bear in mind though that BOINC apps are not the only software that can fully use a CPU and memory to its designed limits. BOINC does not overclock or push a system beyond that it was designed or built for. If the fault was caused by full load conditions, BOINC just happened to be there at the time, and shouldn't get the blame. |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
Quite right Mike - was just trying to put forward a possible explanation as to why the machines may have failed. Presumably if the original poster is able to run boinc on them, then they weren't originally being run with a constantly full CPU load. Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Celtic Wolf Send message Joined: 3 Apr 99 Posts: 3278 Credit: 595,676 RAC: 0 |
Neither system exceeded %3 use. Just strange that two systems whose only commonality was BOINC failed in the same manner. It is not proven that the power supplies are definately at fault yet. IF the CPU fried the symptoms would be similar to a power supply output failure. |
Celtic Wolf Send message Joined: 3 Apr 99 Posts: 3278 Credit: 595,676 RAC: 0 |
Well for those that care. I have determined the cause of failure and it was indeed coicidental that they both occurred at the same time. The primary drive on my E-220-R is failing and the power supply failed on the Ultra 10. Unrelated events.. I am once again BOINC'in on the other servers.. |
Kathy Send message Joined: 5 Jan 03 Posts: 338 Credit: 27,877,436 RAC: 0 |
Sorry that happened to ya, CW. Glad to read that you located the problems and are back to crunching! Kathy |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.