September 11, 2006 - 22:00 UTC
We're back on line, more or less, after the science database crash last Friday. Work is being generated, and the assimilators are catching up on the backlogged results. But we're far from out of the woods.
We aren't quite finished getting the data unloaded from the old (broken) database. There is no data corruption, but we are currently operating on only one half of a RAID 10 mirror. In other words, one more drive failure and the whole database is toast. What happened? The two mirrors are on two separate drive arrays, and one of the two enclosures went belly-up, causing all our headaches this past Friday, and our cautionary measures over the weekend.
The rest of the data should be unloaded within the next 24 hours. Then we have to check the data, and load it onto the new server. Then we check to make sure the data on both databases match. And then we shut down the whole project for a day and migrate all the workunits created and results uploaded since we started this whole project a week ago.
Finally, we turn things over to the new database and start the project back up. If all is well, we'll be completely on the new server and can fully retire the old one.
Meanwhile there is positive news about the data recorder. We haven't been able to take much data with the new multi-beam recorder because of DLT drive problems galore. In an effort to move away from that technology, we successfully implemented and tested using swappable SATA drives to store the data. As of this morning we have all the parts working at Arecibo. Soon we will get the parts ordered/installed/tested up here at the lab. Eventually, instead of shipping data back on forth on tapes, we'll be shipping whole 500GB drives.