Harsh, but could help

Author	Message
Paul Shellien Send message Joined: 6 Sep 03 Posts: 8 Credit: 1,171,683 RAC: 0	Message 77517 - Posted: 8 Feb 2005, 12:54:31 UTC Just a suggestion. Why do they not take the project down for 48 hours and let the splitters work away and build up a good cache of WU's? That way when they come back on line and start accepting connections, everyone who requests a WU will get one and we will nopt be stuck in the situation we are in now with the splitters playing 'catch-up' all the time. Anyone else have a view on my idea. Good or bad, please comment <img border="0" src="http://boinc.mundayweb.com/one/stats.php?userID=515&prj=1&trans=off" /> ID: 77517 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 77518 - Posted: 8 Feb 2005, 12:59:39 UTC Last modified: 8 Feb 2005, 13:18:35 UTC From my understanding, the splitters are working fine. After a tape is split, the data is written to the data base. therein lies the problem. The 1/2 the DB resources are being taken up by Mysql, and that is limiting the number of split units that can be handled by the DB. This I think is what's limiting our ability to get work. tony Matt Lebofsky wrote the following in the "Are we back, looks like it" thread. there's more info there. this was written AFTER the migration. For those who don't know Matt, he's the person that physically places the tapes on the splitters, and I'm sure he does other things. Matt Lebofsky Joined: Mar 2, 1999 Posts: 60 ID: 122079 Posted: 8 Feb 2005 4:18:40 UTC Well, the actual migration wasn't terribly exciting. We fired off a mysqldump that copied everything onto the new server. About 9 hours later it was done. Initial data checks look okay. But the fact it didn't crash during the data transfer is exciting. We'll run some tests tomorrow and then add the plumbing to make it a replica. If all goes well, it'll be the master by the end of the week. - Matt BOINC/SETI@home Matt Lebofsky Joined: Mar 2, 1999 Posts: 60 ID: 122079 Posted: 8 Feb 2005 4:27:39 UTC > good news!...so we should expect things to function more or less the way they > have been until the replica becomes the master? Yup. I know it's slooow, but at least things work. There is still the perfectly valid hope that mysql's internal merge/purge which is eating up a lot of I/O could finish at any second, rendering our current database more than adequate for now. Until then, well, we'll just have to deal with a bit of sluggishness here and there. - Matt BOINC/SETI@home Matt Lebofsky Joined: Mar 2, 1999 Posts: 60 ID: 122079 Posted: 8 Feb 2005 4:45:16 UTC > Any further info on the strange I/O activity? Will it continue to impact the > current server performance for a while, and is it expected to continue on the > new hardware? We're pretty convinced at this point it has to do with innodb caching that we didn't have tuned perfectly (thanks to low memory and disk space on the server). Bob and Jeff understand this part better than me, but the basic gist of it is: the data were logically up to date, but not commited to disk in a manner that mysql finds pleasing (or efficient). Eventually this hits a limit where mysql goes, "okay - I'm going to take 50% of your I/O power and take care of some garbage collection," and there's nothing you can do about it until it's done. - Matt BOINC/SETI@home ID: 77518 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 77558 - Posted: 8 Feb 2005, 16:54:51 UTC - in response to Message 77517. > Just a suggestion. Why do they not take the project down for 48 hours and let > the splitters work away and build up a good cache of WU's? That way when they > come back on line and start accepting connections, everyone who requests a WU > will get one and we will nopt be stuck in the situation we are in now with the > splitters playing 'catch-up' all the time. > > Anyone else have a view on my idea. Good or bad, please comment Based on what Matt said, the problem is going to be the problem as long as MySQL is struggling with garbage collection. Talking the rest of the project down may or may not let the splitters run. ... and as I read his messages, it's mostly RAM -- the current database server just doesn't have enough. So, given the way most people panic at the slightest glitch, the best thing for our friends in Berkeley is to continue with the move to the new server as quickly as possible while also making sure that everything goes perfectly. ID: 77558 ·

Paul Shellien Send message Joined: 6 Sep 03 Posts: 8 Credit: 1,171,683 RAC: 0	Message 77647 - Posted: 8 Feb 2005, 22:11:17 UTC - in response to Message 77558. I now stand corrected. The snags, it appears, are with the database. The same premise still stands. Take it down and stop accepting requests until it is all sorted out. Surely that has to be the best option? Everyone who is multi-project on BOINC can then direct the SETi resources to the other projects for the out time, and then redivert them when SETI has sorted itself out and is ready to distribute WUs <img border="0" src="http://boinc.mundayweb.com/one/stats.php?userID=515&prj=1&trans=off" /> ID: 77647 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 77648 - Posted: 8 Feb 2005, 22:16:55 UTC - in response to Message 77647. > The same premise still stands. Take it down and stop accepting requests until it > is all sorted out. Surely that has to be the best option? > I have no problem with this. tony ID: 77648 ·

MattDavis Volunteer tester Send message Joined: 11 Nov 99 Posts: 919 Credit: 934,161 RAC: 0	Message 77652 - Posted: 8 Feb 2005, 22:31:33 UTC Seti needs a vacation 8) ----- ID: 77652 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 77664 - Posted: 8 Feb 2005, 23:30:30 UTC - in response to Message 77647. > I now stand corrected. The snags, it appears, are with the database. The > same premise still stands. Take it down and stop accepting requests until it > is all sorted out. Surely that has to be the best option? ... because BOINC/SETI is successfully sending out work (or I would have run out last week). But the biggest reason is: if they are going to test under load, they need a load for testing. ... and here we are, providing that load. ID: 77664 ·

baalthazaar Send message Joined: 23 Mar 03 Posts: 1 Credit: 10,194 RAC: 0	Message 77746 - Posted: 9 Feb 2005, 4:35:27 UTC - in response to Message 77558. Maybe they ought to consider switching databases..... I hear that Ingres is open source now... > > Just a suggestion. Why do they not take the project down for 48 hours > and let > > the splitters work away and build up a good cache of WU's? That way when > they > > come back on line and start accepting connections, everyone who requests > a WU > > will get one and we will nopt be stuck in the situation we are in now > with the > > splitters playing 'catch-up' all the time. > > > > Anyone else have a view on my idea. Good or bad, please comment > > Based on what Matt said, the problem is going to be the problem as long as > MySQL is struggling with garbage collection. Talking the rest of the project > down may or may not let the splitters run. > > ... and as I read his messages, it's mostly RAM -- the current database server > just doesn't have enough. > > So, given the way most people panic at the slightest glitch, the best thing > for our friends in Berkeley is to continue with the move to the new server as > quickly as possible while also making sure that everything goes perfectly. > ID: 77746 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 77753 - Posted: 9 Feb 2005, 5:41:28 UTC - in response to Message 77746. > Maybe they ought to consider switching databases..... I hear that Ingres is > open source now... Garbage collection is one of a class of problems that just seems to exist everywhere in some form or another. Ingres may very well have the same kind of issue somewhere -- especially if run on "small" hardware. ID: 77753 ·

©2025 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.