Problem with work units... (possibly BOINC itself)

Message boards : Number crunching : Problem with work units... (possibly BOINC itself)
Message board moderation

To post messages, you must log in.

AuthorMessage
Scott0309

Send message
Joined: 9 Jan 01
Posts: 2
Credit: 1,351
RAC: 0
United States
Message 192356 - Posted: 23 Nov 2005, 7:13:36 UTC

Problem: When BOINC is finished running a work unit and it went ahead and uploaded the data set that it was working on, but on the "work" tab, it still states that it is "Ready to Report". Is there some sort of automatic report function that I need to enable?

Thanks,
- Scott
ID: 192356 · Report as offensive
Profile Robert J
Avatar

Send message
Joined: 30 Mar 00
Posts: 115
Credit: 20,087,874
RAC: 15
United States
Message 192360 - Posted: 23 Nov 2005, 7:30:48 UTC - in response to Message 192356.  

Hi Scott,

The work unit will be reported automatically the next time that Seti asks for work. This is to reduce the demands on the server.

It is possible to report the work unit manually via the update button on the projects tab. Just make sure that Seti is the project that is highlighted if you are running more than one project.

Happy Crunching!

Problem: When BOINC is finished running a work unit and it went ahead and uploaded the data set that it was working on, but on the "work" tab, it still states that it is "Ready to Report". Is there some sort of automatic report function that I need to enable?

Thanks,
- Scott


ID: 192360 · Report as offensive
Profile cjsoftuk
Volunteer tester

Send message
Joined: 3 Sep 04
Posts: 248
Credit: 183,721
RAC: 0
United Kingdom
Message 192361 - Posted: 23 Nov 2005, 7:32:21 UTC

BOINC Automatically reports results when they are within one day or expiring, I think. Do not worry. You can manually click on the project and then Update on the Projects tab to force it to be reported.
ID: 192361 · Report as offensive
Bill Barto

Send message
Joined: 28 Jun 99
Posts: 864
Credit: 58,712,313
RAC: 91
United States
Message 192362 - Posted: 23 Nov 2005, 7:32:23 UTC
Last modified: 23 Nov 2005, 7:35:25 UTC

The next time the BOINC manager sends a request to the scheduler to download new work the results in the "ready to report" status will be reported. You can also force them to be reported by opening the "projects" tab, selecting SETI@home and doing an update.

[edit] OK, so I missed being #2 by three seconds. At least we all agreed [edit]
ID: 192362 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 192484 - Posted: 23 Nov 2005, 13:47:21 UTC

The BOINC Daemon reports completed work at the first of:

1) The next work request for a project.
2) 24 hours before the deadline.
3) Immediately if the work is completed later than 24 hours before the deadline.
4) Connect every X days after the work is completed.
5) With the next trickle (CPDN only at this point).
6) When the user clicks update.


BOINC WIKI
ID: 192484 · Report as offensive
Scott0309

Send message
Joined: 9 Jan 01
Posts: 2
Credit: 1,351
RAC: 0
United States
Message 192706 - Posted: 23 Nov 2005, 22:10:02 UTC - in response to Message 192360.  

Thanks for the help. I really appreciate it. Now, if I could only get the SETI/BOINC client to display the graphics exactly the way the classic client did (not the Classic setting for the BOINC project, but the exact graphics), then I would be totally happy with the SETI/BOINC project settings. My problem is that I can't read the silly display when it is in screen saver mode.

Thanks,
- Scott

Hi Scott,

The work unit will be reported automatically the next time that Seti asks for work. This is to reduce the demands on the server.

It is possible to report the work unit manually via the update button on the projects tab. Just make sure that Seti is the project that is highlighted if you are running more than one project.

Happy Crunching!

Problem: When BOINC is finished running a work unit and it went ahead and uploaded the data set that it was working on, but on the "work" tab, it still states that it is "Ready to Report". Is there some sort of automatic report function that I need to enable?

Thanks,
- Scott


ID: 192706 · Report as offensive
kevint
Volunteer tester

Send message
Joined: 17 May 99
Posts: 414
Credit: 11,680,240
RAC: 0
United States
Message 192710 - Posted: 23 Nov 2005, 22:14:50 UTC


I agree, I liked the screen saver on the classic much better, all that rotating back and forth with BOINC makes me crazy. I don't run the screen saver anymore anyways because the screen saver tends to kill productivity but it is fun to watch every once in awhile.
ID: 192710 · Report as offensive
Profile Landroval

Send message
Joined: 7 Oct 01
Posts: 188
Credit: 2,098,881
RAC: 1
United States
Message 192724 - Posted: 23 Nov 2005, 22:33:33 UTC - in response to Message 192706.  

Now, if I could only get the SETI/BOINC client to display the graphics exactly the way the classic client did (not the Classic setting for the BOINC project, but the exact graphics), then I would be totally happy with the SETI/BOINC project settings. My problem is that I can't read the silly display when it is in screen saver mode.


The screensaver is part of the science app, not boinc...but by "the classic setting for the boinc project" did you mean something like emulating the classic screensaver? It's not an exact match, but it's pretty close, and certainly closer than the default settings.

(And I usually leave the screensaver off to save clock cycles, but I agree, the rotation and gyration is, um, distracting.)

Cheers,

Brian
If you think education is expensive, try ignorance.
ID: 192724 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 192762 - Posted: 23 Nov 2005, 23:45:55 UTC - in response to Message 192484.  

The BOINC Daemon reports completed work at the first of:

1) The next work request for a project.
2) 24 hours before the deadline.
3) Immediately if the work is completed later than 24 hours before the deadline.
4) Connect every X days after the work is completed.
5) With the next trickle (CPDN only at this point).
6) When the user clicks update.


Can you explain to me, either here or in email, as to why two separate RPC calls to the scheduler are performed rather than one? Where I work we are limited to a 4K CIR (56K burst) frame relay circuit. All messaging we do, either via MSMQ or IBM MQ Series, is optimized for maximum message content given the bandwidth constraints. I simply do not understand what is gained on the scheduler by having it hit an extra time to report. To me, the upload should be all that is required of the client, then a process at UCB (or the other database locations for the other projects) would handle populating the necessary fields off of that one message. I guess, on a fundamental level, I have issues with what I see as a design flaw on the back-end server side and the subsequent "work-around" that was built into the client to deal with said design flaw. Why in the world would you want to intentionally slow down the reporting of results?

Thanks...
ID: 192762 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 192890 - Posted: 24 Nov 2005, 2:03:10 UTC - in response to Message 192762.  
Last modified: 24 Nov 2005, 2:09:02 UTC

Can you explain to me, either here or in email, as to why two separate RPC calls to the scheduler are performed rather than one? Where I work we are limited to a 4K CIR (56K burst) frame relay circuit. All messaging we do, either via MSMQ or IBM MQ Series, is optimized for maximum message content given the bandwidth constraints. I simply do not understand what is gained on the scheduler by having it hit an extra time to report. To me, the upload should be all that is required of the client, then a process at UCB (or the other database locations for the other projects) would handle populating the necessary fields off of that one message. I guess, on a fundamental level, I have issues with what I see as a design flaw on the back-end server side and the subsequent "work-around" that was built into the client to deal with said design flaw. Why in the world would you want to intentionally slow down the reporting of results?



Well, the AFAIK original design was to fill upto high water-mark, and not re-fill before dropped below low water-mark. Both high and low water-mark was configurable by user. In this design, there could be many saved-up results before finally asked for more work, and grouping everything together in one RPC therefore made sence.

Later, not sure if it was around v4.05 or something, it was changed to only one cache-setting, but client asked for 2x of work, this again meant could be multiple results reported each time asked for more work.

In v4.20, the client stopped asking for 2x of work, meaning you're at the current behaviour, there you'll crunch result-1, start on result-2, and ask for more work during result-2. Well, you don't follow this all the time, but this is basically the current behaviour, except for users with non-permanent connection.

With the introducing of the new cpu-scheduler in v4.35, there is a possibility can take days/weeks from a result finished before next time asking for more work. Results will of course be reported after the rule "if less than 24h from deadline", but due to the potentially very long wait it was decided to add the new rule, "report result if N days since finished, there N is cache-setting". But, with the adding of this new functionality, a bug also snuck in, meaning v4.45 reported after each result instead of waiting as it really should have done.

v5.2.x just removes the bug introduced in v4.4x, and you're now back to v4.2x-behaviour, with normally one RPC per "result" due to asking for more work.



Now, haven't looked too closely into the working of the scheduling-server, but a quick look indicates each RPC gives 3 db-reads and 1 db-write for looking-up user-info, host-info and team-info, and writing host-info. This is done once per connection.
On top of this, there is a db-read and db-write for each result sent-out or accepted back.

This means, if you adds an extra RPC just to report result, just like v4.45 due to a bug was doing, there will be 8 db-reads and 4 db-writes per result.

If on the other hand works as designed, with waiting to report till next time asks for work, you needs 5 db-reads and 3 db-writes per result.


If you moves the reporting to the upload-server, you will still have 4 reads and 2 writes for assigning a "result". Also, the upload-server at the minimum will have 1 read and 1 write per result, and must also look-up user-info, host-info and team-info to make sure it's not an impostor returning a result, meaning 3 reads. At the end, since claimed credit relies on a possibly changed benchmark, also updating host-info can be a good idea, meaning 1 write.
Meaning, you're back at 8 db-reads and 4 db-writes per result... This is the "worse-case" database-load-scenario in the current system...

Well, it is a possibility you can get away without looking-up team-info, and choose to not updated computer-info, this will mean 7 db-reads and 3 db-writes.


Meaning, waiting on reporting till next time asks for more work gives lower database-load than you can get by letting the reporting of a result be part of the result-file-upload.



Moving reporting to upload-file has also other weaknesses, like:
1; Makes it more difficult to run multiple upload-servers, since needs database-access. Example CPDN runs multiple upload-servers in UK and Switzerland.
2a; Can either "ok" everything and have delayed db-update, but this will break the re-issue of "lost" results projects can choose to use.
b; Or, must update database and client must wait on "ok".
3; If example a download-error you must either duplicate code for reporting in scheduling-server and upload-handler, or add unnessesary load to upload-server just to report the error.
4; Opening a db-connection before a result is fully uploaded can greatly increase the load on db-server, so if upload-server starts dropping connection you'll basically kill the db. Therefore, you need to finish a result-upload before opening db.
5; If uses 2b; there is a 25% chance in example SETI@Home an uploaded result was the 4th result. If for any reason the client uploading this last result did NOT get the "ok", it means result-file is deleted from disk, and because of #4 you need to re-upload the full result-file just to get "ok" from server.


#5 isn't a problem in SETI@Home since the results is so small, but you can expect some angry dialup-users if they're forced to re-upload maybe a 30minute-1h upload just to get an "ok"...



So, to sum it up, a re-design of BOINC to let the reporting be part of result-upload will give higher db-load, higher upload-server-load, clients using longer on uploads, meaning upload-server can handle less results/day.
Also, you will either break the re-issue of "ghost-results", get some angry dialup-users, or program a "db-killer" in #4...

Only gain is less load on scheduling-server, since upload-server is handling the reporting instead and gets higher load.

As for a project, reporting immediately after upload will often not change anything, since the other results for same wu can be stuck in a 10-day cache, meaning reporting now or a couple hours later will not validate wu faster...
ID: 192890 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 192911 - Posted: 24 Nov 2005, 2:16:56 UTC - in response to Message 192890.  


So, to sum it up


Just letting you know I've seen this and will read it and think about it tomorrow. My back is bothering me too much to sit here in this chair right now...

Brian

ID: 192911 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21681
Credit: 7,508,002
RAC: 20
United Kingdom
Message 192925 - Posted: 24 Nov 2005, 2:28:13 UTC - in response to Message 192890.  

Can you explain to me, ...


... So, to sum it up, a re-design of BOINC to let the reporting be part of result-upload will give higher db-load, higher upload-server-load, clients using longer on uploads, ...

Thanks for the descriptions and design history.

I hope PB sees this to add into the docs. Even I followed that!

Regards,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 192925 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19587
Credit: 40,757,560
RAC: 67
United Kingdom
Message 193007 - Posted: 24 Nov 2005, 4:05:55 UTC

Ingleside
Thank you for that explaination. And yes it should go into PB's Wiki.
ID: 193007 · Report as offensive

Message boards : Number crunching : Problem with work units... (possibly BOINC itself)


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.