Message boards :
Number crunching :
Is it possible to swap a guppi assigned to GPU with a Arecibo assigned to CPU?
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
... early in testing I found it was losing work when client_state was out of synch with sched_request and sched_reply but that may have been another cause then. Thank you. If the rescheduling routine is only run after the BOINC client has been shut down, isn't that 'reply' info already merged into the client_state.xml that's already on disk? I haven't had any problems with my own routine just manipulating the client_state.xml file. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14671 Credit: 200,643,578 RAC: 874 |
... early in testing I found it was losing work when client_state was out of synch with sched_request and sched_reply but that may have been another cause then. Thank you. Yes, exactly. You could conceivably get into a race condition if a reply was received from the server at the precise millisecond when you (or the rescheduler program) issued the shutdown command - I don't know how completely or robustly BOINC deals with that. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Yes, exactly. You could conceivably get into a race condition if a reply was received from the server at the precise millisecond when you (or the rescheduler program) issued the shutdown command - I don't know how completely or robustly BOINC deals with that. I suppose in that situation, the downloads wouldn't complete anyway and might crap out on BOINC restart with one of those "Timed out - no response" errors. I wonder if BOINC reads the latest scheduler reply file on startup to ensure that the client_state.xml is in sync with it. Doesn't seem likely, but I suppose it's possible. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
I wonder if BOINC reads the latest scheduler reply file on startup to ensure that the client_state.xml is in sync with it. I think I just answered my own question. While I was eating lunch, it occurred to me that I could use Process Monitor during a BOINC startup to see what files were actually being read by the BOINC client. So, here's a list of all the files that the client reads from the BOINC data directory ("C:\ProgramData\BOINC\" on my daily driver) from the time the client is launched (in my case, by BOINC Manager) and the time the active tasks are up and running. I've only listed the first ReadFile for each (although cc_config appears to be read twice). 12:55:12.2982954 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\cc_config.xml SUCCESS Offset: 0, Length: 4,096, Priority: Normal 12:55:12.5438900 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\daily_xfer_history.xml SUCCESS Offset: 0, Length: 4,096, Priority: Normal 12:55:12.7054181 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\account_setiathome.berkeley.edu.xml SUCCESS Offset: 0, Length: 2,625, Priority: Normal 12:55:12.7242364 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\statistics_setiathome.berkeley.edu.xml SUCCESS Offset: 0, Length: 4,096, Priority: Normal 12:55:13.0120202 PM boinc.exe 6060 ReadFile C:\ProgramData\BOINC\cc_config.xml SUCCESS Offset: 0, Length: 4,096, Priority: Normal 12:55:13.8964257 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\coproc_info.xml SUCCESS Offset: 0, Length: 2,666, Priority: Normal 12:55:13.9640905 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\app_info.xml SUCCESS Offset: 0, Length: 4,096, Priority: Normal 12:55:13.9875153 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\client_state.xml SUCCESS Offset: 0, Length: 4,096, Priority: Normal 12:55:14.0896113 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\app_config.xml SUCCESS Offset: 0, Length: 378, Priority: Normal 12:55:14.2975966 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\global_prefs_override.xml SUCCESS Offset: 0, Length: 1,480, Priority: Normal 12:55:14.3439112 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\global_prefs.xml SUCCESS Offset: 0, Length: 1,407, Priority: Normal 12:55:14.3635428 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\slots\2\boinc_task_state.xml SUCCESS Offset: 0, Length: 539, Priority: Normal 12:55:14.3641406 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\slots\1\boinc_task_state.xml SUCCESS Offset: 0, Length: 538, Priority: Normal 12:55:14.3646406 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\slots\0\boinc_task_state.xml SUCCESS Offset: 0, Length: 500, Priority: Normal 12:55:14.3677519 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\gui_rpc_auth.cfg SUCCESS Offset: 0, Length: 32, Priority: Normal 12:55:14.6029155 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\notices\feeds_setiathome.berkeley.edu.xml SUCCESS Offset: 0, Length: 274, Priority: Normal 12:55:14.6378302 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\notices\archive_setiathome.berkeley.edu_notices.php.xml SUCCESS Offset: 0, Length: 1,673, Priority: Normal 12:55:15.0231213 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\MB8_win_x86_SSE3_VS2008_r3330.exe SUCCESS Offset: 735,232, Length: 1,024, Priority: Normal 12:55:15.0655795 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\MB8_win_x86_SSE3_VS2008_r3330.exe SUCCESS Offset: 735,232, Length: 1,024, Priority: Normal 12:55:15.1006801 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\Lunatics_x41zi_win32_cuda50.exe SUCCESS Offset: 6,853,632, Length: 1,024, Priority: Normal 12:55:16.5113042 PM boinc.exe 8072 ReadFile C:\ProgramData\BOINC\notices\setiathome.berkeley.edu_notices.php.xml SUCCESS Offset: 0, Length: 1,899, Priority: Normal There's nothing indicating that any scheduler file is read, either before or after the client_state.xml is read. |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
I'm loving this exchange of neuron data! So much so that I tried a suggestion in a PM on reassigning tasks to the GPU or CPU by making minor changes in client_state.xml (while the Boinc Client has been shut down). It works but I made a few mistakes so I just let the cache nearly empty itself before aborting the last few and resetting the project in order to start fresh. The trick I've come up with is to: 1. do 2 batches: 1 for sending from GPU to CPU, and the other from CPU to GPU 2. for each batch, I "Suspend" the tasks in Boinc Manager (or BoincTasks in my case) the ones to reassign before I shut it down. 3. using Find & Replace, I can remove the line <plan_class>cuda50</plan_class> above the <suspended_via_gui/>, or replace <suspended_via_gui/> with <plan_class>cuda50</plan_class> The one abnormality I've come across is that a few reassigned-to-GPU tasks take much longer! I haven't looked into why yet (by doing smaller batch transfers with a much smaller cache); I just thought I'd share that now in case others have come across it. Cheers, Rob :-D |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
The one abnormality I've come across is that a few reassigned-to-GPU tasks take much longer! It looks like those may be Arecibo VLARs that you moved from CPU to GPU. VLARs of any kind just don't do well on NVIDIA GPUs. I treat both GBT and Arecibo VLARs the same when it comes to rescheduling. |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
@Mr Kevvy: is there an ETA for a beta test of your Windows script? ...cuz my Pavlov dog saliva is starting to run dry! ;-) |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3797 Credit: 1,114,826,392 RAC: 3,319 |
@Mr Kevvy: Well, given you have the same little red and white icon on your profile that I do, you know what weekend this. Yes, it's a weekend of rest, relaxation, good food preferably around a BBQ, and a celebration of what it is to be Canadian... ...for other people. For me, it's time to be put to be work for a hellish three days of dust, dirt, sweat and moving heavy objects, most of which try to crush my fingers and toes. Hoping to have it ready sometime next week. :^p |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
To quote my favorite music station It's the 1st of July Holiday weekend.... |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
For me, it's time to be put to be work for a hellish three days of dust, dirt, sweat and moving heavy objects, most of which try to crush my fingers and toes.Sounds like you were helping one of the 70,000 households moving in Montreal on July 1st! I hope your fingers and toes are intact. While I wait impatiently ;-) for your script, is there something else I should be manually changing other than 1 or 2 lines within each <result>...</result> section of the file: C:\ProgramData\BOINC\client_state.xml ? |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
On this rig 8010413, I am running Cuda50 with 2 tasks at a time on the GPU. I was using Notepad++ to do a Find&Replace on the tasks I had suspended. It's fairly easy when it only involves deleted or replacing 1 line. On my other rig 7996377 , I am running SoG (installed with Lunatics v0.45 beta3) with 1 task at a time running on the GPU. It wasn't easy with Notepad++ since the version number had to be changed from 800 to 812 (or vice-versa) in addition to the same line needing to be modified as described above for the Cuda50. I then tried other text editors (that I had used in the past to prep host.gz to import into MS Access) and luckily, "Sublime Text" has a multiple line Find&Replace interface that doesn't even use a pop-up window (like Notepad++). In addition, it allows you to do a: Replace All! All this to say: if you'd like to send tasks assigned to CPU or GPU to the other, I recommend using "Sublime Text". For CPU to GPU, all you need to do is replace: <version_num>800</version_num> <suspended_via_gui/>with <version_num>812</version_num> <plan_class>opencl_nvidia_SoG</plan_class>...but don't forget to make a copy of client_state.xml before! Cheers, Rob ;-) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . I don't have a file by that name in my setiathome directory. . . Should it be there? Stephen |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
. . I don't have a file by that name in my setiathome directory. It's in: C:\ProgramData\BOINC |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3797 Credit: 1,114,826,392 RAC: 3,319 |
Some updates: the work-in-progress detection is fixed now. The app. works fine on Linux and our Windows 64-bit box, but a couple of the testers didn't have good results on Windows possibly 32-bit. One of them indicated that the file(s?) have a different format in Win32 which I had no idea of. I was in a conundrum until I remembered I have a spare machine and a small PCIe-powered card to go in it. So I will be imaging that with Win7 32-bit this weekend starting Friday evening, and then I can finally test it properly with stock and Lunatics. Let's hope I can finally get it out there this weekend. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
One of them indicated that the file(s?) have a different format in Win32 which I had no idea of. I have one 64-bit and four 32-bit machines on which I've been doing VLAR rescheduling for the last couple weeks, and I haven't noticed any format differences, at least not for the client_state.xml, which is the only file that I'm touching. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Some updates: the work-in-progress detection is fixed now. The app. works fine on Linux and our Windows 64-bit box, but a couple of the testers didn't have good results on Windows possibly 32-bit. One of them indicated that the file(s?) have a different format in Win32 which I had no idea of. . . Your efforts are much appreciated and I look forward to trying it out. I have a little Windows10 64 bit box (core2 Duo) with a Gt 730 card that is itching for it :) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . I don't have a file by that name in my setiathome directory. . . OK, . . I have tried the batch file with no luck, it crashes badly on my Core2 Duo. I cannot tell exactly what is failing because it scrolls the windoww before I can read anything and then goes to a red screen saying I need to rerun it. . . One result message I have seen is a message saying it cannot find Boinc Tasks. Apparently it is not compatible with Boinc Manager at all. . . Also I think it would help to have a single instructions file outlining the process and then taking you through the steps from start to finish. But I cannot even begin to create one when it fails almost right away. :( . |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
For those working on developing a VLAR rescheduler (and I know there are at least 2 of you), I want to mention a situation that I ran into on one of my own boxes last evening, which probably should be taken into consideration in a rescheduler. I'm currently running my own little home-grown VLAR rescheduler on each of my crunch-only machines as a scheduled task at user logon. After the rescheduling is completed, the routine then launches BOINC Manager (instead of having BM launch itself at startup). Last evening, the rescheduler's log on my WinVISTA machine showed that 7 VLARs had been moved to the CPU. However, BOINC Manager was showing me that those 7 VLARs were still scheduled to run on the GPU. I didn't have time to look into it then, but this morning I reviewed the BOINC Event Log and found a curious line: 7/14/2016 9:00:57 PM | | Using state file client_state_next.xml It seems that when the system shut down yesterday afternoon (a normal weekday occurrence), BOINC hadn't finished cleanly writing and renaming its assorted client_state files, probably because that machine experienced one of those "restarting tasks during shutdown" episodes that has been discussed in several threads here. No tasks actually failed, but apparently the OS ultimately terminated BOINC "with prejudice" while it was still busy with client_state activities. So, it seems that during BOINC startup, if it finds a client_state_next.xml file, it uses it to the exclusion of any client_state.xml file that exists. I think that makes sense since, assuming the client_state_next file is complete (and was cleanly closed before the previous BOINC shutdown), it would contain the most up-to-date client_state data. Ultimately, that probably means that any alterations made to a client_state.xml file (while BOINC is shut down, of course), whether for rescheduling or any other purpose, should only be performed when a client_state_next.xml file doesn't currently exist. Food for thought! :^) |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
146 points in under 4-6 minutes. I wouldn't swap. Just wait -- it is coming to you all NV people. see this for an example. And my inconclusives is falling at the same rate my valids are increasing. An error or two with (tens of) thousands a day is a ... Next 20 State: All (6472) · In progress (500) · Validation pending (3092) · Validation inconclusive (1023) · Valid (1854) · Invalid (0) · Error (3) Please, do not take this too seriously. The improvements are coming. I'm going -- to sleep. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13831 Credit: 208,696,464 RAC: 304 |
146 points in under 4-6 minutes. I wouldn't swap. Just wait -- it is coming to you all NV people. Soon? Please say "very soon." Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.