Need some guidance and advice on optimizing my optimized app. :-)

Message boards : Number crunching : Need some guidance and advice on optimizing my optimized app. :-)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1611480 - Posted: 10 Dec 2014, 1:40:20 UTC - in response to Message 1602260.  
Last modified: 10 Dec 2014, 2:11:23 UTC

My computer finally did the one GPU AP7 it managed to get last week. The outcome looks perfectly normal. (I'm the _3. The _0 had an error and the _2 aborted. Still waiting for the _1.)

Now I guess I'll put the switches back in the text file and wait for it to get another one.

-use_sleep -unroll 6 -oclFFT_plan 256 16 256 -ffa_block 4096 -ffa_block_fetch 2048

has been inserted into my file again. Now I just have to wait for an AP7 to float my GPU's way and see what happens.

Well, now that we have APs again, I do indeed have the same problem I did before.

My GT440 was assigned 6 new AP7s. The 2nd through 6th show estimated times of just under 3 hours, but the first one has been running for over 8 hours now and is showing 93.4% complete.

Also, I was briefly using the computer today and at one point the display went black for a second. When it came back, a Windows tray balloon told me the display driver had crashed and restarted.

I'm letting this one finish, but I suspended the other 5.

Anyone have any suggestions to modify the command line switches?

[edit]
Here's the link to the task in question. Since I'll be going to bed soon, you guys will probably see its final report before I do.

It's currently showing 94.57% at 8:34 running time.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1611480 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1611519 - Posted: 10 Dec 2014, 2:52:49 UTC - in response to Message 1611480.  
Last modified: 10 Dec 2014, 2:58:30 UTC

David,

Am I understanding you right? You are running 6 APs on a GT 440?

That has 1 GB of memory?

I don't even do that many on my GTX 980 with 4 GB.

My first guess is that you are running out of memory on your GPU.

http://setiathome.berkeley.edu/forum_thread.php?id=76244&postid=1611509

Even though SIV or whatever you use tells you that it's only using 125 MB of memory, best to just think that they use more.

I would just put in your mind that an AP could use anywhere between 250-500 MB and plan accordingly.

Some APs were using more than 1 GB of memory. That's what lead to those Zombie APs.

With the changes, they were able to limit them under 1 GB but I would still play it safe and give yourself some buffer room.

Just my 2 cents..


Zalster
ID: 1611519 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1611527 - Posted: 10 Dec 2014, 3:14:36 UTC - in response to Message 1611519.  

David,

Am I understanding you right? You are running 6 APs on a GT 440?

That has 1 GB of memory?

I don't even do that many on my GTX 980 with 4 GB.

My first guess is that you are running out of memory on your GPU.

http://setiathome.berkeley.edu/forum_thread.php?id=76244&postid=1611509

Even though SIV or whatever you use tells you that it's only using 125 MB of memory, best to just think that they use more.

I would just put in your mind that an AP could use anywhere between 250-500 MB and plan accordingly.

Some APs were using more than 1 GB of memory. That's what lead to those Zombie APs.

With the changes, they were able to limit them under 1 GB but I would still play it safe and give yourself some buffer room.

Just my 2 cents..


Zalster

No, I'm not running 6 at once.

I have the 440 set to run 2 at a time of anything (except Beta; I haven't changed any settings for it). That can be 2 MBs, 2 APs, 2 of any flavor of Einstein, or 1 each of 2 different things. Right now, it's running 1 AP and 1 MB.

I will also remind you that it runs them just fine without any of the command line switches that are supposed to make it run more efficiently.

Currently at 96.09% and 9:34.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1611527 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1611960 - Posted: 10 Dec 2014, 23:20:43 UTC

Just in the last few minutes, this task finally ended and reported. It came up as a computation error.

Specifically, it's
Exit status     197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED
Run time	1 days 5 hours 32 min 8 sec
CPU time	2 sec


(In the mean time, the host finished every other Seti task it had, besides the 5 APs I suspended. Just now, I unsuspended those 5, had it ask for more work, then suspended them again.)
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1611960 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1611977 - Posted: 10 Dec 2014, 23:46:23 UTC - in response to Message 1611480.  
Last modified: 10 Dec 2014, 23:47:51 UTC

Also, I was briefly using the computer today and at one point the display went black for a second. When it came back, a Windows tray balloon told me the display driver had crashed and restarted.

I was going to say (But didn't get a chance to, it was 5:30am), that because of the driver restart the OpenCL content will be lost, and the app won't make any progress,
only an app restart (By suspending and resuming GPU usage or restarting Boinc) will get it going again,
the Recent Boincs will estimate progress if the app isn't reporting progress, giving the illusion that progress is being made when it isn't.

Claggy
ID: 1611977 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1611990 - Posted: 11 Dec 2014, 0:28:21 UTC - in response to Message 1611977.  

Also, I was briefly using the computer today and at one point the display went black for a second. When it came back, a Windows tray balloon told me the display driver had crashed and restarted.

I was going to say (But didn't get a chance to, it was 5:30am), that because of the driver restart the OpenCL content will be lost, and the app won't make any progress,
only an app restart (By suspending and resuming GPU usage or restarting Boinc) will get it going again,
the Recent Boincs will estimate progress if the app isn't reporting progress, giving the illusion that progress is being made when it isn't.

Claggy

Okay, that makes sense. But why is it crashing???

Another question: one of the other 5 started running before I suspended it. If I remove the switches now and then let it run, will it run normally or will it still be affected by whatever the problem is?
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1611990 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1612442 - Posted: 11 Dec 2014, 22:46:58 UTC

Okay, since no one is offering any suggestions for different switches, I just deleted the whole line. The text file is now blank.

Question: do I need to restart Boinc for it to read that file again, or will it read it when I unsuspend the AP tasks?
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1612442 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1612455 - Posted: 11 Dec 2014, 23:00:58 UTC - in response to Message 1612442.  

Question: do I need to restart Boinc for it to read that file again, or will it read it when I unsuspend the AP tasks?

It'll be read the next time the app starts, either by suspending/resuming GPU usage, or by waiting for the next Wu to start.

Claggy
ID: 1612455 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1612469 - Posted: 11 Dec 2014, 23:13:07 UTC

What would happen if I put in -use_sleep and nothing else?
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1612469 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1612477 - Posted: 11 Dec 2014, 23:16:41 UTC - in response to Message 1612469.  

What would happen if I put in -use_sleep and nothing else?

You'd get reduced CPU usage, But extended runtime, increasing just the -unroll can bring the runtime back down (at the expense of using more memory).

Claggy
ID: 1612477 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1612488 - Posted: 11 Dec 2014, 23:33:24 UTC - in response to Message 1612477.  
Last modified: 11 Dec 2014, 23:36:03 UTC

What would happen if I put in -use_sleep and nothing else?

You'd get reduced CPU usage, But extended runtime, increasing just the -unroll can bring the runtime back down (at the expense of using more memory).

Claggy

At the moment (not sure if it read the file or not) Task Manager is showing 48% of system memory being used. It's running 2 AP7s on the GPU.

[edit]
And 7 MBs on the CPU.

Siv shows all 8 cores running at max.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1612488 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34814
Credit: 261,360,520
RAC: 489
Australia
Message 1612494 - Posted: 11 Dec 2014, 23:43:58 UTC - in response to Message 1612488.  

What would happen if I put in -use_sleep and nothing else?

You'd get reduced CPU usage, But extended runtime, increasing just the -unroll can bring the runtime back down (at the expense of using more memory).

Claggy

At the moment (not sure if it read the file or not) Task Manager is showing 48% of system memory being used. It's running 2 AP7s on the GPU.

[edit]
And 7 MBs on the CPU.

Siv shows all 8 cores running at max.

Try reducing the cores being used down to 6 David and see if that helps. ;-)

Cheers.
ID: 1612494 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1612974 - Posted: 12 Dec 2014, 21:11:48 UTC

My other 5 APs, plus one more that was assigned, have all finished normally with all the command line switches removed. I guess I'll stick with that.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1612974 · Report as offensive
Profile Fawkesguy
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 108
Credit: 188,578,766
RAC: 0
United States
Message 1613939 - Posted: 14 Dec 2014, 19:15:24 UTC

I've decided to try running GPU only for a while. What do I need to change (if anything) in my config files? I want to make sure I'm not holding my machines back with my current settings. This is what I have in app_config:

<app_config>
<app>
<name>setiathome_v7</name>
<gpu_versions>
<gpu_usage>0.33</gpu_usage>
<cpu_usage>0.1</cpu_usage>
</gpu_versions>
</app>
<app>
<name>astropulse_v6</name>
<gpu_versions>
<gpu_usage>0.33</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>0.33</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
</app_config>
ID: 1613939 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1613945 - Posted: 14 Dec 2014, 19:26:03 UTC - in response to Message 1613939.  
Last modified: 14 Dec 2014, 19:26:27 UTC

I've decided to try running GPU only for a while. What do I need to change (if anything) in my config files? I want to make sure I'm not holding my machines back with my current settings.

Just go to your project preferences and set 'Use CPU' to 'No', Boinc will no longer ask for CPU work,
You have four locations/venues available, so you can tailer settings for different groups of hosts.

Claggy
ID: 1613945 · Report as offensive
Profile Fawkesguy
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 108
Credit: 188,578,766
RAC: 0
United States
Message 1614055 - Posted: 14 Dec 2014, 23:50:46 UTC - in response to Message 1613945.  
Last modified: 14 Dec 2014, 23:51:34 UTC

OK, so the "cpu_usage" lines in app_config will be ignored?
ID: 1614055 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1614057 - Posted: 15 Dec 2014, 0:16:50 UTC - in response to Message 1614055.  

Leave your app_config file the way it is.

You always need some CPU support when running the GPUs.

Do was Claggy suggest that way you are only running apps on the GPUs with a very small portion of the CPU being used to support them.

Happy Crunching....

Zalster
ID: 1614057 · Report as offensive
Profile Fawkesguy
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 108
Credit: 188,578,766
RAC: 0
United States
Message 1614078 - Posted: 15 Dec 2014, 2:36:36 UTC - in response to Message 1614057.  

Great, thank you. I just wanted to make sure I wasn't making anything run less efficiently by leaving those lines in there.
ID: 1614078 · Report as offensive
JarrettH

Send message
Joined: 14 Nov 02
Posts: 97
Credit: 25,385,250
RAC: 95
Canada
Message 1617975 - Posted: 23 Dec 2014, 23:40:07 UTC
Last modified: 23 Dec 2014, 23:46:33 UTC

What readme file are those switches in? I can't find it on this machine.

-unroll 12 -ffa_block 12288 -ffa_block_fetch 6144

---^ I remember there were recommended settings based on GPU.

Thanks

Edit: It's not on this machine because there is no GPU cruncher yet
ID: 1617975 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1618061 - Posted: 24 Dec 2014, 2:08:22 UTC - in response to Message 1617975.  

If you use the Lunatics installer, it will create a folder called docs in the setiathome.edu folder. Look at the ReadMe_Astropulse_OpenCl_NV.txt

I'm assuming it's the computer with the gtx 550?



Good luck.


Zalster
ID: 1618061 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Need some guidance and advice on optimizing my optimized app. :-)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.