Need some guidance and advice on optimizing my optimized app. :-)

Message boards : Number crunching : Need some guidance and advice on optimizing my optimized app. :-)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Profile Fawkesguy
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 108
Credit: 188,578,766
RAC: 0
United States
Message 1594178 - Posted: 30 Oct 2014, 2:39:39 UTC
Last modified: 30 Oct 2014, 2:40:26 UTC

Hello everyone,

First post. :-)

I've been running Seti@home on and off for a fairly long time, and I've recently built a machine that crunches around 20 hours a day. I've run the lunatics 0.43 64bit installer, and I need some guidance on how to ensure my system is truly optimized. I've read the readme's, but it's all a bit fuzzy to me. Anyway, I'm going to dump a bunch of info here and I'm hoping you'll be able to offer some suggestions. Thank you in advance. :-)

The machine: http://setiathome.berkeley.edu/show_host_detail.php?hostid=7370555

In case that link doesn't work, it's Windows 7 64bit, 16gb RAM, two(2) GTX 780's, and an intel 4930K (6 core).

The only things I've done is add these lines to ap_cmdline_win_x86_SSE2_OpenCL_NV.txt based on a bit of reading, and very little understanding. ;-)

-unroll 12 -ffa_block 12288 -ffa_block_fetch 6144
-cpu_lock

I've also edited app_info.xml so every instance of <count>1</count> now reads <count>0.5</count>. That seems to give me around 80% utilization on both cards.

That's it. Not sure what else to change.

Below is some info from the BOINC log. Not sure if that'll be useful.

Again, thanks for your help.



10/29/2014 8:10:54 PM | | cc_config.xml not found - using defaults
10/29/2014 8:10:54 PM | | Starting BOINC client version 7.2.42 for windows_x86_64
10/29/2014 8:10:54 PM | | log flags: file_xfer, sched_ops, task
10/29/2014 8:10:54 PM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
10/29/2014 8:10:54 PM | | Data directory: C:\BOINC\DATA
10/29/2014 8:10:54 PM | | Running under account
10/29/2014 8:10:54 PM | | CUDA: NVIDIA GPU 0: GeForce GTX 780 (driver version 344.48, CUDA version 6.5, compute capability 3.5, 3072MB, 2937MB available, 4576 GFLOPS peak)
10/29/2014 8:10:54 PM | | CUDA: NVIDIA GPU 1: GeForce GTX 780 (driver version 344.48, CUDA version 6.5, compute capability 3.5, 3072MB, 2806MB available, 4878 GFLOPS peak)
10/29/2014 8:10:54 PM | | OpenCL: NVIDIA GPU 0: GeForce GTX 780 (driver version 344.48, device version OpenCL 1.1 CUDA, 3072MB, 2937MB available, 4576 GFLOPS peak)
10/29/2014 8:10:54 PM | | OpenCL: NVIDIA GPU 1: GeForce GTX 780 (driver version 344.48, device version OpenCL 1.1 CUDA, 3072MB, 2806MB available, 4878 GFLOPS peak)
10/29/2014 8:10:54 PM | SETI@home | Found app_info.xml; using anonymous platform
10/29/2014 8:10:54 PM | | Host name: seti
10/29/2014 8:10:54 PM | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz [Family 6 Model 62 Stepping 4]
10/29/2014 8:10:54 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx tm2 dca pbe
10/29/2014 8:10:54 PM | | OS: Microsoft Windows 7: Professional x64 Edition, Service Pack 1, (06.01.7601.00)
10/29/2014 8:10:54 PM | | Memory: 15.94 GB physical, 16.13 GB virtual
10/29/2014 8:10:54 PM | | Disk: 465.27 GB total, 438.23 GB free
10/29/2014 8:10:54 PM | | Local time is UTC -4 hours
10/29/2014 8:10:54 PM | | VirtualBox version: 4.3.16
10/29/2014 8:10:54 PM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 7370555; resource share 100
10/29/2014 8:10:54 PM | SETI@home | General prefs: from SETI@home (last modified 29-Oct-2014 18:49:43)
10/29/2014 8:10:54 PM | SETI@home | Host location: none
10/29/2014 8:10:54 PM | SETI@home | General prefs: using your defaults
10/29/2014 8:10:54 PM | | Preferences:
10/29/2014 8:10:54 PM | | max memory usage when active: 8161.92MB
10/29/2014 8:10:54 PM | | max memory usage when idle: 14691.45MB
10/29/2014 8:10:54 PM | | max disk usage: 10.00GB
10/29/2014 8:10:54 PM | | don't compute while active
10/29/2014 8:10:54 PM | | don't use GPU while active
10/29/2014 8:10:54 PM | | (to change preferences, visit a project web site or select Preferences in the Manager)
10/29/2014 8:10:54 PM | | Not using a proxy
10/29/2014 8:10:55 PM | | Suspending computation - computer is in use
10/29/2014 8:32:55 PM | SETI@home | Sending scheduler request: To fetch work.
10/29/2014 8:32:55 PM | SETI@home | Requesting new tasks for CPU and NVIDIA
10/29/2014 8:32:58 PM | SETI@home | Scheduler request completed: got 0 new tasks
10/29/2014 8:32:58 PM | SETI@home | No tasks sent
10/29/2014 8:32:58 PM | SETI@home | No tasks are available for SETI@home Enhanced
10/29/2014 8:32:58 PM | SETI@home | No tasks are available for SETI@home v7
10/29/2014 8:32:58 PM | SETI@home | No tasks are available for AstroPulse v6
10/29/2014 8:32:58 PM | SETI@home | No tasks are available for AstroPulse v7
10/29/2014 8:32:58 PM | SETI@home | Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
10/29/2014 8:32:58 PM | SETI@home | Tasks for Intel GPU are available, but your preferences are set to not accept them
10/29/2014 8:32:58 PM | SETI@home | This computer has reached a limit on tasks in progress
ID: 1594178 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1594199 - Posted: 30 Oct 2014, 4:21:09 UTC - in response to Message 1594178.  

You might be interest in this thread. There seems to be some issues with the Nvidia App for APs. Mike is recommending reducing the -ffa_block and the -ffa_block_fetch

http://setiathome.berkeley.edu/forum_thread.php?id=75863&postid=1592389

They are working on a fix for this so hopefully we will see a it in a few weeks. For now, best to lower those values for the -ffa sections. Keep your -unroll as you have it.

Hope this helps.


Zalster
ID: 1594199 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1594242 - Posted: 30 Oct 2014, 8:14:06 UTC - in response to Message 1594178.  

It's entirely up to you how you configure the machine, but you might want to look at your BOINC preferences.

10/29/2014 8:10:55 PM | | Suspending computation - computer is in use

Suspending crunching every time you touch the mouse or keyboard doesn't sound fully optimal for a 20-hour-per-day build.
ID: 1594242 · Report as offensive
Profile Fawkesguy
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 108
Credit: 188,578,766
RAC: 0
United States
Message 1594366 - Posted: 30 Oct 2014, 13:15:52 UTC - in response to Message 1594242.  

It's entirely up to you how you configure the machine, but you might want to look at your BOINC preferences.

10/29/2014 8:10:55 PM | | Suspending computation - computer is in use

Suspending crunching every time you touch the mouse or keyboard doesn't sound fully optimal for a 20-hour-per-day build.


Hi Richard,

I understand that it's up to me - I just don't know what I'm doing. That's why I'm looking for advice/tips/etc. to ensure I'm cranking out as much work as possible.

The machine is untouched for about 20 hours per day.
ID: 1594366 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1594388 - Posted: 30 Oct 2014, 14:33:22 UTC

I read through the whole readme about optimizers yesterday, but I'm still too clueless to actually try anything. And I'm still not even sure how to do anything. I probably missed it, but where would I put whatever switches I decide to use?

I seriously want to do something to speed up my GPU-only cruncher. It seems (but my analysis could be wrong) like AP7 is taking considerably longer to run but only getting slightly more credit compared to MB7. I had AP6 disallowed on this box because it used more CPU than I wanted to assist the GPU. I allowed AP7 because I was assured it would use less of the CPU.

(Right now, it has nothing in progress after taking a big gulp of Einstein.)
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1594388 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1594400 - Posted: 30 Oct 2014, 14:54:14 UTC - in response to Message 1594388.  

I read through the whole readme about optimizers yesterday, but I'm still too clueless to actually try anything. And I'm still not even sure how to do anything. I probably missed it, but where would I put whatever switches I decide to use?

I seriously want to do something to speed up my GPU-only cruncher. It seems (but my analysis could be wrong) like AP7 is taking considerably longer to run but only getting slightly more credit compared to MB7. I had AP6 disallowed on this box because it used more CPU than I wanted to assist the GPU. I allowed AP7 because I was assured it would use less of the CPU.

(Right now, it has nothing in progress after taking a big gulp of Einstein.)


Check for a file called ap_cmdline_win_x86_SSE2_OpenCL_NV.txt.
Its in your project folder.

I wrote in the read me.
These switches can be placed into ap_cmdline_win_x86_SSE2_OpenCL_NV.txt also.


For the 440 you can use
-use_sleep -unroll 6 -oclFFT_plan 256 16 256 -ffa_block 4096 -ffa_block_fetch 2048


With each crime and every kindness we birth our future.
ID: 1594400 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1594405 - Posted: 30 Oct 2014, 14:57:23 UTC - in response to Message 1594400.  

I read through the whole readme about optimizers yesterday, but I'm still too clueless to actually try anything. And I'm still not even sure how to do anything. I probably missed it, but where would I put whatever switches I decide to use?

I seriously want to do something to speed up my GPU-only cruncher. It seems (but my analysis could be wrong) like AP7 is taking considerably longer to run but only getting slightly more credit compared to MB7. I had AP6 disallowed on this box because it used more CPU than I wanted to assist the GPU. I allowed AP7 because I was assured it would use less of the CPU.

(Right now, it has nothing in progress after taking a big gulp of Einstein.)


Check for a file called ap_cmdline_win_x86_SSE2_OpenCL_NV.txt.
Its in your project folder.

I wrote in the read me.
These switches can be placed into ap_cmdline_win_x86_SSE2_OpenCL_NV.txt also.


For the 440 you can use
-use_sleep -unroll 6 -oclFFT_plan 256 16 256 -ffa_block 4096 -ffa_block_fetch 2048

Okay, but how about for the 630?
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1594405 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1594419 - Posted: 30 Oct 2014, 15:10:50 UTC - in response to Message 1594405.  

I read through the whole readme about optimizers yesterday, but I'm still too clueless to actually try anything. And I'm still not even sure how to do anything. I probably missed it, but where would I put whatever switches I decide to use?

I seriously want to do something to speed up my GPU-only cruncher. It seems (but my analysis could be wrong) like AP7 is taking considerably longer to run but only getting slightly more credit compared to MB7. I had AP6 disallowed on this box because it used more CPU than I wanted to assist the GPU. I allowed AP7 because I was assured it would use less of the CPU.

(Right now, it has nothing in progress after taking a big gulp of Einstein.)


Check for a file called ap_cmdline_win_x86_SSE2_OpenCL_NV.txt.
Its in your project folder.

I wrote in the read me.
These switches can be placed into ap_cmdline_win_x86_SSE2_OpenCL_NV.txt also.


For the 440 you can use
-use_sleep -unroll 6 -oclFFT_plan 256 16 256 -ffa_block 4096 -ffa_block_fetch 2048

Okay, but how about for the 630?


-use_sleep -unroll 4 -oclFFT_plan 256 16 256 -ffa_block 2048 -ffa_block_fetch 1024


With each crime and every kindness we birth our future.
ID: 1594419 · Report as offensive
Profile Fawkesguy
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 108
Credit: 188,578,766
RAC: 0
United States
Message 1594426 - Posted: 30 Oct 2014, 15:28:05 UTC

Hi Mike,

I added:
-use_sleep -unroll 16 -oclFFT_plan 256 16 256 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -tune 2 64 4 1

to my ap_cmdline_win_x86_SSE2_OpenCL_NV.txt based on your post which was directed at another user with two 780's. http://setiathome.berkeley.edu/forum_thread.php?id=75863&postid=1592389

Hopefully that is correct.

Another question - I've seen it mentioned in several places that one should free up a core to be used by the GPU. Does the "-cpu_lock" line take care of that, or is there something else I need to set?
ID: 1594426 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1594428 - Posted: 30 Oct 2014, 15:46:31 UTC - in response to Message 1594426.  

Hi Mike,

I added:
-use_sleep -unroll 16 -oclFFT_plan 256 16 256 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -tune 2 64 4 1

to my ap_cmdline_win_x86_SSE2_OpenCL_NV.txt based on your post which was directed at another user with two 780's. http://setiathome.berkeley.edu/forum_thread.php?id=75863&postid=1592389

Hopefully that is correct.

Another question - I've seen it mentioned in several places that one should free up a core to be used by the GPU. Does the "-cpu_lock" line take care of that, or is there something else I need to set?


Depends which CPU is used.

A free CPU core always helps feeding the GPU.
Especially if more than one GPU is running.
On Intel CPU`s -cpu_lock also helps.


With each crime and every kindness we birth our future.
ID: 1594428 · Report as offensive
Profile Fawkesguy
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 108
Credit: 188,578,766
RAC: 0
United States
Message 1594436 - Posted: 30 Oct 2014, 16:04:58 UTC - in response to Message 1594428.  

Well, my CPU is an intel 4930K. Six cores. Do I need to set anything else to ensure my two 780's are being fed properly?
ID: 1594436 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1594591 - Posted: 30 Oct 2014, 21:05:30 UTC - in response to Message 1594436.  

Well, my CPU is an intel 4930K. Six cores. Do I need to set anything else to ensure my two 780's are being fed properly?


No those params should work fine on your system.


With each crime and every kindness we birth our future.
ID: 1594591 · Report as offensive
Profile Fawkesguy
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 108
Credit: 188,578,766
RAC: 0
United States
Message 1594636 - Posted: 30 Oct 2014, 22:25:20 UTC - in response to Message 1594591.  

Mike, thanks for your help!
ID: 1594636 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1594647 - Posted: 30 Oct 2014, 22:40:41 UTC - in response to Message 1594419.  

I read through the whole readme about optimizers yesterday, but I'm still too clueless to actually try anything. And I'm still not even sure how to do anything. I probably missed it, but where would I put whatever switches I decide to use?

I seriously want to do something to speed up my GPU-only cruncher. It seems (but my analysis could be wrong) like AP7 is taking considerably longer to run but only getting slightly more credit compared to MB7. I had AP6 disallowed on this box because it used more CPU than I wanted to assist the GPU. I allowed AP7 because I was assured it would use less of the CPU.

(Right now, it has nothing in progress after taking a big gulp of Einstein.)


Check for a file called ap_cmdline_win_x86_SSE2_OpenCL_NV.txt.
Its in your project folder.

I wrote in the read me.
These switches can be placed into ap_cmdline_win_x86_SSE2_OpenCL_NV.txt also.


For the 440 you can use
-use_sleep -unroll 6 -oclFFT_plan 256 16 256 -ffa_block 4096 -ffa_block_fetch 2048

Okay, but how about for the 630?


-use_sleep -unroll 4 -oclFFT_plan 256 16 256 -ffa_block 2048 -ffa_block_fetch 1024

Okay, I inserted the text into the respective files on each computer. I wasn't sure if I had to do anything else, so I exited and restarted Boinc on them. Anything else I need to do? (Besides wait until they decide to do APs on their GPUs...)
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1594647 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1594726 - Posted: 31 Oct 2014, 3:17:05 UTC - in response to Message 1594647.  

Mike,

Do you have recommendations for the command line of the 980? This is my current one

-use_sleep -unroll 18 -oclFFT_plan 256 16 256 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -tune 2 64 4 1 -hp

Thanks in advance

Zalster
ID: 1594726 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1594757 - Posted: 31 Oct 2014, 4:44:28 UTC - in response to Message 1594726.  

Mike,

Do you have recommendations for the command line of the 980? This is my current one

-use_sleep -unroll 18 -oclFFT_plan 256 16 256 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -tune 2 64 4 1 -hp

Thanks in advance

Zalster


The 980 can probably handle a bigger unroll but i would wait until the new app is available.

Just let it run a few days.


With each crime and every kindness we birth our future.
ID: 1594757 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1595935 - Posted: 2 Nov 2014, 14:09:10 UTC

My first task reported since I added the switches came back with an error.

http://setiathome.berkeley.edu/result.php?resultid=3808622132
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1595935 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1595936 - Posted: 2 Nov 2014, 14:14:59 UTC - in response to Message 1595935.  
Last modified: 2 Nov 2014, 14:16:24 UTC

Server state Over
Outcome Computation error
Client state Compute error
Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED
Computer ID 5947619
Run time 1 days 6 hours 34 min 4 sec
CPU time 11 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 171.22 GFLOPS
Application version AstroPulse v7
Anonymous platform (NVIDIA GPU)


Is your time estimate for completion maybe too low. What does it say? I think you have 10X the time estimate to complete otherwise it errors out. If that is the case, you might need to change the estimate. Before I post that I want to see what your answer to the above is



Zalster
ID: 1595936 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34255
Credit: 79,922,639
RAC: 80
Germany
Message 1595944 - Posted: 2 Nov 2014, 14:46:34 UTC - in response to Message 1595935.  
Last modified: 2 Nov 2014, 15:02:37 UTC

My first task reported since I added the switches came back with an error.

http://setiathome.berkeley.edu/result.php?resultid=3808622132


Something is completly wrong here.

Run time over 1 day.

Do you have one CPU core free ?

Did you exclude the seti data folder from your anti virus proggy ?

It seems your GPU got in a stall.


With each crime and every kindness we birth our future.
ID: 1595944 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1596273 - Posted: 3 Nov 2014, 2:47:44 UTC

Okay, I may have had something else also making intensive use of the GPU at the same time, and between them they crashed both the driver and the other app. The next time I run that app, I'll suspend Boinc first.

After the second time, I stopped and restarted Boinc. It has not reported another AP7 since then. (It probably won't for a day or two, either; it's running Einsteins HP right now.)

However, my other host (the 630) has now finished two tasks with the new settings and it's a major improvement in the run time. I hope the 440 ends up showing the same improvement.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1596273 · Report as offensive
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Need some guidance and advice on optimizing my optimized app. :-)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.