Spicing UP the RAC a little...

Message boards : Number crunching : Spicing UP the RAC a little...
Message board moderation

To post messages, you must log in.

AuthorMessage
I3APR

Send message
Joined: 23 Apr 16
Posts: 99
Credit: 70,717,488
RAC: 0
Italy
Message 1798428 - Posted: 24 Jun 2016, 14:51:46 UTC

I thought 'twas time to play a little with GPU's more than CPU's, to spice up the RAC, so I borrowed some ideas from bitcoin miners and fight with my limiting wallet... :

MB : Asrock H81-BTC ( pretty sure BTC stands for BiTCoin) which according to the miners community is a rather stable mobo when hosting multiple GPU's ( up to 6 )
CPU : Core i7-4790K Quad-Core 4 Ghz unlocked
RAM : CORSAIR - Dominator Platinum 16 GB ( 2x8 )
SSD&DVD : lame
PSU: Corsair AX1500i Titanium PSU 1500W
CASE : fresh air ;-)
RISER: 6x PCI-E 1x to 16x Riser USB 3.0
GPU : 4 x Nvidia 660ti + 1 x Nvidia 780ti + 1 x Nvidia TBD

Now, with the actual rig ( i7-3770 and 7850 Radeon), I've been playing with the app_config.xml, and it seems that the optimal conf, leaving 10% of the CPU free in the computing preferences is :

<app_config>
<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>.67</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_v8</name>
<max_concurrent>16</max_concurrent>
<gpu_versions>
<gpu_usage>.33</gpu_usage>
<cpu_usage>.2</cpu_usage>
</gpu_versions>
</app>
</app_config>

Now, from all the readings, if I'm not wrong, this would let the GPU use the computational power of 3 threads ( 1 physical core and 1 thread of a virtual core) but I'm not sure, I admit I copied & pasted that, why "0.2" and how to relate it to the number of GPU's instances

So...when running 5 or even 6 GPU's ( 3 instances per GPU) with an 8 core CPU (4 + HT) how would you set the cpu_usage AND the "use at most %" of the computing preferences in BOINC..?

Please, helpe get out of the fog...

Andy
ID: 1798428 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1798433 - Posted: 24 Jun 2016, 15:18:42 UTC - in response to Message 1798428.  

I see a problem off the bat..

Not enough CPU cores.

Mind you, before we had new apps, you could get away with smaller percentage for each work unit but not anymore.

The 0.2 CPU is what you want the computer to use of each core for a work unit.

Unfortunately, it doesn't mean that is actually how much it will use.

The new blc (breakthrough listen) work units tend to require higher amounts of a core ( upto and including 1 core for each work unit).

That creates a bottleneck of not having enough CPU cores to feed all those work units.
ID: 1798433 · Report as offensive
I3APR

Send message
Joined: 23 Apr 16
Posts: 99
Credit: 70,717,488
RAC: 0
Italy
Message 1798632 - Posted: 25 Jun 2016, 13:59:47 UTC - in response to Message 1798433.  

I see a problem off the bat..
Not enough CPU cores.

Well...that was an helluva anticlimax, if you ask me.... ;-)

So, the best scenario would be having a CPU core per WU, almost no CPU number crunching, all GPU work.

The app_config.xml would then be ( optimistically )

<gpu_usage>.5</gpu_usage>
<cpu_usage>.2</cpu_usage>

Given that "The 0.2 CPU is what you want the computer to use of each core for a work unit..Unfortunately, it doesn't mean that is actually how much it will use." it seems to me we're on a "grey zone" where parameters are set by the config files, but the app is not strictly following them, and the makes me feel less stupid in understanding the logic beneath...
Anyway, what would be the amount of CPU NOT to be used in the cpu_usage AND the "use at most %" of the computing preferences in BOINC..? Because it seems to me it should be left almost to zero, if all the CPU power must be put aside to support GPU's..

I'm frankly embarassed to admit I'm at a loss here, ad I thank you everyon that can be clear about that...

A.
ID: 1798632 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1798674 - Posted: 25 Jun 2016, 19:27:08 UTC - in response to Message 1798632.  

If this is to be a pure cruncher, then I would put the following

Use at most 100% of the CPUs

Use at most 100% of the CPU time

This will allow the computer to use all of the CPU for your GPUs and allow all of the time for your project.


I have a laptop that I use, but I use it for other things so then I only want it to use 50% of the CPUs so it leave half for other things but leave use at most set to 100%

For my pure crunchers, it's set to the 100 on both
ID: 1798674 · Report as offensive
EdwardPF
Volunteer tester

Send message
Joined: 26 Jul 99
Posts: 389
Credit: 236,772,605
RAC: 374
United States
Message 1798694 - Posted: 25 Jun 2016, 20:43:59 UTC - in response to Message 1798428.  

This reply MAY or MAY NOT be germain to the question ... but here goes ...

I am using an old Dell Studio xps with a Gen-1 core-7, win-7, and a nvidia GEForce GTX770.

I have found the the CPU's (I'll call them CPU-0,CPU-2, CPU-4, and CPU-6) run faster then the Hyprt Threaded analogs (I'll call them HPU-1, HPU-3, HPU-5, and HPU-7). To be more correct (Ha) I have found the the HPU's are about 20% of a CPU.

After experimenting with the various combinations I have found that for this computer running the CPU-only Work units on the CPU's (0,2,4,&6) and GPU-only Work units on the HPU's (1,3,&5) gives me the most RAC AND most responsive computer. I leave HPU-7 for Windows to use for its "scheduling/ETC".

each boinc instance is dedicated to 1 (and only 1) CPU or HPU as indicated)

My usage/results look somthing like this ... (if I can paste this link correctly here) https://dl.dropboxusercontent.com/u/42596478/boinc_x_7.jpg


I hope this is SOME?? help in answering your question

Ed F
ID: 1798694 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1798704 - Posted: 25 Jun 2016, 21:36:24 UTC - in response to Message 1798694.  

Wow, learned something today Ed, I had no idea that you could be that granular and assign WU's to either the HT cores or the 'real' cores. How did you go about assigning them like you did?

ID: 1798704 · Report as offensive
EdwardPF
Volunteer tester

Send message
Joined: 26 Jul 99
Posts: 389
Credit: 236,772,605
RAC: 374
United States
Message 1798732 - Posted: 25 Jun 2016, 22:56:11 UTC - in response to Message 1798704.  
Last modified: 25 Jun 2016, 22:56:42 UTC

oh good grief! what have I done?? I'll have to remember the original setup ... This may take some time ... and not be exact ... but once the setup is done the startup is thus:

rem
start /affinity 08 Z:\BOINC_test_programs\boinc.exe --gui_rpc_port 31417 --dir Z:\BOINC_test_data_2 --allow_multiple_clients --detach
rem
start /affinity 20 Z:\BOINC_test_programs\boinc.exe --gui_rpc_port 31418 --dir Z:\BOINC_test_data_3 --allow_multiple_clients --detach
rem
start /affinity 01 Z:\BOINC_test_programs\boinc.exe --gui_rpc_port 31419 --dir Z:\BOINC_test_data_4 --allow_multiple_clients --detach
rem
start /affinity 04 Z:\BOINC_test_programs\boinc.exe --gui_rpc_port 31420 --dir Z:\BOINC_test_data_5 --allow_multiple_clients --detach
rem
start /affinity 10 Z:\BOINC_test_programs\boinc.exe --gui_rpc_port 31421 --dir Z:\BOINC_test_data_6 --allow_multiple_clients --detach
rem
start /affinity 40 Z:\BOINC_test_programs\boinc.exe --gui_rpc_port 31422 --dir Y:\MC_at_Home --allow_multiple_clients --detach
rem
start /affinity 02 Z:\BOINC_test_programs\boinc.exe --gui_rpc_port 31423 --dir Z:\BOINC_test_data_8 --allow_multiple_clients --detach
REM
cd /D Z:\BOINC_test_data_1
start Z:\BOINC_test_programs\boincmgr.exe /s
cd ..
rem
cd ..\users\me\boinc_starts
exit

(as edited for your own computer serup) ...

As I hope was evident by the pic I attached (I hope) cpu's 0,2,4,and 6 were 100% busy (running "cpu" WU's) HPU's 1,2,and 5 were not so busy feeding the GPU.
HPU 7 was running ~100% Win-7 code.

If you NEED the setup ... I'll try to remember ... or perhaps other SETI folks who actually know what they are doing can give you an exact cook-book for it.

Ed F
ID: 1798732 · Report as offensive
Profile Ninos Y

Send message
Joined: 26 Aug 99
Posts: 15
Credit: 55,831,116
RAC: 0
Canada
Message 1798759 - Posted: 26 Jun 2016, 2:49:04 UTC - in response to Message 1798732.  

I agree with Zalster, its ALL about the CPU cores now. How things have come full circle eh?

I'm sorry, but you need a real CPU core to drive your GPU instances, not a make-believe one.

As for the physical-core vs. HT-core issue, it's true you are sharing cache with one physical core. But the later model CPUs have "smarter" HT technology and have less cache thrashing. Its just a way of programming to run 2 threads on one core, so yes I could see how the HT-cores won't perform as well.

more physical cores = more WU done = higher RAC no doubt about it.
ID: 1798759 · Report as offensive
I3APR

Send message
Joined: 23 Apr 16
Posts: 99
Credit: 70,717,488
RAC: 0
Italy
Message 1798974 - Posted: 27 Jun 2016, 9:26:23 UTC

Ok guys, the damage is done...guilty not to have gone deep on the subject, the new CPU arrived few days ago and I hade already installed with thermal paste and a big cooler, so I'm not returning it.
Could have gone with an older, multi-core xeon instead for the same price ( about 400$), but here am I, and I have to get out the most of the actual configuration, can't go back.... :-(

To be honest, almost ALL the post I read about how to configure "app_config.xml", were discussing using "<cpu_usage>.2</cpu_usage>" or other elaborated fractions of CPU, not whole ones, not even pointing to the different behaviour between the virtual and the physical cores, but maybe I'm blind...or was simply wrongly biased by the bitcoin miners environment, where only GPU's seems to matter ( and Radeon GPU's not nVidia..btw)

Anyhow...

The PC was intended to be 100% dedicated to SETI@home, and I was hoping to get between 20.000 to 50.000 additional credit/day, since I was planning to get a GTX 1080 to be installed side by side a 780ti and four 660ti, but it seems that I will have to change plans a little, so...

@Zalster
Yes, the PC is 100% dedicated to seti, and ok : 100% CPU on both sliders.

@EdwardPF
I too wasn't aware it was possible to manually assign WU selectively to physical or virtual cores.
At this point, I'm willing to use single instance GPU tasks, so I believe I could assign the 780ti to first physical core, the GTX1080 to the second, two 660ti to the third and fourth, then use a couple of virtual cores for the two 660ti left and two virtual core for CPU crunching. Did I get it right ?
Care to explain a little bit more in depth how to ? I believe it would be useful not only for me but for other folks here as well..

@All
While I'm here : are there any alternative, optimized app to be used with seti@home project, that maybe are using the CPU the way the conf file were meant to be, or should I stick to the classic one ?

Thank you folks, I'm learning a lot here...

A.

P.S.
I know my english may seems a little buggy sometimes, but, yeah, you've got it : I'm not an english native speaker, and it shows.. :-D
ID: 1798974 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1798976 - Posted: 27 Jun 2016, 10:06:31 UTC - in response to Message 1798974.  

At this point, I'm willing to use single instance GPU tasks, so I believe I could assign the 780ti to first physical core, the GTX1080 to the second, two 660ti to the third and fourth, then use a couple of virtual cores for the two 660ti left and two virtual core for CPU crunching. Did I get it right ?

If you run the CUDA applications in their default configuration then there's no need to reserve a core, no matter how many WUs you run on each GPU.

However if you are after maximum output from the system, then you will want to make use of the OpenCL SoG application. And for it to give maximum possible output, it requires 1 CPU Core per WU crunched. And with the current applications maximum possible output requires crunching multiple WUs per GPU.
Grant
Darwin NT
ID: 1798976 · Report as offensive

Message boards : Number crunching : Spicing UP the RAC a little...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.