Fastest MB/AP cmdline settings for a NV GTX980Ti?

Message boards : Number crunching : Fastest MB/AP cmdline settings for a NV GTX980Ti?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1735886 - Posted: 21 Oct 2015, 9:47:46 UTC
Last modified: 21 Oct 2015, 9:50:36 UTC

Yes, i have added the tuning section in the read me files.

Last modification of the files i made in june before the Installer 0.43b has been released.

The work group size has nothing to do with cuda cores.
The 980TI also has a work group size of 1024.
Changing it will have no possitive affect for the 980TI.
Using higher work group size for high end cards results in bigger run times.
It only helps for entry level cards like the 720 with less than 10 compute units for example.

For high end cards like Titan X or 980TI the best way to speed up the card further more is to increase -unroll when -use_sleep is in place.
FFA_fetch values are max possible without getting false overflows.

Increasing unroll might require to reduce ffa_fetch values on some GPU host combos.


With each crime and every kindness we birth our future.
ID: 1735886 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1735893 - Posted: 21 Oct 2015, 10:29:26 UTC - in response to Message 1735886.  


Increasing unroll might require to reduce ffa_fetch values on some GPU host combos.

And this possible relation can't be explained from algorithm point of view :)
Maybe kernel calls batched in some undivisible blocks there...
Big -ffa_block sizes reasonably would require drop in -ffa_block_fetch value.
ID: 1735893 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1735894 - Posted: 21 Oct 2015, 10:51:14 UTC - in response to Message 1735893.  
Last modified: 21 Oct 2015, 10:57:55 UTC


Increasing unroll might require to reduce ffa_fetch values on some GPU host combos.

And this possible relation can't be explained from algorithm point of view :)
Maybe kernel calls batched in some undivisible blocks there...
Big -ffa_block sizes reasonably would require drop in -ffa_block_fetch value.


I mean both -ffa_block and -ffa_block_fetch value.
With 16384 8192 i got some lags using -unroll bigger 24 on some NV cards during my tests.
On the 780TI for example.
The card can handle -ffa_block 12288 and -ffa_block_fetch 6144 with -unroll 28.


With each crime and every kindness we birth our future.
ID: 1735894 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1735898 - Posted: 21 Oct 2015, 11:01:38 UTC - in response to Message 1735893.  


Increasing unroll might require to reduce ffa_fetch values on some GPU host combos.

And this possible relation can't be explained from algorithm point of view :)
Maybe kernel calls batched in some undivisible blocks there...
Big -ffa_block sizes reasonably would require drop in -ffa_block_fetch value.

I've noticed in Linux the smaller ffa numbers work better. At least on the cards that have tried it.
I've also noticed with some drivers you don't need the -Sleep cmd, checkout these times Not using sleep, http://setiathome.berkeley.edu/results.php?hostid=7771865&offset=60&appid=20
Then these cpu times, http://setiathome.berkeley.edu/results.php?hostid=7611256&appid=20
My nVidia in Ubuntu, Not using Sleep, http://setiathome.berkeley.edu/results.php?hostid=7258715&offset=100
The same card in OSX is the same without using Sleep.
ID: 1735898 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1735899 - Posted: 21 Oct 2015, 11:04:22 UTC - in response to Message 1735898.  


Increasing unroll might require to reduce ffa_fetch values on some GPU host combos.

And this possible relation can't be explained from algorithm point of view :)
Maybe kernel calls batched in some undivisible blocks there...
Big -ffa_block sizes reasonably would require drop in -ffa_block_fetch value.

I've noticed in Linux the smaller ffa numbers work better. At least on the cards that have tried it.
I've also noticed with some drivers you don't need the -Sleep cmd, checkout these times Not using sleep, http://setiathome.berkeley.edu/results.php?hostid=7771865&offset=60&appid=20
Then these cpu times, http://setiathome.berkeley.edu/results.php?hostid=7611256&appid=20
My nVidia in Ubuntu, Not using Sleep, http://setiathome.berkeley.edu/results.php?hostid=7258715&offset=100
The same card in OSX is the same without using Sleep.


Yes, we are talking about win only in this case.


With each crime and every kindness we birth our future.
ID: 1735899 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1735900 - Posted: 21 Oct 2015, 11:07:02 UTC - in response to Message 1735898.  

I've also noticed with some drivers you don't need the -Sleep cmd,

Same on Windows but those drivers that don't need sleep became ancient ones ....
ID: 1735900 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1735901 - Posted: 21 Oct 2015, 11:08:54 UTC - in response to Message 1735900.  

I've also noticed with some drivers you don't need the -Sleep cmd,

Same on Windows but those drivers that don't need sleep became ancient ones ....

Yes, but the drivers in Linux Are Not ancient.
ID: 1735901 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1735902 - Posted: 21 Oct 2015, 11:11:24 UTC

Can we stay on topic please.

This guys are asking for optimisation on their hosts.
Non of them running Linux or OSX.


With each crime and every kindness we birth our future.
ID: 1735902 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1735908 - Posted: 21 Oct 2015, 11:21:11 UTC - in response to Message 1735902.  

Can we stay on topic please.

This guys are asking for optimisation on their hosts.
Non of them running Linux or OSX.

If they discovered it would optimize better they might.
It is on Topic, just the Topic You don't like.
ID: 1735908 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1735909 - Posted: 21 Oct 2015, 11:24:54 UTC - in response to Message 1735908.  

Can we stay on topic please.

This guys are asking for optimisation on their hosts.
Non of them running Linux or OSX.

If they discovered it would optimize better they might.
It is on Topic, just the Topic You don't like.


I run Linux so why do you think i don`t like it ?


With each crime and every kindness we birth our future.
ID: 1735909 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1735912 - Posted: 21 Oct 2015, 11:29:51 UTC - in response to Message 1735909.  

Can we stay on topic please.

This guys are asking for optimisation on their hosts.
Non of them running Linux or OSX.

If they discovered it would optimize better they might.
It is on Topic, just the Topic You don't like.


I run Linux so why do you think i don`t like it ?

Why are you telling me that Linux Optimizations are off topic?
If the cards runs better in Linux why say it's off topic?
ID: 1735912 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1735913 - Posted: 21 Oct 2015, 11:43:42 UTC - in response to Message 1735912.  

Optimization by OS change is viable way, but one should consider overall performance gain then cause it's hardly possible w/o unacceptable overhead to put NV card "under Linux" and staying with Windows for all other devices. This makes transition to Linux more complex optimization than simple param space tuning for particular GPU.
Would be interesting to get full-host transition case studies Linux vs Windows (OS X should be excluded cause it requires different hardware/firmware AFAK) in separate thread. That accounts for CPU parformance change as well and other possible intrications for multi-GPU hosts.
Having such thread would be really great in view of Win10 concerns...
ID: 1735913 · Report as offensive
Profile AyalaZero
Volunteer tester

Send message
Joined: 14 Aug 05
Posts: 21
Credit: 10,910,119
RAC: 0
United States
Message 1743434 - Posted: 19 Nov 2015, 23:36:25 UTC - in response to Message 1735913.  

Are you telling me I can set-up my computer to run Seti on Linux, while still being able to actively use my computer on Windows?[/quote]
ID: 1743434 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1743631 - Posted: 20 Nov 2015, 17:35:11 UTC - in response to Message 1743434.  

Are you telling me I can set-up my computer to run Seti on Linux, while still being able to actively use my computer on Windows?
[/quote]
It depends on how you would "actively use" it.
Playing 3D games - probably not. Browsing and office work - yes, it's quite possible.
Just run windows OS under VM in Linux.
ID: 1743631 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Fastest MB/AP cmdline settings for a NV GTX980Ti?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.