low performance with dual GPU under linux

Message boards : Number crunching : low performance with dual GPU under linux
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile ChristianVirtual
Avatar

Send message
Joined: 23 Jun 13
Posts: 21
Credit: 10,060,003
RAC: 0
Japan
Message 1878616 - Posted: 16 Jul 2017, 23:49:40 UTC

I would need your collective knowledge and experience:

I have a CentOS 7 system with latest nV driver 384.47.
one 980Ti (MSI SeaHawk) and one 1080Ti (FE). CPU is i7-2600S on Asus MB

I have WU running (blc) which run > 30 min while estimation say 6 min

When looking around the system I saw this in "top": two irq processes linked to nVidia. Not sure if I have seen those before; any idea where they come from ?

29709 boinc     30  10 64.509g 209208  87984 S   4.0  0.6   2:25.89 setiathome_8.01                                                                                                   
29746 boinc     30  10 64.864g 218664  98296 S   2.0  0.7   1:39.73 setiathome_8.01                                                                                                   
 7275 root     -51   0       0      0      0 S   1.0  0.0 225:20.22 irq/49-nvidia                                                                                                     
 7282 root     -51   0       0      0      0 S   0.3  0.0 409:51.05 irq/50-nvidia   


Any other idea why performance can be poor ? There is no other CPU tasks running.
ID: 1878616 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1878655 - Posted: 17 Jul 2017, 1:47:33 UTC - in response to Message 1878616.  

Greater that 30 minutes with those cards? How many tasks at a time are you running?

I believe my 750Ti was around 26 minutes with stock apps.

And your computers are hidden, so I can't see the details ...
ID: 1878655 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1878659 - Posted: 17 Jul 2017, 1:55:08 UTC - in response to Message 1878616.  
Last modified: 17 Jul 2017, 2:04:19 UTC

The irq/49-nvidia is a system process and normal, I have something similar on my nVidia machines. The problem is the setiathome_8.01 part. I'm assuming that's the Old Baseline CUDA App? Kinda hard to tell since your machines are Hidden. The Old Baseline Apps have a Bug with the VLAR tasks which causes them to only use One Compute Unit. That is why they run Slow on the VLARs. The New CUDA App fixes that problem with the Unroll feature which sends data to the Unused CUs resulting in Much faster performance. The only way to Stop receiving tasks for that faulty App is to switch to using Anonymous platform and just supply either the Stock OpenCL App or the New CUDA App. The New CUDA App is Much Faster than the stock OpenCL App. You can download a package designed to be Plug & Play here, Linux_zi3v-CUDA80_Special.7z Just add the expanded files to setiathome.berkeley.edu and set the Permissions if you're using the Repository version of BOINC, the 29709 boinc part indicates you are using the repository version. If you were using the BOINC version downloaded from Berkeley, running in your Home folder, you wouldn't have to worry about Permissions.
ID: 1878659 · Report as offensive
Profile ChristianVirtual
Avatar

Send message
Joined: 23 Jun 13
Posts: 21
Credit: 10,060,003
RAC: 0
Japan
Message 1878694 - Posted: 17 Jul 2017, 5:39:46 UTC

Thanks both,

I just unhide the hosts; and installed the version as TBar suggested. Running now cuda8. Though difficult for me to compare as different WU are downloaded (see some BLC down the line); need to wait a bit.
ID: 1878694 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1878710 - Posted: 17 Jul 2017, 8:08:25 UTC - in response to Message 1878694.  
Last modified: 17 Jul 2017, 8:23:28 UTC

Thanks for unhiding your computers, I'm getting the picture now.

- It's a new computer ID created July 8.
- Up until July 16 it was still searching for the 'right' app to run on your computer. This is normal for Seti to do with stock apps.
- When it was testing the CUDA60 app you just happened to be running all BLC tasks (from what I see) and they were >30 minutes. CUDA60 is known to be REALLY slow.
- The r3584 SoG app looks to have run fairly good after that.
- Then you changed to x41p_zi3v on July 17.

So now your times look quite good to me. I don't think you will see a BLC task last longer than 3 minutes now :)

SoG would have ran fine, but thank to TBar you are now running the 'cream of the crop' "special sauce" app. Enjoy watching those tasks fly through.

PS: I would add -nobs to your command line for a little extra boost, but requires more CPU use, 1 thread / card.

EDIT: For your GPUs on BLC tasks now ...
980Ti ~3m 20s
1080Ti ~2m 20s
Not bad at all :D
ID: 1878710 · Report as offensive
Profile ChristianVirtual
Avatar

Send message
Joined: 23 Jun 13
Posts: 21
Credit: 10,060,003
RAC: 0
Japan
Message 1878721 - Posted: 17 Jul 2017, 9:44:00 UTC

Thanks Brent for the assessment ... will try the noobs , I mean nobs option too. CPU in this box is exclusive to drive GPU, as for other DC projects on this box. So should be fine with that.
ID: 1878721 · Report as offensive

Message boards : Number crunching : low performance with dual GPU under linux


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.