Need Help: trying to get 2 nvida cards runnng under linux

Message boards : Number crunching : Need Help: trying to get 2 nvida cards runnng under linux
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Eric B

Send message
Joined: 9 Mar 00
Posts: 88
Credit: 168,875,085
RAC: 762
United States
Message 2032169 - Posted: 13 Feb 2020, 3:27:07 UTC

I'm running openSuse Tumbleweed. I had a nvidia-1660-ti as my primary gpu and it worked great. Now i have added a second nvidia card, an RTX 2060 and I swapped my video to this card. When i run boinc it sees and uses the RTX2060 but doesnt see the 1660ti. I am using the nvidia driver 440.31 and i tried re-installing that driver but it still doesn't work. Also nvidia-smi only sees one card (the 2060) but nvidia-settings sees both. Finally: CUDA Version is 10.2 according to nvidia-smi and my app_info.xml specifies the app setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101
lspci shows:
# lspci|grep VGA
         03:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2060 Rev. A] (rev a1)
         05:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)

I have use_all_gpu's set to 1 in cc_config
<cc_config>
 <options>
   <use_all_gpus>1</use_all_gpus>
 </options>
</cc_config>

boinc says:
12-Feb-2020 19:12:38 [---] CUDA: NVIDIA GPU 0: GeForce RTX 2060 (driver version 440.31, CUDA version 10.2, compute capability 7.5, 4096MB, 3970MB available, 6739 GFLOPS peak)
12-Feb-2020 19:12:38 [---] OpenCL: NVIDIA GPU 0: GeForce RTX 2060 (driver version 440.31, device version OpenCL 1.2 CUDA, 5932MB, 3970MB available, 6739 GFLOPS peak)
12-Feb-2020 19:12:38 [---] OpenCL CPU: pthread-Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz (OpenCL driver vendor: The pocl project, driver version 1.4, device version OpenCL 1.2 pocl HSTR: pthread-x86_64-unknown-linux-gnu-sandybridge)
12-Feb-2020 19:12:38 [SETI@home] Found app_info.xml; using anonymous platform
12-Feb-2020 19:12:38 [---] [libc detection] gathered: 2.30, GNU libc


Any ideas?
TIA
ID: 2032169 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2032171 - Posted: 13 Feb 2020, 3:41:09 UTC - in response to Message 2032169.  

Might have a typo?
Permission is "boinc"?
-rw-rw-r-- 1 boinc boinc 2581 Feb  9 14:50 cc_config.xml

Issue the "read config" and look for
2/12/2020 9:35:16 PM	Re-reading cc_config.xml	
ID: 2032171 · Report as offensive
Profile Eric B

Send message
Joined: 9 Mar 00
Posts: 88
Credit: 168,875,085
RAC: 762
United States
Message 2032191 - Posted: 13 Feb 2020, 7:28:45 UTC - in response to Message 2032171.  

I found it, something disabled my affiliation with 'video' users, once i put that back everything worked
My boinc install is a bit different. I dont use the boinc user/group because i like to run boinc-client as
myself in $HOME/boinc and in the foreground so i can see whats going on there. Another reason is the
default suse install never worked properly and no matter what i tried it always tried to start boinc in the
wrong place even with the -d or --dir switch which it mostly ignored. There were other reasons as well, but
now that behind me and i have been running boinc-client i the foreground and in $HOME/boinc on 4
machines without issue until today when i added the 2060. 7 days ago there was a broken update from
suse that may have been the root cause of the group change - whatever, its fixed now.
 Starting BOINC client version 7.16.3 for x86_64-suse-linux-gnu
 log flags: file_xfer, sched_ops, task
 Libraries: libcurl/7.68.0 OpenSSL/1.1.1d-fips zlib/1.2.11 libidn2/2.3.0 libpsl/0.21.0 (+libidn2/2.3.0) libssh/0.9.3/openssl/zlib nghttp2/1.40.0
 Data directory: /home/erbenton/boinc
 CUDA: NVIDIA GPU 0: GeForce RTX 2060 (driver version 440.59, CUDA version 10.2, compute capability 7.5, 4096MB, 3970MB available, 6739 GFLOPS peak)
 CUDA: NVIDIA GPU 1: GeForce GTX 1660 Ti (driver version 440.59, CUDA version 10.2, compute capability 7.5, 4096MB, 3972MB available, 5668 GFLOPS peak)
 OpenCL: NVIDIA GPU 0: GeForce RTX 2060 (driver version 440.59, device version OpenCL 1.2 CUDA, 5932MB, 3970MB available, 6739 GFLOPS peak)
 OpenCL: NVIDIA GPU 1: GeForce GTX 1660 Ti (driver version 440.59, device version OpenCL 1.2 CUDA, 5945MB, 3972MB available, 5668 GFLOPS peak)
 OpenCL CPU: pthread-Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz (OpenCL driver vendor: The pocl project, driver version 1.4, device version OpenCL 1.2 pocl HSTR: pthread-x86_64-unknown-linux-gnu-sandybridge)
 Found app_info.xml; using anonymous platform
 [libc detection] gathered: 2.30, GNU libc
 Host name: erb1
 Processor: 12 GenuineIntel Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz [Family 6 Model 45 Stepping 7]
 Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d
 OS: Linux openSUSE: openSUSE Tumbleweed [5.4.12-1|libc 2.30 (GNU libc)]
 Memory: 31.29 GB physical, 2.00 GB virtual
ID: 2032191 · Report as offensive

Message boards : Number crunching : Need Help: trying to get 2 nvida cards runnng under linux


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.