Message boards :
Number crunching :
Need help trying to understand what happened on v7 Cuda50 WUs
Message board moderation
Author | Message |
---|---|
Cliff Harding Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67 |
I installed a 2nd GTX750Ti FTW yesterday. I did reinstall the NVidia driver and recycled the machine prior to restarting BOINC again. I'm running Win7(x64), 7.48 (x64) with Lunatics 0.41. The app_config_xml is <app_config> <app> <name>astropulse_v6</name> <max_concurrent>8</max_concurrent> <gpu_versions> <gpu_usage>.5</gpu_usage> <cpu_usage>1</cpu_usage> </gpu_versions> </app> <app> <name>setiathome_v7</name> <gpu_versions> <gpu_usage>.25</gpu_usage> <cpu_usage>.5</cpu_usage> </gpu_versions> </app> </app_config> Tonight I spotted this http://setiathome.berkeley.edu/results.php?hostid=5501972&offset=0&show_names=0&state=6&appid=11 showing 100 WUs aborted by user with an error status 201 (0xc9) EXIT_MISSING_COPROC. Snippet of STDOUTDAE from 22 July -- 22-Jul-2014 18:26:58 [---] Starting BOINC client version 7.4.8 for windows_x86_64 22-Jul-2014 18:26:58 [---] log flags: file_xfer, sched_ops, task, cpu_sched, dcf_debug 22-Jul-2014 18:26:58 [---] Libraries: libcurl/7.33.0 OpenSSL/1.0.1h zlib/1.2.8 22-Jul-2014 18:26:58 [---] Data directory: D:\BOINC 22-Jul-2014 18:26:58 [---] Running under account Cliff Harding 22-Jul-2014 18:26:58 [---] OpenCL: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 340.43, device version OpenCL 1.1 CUDA, 2048MB, 2048MB available, 101 GFLOPS peak) 22-Jul-2014 18:26:58 [---] OpenCL: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 340.43, device version OpenCL 1.1 CUDA, 2048MB, 2048MB available, 101 GFLOPS peak) 22-Jul-2014 18:26:58 [---] OpenCL: Intel GPU 0 (ignored by config): Intel(R) HD Graphics 4600 (driver version 10.18.10.3621, device version OpenCL 1.2, 1195MB, 1195MB available, 200 GFLOPS peak) 22-Jul-2014 18:26:58 [---] OpenCL CPU: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 3.0.1.10878, device version OpenCL 1.2 (Build 76413)) 22-Jul-2014 18:26:58 [Milkyway@Home] Found app_info.xml; using anonymous platform 22-Jul-2014 18:26:58 [---] App version needs CUDA but GPU doesn't support it 22-Jul-2014 18:26:58 [SETI@home] Found app_info.xml; using anonymous platform 22-Jul-2014 18:26:58 [---] App version needs CUDA but GPU doesn't support it 22-Jul-2014 18:26:58 [---] App version needs CUDA but GPU doesn't support it 22-Jul-2014 18:26:58 [---] App version needs CUDA but GPU doesn't support it ... ... 22-Jul-2014 18:26:58 [---] App version needs CUDA but GPU doesn't support it 22-Jul-2014 18:26:58 [---] App version needs CUDA but GPU doesn't support it 22-Jul-2014 18:26:58 [---] App version needs CUDA but GPU doesn't support it 22-Jul-2014 18:26:58 [---] App version needs CUDA but GPU doesn't support it 22-Jul-2014 18:26:58 [SETI@home] Missing coprocessor for task 21ja09ac.7210.12751.438086664200.12.145_1 22-Jul-2014 18:26:58 [SETI@home] Missing coprocessor for task 22mr08ab.19719.17250.438086664196.12.236_0 22-Jul-2014 18:26:58 [SETI@home] Missing coprocessor for task 21ja09ac.7210.19704.438086664200.12.191_0 ... ... 22-Jul-2014 18:26:58 [SETI@home] Missing coprocessor for task 22mr08ab.6744.15205.438086664205.12.90_0 22-Jul-2014 18:26:58 [---] Host name: A-SYS 22-Jul-2014 18:26:58 [---] Processor: 8 GenuineIntel Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz [Family 6 Model 60 Stepping 3] 22-Jul-2014 18:26:58 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx tm2 pbe fsgsbase bmi1 smep bmi2 22-Jul-2014 18:26:58 [---] OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00) 22-Jul-2014 18:26:58 [---] Memory: 7.67 GB physical, 8.06 GB virtual 22-Jul-2014 18:26:58 [---] Disk: 119.24 GB total, 88.74 GB free 22-Jul-2014 18:26:58 [---] Local time is UTC -4 hours 22-Jul-2014 18:26:58 [Milkyway@Home] Found app_config.xml 22-Jul-2014 18:26:58 [SETI@home] Found app_config.xml 22-Jul-2014 18:26:58 [---] Config: report completed tasks immediately 22-Jul-2014 18:26:58 [---] Config: use all coprocessors 22-Jul-2014 18:26:58 [---] Config: ignoring Intel GPU 0 22-Jul-2014 18:26:58 [---] Config: GUI RPCs allowed from: 22-Jul-2014 18:26:58 [Milkyway@Home] URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 543798; resource share 0 22-Jul-2014 18:26:58 [SETI@home] URL http://setiathome.berkeley.edu/; Computer ID 5501972; resource share 100 22-Jul-2014 18:26:58 [SETI@home] General prefs: from SETI@home (last modified 18-Aug-2013 12:00:17) 22-Jul-2014 18:26:58 [SETI@home] Host location: none 22-Jul-2014 18:26:58 [SETI@home] General prefs: using your defaults 22-Jul-2014 18:26:58 [---] Reading preferences override file 22-Jul-2014 18:26:58 [---] Preferences: 22-Jul-2014 18:26:58 [---] max memory usage when active: 7462.60MB 22-Jul-2014 18:26:58 [---] max memory usage when idle: 7855.37MB 22-Jul-2014 18:26:58 [---] max disk usage: 79.46GB 22-Jul-2014 18:26:58 [---] max CPUs used: 6 22-Jul-2014 18:26:58 [---] (to change preferences, visit a project web site or select Preferences in the Manager) 22-Jul-2014 18:26:58 [---] Not using a proxy 22-Jul-2014 18:27:00 Initialization completed The machine has been recycled several times since then and there doesn't appear to be any more aborts. Temps seem to be within nominal range for both the CPU & GPUs. It does appear though that with both GPUs working nominally that they are sucking up WUs like a Pac Man. I don't buy computers, I build them!! |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
where did you get your nvidia driver from? Current Nvidia driver is driver: 337.88 |
Cliff Harding Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67 |
where did you get your nvidia driver from? Current Nvidia driver is driver: 337.88 340.43 is the beta driver dated 17 June from the NVidia web site, where I get all of my version upgrades. It has been on my machine since 13 July with no problems. I always d/l a beta version after it has been up for a while and only had one problem that I can remember. If you look at WUs from this machine prior to yesterday there were not any problems. I don't buy computers, I build them!! |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Aparently something is broken/missing in your driver or an incompatibility with your configuraton. Try to reinstall the driver but use the recomended (more tested and stable) GeForce 337.88 Driver instead of the beta ones. DL it directly from the nvidia site of course. Not forget: do clean instalation. |
Cliff Harding Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67 |
I always do a clean driver install and it has been steadily crunching using one GPU since 13 July with no errors, so I don't believe that is/was the problem. The problem only lasted for approx. 100 WUs. If the driver was at fault it would have had blew all GPU tasks down the drain which is not the case, since the problem lasted for less than 1 minute (clock time) then ceased. If it had lasted longer or if affected other types (Open_cl), I would have suspected the driver. The errors, some of which have dropped off, are the only errors that I have seen for this machine for this driver; in fact since it first came online in May. I don't buy computers, I build them!! |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I'm going to include a link from another thread I saw from 2 years ago. While not completely describing what is happening, it's pretty close. Might want to read all the way thru the thread and see what you think http://boinc.berkeley.edu/dev/forum_thread.php?id=7600 |
Cliff Harding Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67 |
I'm going to include a link from another thread I saw from 2 years ago. While not completely describing what is happening, it's pretty close. Might want to read all the way thru the thread and see what you think I read the thread and two thing immediately pop out: 1) The driver for the card was not recognized upon startup, and my driver was immediately detected. 2) He is using Linux and I'm using Win7 (x64). I also noticed that sometimes you have to play games to get BOINC to recognize NVidia devices while using Linux or other OSs whereas you don't have this problem with Windows regardless of where the data directory is located. BOINC resides on C:\Program Files\BOINC The data directory, therefore all .EXEs resides on D:\BOINC\Projects - with sub-folders for each particular project attached to the machine. This has been the configuration for all of my machines for several years. I don't buy computers, I build them!! |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
... Detection of the CUDA capability is separate from detection of OpenCL, and the snippet of STDOUTDAE in your original post was missing the expected lines for CUDA. From subsequent posts it seems that may have been a one-time glitch, and the cause may remain inscrutable. OTOH, it might make sense to report the incident to the BOINC developers since you're using the current alpha version. I doubt they have many alpha testers who have transitioned from one GPU to two. Joe |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.