Message boards :
Number crunching :
@Pre-FERMI nVidia GPU users: Important warning
Message board moderation
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · Next
Author | Message |
---|---|
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
We're currently referring to the demos (with source code) at https://developer.nvidia.com/opencl Yeah, sadly the old oclSimpleGL that demonstrates the technique (but doesn't do much) isn't there. The Ocean one isn't there either, and not in the SDKs I have currently installed, though oCLsimpleGL shows up in at least 3.2 and 4. The Ocean one was removed from OpenCL prior to 3.2 and moved to DirectCompute compute shader demos (Microsoft). There was a huge kerfuffle at the time, because MS's big thing was the synchronisation being discussed here (DirectX/DirectCompute, which Cuda uses for its synchronisation underneath still), And OpenGL/OpenCLs was (and probably still is) faster. Still, wouldn't solve issues here without different engineering. [*Edit*]: note that as far as I can see, the particles demo doesn't use any kindof blocking synchronisation, so will spin the CPU at 100% as observed. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
I've been running test WUs with the previous build - both normally, and with _use_sleep. Normal one was usable, and passed validation - so far so good. With -use_sleep, screen lag was intolerable - especially during ap_18se08aa_B6_P1_00046_1LC25 and sigind_v5. I couldn't even reliably complete a drag'n'drop operation while those were running. And at one point, I got it to sleep for about half an hour: After killing it, the next one ran normally - note the extra TpCallbackIndependent threads: |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Curious. Who's using Windows Kernel Threadpools ? The OpenCL DLL/Driver ? "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Curious. Who's using Windows Kernel Threadpools ? The OpenCL DLL/Driver ? Pass. Above my pay-grade :P |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Curious. Who's using Windows Kernel Threadpools ? The OpenCL DLL/Driver ? Well, either way, it changes a lot if you can enter an alertable wait instead of sleeping or using Cuda blocking syncs. [Edit:] ROFL, so much for "impossible to leak kernel resources as boinc apps don't use them...." (highly paraphrased) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I couldn't even reliably complete a drag'n'drop operation while those were running. And at one point, I got it to sleep for about half an hour: It's counterintuitive cause call to Sleep(1) would provide switching point for scheduler earlier than quantum passed, so should increase GUI responsibility, not to decrease it. Check stderr if something unusual reported there for that task. Anyway, better to follow pattern I described earlier for getting clear results. Single task, 2 binaries, 3 sets of switches. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Any confirmation that new build works OK with 341.44 driver on pre-FERMI cards? Reading the conditional logic for a bad driver version from changeset 2867, it appears to me the rejection of 344+ has been lost. if(driver_major_version_num>=340 && driver_major_version_num<341 || (driver_minor_version_num<44 && driver_major_version_num==341) The first line of that logically reduces to if(driver_major_version_num==340 Joe |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
Any confirmation that new build works OK with 341.44 driver on pre-FERMI cards? Sooo... Doesn't it say "340.xx or (341.xx less than 341.44)". Looks sorta correct to me. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Any confirmation that new build works OK with 341.44 driver on pre-FERMI cards? I don't know if checking for 344+ drivers is needed, probably they wouldn't even install on a system which had only a pre-Fermi GPU. If nVidia has considered all possibilities, adding a pre-Fermi card as a secondary GPU may also be handled gracefully. I just wanted to point out that the original >= 340 did define everything higher as bad for pre-Fermi while the present logic says nothing at all about anything above 341. Joe |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Any confirmation that new build works OK with 341.44 driver on pre-FERMI cards? That's correct - 344(+) drivers won't install on legacy hardware. But that doesn't mean that we won't push them up incrementally to 342 and above before we've finished finding their bugs for them. Has anybody got a copy of Clean20 they can point me to? I know I downloaded it from "the other thread" the year before last, but the machine I'm testing on has a sacrificial OS for testing BOINC installers - and I've wiped it since then. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Has anybody got a copy of Clean20 they can point me to? I know I downloaded it from "the other thread" the year before last, but the machine I'm testing on has a sacrificial OS for testing BOINC installers - and I've wiped it since then. GPU AP tuning: new set of test tasks for GPU AP Claggy |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I suppose that if bug fixed in 341.44 it will not appear in higher versions. At least probability is low enough to not ban version before it appears. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
I've also carried out a bench on my E8500/9800GTX+ host, it's first since these ones: http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=2195&postid=52381 AP7_win_x86_SSE2_OpenCL_NV_r2690.exe / single_pulses.wu : Now r2690 with 341.44 matches what was produced with 337.50 Beta drivers: ------------ Claggy |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I repoted that fix works back to NV and closed bug report. Awaiting info about sync/async bug status. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Well, to re-test that sync/async issue this binary https://www.dropbox.com/s/nwtnomc5m6uhxen/AP7_win_x86_SSE2_OpenCL_NV_r2745_sleep_loop_shifted.exe.7z?dl=0 can be dropped along usual one into benchmark. Completed step (2) WU : Clean_20LC.wu I don't think "sleep_loop_shifted" is viable - doesn't reduce lag, does consume CPU. The step (1) numbers were WU : Clean_20LC.wu (as near identical as makes no difference) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
And for step (3) WU : Clean_20LC.wu The basic r2745 mostly waited 2 iterations in SinglePulse find (before buffer read): 20 iterations in PC_inner_ffa (before buffer read). Sometimes more - the highest I've found is "Awaited 1219 iterations for completion" of an ffa. I may have been stress testing for screen lag at the time... sleep_loop_shifted waited 1 iteration for both SinglePulse and ffa - never more, never less. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
So, bug still there. |
RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0 |
So it isn't fixed? don't upgrade to the latest drivers? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
So it isn't fixed? Bug about that this thread was started is fixed in 341.44. Another bug wasn't but it exists in older drivers too. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
So, maybe consider closing and locking the thread, since it is resolved, to prevent more confusion? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.