Intel GPU Validation Inconclusive

Message boards : Number crunching : Intel GPU Validation Inconclusive
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Saicere

Send message
Joined: 9 Jul 99
Posts: 7
Credit: 6,717,907
RAC: 0
Netherlands
Message 1788382 - Posted: 18 May 2016, 10:45:46 UTC

I recently added two new Skylake 6700K-based computers with the Intel GPU enabled and available for Seti@Home to use, but I'm seeing a lot of "Validation Inconclusive" results on both for work units using the Intel OpenCL application (specifically, SETI@home v8 v8.00 (opencl_intel_gpu_sah) windows_intelx86).

https://setiathome.berkeley.edu/results.php?hostid=8004096
https://setiathome.berkeley.edu/results.php?hostid=8004128

CPU-generated units seem fine, and most of the GPU results get updated to "Valid", there's just one "Invalid" at this point in time.

Any idea what might be causing the discrepancy and large number of inconclusive results?
ID: 1788382 · Report as offensive
Profile Saicere

Send message
Joined: 9 Jul 99
Posts: 7
Credit: 6,717,907
RAC: 0
Netherlands
Message 1788408 - Posted: 18 May 2016, 12:11:21 UTC - in response to Message 1788382.  
Last modified: 18 May 2016, 12:11:50 UTC

Was looking a bit more at this, specifically the one work unit that was marked as invalid:

https://setiathome.berkeley.edu/workunit.php?wuid=2160985056

The results look similar, while the AMD GPU and CPU results found 9 pulses and a triplet, the Intel GPU result found 10 pulses. Comparing the actual results from the AMD and Intel GPU, sorted by time with the differing entry at the bottom:

AMD:

Pulse: peak=0.1869025, time=45.82, period=0.06417, d_freq=1763327896.77, score=1.011, chirp=-20.16, fft_len=128 
Pulse: peak=0.5658, time=45.82, period=0.4011, d_freq=1763332430.22, score=1.053, chirp=48.532, fft_len=256 
Pulse: peak=6.95606, time=45.84, period=15.58, d_freq=1763333460.1, score=1.052, chirp=-97.251, fft_len=512 
Pulse: peak=7.482055, time=45.84, period=15.58, d_freq=1763333451.55, score=1.131, chirp=-97.437, fft_len=512 
Pulse: peak=7.584933, time=45.99, period=22.91, d_freq=1763331222.83, score=1.02, chirp=51.262, fft_len=4k
Pulse: peak=4.384547, time=45.99, period=10.5, d_freq=1763326967.09, score=1.035, chirp=55.625, fft_len=4k
Pulse: peak=5.635067, time=45.9, period=14.58, d_freq=1763325916.18, score=1.032, chirp=-68.506, fft_len=2k
Pulse: peak=4.664864, time=45.9, period=10.77, d_freq=1763329940.72, score=1.073, chirp=-76.392, fft_len=2k
Pulse: peak=7.535296, time=46.17, period=21, d_freq=1763323584.46, score=1.044, chirp=46.654, fft_len=8k

Triplet: peak=10.01984, time=48.65, period=28.95, d_freq=1763323381.63, chirp=43.492, fft_len=512 


Intel:
Pulse: peak=0.1866636, time=45.82, period=0.06417, d_freq=1763327896.77, score=1.01, chirp=-20.16, fft_len=128 
Pulse: peak=0.5739036, time=45.82, period=0.4011, d_freq=1763332430.22, score=1.068, chirp=48.532, fft_len=256 
Pulse: peak=6.865847, time=45.84, period=15.58, d_freq=1763333460.1, score=1.038, chirp=-97.251, fft_len=512 
Pulse: peak=7.336316, time=45.84, period=15.58, d_freq=1763333451.55, score=1.109, chirp=-97.437, fft_len=512 
Pulse: peak=7.647188, time=45.99, period=22.91, d_freq=1763331222.83, score=1.028, chirp=51.262, fft_len=4k
Pulse: peak=4.426685, time=45.99, period=10.5, d_freq=1763326967.09, score=1.045, chirp=55.625, fft_len=4k
Pulse: peak=5.647213, time=45.9, period=14.58, d_freq=1763325916.18, score=1.034, chirp=-68.506, fft_len=2k
Pulse: peak=4.581229, time=45.9, period=10.77, d_freq=1763329940.72, score=1.054, chirp=-76.392, fft_len=2k
Pulse: peak=7.537786, time=46.17, period=21, d_freq=1763323584.46, score=1.044, chirp=46.654, fft_len=8k

Pulse: peak=1.256464, time=45.84, period=1.668, d_freq=1763332922.82, score=1, chirp=54.879, fft_len=512 


Outside of the last pulse/triplet and a minor discrepancy in the peaks, the results are matching, with identical time/period/d_freq/chirp/fft_len.
ID: 1788408 · Report as offensive
BetelgeuseFive Project Donor
Volunteer tester

Send message
Joined: 6 Jul 99
Posts: 158
Credit: 17,117,787
RAC: 19
Netherlands
Message 1788419 - Posted: 18 May 2016, 12:43:27 UTC - in response to Message 1788382.  

I recently added two new Skylake 6700K-based computers with the Intel GPU enabled and available for Seti@Home to use, but I'm seeing a lot of "Validation Inconclusive" results on both for work units using the Intel OpenCL application (specifically, SETI@home v8 v8.00 (opencl_intel_gpu_sah) windows_intelx86).

https://setiathome.berkeley.edu/results.php?hostid=8004096
https://setiathome.berkeley.edu/results.php?hostid=8004128

CPU-generated units seem fine, and most of the GPU results get updated to "Valid", there's just one "Invalid" at this point in time.

Any idea what might be causing the discrepancy and large number of inconclusive results?


Same problem for me (i5-6500). There seem to be problems with the OpenCL support in the Intel HD 530 driver. This has also been reported on other projects (e.g. Einstein@home). Only project that I found so far that works without problems is Collatz Conjecture, but this is probably because they are not using floating point.

Tom
ID: 1788419 · Report as offensive
BetelgeuseFive Project Donor
Volunteer tester

Send message
Joined: 6 Jul 99
Posts: 158
Credit: 17,117,787
RAC: 19
Netherlands
Message 1788421 - Posted: 18 May 2016, 12:47:46 UTC

Ah, nearly forgot. You should also check this thread:

http://setiathome.berkeley.edu/forum_thread.php?id=78613

Tom
ID: 1788421 · Report as offensive
Profile Saicere

Send message
Joined: 9 Jul 99
Posts: 7
Credit: 6,717,907
RAC: 0
Netherlands
Message 1788452 - Posted: 18 May 2016, 15:31:41 UTC - in response to Message 1788421.  
Last modified: 18 May 2016, 15:33:35 UTC

Ah, nearly forgot. You should also check this thread:

http://setiathome.berkeley.edu/forum_thread.php?id=78613


Thanks for the tip. I looked at the thread in question, but I'm not seeing any kind of runtime error or anything else that indicates that the work unit failed in any way, it just seems to be generating results that are slightly different from some other computers.

Without knowing the logic behind the result validator/comparator, it seems sufficiently different to get a "Validation Inconclusive" state but still get validated after a third result. For the GPU results for these two computers, they are now sitting at 20 valid, 1 invalid and 25 inconclusive. Out of the valid ones, quite a few were inconclusive, but then got validated for all three tasks, which seems weird. If the first two task results had discrepancies, shouldn't at least one of them be invalid?

I guess it's possible that Intel pulled another FDIV bug with the Skylake HD 530 GPU, but there could be something weird going on with the Seti@Home result validator as well.

Also, driver versions:

OpenCL: Intel GPU 0: Intel(R) HD Graphics 530 (driver version 20.19.15.4377, device version OpenCL 2.0, 13041MB, 13041MB available, 221 GFLOPS peak)
OpenCL CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 5.2.0.10094, device version OpenCL 2.0 (Build 10094))
ID: 1788452 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1788514 - Posted: 18 May 2016, 19:06:10 UTC - in response to Message 1788452.  

there could be something weird going on with the Seti@Home result validator as well.

Consider simple example.
Let say validator consider results as different if difference more than 2.
Let result A be 1, result B be 3.1
Apparently they are different for validator.
Then result C comes and it's 1.5

What validator should discard then?
ID: 1788514 · Report as offensive
Profile Saicere

Send message
Joined: 9 Jul 99
Posts: 7
Credit: 6,717,907
RAC: 0
Netherlands
Message 1788674 - Posted: 19 May 2016, 11:57:12 UTC - in response to Message 1788514.  

there could be something weird going on with the Seti@Home result validator as well.

Consider simple example.
Let say validator consider results as different if difference more than 2.
Let result A be 1, result B be 3.1
Apparently they are different for validator.
Then result C comes and it's 1.5

What validator should discard then?


I'm not saying that the validator doesn't have to make choices like this, but when WU results involving the Intel HD 530 produce "Validation Inconclusive" results 50% of the time, there might be something off with the thresholds for flagging it as such. The other possible conclusion is that the HD 530 GPU and/or OpenCL drivers are just bad at math.
ID: 1788674 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1788687 - Posted: 19 May 2016, 13:16:33 UTC - in response to Message 1788674.  

The other possible conclusion is that the HD 530 GPU and/or OpenCL drivers are just bad at math.

Most probably. We had issues with Intel's OpenCL drivers before too.
ID: 1788687 · Report as offensive
Profile Earthcore

Send message
Joined: 9 Feb 13
Posts: 19
Credit: 2,333,725
RAC: 0
Sweden
Message 1795361 - Posted: 11 Jun 2016, 13:41:05 UTC
Last modified: 11 Jun 2016, 13:47:18 UTC

I have the same problem with i5-6200U Skylake and Intel HD Graphics 520 OpenCL 2.0

Intel may have a bug in their GPU in terms of floating point processing?
ID: 1795361 · Report as offensive
Profile Earthcore

Send message
Joined: 9 Feb 13
Posts: 19
Credit: 2,333,725
RAC: 0
Sweden
Message 1808768 - Posted: 12 Aug 2016, 20:43:13 UTC
Last modified: 12 Aug 2016, 21:07:57 UTC

After upgraded Intel HD Graphics 520 driver today, to 2016-07-25 version 21.20.16.4494, all work-units ends up with Computation Error (i.e. Beräkningsfel).

SETI@home 8.12 SETI@home v8 (opencl_intel_gpu_sah) 06mr09af.1650.18882.5.32.83_2 00:00:01 (00:00:01) 84,94 100,000 - 2016-10-04 11:14:45 0,00581C + 1 Intel GPU Beräkningsfel (5,)

What's wrong?
ID: 1808768 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1808769 - Posted: 12 Aug 2016, 20:46:04 UTC - in response to Message 1808768.  

What's wrong?

IMHO Intel GPU is what's wrong. Many including myself have found them to not be worth the effort.
ID: 1808769 · Report as offensive
Profile Earthcore

Send message
Joined: 9 Feb 13
Posts: 19
Credit: 2,333,725
RAC: 0
Sweden
Message 1808772 - Posted: 12 Aug 2016, 20:56:37 UTC - in response to Message 1808769.  

OK, but maybe someone in the project should talk with Intel to fix this problem.
ID: 1808772 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1808777 - Posted: 12 Aug 2016, 21:17:46 UTC

The problem is that the Intel iGPU family, while very common, are very difficult to get working properly. The developers are all volunteers, and have skills that have been developed on the two "mainstream" GPU families (AMD & nVidia), and not the iGPU.
Now, if the project could find an iGPU developer interested in providing their services for free I am sure they would be welcomed with open arms.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1808777 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1808787 - Posted: 12 Aug 2016, 22:08:55 UTC - in response to Message 1808777.  

The problem is that the Intel iGPU family, while very common, are very difficult to get working properly. The developers are all volunteers, and have skills that have been developed on the two "mainstream" GPU families (AMD & nVidia), and not the iGPU.
Now, if the project could find an iGPU developer interested in providing their services for free I am sure they would be welcomed with open arms.



I'd hug them open arms.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1808787 · Report as offensive
Profile Michael H.W. Weber

Send message
Joined: 13 Aug 05
Posts: 3
Credit: 3,250,984
RAC: 0
Germany
Message 1808874 - Posted: 13 Aug 2016, 14:06:01 UTC

Well, it seems, my Broadwell Intel IGP (HD 5500) has the same issues.

Question 1: Wouldn't it be an idea to validate ATI vs. ATI, Intel vs. Intel and NVIDIA vs. NVIDIA GPU tasks? For many projects differences in rounding when using different CPUs result in slightly different results such that a bit-wise result comparison runs into issues when cross-system validation is applied.

Question 2: So far, I havn't taken closer look into the details of my tasks, but is it correct that from what was reported above one could come to the conclusion that, in a second round of validation, formerly non-validated (inconclusive) tasks were validated DESPITE THE FACT THAT THE RESULT CONTENT IS NOT IDENTICAL when inspected with bare eye? If that is true, then I would suspect serious problems with alls of these IGP-generated results. Please clarify (I do not know anything about what the basis of the vaildation process is).

Michael.
ID: 1808874 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1808883 - Posted: 13 Aug 2016, 14:22:40 UTC - in response to Message 1808874.  

1) SETI doesn't use a bitwise validator: there is a (tight) tolerance allowed on the numeric values.

2) There are two levels of validation: 'stronly similar' and 'weakly similar'. For full validation to take place, at least two of the task results must be 'strongly similar', and one of those will be chosen as the canonical scientific result. If, after validation has taken place, there are other 'weakly similar' results, they are awarded credit as a reward for a 'good try', but the weakly similar results are never used as the scientific outcome.

This schema is actually more robust than always comparing 'like with like'. There is still a small possibility that two of the 'weaker' hosts will validate against each other, and their slightly inaccurate data will get adopted as science, but the risk is seen as being low enough to be acceptable within the aims of this project.
ID: 1808883 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1808910 - Posted: 13 Aug 2016, 17:27:13 UTC

Also we can normally only see the stderr report, which is a summary of the results file (the file that is sent soon after a task has completed, this has a lot more information about what has been found than the stderr file has.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1808910 · Report as offensive
Profile Michael H.W. Weber

Send message
Joined: 13 Aug 05
Posts: 3
Credit: 3,250,984
RAC: 0
Germany
Message 1809088 - Posted: 14 Aug 2016, 15:49:48 UTC

OK.

What is your conclusion: Should one make use of the IGPs in Intel CPUs in this project or rather not?

Michael.
ID: 1809088 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1809129 - Posted: 14 Aug 2016, 20:12:10 UTC - in response to Message 1809088.  

OK.

What is your conclusion: Should one make use of the IGPs in Intel CPUs in this project or rather not?

Michael.

The HD 530 on my parents' HP Pavilion, Skylake Desktop system SUCKS!!!!! I will say that it performs SLIGHTLY better on v8 MB Tasks than it did on v7 work. It FAILED MISERABLY on AP Units.

The card has HIGH Inconclusive rates; and even WITH Lunatics 0.44 installed, it is SLOWER at processing work than MANY STOCK CPU crunchers. If I had to pay for an Intel GPU; I WOULDN'T DO IT!!! It's unfortunately what came with my parents' system.

If you want a REAL GPU, get EVGA NVIDIA or AMD/ATI GPUs.

My two main crunchers are using EVGA. In Andromeda, (MAC OS X El Capitan), I'm using TWO GTX-750TI SC cards, and in Exeter, (Win XP Pro x64), I'm using a GTX-760. These two machines run FAR, FAR better than the Skylake system for GPU crunching.


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1809129 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1809140 - Posted: 14 Aug 2016, 20:42:23 UTC

Like TimeLord my experience with iGPU is "less than favourable". I know some have made them work fairly well, but I failed. Tasks taking at least as long as on the CPU (an i7) and causing no end of lags and crashes. So I gave up with it.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1809140 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Intel GPU Validation Inconclusive


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.