Tasks ending in errors

Message boards : Number crunching : Tasks ending in errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809035 - Posted: 14 Aug 2016, 8:58:01 UTC - in response to Message 1808676.  

Thanks Gant(SSSF),
as you said:
With the CPU core reserved, it should run 3 (or even 4 WUs) at a time without things being laggy (even 0.4 or 0.3 may be enough).
~~~~~~~~~~~~
it runs 4 WUs great.
ID: 1809035 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1809064 - Posted: 14 Aug 2016, 13:01:13 UTC - in response to Message 1809035.  

Hey heinz, quick note to say congratz for pulling the old boy back out of storage and getting it fired back up & into useful, productive work!

ID: 1809064 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809146 - Posted: 14 Aug 2016, 21:27:30 UTC
Last modified: 14 Aug 2016, 21:59:16 UTC

BOINC Version 7.6.22 (x64)
V8-Xeon run 0.2 per GPU
device 0 runs 5 wu's
device 1 runs 5 wu's
device 2 runs still 4 wu'
together 14 wu's are running,
question is why run device 2 still 4 and not 5 wu's
as you can here see --> v8-xeon_runs_14_wus
have a look here --> v8-xeon_0.20_gpu
what is missing ?
here is app Version

<app>
<name>setiathome_v8</name>
</app>
<file_info>
<name>Lunatics_x41zi_win32_cuda50.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_50_35.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_50_35.dll</name>
<executable/>
</file_info>
<file_info>
<name>mbcuda.cfg</name>
</file_info>
<app_version>
<app_name>setiathome_v8</app_name>
<version_num>800</version_num>
<platform>windows_intelx86</platform>
<api_version>6.2.18</api_version>
<plan_class>cuda50</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>0.2</count>
</coproc>
<file_ref>
<file_name>Lunatics_x41zi_win32_cuda50.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_50_35.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft32_50_35.dll</file_name>
</file_ref>
<file_ref>
<file_name>mbcuda.cfg</file_name>
</file_ref>
</app_version>
D5400XS V8-Xeon
ID: 1809146 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1809157 - Posted: 14 Aug 2016, 22:07:49 UTC - in response to Message 1809146.  

Hi Heinz,
Could be boinc client memory or disk limits ?
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1809157 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809164 - Posted: 14 Aug 2016, 22:33:04 UTC - in response to Message 1809157.  

Hi Heinz,
Could be boinc client memory or disk limits ?

I give him 16 cpus, but it does not help
~~~~~~~~~~~~~~~~~~~~~~~~
15.08.2016 00:25:27 | | Processor: 16 GenuineIntel Intel(R) Xeon(R) CPU E5405 @ 2.00GHz [Family 6 Model 23 Stepping 6]
15.08.2016 00:25:27 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 syscall nx lm vmx tm2 dca pbe
15.08.2016 00:25:27 | | OS: Microsoft Windows 10: Professional x64 Edition, (10.00.10240.00)
15.08.2016 00:25:27 | | Memory: 16.00 GB physical, 18.37 GB virtual
15.08.2016 00:25:27 | | Disk: 931.50 GB total, 808.91 GB free

Maybe BOINC limitations ?
perhaps I should use any older BOINC Version
ID: 1809164 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809166 - Posted: 14 Aug 2016, 22:56:11 UTC - in response to Message 1809164.  

Surprize, with BOINC Version 7.4.42 (X64) it runs 5 wu's per GPU smoothly on the Titans.
Looks like a error in the new Boinc Version

v8-xeon_runs_15_wus
D5400XS V8-Xeon
ID: 1809166 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22216
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1809216 - Posted: 15 Aug 2016, 5:28:57 UTC
Last modified: 15 Aug 2016, 5:43:47 UTC

...while the Titans may be running 5 concurrent tasks have you had a look at the run times? The few I looked at are long, in the range 90mins to >150mins.

(With three concurrent on a GTX980 I was seeing times between 40 and 50 minutes)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1809216 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809252 - Posted: 15 Aug 2016, 9:04:40 UTC - in response to Message 1809216.  

ok, looked up
I use v8-xeon for work too. Therefore I have separate settings for work as shown:
If the Computer will be used, use max 50%
If the Computer will be not used, use max 90%
SWAP: use max 75%
~~~~~~~~~~~~~~~~~~~
This way I can work with the machine while crunching.
I get a lot of the vlar Guppis too.
Sure with 5 per GPU runtime increase.
My goal is to test how it is possible smoothly run 4, 5 or more wu's per GPU on v8-xeon.
Results on --> hostid=6944847
D5400XS V8-Xeon
ID: 1809252 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809277 - Posted: 15 Aug 2016, 11:40:05 UTC

Let the Cache now drain out to test other Settings.
Sicherung der Aufgaben höchstens alle 300 Sekunden
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
this prevent the high Bus traffic running 15 Wu's on GPU's parallel.
Would I run v8-Xeon under full load, I can run 16 CPU WU's additional.
This I have also tried two years ago as I got my 2 Mrd credits.
D5400XS V8-Xeon
ID: 1809277 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809303 - Posted: 15 Aug 2016, 13:09:09 UTC
Last modified: 15 Aug 2016, 13:33:51 UTC

I started new and set 32 CPU's for full load,
It runs now with 32 CPU-wu's paralllel so far ok.
This streess My FBDIMMs high, Temps on FBDIMMS are ~90 grd Celsius.
Normally V8-Xeon run parallel Work on it's GPU's, but till now no GPU start a gpu-wu.
looking now at all the settings again
edit:
looked up and found nothing, the difference to my last Settings for work is I have allowed CPU work -->
Ressourcenaufteilung 100
CPU benutzen yes
ATI-Grafikprozessor benutzen no
NVIDIA-Grafikprozessor benutzen yes
ID: 1809303 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809371 - Posted: 15 Aug 2016, 15:56:02 UTC - in response to Message 1809303.  

Meanwhile I started new with 8 cpu's to decrease temp of the FB-Dimms.
I will not cancel the cpu-work...
So we must wait till all 32 cpu-wu's are ready.
I set Count to 0.5
<count>0.5</count>

but no success, BOINC did not start the GPU-wu's
If all cpu-work is done, I will exclude CPU usage in Settings as it was before.
Then we will see if it starts again...
ID: 1809371 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809410 - Posted: 15 Aug 2016, 17:39:25 UTC - in response to Message 1809371.  

No Chance, we must wait till all 32 CPU wu's are done.
Tomorrow it will be ready.
ID: 1809410 · Report as offensive
The_Matrix
Volunteer tester

Send message
Joined: 17 Nov 03
Posts: 414
Credit: 5,827,850
RAC: 0
Germany
Message 1809413 - Posted: 15 Aug 2016, 17:45:27 UTC - in response to Message 1809371.  
Last modified: 15 Aug 2016, 17:49:44 UTC

but no success, BOINC did not start the GPU-wu's


Benutzt du denn KEINE app_config.xml Datei in dem Project-Ordner von SETI ?

<app_config>
<app>
<name>setiathome_v8</name>
<max_concurrent>20</max_concurrent>

      <gpu_versions>
          <gpu_usage>1.0</gpu_usage>
          <cpu_usage>.25</cpu_usage>
      </gpu_versions>
</app>
</app_config>


So schaut meine aus. Für max_concurrent würde ich dir 50 empfehlen, das ist die maximale Anzahl der Workunits die gleichzeitig bearbeitet werden können.Grüße

p.s.

gpu/cpu_usage nach eigenem Ermessen
ID: 1809413 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809435 - Posted: 15 Aug 2016, 18:49:38 UTC - in response to Message 1809413.  

Ok,
app_config.xml hatte ich noch nicht.
hatte immer alles in cc_config.xml.
hat aber gefunzt.
Auf jeder GPU 1 WU.
sieht so aus als wenn
<gpu_usage>1.0</gpu_usage>
das macht,
obwohl in cc_config
<count>0.2</count>
angegeben ist.

MfG _heinz
ID: 1809435 · Report as offensive
The_Matrix
Volunteer tester

Send message
Joined: 17 Nov 03
Posts: 414
Credit: 5,827,850
RAC: 0
Germany
Message 1809449 - Posted: 15 Aug 2016, 19:55:58 UTC
Last modified: 15 Aug 2016, 20:02:27 UTC

ja das ist so mit <gpu_usage>

Lade dir die Freeware GPU-Z und schau dir die GPU Auslastung mal in % an,

ist die nur bei 50% , kannst gpu usage auf 0.5 setzen, bei 33% auf 0.33 setzen usw.

Andere empfehlen auch ein paar CPU Kerne frei zu lassen, dazu die Prozessornutzung auf 75% schalten, da meint man man hat ja weniger Leistung und verschwendet Power, stimmt aber nicht.

Das OpenCL verbraucht irgendwie neben der Prozessorlast noch irgendwelche andere Resourcen, weis jetzt nicht welche, aber dafür brauchts die freigelegte Prozessorleistung.

Resultat ist eine gleichmäßigere Auslastung der GPU, allerdings verbaucht die GPU dann auch MEHR Strom.
ID: 1809449 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809480 - Posted: 15 Aug 2016, 23:56:42 UTC

With app_config.xml I get it running.
3 wu / GPU and 32 cpu-wu's run parallel.
if cpu-wu's are done, I let run the machine on cpu empty.
The pressure on the FB-Dimms are enormous with temps up to 98 grd Celsius.
The night is cool, so it can run under full load
V8-Xeon runs now 41 wu's parallel.
D5400XS V8-Xeon
ID: 1809480 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809575 - Posted: 16 Aug 2016, 8:45:23 UTC

All cpu-wu's are out now, so we can came back to my normal modus, still chrunching with GPU's on V8-Xeon.
It was a good experiment to run V8-Xeon under full load again.
But this is more reasonable in winter times, if it is cool enough too hold the temperatures into normal limits.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Running now moderate 3 wu's / GPU so I can do my additional work without any issue with the machine too.

Thank's to all for your support.
:-)
D5400XS V8-Xeon
ID: 1809575 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809901 - Posted: 17 Aug 2016, 12:54:35 UTC

During the weekly outage I run Primegrid PPS Sieve.
Now if want work for seti I get this message:
17.08.2016 14:48:12 | SETI@home | Not requesting tasks: don't need (CPU: ; NVIDIA GPU: not highest priority project)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
How can I handle this without stopping Primegrid.
I want to run seti and Primegrid together on the same machine.
What I had todo Setting the priority thingy right. ?
D5400XS V8-Xeon
ID: 1809901 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1812565 - Posted: 25 Aug 2016, 19:16:56 UTC

short Review about the last week with Pictures:
look FBDIMM2 has 98grd/Celsius
v8-xeon_32cpu_wu_parallel
~~~~~~~~~~~~~~~~~~~~~~~~~
since 9 days v8-xeon run PrimeGrid on GPU
v8-xeon_tophost_pg_nr_9
v8-xeon_topuser_pg_nr_36
v8-xeon_tpp_by_work24h_pg_nr_7
~~~~~~~~~~~~~~~~~~~~~~~~~
v8-xeon runs now 9days 24h absolut stable with high continuous output into tophost list.
And don't forget we have summer with hot days up to 35 grd/C here.
D5400XS V8-Xeon
ID: 1812565 · Report as offensive
The_Matrix
Volunteer tester

Send message
Joined: 17 Nov 03
Posts: 414
Credit: 5,827,850
RAC: 0
Germany
Message 1812822 - Posted: 26 Aug 2016, 20:31:48 UTC
Last modified: 26 Aug 2016, 20:34:21 UTC

.....
<gpu_versions>
<max_concurrent>20</max_concurrent>
....
</gpu_versions>

is the right place for max_conrurrent, i were false.
ID: 1812822 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Tasks ending in errors


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.