NVIDIA driver crashing

Message boards : Number crunching : NVIDIA driver crashing
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1746720 - Posted: 4 Dec 2015, 3:07:18 UTC - in response to Message 1746713.  
Last modified: 4 Dec 2015, 3:08:46 UTC


---------------

As a separate but related matter, if I currently have no APs from Main, can I install Lunatics without finishing off the MB tasks in progress?

The app_config in my Main folder is just this:

<app_config>
   <app>
      <name>setiathome_v7</name>
      <gpu_versions>
          <gpu_usage>0.5</gpu_usage>
          <cpu_usage>.04</cpu_usage>
      </gpu_versions>
    </app>
</app_config>


Do I just need to duplicate the body of that with astropulse instead of setiathome? EVERY occurrence of astropulse in my current app_info.xml has the count set to .5 .

Also, I seem to remember having just a little bit of customization going (there was a thread a while back where I tried to customize it more and it promptly started crashing). Is that controlled from app_info?

David,

Here is my app_config.xml file originally created by Joe Segur:


<app_config>
<app>
<name>astropulse_v7</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_v7</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.04</cpu_usage>
</gpu_versions>
</app>
</app_config>

-------------------------------------------------------

With my app_config.xml, (used both on Prometheus with the GTX-750 TI SC, and Exeter with the GTX-760), both of my machines will crunch two units at a time. Either 1 MB and 1 AP, or 2 AP, or 2 MB...


TL

Doesn't that affect what gets done on your CPU? That's the other thing I just remembered. Somewhere, I have a setting that limits the CPU to 7 tasks at a time to keep 1 core free for the GPU. I suppose that's a Boinc thing, though, not project specific...?
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1746720 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1746724 - Posted: 4 Dec 2015, 3:12:15 UTC - in response to Message 1746720.  

Doesn't that affect what gets done on your CPU? That's the other thing I just remembered. Somewhere, I have a setting that limits the CPU to 7 tasks at a time to keep 1 core free for the GPU. I suppose that's a Boinc thing, though, not project specific...?


I don't use CPU to crunch; so, no, no effect on my CPU; only 1 core of my CPU feeds the GPU on each system.


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1746724 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1746726 - Posted: 4 Dec 2015, 3:32:27 UTC - in response to Message 1746724.  
Last modified: 4 Dec 2015, 3:46:33 UTC

David,


Regarding Beta site

Couple of things. what cuda are you running currently on Seti?

I only ask as I don't know what the 440 usually runs

The app_config will be different in cuda for the 440 and the 630

As far as the OpenCl, I would just abort them. I saw the discussion going on as to what might be causing the problem. I would just not crunch them for now.

________


As far as main, which computer is this app_config going into?


<app_config>
<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.20</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_v7</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.20</cpu_usage>
</gpu_versions>
</app>
</app_config>

Without running any CPU task this should allow for 2 work units each on the GPU in both machines

If you want to run CPU task on that 8 core machine, let me know I will modify this to put a <project_max_concurrent> in to allow the CPU to crunch as well.

I would not recommend any CPU task on your Dual core.
ID: 1746726 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1746898 - Posted: 4 Dec 2015, 21:20:33 UTC - in response to Message 1746726.  

David,


Regarding Beta site

Couple of things. what cuda are you running currently on Seti?

I only ask as I don't know what the 440 usually runs

The app_config will be different in cuda for the 440 and the 630

As far as the OpenCl, I would just abort them. I saw the discussion going on as to what might be causing the problem. I would just not crunch them for now.

This entire discussion has been about my i7 with the 440. I do not run Beta on the 630.

The 440 runs cuda42 on Main. I don't remember anymore if that was what the server concluded was best before I started doing Lunatics, or if I did a bit of informal observation and picked that myself. On Beta, I just let it do what it wants, and it appears the server has settled on 50 being the best. <shrug>

________


As far as main, which computer is this app_config going into?


<app_config>
<app>
<name>astropulse_v7</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.20</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_v7</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.20</cpu_usage>
</gpu_versions>
</app>
</app_config>

Without running any CPU task this should allow for 2 work units each on the GPU in both machines

If you want to run CPU task on that 8 core machine, let me know I will modify this to put a <project_max_concurrent> in to allow the CPU to crunch as well.

I would not recommend any CPU task on your Dual core.

Like I said above, this thread is entirely about the i7/440. (The box with the 630 does not crunch on the CPU because it has another function I consider more important.) I currently have the i7 crunching on 7 cores to keep 1 free for the GPU. This works fine for me and has for quite a while.

I downloaded the new Lunatics. I just want to know what I need to adjust while/after I install it to stay at the status quo, that being 7 cores of CPU and 2 at a time of anything from Main or Einstein on the GPU. (Actually, once I get the Beta problem straightened out, I wouldn't mind letting it do 2 at a time also, but first things first.)
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1746898 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1746902 - Posted: 4 Dec 2015, 21:44:25 UTC - in response to Message 1746898.  

You might need to increase the CPU ration to 0.5 from the 0.2 I have listed, that way with 2 GPU task running, it will keep 1 core for the GPU and allow 7 on the CPU.

Otherwise it might try to run 8 CPU task.

If you have any problems let us know
ID: 1746902 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1746926 - Posted: 4 Dec 2015, 23:58:26 UTC - in response to Message 1746902.  

You might need to increase the CPU ration to 0.5 from the 0.2 I have listed, that way with 2 GPU task running, it will keep 1 core for the GPU and allow 7 on the CPU.

Otherwise it might try to run 8 CPU task.

If you have any problems let us know

I'll give it a try.

Asking again, will tasks in progress have a problem with having the new Lunatics installed? There are currently 7 running on the CPU and 2 on the GPU, all MBs. I have no APs on board.

Thanks for all your help.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1746926 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1746929 - Posted: 5 Dec 2015, 0:07:47 UTC - in response to Message 1746926.  

There shouldn't be any problem with the current work in progress.
ID: 1746929 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1746984 - Posted: 5 Dec 2015, 3:30:42 UTC

Okay, it's been running with the new Lunatics for a few hours and everything seems fine so far. It's currently running a pair of MBs on the GPU, still saying 0.04 CPUs even with the app_config saying .5. Running 7 on the CPU too, just like it should.

The test will be if it downloads a new opencl from Beta. Just have to wait and see.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1746984 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1747003 - Posted: 5 Dec 2015, 7:25:11 UTC - in response to Message 1746984.  

Okay, it's been running with the new Lunatics for a few hours and everything seems fine so far. It's currently running a pair of MBs on the GPU, still saying 0.04 CPUs even with the app_config saying .5. Running 7 on the CPU too, just like it should.

The test will be if it downloads a new opencl from Beta. Just have to wait and see.

The Units that were already DL'd and being worked on BEFORE Lunatics was installed will ALL still say .04 CPU; however, IF you had BOINC Reread the Config Files before Resuming tasks, then they are actually being crunched at .5... Especially if you are seeing that 2 GPU Units are crunching at the same time; then your new app_config.xml is working, and Lunatics, also, is working. Once you receive new Units in your cache, you will see the appropriate numbers. (Which I hope are .5 CPU and .5 GPU to yield 2 Units crunching at a time on the GPU...)


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1747003 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1747334 - Posted: 6 Dec 2015, 23:46:42 UTC - in response to Message 1747003.  

Okay, it's been running with the new Lunatics for a few hours and everything seems fine so far. It's currently running a pair of MBs on the GPU, still saying 0.04 CPUs even with the app_config saying .5. Running 7 on the CPU too, just like it should.

The test will be if it downloads a new opencl from Beta. Just have to wait and see.

The Units that were already DL'd and being worked on BEFORE Lunatics was installed will ALL still say .04 CPU; however, IF you had BOINC Reread the Config Files before Resuming tasks, then they are actually being crunched at .5... Especially if you are seeing that 2 GPU Units are crunching at the same time; then your new app_config.xml is working, and Lunatics, also, is working. Once you receive new Units in your cache, you will see the appropriate numbers. (Which I hope are .5 CPU and .5 GPU to yield 2 Units crunching at a time on the GPU...)


TL

It has caught up and is now crunching 2 at .2 CPUs and .5 GPUs, and 7 on the CPU. Just as I want.

On the Beta front, it finished the cudas it had and hasn't downloaded anything else. The Beta app_config reads

<app_config>
<app_version>
<app_name>setiathome_v7</app_name>
<plan_class>cuda42</plan_class>
<avg_ncpus>1</avg_ncpus>
<ngpus>1</ngpus>
</app_version>
<app_version>
<app_name>setiathome_v7</app_name>
<plan_class>cuda32</plan_class>
<avg_ncpus>1</avg_ncpus>
<ngpus>1</ngpus>
</app_version>
<app_version>
<app_name>setiathome_v7</app_name>
<plan_class>cuda50</plan_class>
<avg_ncpus>1</avg_ncpus>
<ngpus>1</ngpus>
</app_version>
</app_config>

Is that causing it not to ask for CPU tasks?
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1747334 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1748267 - Posted: 11 Dec 2015, 0:22:09 UTC

Status update:

For some reason, it was refusing to ask Beta for work even after I suspended Main and Einstein. I disabled the app_config and then went back to the site and discovered that I had somehow unchecked CPU and all three kinds of GPU. I checked them all and temporarily bumped up the resource share, and it started asking. That's when Beta started saying no tasks available. I posted about it there and Eric did something. Then I got some CPUs and some cuda42s (odd, since it'd had 50s before). Anyway, at some point, it also downloaded some opencls for both MB and AP. Last I looked, it was running an AP opencl_nvidia_100, about half done with no problem.

I'll let you know when it gets to an MB opencl.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1748267 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1748287 - Posted: 11 Dec 2015, 3:00:21 UTC

Well, actually, while I wasn't paying attention, it's done nearly 100 tasks in every configuration it's capable of, including a lot of opencls and some v8s. Looks like Claggy was right and updating my Lunatics apps for Main fixed my Beta problem.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1748287 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : NVIDIA driver crashing


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.