GPU Problem

Message boards : Number crunching : GPU Problem
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1655914 - Posted: 23 Mar 2015, 3:48:36 UTC

OK start up was 35C
40 after 10 sec
41 after 20
42 after 30
48 after 60
49 after 120
51 after 240
Doesn't seem to me like an excessive spike in temp.

After yanking the card apart and fresh paste, seemed much the same for startup. Then I put the cover back on, and got up to 64 consistent temp, geesh I was 58-59C before, Guess I got to try that again. Still by what I hear that is not too hot.

I remove my 50C limit, will see if I start to kick out errors again. My guess is yes, it will.

Keith, Yea the 240 is not a dynamite card, I did see in my searches some don't even have a fan on them. But oh well it came with the computer.
ID: 1655914 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1655920 - Posted: 23 Mar 2015, 4:45:19 UTC

BTW I do really appreciate the expert advice of the volunteer/development team.

You guys are great!

And yea, my problems are most likely not over so will be asking more :D
ID: 1655920 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1656076 - Posted: 23 Mar 2015, 17:24:38 UTC - in response to Message 1655920.  

Brent, that is typical behavior for a fresh coat of TIM. Normally needs some time for it to "bed" in. It should drop after a couple of days.

Keith
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1656076 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1656135 - Posted: 23 Mar 2015, 21:05:34 UTC

Unrelated, well relate in a thermal sense it is. This computer couldn't be worse designed for heat dissipation if they actual tried to do that.

While looking at my GPU and heat, the CPU is located so close to the card, and you guessed it ,,, it exactly lines up with blowing heat on the backside of the GPU chip. They couldn't have tried to get it any closer. I wish I had a spare slot there to put in a heat barrier shield. No wonder I see the GPU temp jump so much if I work the CPU hard. And CPU temps change with MB/AP, AP runs hotter for my CPU.

I seriously thought of firing up the TIG welder to put a deflector on the heat sink so it doesn't blow on the GPU, then thought different heatsink/fan would be a whole lot easier:) I'm sure I could get a directional one, or water cooled, even better.

My case fan is sucked up so tight to the power supply intake that at best, it is just disturbing the airflow into the PS. It is definitely going to move.

Keith, yea I do see a little drop in temps since yesterdays thermal redo. 1 or 2C.
ID: 1656135 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1656138 - Posted: 23 Mar 2015, 21:21:55 UTC

Well I'm still scratching my head and trying TO figure out WHY I was running 100% errors on my GPU.

Heat didn't seem to be the thing, no invalids at just letting GPU go at it's max ... whatever it what to gobble.

I am completely confuzzed at this point of where all my errors came from!

I started this venture by changing 1 thing at a time to find this problem. Now that I know things work OK, I'm going to keep stepping back to see what broke it in the first place. Being technically oriented ... yea I just have to know the WHY!

My rating is screwed now anyways :)
ID: 1656138 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13746
Credit: 208,696,464
RAC: 304
Australia
Message 1656290 - Posted: 24 Mar 2015, 6:48:42 UTC - in response to Message 1656135.  

While looking at my GPU and heat, the CPU is located so close to the card, and you guessed it ,,, it exactly lines up with blowing heat on the backside of the GPU chip. They couldn't have tried to get it any closer. I wish I had a spare slot there to put in a heat barrier shield. No wonder I see the GPU temp jump so much if I work the CPU hard.

Not because of the heat blowing on the card, but because of the overall increase in temperatures within the case. Just removing the side panel from my systems dropped the CPU temperatures by 5°c.
The temperature of the air coming off of the CPU heatsink in most instances is considerably less than the temperature of the back of the video card, so the air (as hot as it is) coming from the CPU actually helps cool down the video card in most instances.
Grant
Darwin NT
ID: 1656290 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1656385 - Posted: 24 Mar 2015, 12:01:19 UTC - in response to Message 1656138.  

Well I'm still scratching my head and trying TO figure out WHY I was running 100% errors on my GPU

http://setiathome.berkeley.edu/forum_thread.php?id=76946&postid=1655235#1655235
"... when I was actually running at 925MHz"
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1656385 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1658647 - Posted: 29 Mar 2015, 9:01:04 UTC - in response to Message 1654422.  

Josef did touch on something that I think my problem is ...

There's a perhaps related issue with that GPU, it is manufacturing false signals fairly often and so has more inconclusive and invalid results than reasonable. Some are extremely wrong, like infinite peak power for AP single pulses.

"OpenCL 1.2 AMD-APP (1268.1)" indicates Catalyst 13.9, released in September 2013 so about 1.5 years old. The GPU is about the same age, but perhaps newer drivers would help.


I have been testing and testing trying to figure out why settings that wouldn't work before are now working.

I have came to the conclusion it is Catalyst.

I can change my clock speeds up/down and see what works. But I bumped my voltage by 10% and errors every 20 sec of running, OK fine remove that (errors) slow clock down (errors) go back to default (errors) try power +1,-1, 0 (errors) restart BOINC (errors) reboot ... tasks are running and will see what happens now.

It seems my version of Catalyst doesn't like you change power levels.

So in the beginning when I was chasing all these errors and trying to figure it out, I should have rebooted, which I did when I put heatsink paste on - BTW I do think that overhaul helped.

Still testing here to see what clock speeds work, what doesn't. Just time consuming when I'm not putting out a lot of tasks so I have to wait a day or 2 for wingmen to get a read if the change worked or not.

Then I guess I will have to update my driver/Catalyst - just have the mindset of many, if it works why change it? But have now seen that part of it does NOT work.

I love computers, I love computers, never know what surprise they may bring you :)
ID: 1658647 · Report as offensive
Previous · 1 · 2 · 3

Message boards : Number crunching : GPU Problem


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.