Message boards :
Number crunching :
the latest on release of AP_v7?
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
With AP7 we will get SIMD CPU builds released as stock as well. So, efficiency of AP7 crunching on stock should improve a lot. Nethertheless relative GPU crunching efficiency for AP remains considerably bigger even versus SIMD optimized CPU builds. Actually it will even increase cause gain from new blanking approach is considerably bigger for GPU. All this leaves untouched or even hardens all conclusions I made about effective computational resources usage here: http://lunatics.kwsn.net/2-windows/what-is-best-hardware-for-what-seti-application.0.html Use CPU and NV/iGPU hardware for MultiBeam processing, use ATi for AstroPulse processing. It will not result in biggest RAC for host but it will result in most host usefulness for SETI project. |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
AFAIK ... Currently APv6 is payed more per hour than SETIv7, because the stock APv6 CPU app is not optimized (no usage of CPU instruction set) - so very long calculation time. The stock CPU apps are the reference points for to calculate the Credits/project task. The »upcoming APv7« CPU apps use the CPU instruction sets (SSE, SSE2, SSE3 (depend of OS)). So the calculation of the project task will be faster (reference point is also faster). Because of this after some time less Credits/AP project task. I'm correct, or wrong? After some time the Credits AP/hour are lower than for SETI/hour. I'm correct, or wrong? Then all want just SETI project tasks - and noone crunch AP. ;-) I'm correct, or wrong? :o) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14655 Credit: 200,643,578 RAC: 874 |
AFAIK ... It will be fascinating to wait and watch, and to see what happens and what people do. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
The »upcoming APv7« CPU apps use the CPU instruction sets (SSE, SSE2, SSE3 (depend of OS)). The reference points as well as being faster, will also be slower, as there are still Stock code base AP apps, and at least on Windows the Stock 7.00 app is a lot slower than the Stock 6.01 app. Claggy |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
I (or we) don't know which app Eric will choose for reference point. For 'Mac OS X' (32bit) just the 'non CPU instruction set' app is available. For other OSs are also 'CPU instruction set' apps available. Maybe the mix of all available SETI hosts (which app/s will be used) will speed up the AP CPU reference point (I guess there are not much hosts around which will crunch just with the 'non CPU instructions set' app (too old CPUs)) ... -> less Cr./AP task. I'm correct, or wrong? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
You seem to have forgotten about AstroPulse v7 v7.00. Many have complained about it. Many have been aborted. On a Windows machine that use to complete the Stock AP App in around 20 hours this App takes around 60 hours. I don't think I ever did complete one. I did complete a couple in Linux, which is much faster on the same machine; Run time: 1 days 6 hours 21 min 38 sec I have a couple in Windows standing by, it took 20 hours to reach 30% complete. I've given up on them, they will be finished with AstroPulse v7 v7.03 (sse) or I will abort them. 60 hours on a machine that can complete the AstroPulse v7 v7.03 (sse) version in 9 hours is ridiculous. If AstroPulse v7 v7.00 is used as the base, look for credits to increase. It is clearly Much slower than the current AP Base CPU App. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
The credits for AP v7 work at Beta are roughly equal to the credits for AP v6 work here. They may be a little lower, but not by a huge amount. Jason's analysis indicates it's the CPU app version which produces APRs which most exceed the host's Whetstone benchmark which becomes the effective reference. If so, the generic Windows CPU build being terribly slow won't matter long term except for those who are running hardware which can't use the 32 bit SSE or 64 bit SSE2 version. Joe |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Folks, how about commandline for Nvidia CL? Shall we use the same as for V6 or are there any differences? |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I wouldn't mess with those just yet. The new builds seem pretty fast. Only improvement I saw with command line was decrease in CPU usage but a big increase in time to completion. |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
With my V6 commandline I don't see much improvement in speed. So you would recommend to run it without commandline? Guess I should try, but I'm afraid it could interfere with my vLHC CPU task again, when CPU usage goes up. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
NX, With the v6 you should use the commandline as it will help with the processing of the work unit. I was referring to the use of Commandline in v7 as it currently is. The new v7 apps appear to be much faster without them. Zalster |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Zalster, I know, I was talking about V7 also, I already run it since yesterday. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
NX, If you want to maximize your GPU utilization then running 2 APs at a time on your 750 is what you want. If you want to decrease the usage of your CPU while doing the maximizing your GPU then use the Command lines. If you just want to blaze through the APs without regard to CPU usage then do 2 APs at a time on the GPU without the Commandline as they will be done quicker than with it. However, I find that it doesn't matter how fast you get the AP done as I'm still waiting 1-2 days for my wingman to validate those results. So, to save CPU life and decrease heat, I use the commandline, thereby increasing the time to complete. I can't give you a yes or no..It depends on what your goals are. Hope this helps. Zalster |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Thx Zalster, but what I really want to know is if the commandline which I used on V6 -unroll 10 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -use_sleep is also ok for V7 or if I should change anything. Anyway, atm I run some tasks without commandline to see if there is any difference in speed. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
NX, So, did you ever test 1 APv7 running at 100% against 2 APv7 running at 100%? I think you will find there isn't any advantage. With some lower end cards it is a disadvantage. There is nothing magical about running multiple instances, the whole concept was derived because the Multibeam tasks can Not be adjusted to run at 100%. The Only way you can increase the GPU load on a Multibeam task is to run Multiple instances. AstroPulses ARE NOT Multibeams. You CAN adjust a single AP instance to run at 100% using the CMDline settings, therefore running Multiple AP instances has No advantage. Except with v6 Blanked tasks where the GPU spends time waiting on the CPU to work the Blanking. AP v7 will Not have the problem with Blanking APv6 has, so, Multiple tasks with APv7 should have No Advantage at all. The reason you don't need CMDlines when running 2 instances is because running 2 instances by themselves will raise the GPU load to 100% so CMDlines are not needed to raise the load to 100% as they are with 1 task. Running the GPU at 100% is what matters. Adding tasks to a card already running at 100% will not make it run at 200%, or even 101%. What will be interesting is how many CPU cores will be needed with APv7 since Blanking is not an issue. I can run three ATI cards on two CPUs with APv7 but when I try 3 ATI cards with One CPU I see the GPU Load drop in SIV. This is similar to what I see with Unblanked APv6 tasks, 3 APs with 2 cores works...on my machines. ...should change anything. You'll have to try things while looking at programs such as GPU-Z and SIV to see how your system responds. Every system responds a little differently, you'll have to test it yourself. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
When I first started with those v7 AP I did test 1 vs 2 with/without commandlines. You are right, running 1 vs 2 didn't really make much of a different. The time to complete was only 1-2 minutes difference, within the +/- of average time when you doubled it. It was worse when you had the command line, time to complete went up by about 8-10 minutes for 2 APs. AP v7 will Not have the problem with Blanking APv6 has, so, Multiple tasks with APv7 should have No Advantage at all. The reason you don't need CMDlines when running 2 instances is because running 2 instances by themselves will raise the GPU load to 100% so CMDlines are not needed to raise the load to 100% as they are with 1 task. That is what I saw as well. There was only 2 reasons for using the CMDlines that I found. 1. Decrease CPU usage and decrease heat on the CPU. 2....<CreditScrew> I have already had this conversation a few times. If you want me to talk about it I will but we've been over this ground in several threads. Why haven't I tested that 750 again...Meltdown on #1 Cruncher.. MoBo and GPU #1 fryed about 2 hours ago (wasn't the PSU like I first thought). That and Time Capsule bit the dust.. You can see my post over on Beta about that......Sorry been busy. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
When I first started with those v7 AP I did test 1 vs 2 with/without commandlines. You are right, running 1 vs 2 didn't really make much of a different. The time to complete was only 1-2 minutes difference, within the +/- of average time when you doubled it. It was worse when you had the command line, time to complete went up by about 8-10 minutes for 2 APs. Ouch. Well, my theory is when you use the same CMDline setting running 2 instances as you do to raise a single task to 100% you are overloading the system. It's just a theory, but it seems to agree with your experience. My sympathies toward your loss. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
It's ok, I just keep remember this line, https://www.youtube.com/watch?v=wRxHYHPzs7s edit... good excuse to see what the GTX 980s can do... |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Zalster, I know, I was talking about V7 also, I already run it since yesterday. Well, the only place you're supposed to Run V7 is at Beta using the V7 Tasks. AP V7 Results will be different using V6 tasks and shouldn't Validate. You should Run V6 Apps with V6 tasks, not AstroPulse v7 Windows x86 rev 2690 |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
NX-01 wrote: Thx Zalster, but what I really want to know is if the commandline which I used on V6 AFAIK from Raistmer the 'rule of thumb', the -ffa_block & -ffa_block_fetch values for APv6 is to take /2 for APv7. Example for your APv7: -unroll 10 -ffa_block 4096 -ffa_block_fetch 2048 -tune 1 64 4 1 -use_sleep But like always if you would like to know and use the best/fastest params you should make bench test runs. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.