Message boards :
Number crunching :
Version 3 of faster SETI cruncher for Linux
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next
Author | Message |
---|---|
Gary Zhang Send message Joined: 19 Apr 04 Posts: 26 Credit: 32,583 RAC: 0 |
my-naparst-r3-* is built with icc-9.0.021, ipp-5.0.043, gcc-3.4.1, and tested on Intel Celeron M processor 1.40GHz, 1MB cache, 438.84MB memory, 486.30MB swap. benchmarks: my-naparst-r3-no-prec <wu_cpu_time>4600.744603</wu_cpu_time> my-naparst-r3-prec <wu_cpu_time>4902.263765</wu_cpu_time> naparst-r3 <wu_cpu_time>4901.236921</wu_cpu_time> my-naparst-r3-no-prec <wu_cpu_time>4603.051252</wu_cpu_time> my-naparst-r3-prec <wu_cpu_time>4901.652858</wu_cpu_time> naparst-r3 <wu_cpu_time>4901.964811</wu_cpu_time> ./rescmp/rescmp result_unit.sah my-naparst-r3-no-prec/result.sah Result: these are weakly similar. ./rescmp/rescmp result_unit.sah my-naparst-r3-prec/result.sah Result: these are strongly similar. ./rescmp/rescmp result_unit.sah naparst-r3/result.sah Result: these are strongly similar. |
Harold Naparst Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0 |
Gary, So the conclusion is that the -no-prec-div flag causes a big speedup but weak similarity, as Tetsuji has claimed? Harold Naparst |
Gary Zhang Send message Joined: 19 Apr 04 Posts: 26 Credit: 32,583 RAC: 0 |
Gary, I add "-no-prec-div -no-prec-sqrt" for my-naparst-r3-no-prec, so maybe both of them cause weak similarity. |
Harold Naparst Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0 |
ACK!! I missed this message. Please send me this kind of stuff by email, too, in case I don't see it. Anyway, I've fixed it in the tagged source version. So if you already have a working directory based on svn://hnaparst.homelinux.com/seti_boinc/tags/naparst-r3.0, just cd seti_boinc svn update I had been testing some ICC-specific code, and I forgot to comment out the variable declaration before committing the code to the repository. And I didn't run a test-compile on GCC. My fault. Harold Naparst |
Harold Naparst Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0 |
My guess is that the -no-prec-div has a larger effect than -no-prec-sqrt. But I'm usually wrong, as we all know. But still, that's a pretty large performance effect for the prec flags. Good work finding that. Harold Naparst |
michael37 Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0 |
At some point, Boinc developers specified that an application should use less than 64MB of RAM. I personally disagree, plus there is a Boinc bug that allows applications to use unlimited memory. Anyway, I would like to have two versions of the client -- the "boinc-compliant" low memory client for the 64MB RAM, and the "fastest reasonable" client using 256MB RAM. Why 256MB? I personally do not have computers with less than 512MB RAM. Memory is so cheap these days, especially older PC133 SDRAM and DDR266 SDRAM. Using half of the RAM for Seti is OK with me. |
Harold Naparst Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0 |
If Hans is correct, which I suspect he is, we'll be able to produce a client that uses only 8MB cache and has no performance penalty. Hopefully soon. Harold Naparst |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
Well that would indeed be brilliant! Personally, my feeling is that a little more is tolerable, but there comes a point where it's no longer ideal. Say the core client uses approx 22MB, an 8MB cache takes it to 30MB. A 32MB cache (40MB total) would still be fine and a 64MB cache (86MB total) is probably heading towards the upper limit that would be acceptable to everyone. Obviously it's finding a compromise betwen performance and memory usage, and if that can indeed be achieved with an 8MB cache then Hans and yourself definately deserve a pint and a night off! Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
ML1 Send message Joined: 25 Nov 01 Posts: 20326 Credit: 7,508,002 RAC: 20 |
... Do I guess right that I'd better trim that down to just: Well that little lot didn't work! This does though: <app_info> <app> <name>setiathome</name> </app> <file_info> <name>setiathome-4.07.athlon-xp-static-pc-linux-gnu</name> <executable/> </file_info> <app_version> <app_name>setiathome</app_name> <version_num>470</version_num> <file_ref> <file_name>setiathome-4.07.athlon-xp-static-pc-linux-gnu</file_name> <main_program/> </file_ref> </app_version> </app_info> Now crunching with 121MBytes after a few minutes and the memory increase is slowing. Happy crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
Crunch3r Send message Joined: 15 Apr 99 Posts: 1546 Credit: 3,438,823 RAC: 0 |
Hello, I've updated all setiathome Linux clients for i686,P3,Athlon( K7) and Athlon XP. And all setiathome clients for EV5, EV6 and EV67 running Linux. All are based on Harolds 3.0 source. URL: I WANT VERY FAST LINUX CLIENTS :-) Join BOINC United now! |
Harold Naparst Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0 |
Hello, You're going to kill me. I've released another version-- R3.1. Get it at http://naparst.name This one uses only 16MB cache. It could have been only 8MB, but I couldn't bear the thought of not using any of my multi-cache code, so I left the cache size at two for old times sake. I've finally realized that Hans was right all along about the chirp rate. It needs to be a double precision variable. Now everything is good. I don't know why I didn't listen the first time. So the program is just as fast, but it uses practically no memory. Onward and upward. Harold Naparst |
Crunch3r Send message Joined: 15 Apr 99 Posts: 1546 Credit: 3,438,823 RAC: 0 |
Hello, If i would be in the US somewere near you i would throw my hands around your your neck and start pushing GRRRRR :-) It took me 6 hours to compile ... I'll do it again :-) It's 4.20 am here in germany.. Join BOINC United now! |
michael37 Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0 |
Harold, what am I gonna do with those 2GB of RAM my servers got? Just kidding. GREAT JOB!!! |
Hans Dorn Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 |
Hi Crunch3r! That's what daylight saving time is for :o) Regards Hans |
Mark Day Send message Joined: 19 Aug 02 Posts: 81 Credit: 502,830 RAC: 0 |
@Harold Nice work on 3.1 clients, 40meg RSS near end of WU @Crunch3r I picked up the K7+3dnow client, which works fine, but in the stderr.txt it puts non UTF-8 characters °°°°°°°° etc which messes with some stats gathering programs like boincstat that look at client_state.xml. Any chance of making these just **** like the other headers? M |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
Nice work Harold :) I haven't even tested 3.0 properly for you yet! - do you still want me to? I'll definately try the 3.1 source as I still have a little old laptop struggling on with 256MB ram. Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
Harold, V3.1 compiles cleanly with GCC-4.0.1 using FFTW3 :) I've built both dynamic and static versions for Athlon XP and am benchtesting the dynamic version now for speed against V3.0 (should have the results in an hour) Regards, Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Martin A. Boegelund Send message Joined: 4 Jul 00 Posts: 292 Credit: 387,485 RAC: 1 |
Man! I just started using 3.0... Looking forward to using 3.1, I hope it will cut my crunching times, since the big cache cliets are having my system using lots of swap. KBoincspy is reporting a high estimated crunching time, 3+ hrs, it used to be around 2 hrs 15-30 mins with clients 2.6 - 2.8. I'll tell you more when I finish my current work units. "Are you suggesting coconuts migrate?" |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
Here's some benchmark times on an Athlon XP for v3.0 versus V3.1: V3.0 (large cache) Real: 62:42 User: 62:31 Sys: 0:05 V3.1 (small cache) Real: 61:30 User: 61:20 Sys: 0:05 Excellent result! The small cache reduces the amount of total memory used from approx 360MB down to 44M and actually gives a very small increase in performance (probably within experimental error though). These benchmarks were on dynamic builds on a system with 512MB ram (200MHz fsb, 5-2-2-2). So Hans and Harold have succeeded in their brief to produce a client which uses less memory but retains the full speed improvement - truely amazing! Well done, Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Harold Naparst Send message Joined: 11 May 05 Posts: 236 Credit: 91,803 RAC: 0 |
More discomfort. As for the source code, there are really four ideas that lead to performance: 1) Using IPP's FFT routine, due to Tetsuji. 2) Caching trig calculations, due to Hans Dorn. 3) Caching results of invert_lcgf, due to Hans Dorn. 4) Using double precision for the chirp rate, due to Hans Dorn. My name is nowhere to be found in this list. Although Alex Kan and I made some contributions at one point, those contributions don't contribute to the speedup at the present moment. Anyway, glory to the aliens, not to us. Harold Naparst |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.