Version 3 of faster SETI cruncher for Linux

Message boards : Number crunching : Version 3 of faster SETI cruncher for Linux
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

AuthorMessage
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 183014 - Posted: 27 Oct 2005, 23:06:24 UTC - in response to Message 183007.  


Another possibility using the linking methodology would be to use absolutely identical site structures (as we're already very similar anyway) and have links in the left hand navigation panel for "AMD Clients", "Intel Clients", "Alpha Clients" etc that are direct links to each others sites for those pages, rather than locally mirrored pages. So, for example, if a visitor to my site clicked on the "Alpha Clients" navigation panel link they'd jump straight to Crunch3r's site without even knowing it, and similarly then clicked on the Intel navigation panel link they'd jump on to Harolds site - that way each person only needs maintain the page for their own individual clients and the rest of the site layout is just an exact copy on each server - no mirroring needed.


Ahh. Hmm. Don't know what to say here. How about a change of topic?

What would you think would be an appropriate maximum amount of memory
for the caches to use? Although Hans might disagree, (and I might be wrong)
my explorations seem to show that it is important to have more than 8MB cache. The chances of finding cached data are much higher if the cache is larger. Would you think that a cache of 160 MB is OK? Or is that still too large?



Harold Naparst
ID: 183014 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20323
Credit: 7,508,002
RAC: 20
United Kingdom
Message 183025 - Posted: 27 Oct 2005, 23:27:19 UTC - in response to Message 183014.  
Last modified: 27 Oct 2005, 23:29:55 UTC

... Ahh. Hmm. Don't know what to say here. How about a change of topic?

What would you think would be an appropriate maximum amount of memory
for the caches to use? Although Hans might disagree, (and I might be wrong)
my explorations seem to show that it is important to have more than 8MB cache. The chances of finding cached data are much higher if the cache is larger. Would you think that a cache of 160 MB is OK? Or is that still too large?

I've mentioned elsewhere that anything up to 256MB total resources for boinc is acceptable for this system (mine). Those with less memory are going to be squealing a little though so it comes down to a case of using the minimum sized cache that is essential for the speedup. Another 100MB for a 0.1% improvement can be forgone. Or is trig cache size vs speedup a linear relation?

Also, the idea of precalculating the trig tables looks good. Such a scheme would work well with the memory (look-ahead) prefech on the CPU.

Regards,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 183025 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20323
Credit: 7,508,002
RAC: 20
United Kingdom
Message 183034 - Posted: 27 Oct 2005, 23:41:42 UTC - in response to Message 182897.  
Last modified: 27 Oct 2005, 23:45:46 UTC

... just built a static version for Athlon XP:

http://www.pperry.f2s.com/files/setiathome-ned-v3.0.tar.gz

It's experimental - I've NOT tested it - so use at your own risk. I'm not 100% sure it's totally static, but it runs on my boxes OK on the very quick 30 second test I did. ...

PS - Please could you let me know if it spawns 3 setiathome processes like Crunch3r's client does, or just the one - thatnks.

Thanks.

Downloaded and "tar xvzf"-ed. Ok on the naming exactly replacing one of the existing versions there.

My app_info file is:

.boinc/projects/setiathome.berkeley.edu/app_info.xml
<app_info>
    <app>
        <name>setiathome</name>
    </app>
    <file_info>
        <name>setiathome_4.07.3b_athlon-xp-fftw3-static-pc-linux-gnu</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>setiathome</app_name>
        <version_num>407</version_num>
        <file_ref>
            <file_name>setiathome_4.07.3b_athlon-xp-fftw3-static-pc-linux-gnu</file_name>
            <main_program/>
        </file_ref>
    </app_version>
    <app>
        <name>setiathome</name>
    </app>
    <file_info>
        <name>setiathome_4.07.3-athlon-xp-fftw3-static-pc-linux-gnu</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>setiathome</app_name>
        <version_num>470</version_num>
        <file_ref>
            <file_name>setiathome_4.07.3b_athlon-xp-fftw3-static-pc-linux-gnu</file_name>
            <main_program/>
        </file_ref>
    </app_version>
    <app>
        <name>setiathome</name>
    </app>
    <file_info>
        <name>setiathome_4.02_i686-pc-linux-gnu</name>
        <executable/>
    </file_info>
    <app_version>
        <app_name>setiathome</app_name>
        <version_num>402</version_num>
        <file_ref>
            <file_name>setiathome_4.07.3b_athlon-xp-fftw3-static-pc-linux-gnu</file_name>
            <main_program/>
        </file_ref>
    </app_version>
</app_info>


Do I guess right that I'd better trim that down to just:

.boinc/projects/setiathome.berkeley.edu/app_info.xml
<app_info>
    <app>
        <name>setiathome</name>
    </app>
    <file_info>
        <name>setiathome-4.07.athlon-xp-static-pc-linux-gnu</name>
        <executable/>
    </file_info>
</app_info>


Or must there be at least one app_version in there?

(And "[code]" doesn't seem to do much for formatting that lot!)

And I'm on "[SETI@home] Deferring communication with project" until high noon later today. Boinc is miss-reporting the memory in this thing which ironically can be frigged ok by starting boinc early during a bootup. Or must I upgrade to boinc v5.2 now?

I'll let you know how many processes show when it runs.

Regards,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 183034 · Report as offensive
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 183091 - Posted: 28 Oct 2005, 1:47:49 UTC - in response to Message 183025.  


I've mentioned elsewhere that anything up to 256MB total resources for boinc is acceptable for this system (mine). Those with less memory are going to be squealing a little though so it comes down to a case of using the minimum sized cache that is essential for the speedup. Another 100MB for a 0.1% improvement can be forgone. Or is trig cache size vs speedup a linear relation?


Well, this is the big question I'm playing with now.
The key metric is the number of times you can call the routine without
recalculating the trig values. Suppose you have an expensive routine,
and you cache the results to avoid recalculation. If you can avoid
calculation 95% of the time, then you save 95% of the cpu time devoted
to this routine. Another way of looking at it is the total number of
times through the routine divided by the total number of times you have
to do the whole calculation. If this number is 20, for example,
then you avoid calculation 19 times out of 20, which is pretty good.

What I've found out so far is that with a cache size of 160MB, which is
half the size of naparst-r3.0, the above ratio is about 400.
But with a cache size of 8MB (which is what Hans had originally), the
ratio is about 2, so you end up recalculating about half the time.
Hence, you only save half the expense.

So there will be a happy medium at some point between 8MB and 160MB.

Harold Naparst
ID: 183091 · Report as offensive
Brian Silvers

Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 183110 - Posted: 28 Oct 2005, 2:18:39 UTC - in response to Message 182869.  

Needless to say, I'm not going to gut my Windows machine for Linux


You don't really have to do that... You can use VMWare...
One suggestion, make sure you have plenty of RAM. It is RAM intensive...
My guess is though that using VMWare would offset any gain from the
optimization, assuming that there would be one. A better use could be
learing about various OSes without having to actually install them.
They have just recently released a "Player" that is beta, but looks to
be interesting if you don't want to pay for the full product. You could
then load up pre-configured images and try anything from Mac to Suse to
Solaris...

http://www.vmware.com



ID: 183110 · Report as offensive
JohnB175
Volunteer tester

Send message
Joined: 15 Oct 03
Posts: 124
Credit: 321,769
RAC: 0
United States
Message 183134 - Posted: 28 Oct 2005, 3:46:36 UTC

ok here are my numbers against the reference workunit for the various clients:
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
harold 3.0
setiathome_SSE-naparst-r3.0

real 240m50.496s
user 238m38.940s
sys 0m50.760s
wu_cpu_time=15319.340000
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
harold 2.8
setiathome_SSE-naparst-r2.8

real 272m52.589s
user 262m51.860s
sys 2m46.550s
wu_cpu_time=16773.180000
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
crunch3r
setiathome-4.7.pentium3_sse_fftw3_V2.02s_cache-pc-linux-gnu

real 285m27.216s
user 277m20.010s
sys 3m11.880s
wu_cpu_time=17678.820000
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
tmr
setiathome-4.07-ipp.i686-p3-linux-gnu

real 355m50.345s
user 350m57.890s
sys 1m12.430s
wu_cpu_time=22130.310000
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
ID: 183134 · Report as offensive
Ned Slider

Send message
Joined: 12 Oct 01
Posts: 668
Credit: 4,375,315
RAC: 0
United Kingdom
Message 183166 - Posted: 28 Oct 2005, 6:27:44 UTC - in response to Message 183014.  


What would you think would be an appropriate maximum amount of memory
for the caches to use? Although Hans might disagree, (and I might be wrong)
my explorations seem to show that it is important to have more than 8MB cache. The chances of finding cached data are much higher if the cache is larger. Would you think that a cache of 160 MB is OK? Or is that still too large?


I would say the key objective for memory reduction would be to allow the client to run on a system with 256MB ram without swapping, so a maximum of about 160MB (a figure already mentioned) seems about right, leaving 96MB for the system, although I guess it would be nice if this could be reduced further.

Using your ratio of hits/mises analogy above, I guess you need to do some analysis to find the sweet spot.

Ned

*** My Guide to Compiling Optimised BOINC and SETI Clients ***
*** Download Optimised BOINC and SETI Clients for Linux Here ***
ID: 183166 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20323
Credit: 7,508,002
RAC: 20
United Kingdom
Message 183213 - Posted: 28 Oct 2005, 10:40:51 UTC - in response to Message 183034.  
Last modified: 28 Oct 2005, 10:41:32 UTC

... Do I guess right that I'd better trim that down to just:

.boinc/projects/setiathome.berkeley.edu/app_info.xml
<app_info>
    <app>
        <name>setiathome</name>
    </app>
    <file_info>
        <name>setiathome-4.07.athlon-xp-static-pc-linux-gnu</name>
        <executable/>
    </file_info>
</app_info>


Or must there be at least one app_version in there?

(And "[code]" doesn't seem to do much for formatting that lot!)

Any "app_info.xml" experts out there?

Is a "<app_version>" required also?


(Reboot soon to cludge kick-starting boinc :-/ )

Thanks,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 183213 · Report as offensive
Profile spacemeat
Avatar

Send message
Joined: 4 Oct 99
Posts: 239
Credit: 8,425,288
RAC: 0
United States
Message 183250 - Posted: 28 Oct 2005, 13:44:07 UTC - in response to Message 183166.  

@spacemeat --> Portage has icc-9 and gcc-4.0.2, but they don't have
IPP. Anyway, that shouldn't stop you. It is a very simple matter
to install them and compile with the appropriate flags. The instructions
are on my web site: http://naparst.name/sources.htm
Have fun, and let us know if GCC/FFTW is faster than ICC/IPP.


how could i forget to check experimental ebuilds...?

ill give it a shot on my (somewhat) faster P3 and see how it compares. i might even give it a shot on K6 which is usually not compatible with i686. maybe even the sparc if i'm crazy...
ID: 183250 · Report as offensive
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 183258 - Posted: 28 Oct 2005, 14:00:28 UTC - in response to Message 183250.  
Last modified: 28 Oct 2005, 14:01:50 UTC

@spacemeat --> Portage has icc-9 and gcc-4.0.2, but they don't have
IPP. Anyway, that shouldn't stop you. It is a very simple matter
to install them and compile with the appropriate flags. The instructions
are on my web site: http://naparst.name/sources.htm
Have fun, and let us know if GCC/FFTW is faster than ICC/IPP.


how could i forget to check experimental ebuilds...?

ill give it a shot on my (somewhat) faster P3 and see how it compares. i might even give it a shot on K6 which is usually not compatible with i686. maybe even the sparc if i'm crazy...


Warning: It is somewhat difficult to install IPP under Gentoo.
You will have to be resourceful. I recommend becoming familiar
with rpm2cpio and cpio. If you want things to go very smoothly,
install CentOS 3.5 on a designated development machine.

Harold Naparst
ID: 183258 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 183285 - Posted: 28 Oct 2005, 15:46:58 UTC - in response to Message 183258.  


Warning: It is somewhat difficult to install IPP under Gentoo.
You will have to be resourceful. I recommend becoming familiar
with rpm2cpio and cpio. If you want things to go very smoothly,
install CentOS 3.5 on a designated development machine.


The easiest way I found was to install "rpm" to get all the
required rpm tools, and moving /usr/bin/rpm to /usr/bin/hidden_rpm thereafter.

By doing so, the installer finds rpm2cpio, but doesn't try to use "rpm" directly.

Regards Hans


ID: 183285 · Report as offensive
Profile Rush
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 3131
Credit: 302,569
RAC: 0
United Kingdom
Message 183302 - Posted: 28 Oct 2005, 16:43:39 UTC - in response to Message 182879.  

I just had a look at the Cygwin site,
but it looks like you have to compile everything from source, including libraries.
This makes it impossible to use intel's compiler if I'm not wrong.


Thanks, Hans, I didn't think so. I hadn't looked into Linux for a long time, so I had hoped things had changed.
ID: 183302 · Report as offensive
Profile Rush
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 3131
Credit: 302,569
RAC: 0
United Kingdom
Message 183303 - Posted: 28 Oct 2005, 16:46:48 UTC - in response to Message 183110.  

A better use could be learing about various OSes without having to actually install them. They have just recently released a "Player" that is beta, but looks to be interesting if you don't want to pay for the full product. You could then load up pre-configured images and try anything from Mac to Suse to Solaris...


Thanks as well, Brian. I'll check it out. 8^]
ID: 183303 · Report as offensive
Profile spacemeat
Avatar

Send message
Joined: 4 Oct 99
Posts: 239
Credit: 8,425,288
RAC: 0
United States
Message 183312 - Posted: 28 Oct 2005, 17:24:27 UTC - in response to Message 183258.  


Warning: It is somewhat difficult to install IPP under Gentoo.
You will have to be resourceful. I recommend becoming familiar
with rpm2cpio and cpio. If you want things to go very smoothly,
install CentOS 3.5 on a designated development machine.


first thing i want to do is upgrade GCC and see what the difference is between i586/i686/SSE versions of caching FFTW3 seti clients compiled and linked on the same pc. that will take more than a weekend on the hardware i have available.
ID: 183312 · Report as offensive
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 183355 - Posted: 28 Oct 2005, 20:17:55 UTC - in response to Message 183312.  


first thing i want to do is upgrade GCC and see what the difference is between i586/i686/SSE versions of caching FFTW3 seti clients compiled and linked on the same pc. that will take more than a weekend on the hardware i have available.


Welcome to the team!! We need all the help we can get.
Harold Naparst
ID: 183355 · Report as offensive
Profile spacemeat
Avatar

Send message
Joined: 4 Oct 99
Posts: 239
Credit: 8,425,288
RAC: 0
United States
Message 183368 - Posted: 28 Oct 2005, 21:09:42 UTC - in response to Message 183355.  


Welcome to the team!! We need all the help we can get.


thanks but its not quite my first visit, i had put a little effort into Sparc/Linux and Alpha/BSD clients a few months ago but i'm really a hardware guy. much thanks to Ned, Paolo, et al for their help back then but i did not have much luck with the old source.

there is a chance a new gentoo bug can be made for these caching sources. the 'sse' flag in USE sets both --enable-float and --enable-sse in the fftw ebuild so a 'fftw' USE flag in a seti-cache ebuild would make easily buildable linked clients for gentoo users.
ID: 183368 · Report as offensive
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 183386 - Posted: 28 Oct 2005, 22:10:18 UTC - in response to Message 183368.  


there is a chance a new gentoo bug can be made for these caching sources. the 'sse' flag in USE sets both --enable-float and --enable-sse in the fftw ebuild so a 'fftw' USE flag in a seti-cache ebuild would make easily buildable linked clients for gentoo users.


Go for it. A seti-cache ebuild. I love it. That's going to be hard,
though, because it depends on boinc, ipp, icc, fftw, and acml.

Also, for amd64, I can't get fftw to use sse at all.
I've submitted a bugzilla report on Gentoo for this.
Harold Naparst
ID: 183386 · Report as offensive
Profile Crunch3r
Volunteer tester
Avatar

Send message
Joined: 15 Apr 99
Posts: 1546
Credit: 3,438,823
RAC: 0
Germany
Message 183391 - Posted: 28 Oct 2005, 22:27:24 UTC - in response to Message 183386.  


Also, for amd64, I can't get fftw to use sse at all.
I've submitted a bugzilla report on Gentoo for this.


That's not a problem with gentoo. It's in the fftw source.
You have to modify sse.c by hand to get it working.
(i've allready posted this in the Version 2.8 thread)



Join BOINC United now!
ID: 183391 · Report as offensive
Profile spacemeat
Avatar

Send message
Joined: 4 Oct 99
Posts: 239
Credit: 8,425,288
RAC: 0
United States
Message 183449 - Posted: 29 Oct 2005, 1:49:18 UTC

harold, is your boinc source altered for ipp? i am seeing a call for mathimf.h when gcc gets to the benchmark code. the one thing that has frustrated me about seti is that the boinc source has to be built first. i am suddenly remembering what i went through before
ID: 183449 · Report as offensive
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 183478 - Posted: 29 Oct 2005, 3:28:37 UTC - in response to Message 183449.  
Last modified: 29 Oct 2005, 3:29:18 UTC

harold, is your boinc source altered for ipp? i am seeing a call for mathimf.h when gcc gets to the benchmark code.


Yes. My feeling is that the benchmark should do math the same way that the
SETI cruncher does math. If SETI uses libimf.a, then so should the boinc
benchmark, or else the benchmark is not reflective of the way that the
computation will be done. So I modified the benchmark to use IPP since
the SETI cruncher does too.

However, past this point you get into moral issues. For instance,
the dhrystone benchmark is specifically designed to be split into two files,
dhrystone.C and dhrystone2.C. It is strictly forbidden to combine the
two files, because most compilers have a hard time with optimization across
object files. ICC, however, has excellent interprocedural optimization. So,
if you use ICC, there is no benefit to combining source files.

This raises the moral question: Is it OK to combine the source files
in the benchmark if your cruncher uses ICC?

I'm not good at answering these moral questions, so I provided two
boinc clients on my website. One is "moral", and the other "immoral."

More than you wanted to know, I'm sure.

Harold Naparst
ID: 183478 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

Message boards : Number crunching : Version 3 of faster SETI cruncher for Linux


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.