Version 3 of faster SETI cruncher for Linux

Message boards : Number crunching : Version 3 of faster SETI cruncher for Linux
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 8 · Next

AuthorMessage
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 182850 - Posted: 27 Oct 2005, 15:44:05 UTC

As a service to Hans Dorn,
I've released another version of the SETI cruncher
for Pentium 3 and 4. This is version 3.0,
and it is available at http://naparst.name

This version caches values from invert_lcgf and gets
about another 10% speedup. The program runs in 32
minutes on the reference work unit on my computer.

At this point, all of the speedup is due to Hans' two
caches. However, this version still includes my
multiple cache for trig functions, which consumes a
lot of memory and I'll try to remove soon if necessary.


Harold Naparst
ID: 182850 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20289
Credit: 7,508,002
RAC: 20
United Kingdom
Message 182858 - Posted: 27 Oct 2005, 16:11:26 UTC - in response to Message 182850.  
Last modified: 27 Oct 2005, 16:12:05 UTC

... This is version 3.0,
and it is available at http://naparst.name

... about another 10% speedup. ... due to Hans' two
caches. However, this version still includes my
multiple cache for trig functions, which consumes a
lot of memory and I'll try to remove soon if necessary.

I like the "another 10% speedup" bit :-)

For the "consumes a lot of memory" bit, can that be kept to below 256MBytes say without adversely affecting the optimisations?

And could the cross-fertilisation permeate over to Ned for an AthlonXP version please? ;-)

Very Good Stuff,

Regards,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 182858 · Report as offensive
pindakoe

Send message
Joined: 4 Jun 00
Posts: 60
Credit: 345,676
RAC: 0
Netherlands
Message 182865 - Posted: 27 Oct 2005, 16:30:51 UTC - in response to Message 182858.  


I like the "another 10% speedup" bit :-)

For the "consumes a lot of memory" bit, can that be kept to below 256MBytes say without adversely affecting the optimisations?

And could the cross-fertilisation permeate over to Ned for an AthlonXP version please? ;-)

Very Good Stuff,

Regards,
Martin


Hear, hear ! Just installed 2.75 on my Athlon XP and that shaved off 15-25% versus Ned's 4.07.03b. I would be very interested to see that reduced that further ...

Congrats to Hans & Harold.

ID: 182865 · Report as offensive
Profile Rush
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 3131
Credit: 302,569
RAC: 0
United Kingdom
Message 182869 - Posted: 27 Oct 2005, 16:38:48 UTC - in response to Message 182850.  

As a service to Hans Dorn, I've released another version of the SETI cruncher for Pentium 3 and 4. This is version 3.0, and it is available at http://naparst.name


Hey Folks,

Not knowing much about Linux, is it possible to run these on a Windows machine with some sort of emulator, in a DOS window, et cetera? I already use the two optimized versions from Maverick, but if these are better...?

Needless to say, I'm not going to gut my Windows machine for Linux, but I am interested in helping SETI.

Thanks! 8^]
ID: 182869 · Report as offensive
Ned Slider

Send message
Joined: 12 Oct 01
Posts: 668
Credit: 4,375,315
RAC: 0
United Kingdom
Message 182875 - Posted: 27 Oct 2005, 17:13:09 UTC - in response to Message 182858.  
Last modified: 27 Oct 2005, 17:14:37 UTC

... This is version 3.0,
and it is available at http://naparst.name

... about another 10% speedup. ... due to Hans' two
caches. However, this version still includes my
multiple cache for trig functions, which consumes a
lot of memory and I'll try to remove soon if necessary.

I like the "another 10% speedup" bit :-)

For the "consumes a lot of memory" bit, can that be kept to below 256MBytes say without adversely affecting the optimisations?

And could the cross-fertilisation permeate over to Ned for an AthlonXP version please? ;-)

Very Good Stuff,

Regards,
Martin


^^Yes. I currently have a dynamic build I'm testing at the moment and I know Crunch3r has also built this latest version. Crunch3r has had more success building static versions than I have so I'll wait to see if he is going to release a static version. I'm also still investigating the weak similarity I seem to be getting from my clients.

Eventually I would like to host AMD versions on my site, but at the moment new versions are comming out so quickly I'd never have time to constantly keep things up to date. What would be really nice is if we could organise a community website to host these community-developed clients with a community web manager who has time to keep things updated - sort of a one stop shop for optimized clients for all platforms (then lazy old me could just e-mail my latest builds to the community web manager who'd post the clients and update the website).

@Harold: So this is basically v2.85 plus Hans' latest caching (findpulse.cpp) patch in terms of the source (I'd provisionally numbered it as v2.9 for my build :D)?

Ned


*** My Guide to Compiling Optimised BOINC and SETI Clients ***
*** Download Optimised BOINC and SETI Clients for Linux Here ***
ID: 182875 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 182879 - Posted: 27 Oct 2005, 17:19:41 UTC - in response to Message 182869.  
Last modified: 27 Oct 2005, 17:20:41 UTC

As a service to Hans Dorn, I've released another version of the SETI cruncher for Pentium 3 and 4. This is version 3.0, and it is available at http://naparst.name


Hey Folks,

Not knowing much about Linux, is it possible to run these on a Windows machine with some sort of emulator, in a DOS window, et cetera? I already use the two optimized versions from Maverick, but if these are better...?

Needless to say, I'm not going to gut my Windows machine for Linux, but I am interested in helping SETI.

Thanks! 8^]


Hi,

I just had a look at the Cygwin site,
but it looks like you have to compile everything from source, including libraries.
This makes it impossible to use intel's compiler if I'm not wrong.

Regards Hans
ID: 182879 · Report as offensive
Ned Slider

Send message
Joined: 12 Oct 01
Posts: 668
Credit: 4,375,315
RAC: 0
United Kingdom
Message 182897 - Posted: 27 Oct 2005, 18:35:38 UTC - in response to Message 182858.  
Last modified: 27 Oct 2005, 18:37:58 UTC



And could the cross-fertilisation permeate over to Ned for an AthlonXP version please? ;-)

Very Good Stuff,

Regards,
Martin


Here you go Martin - I just built a static version for Athlon XP:

http://www.pperry.f2s.com/files/setiathome-ned-v3.0.tar.gz

It's experimental - I've NOT tested it - so use at your own risk. I'm not 100% sure it's totally static, but it runs on my boxes OK on the very quick 30 second test I did.

Ned


PS - Please could you let me know if it spawns 3 setiathome processes like Crunch3r's client does, or just the one - thatnks.
*** My Guide to Compiling Optimised BOINC and SETI Clients ***
*** Download Optimised BOINC and SETI Clients for Linux Here ***
ID: 182897 · Report as offensive
Chris Bosshard

Send message
Joined: 5 Jun 99
Posts: 86
Credit: 3,474,583
RAC: 0
Switzerland
Message 182940 - Posted: 27 Oct 2005, 20:16:52 UTC

Thank you Ned. :-)

I will give your creation a try, so far I can only see one setiathome process.
Running on Fedora Core4 BTW.


Chris Bosshard
Visit my homepage
astroinfo SETI page
ID: 182940 · Report as offensive
Profile spacemeat
Avatar

Send message
Joined: 4 Oct 99
Posts: 239
Credit: 8,425,288
RAC: 0
United States
Message 182947 - Posted: 27 Oct 2005, 20:38:43 UTC - in response to Message 182875.  

i will give the p3 a shot later tonight. the previous version doubled my RAC to up over 100 for a 500MHz machine (swap vs RAM does not seem to be a big detriment even on an old dma33 drive). is there any version for non-SSE? i have a P2 that i believe is still using Ned's A client.

What would be really nice is if we could organise a community website to host these community-developed clients with a community web manager who has time to keep things updated - sort of a one stop shop for optimized clients for all platforms (then lazy old me could just e-mail my latest builds to the community web manager who'd post the clients and update the website).


isn't Speedy67 the owner of the marisan.nl site? the seti page there did an excellent job of hosting, organizing, defining, and explaining tetsuji's array of windows clients. at the very least, that format would be quite useful to implement for someone who was so inclined.
ID: 182947 · Report as offensive
Don Erway
Volunteer tester

Send message
Joined: 18 May 99
Posts: 305
Credit: 471,946
RAC: 0
United States
Message 182949 - Posted: 27 Oct 2005, 20:40:16 UTC

Assuming that seti calculates lots of ffts of the same size, you might do even better than a generalized cache for trig values.

You can precalculate 2 arrays of the same size as the fft length, and have all the right incremental values of sin and cos sitting there, ready to pick up by a simple array access, rather than a hash, or whatever you are doing for the "cache"...

This is pretty standard in many fft algorithms.

Or is that what is happening anyway?

I hope TMR can pick this up, and generate some new windows clients!!!

Don

ID: 182949 · Report as offensive
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 182950 - Posted: 27 Oct 2005, 20:47:26 UTC - in response to Message 182940.  

@Ned,
I had provisionally called it r2.9 also, but this version
contains a fundamentally new idea that eliminated a major computation
bottleneck, so I deemed that Hans' idea was worth a major number.
Besides, we were getting close to 3.0 anyway.
The SSE2 version was compiled with only -xW -ipo -fp-model fast -O3,
and without a couple of the other flags I've used in the past that
could affect accuracy. So it might not be totally as fast as it could be.
I didn't really test it as much as I should have, but it does give
strong similarity to the reference wu, and it has been running for 24
hours on my AMD 64x2 for a day with no problems.
As I said, I just wanted to put it out for you all. It is in
the source tree too as
svn co svn://hnaparst.homelinux.com/seti_boinc/tags/naparst-r3.0

As for keeping the memory below 256M. Yes, I plan to do this..soon.
It shouldn't be a problem. Hans says that the extra memory isn't necessary,
anyway, and I just want to verify it. I realize that's an issue. It's
an issue for me, too. Several of the computers I have have only 256M total
RAM.

For spacemeat...A non SSE version for your P2 is a distinct possibility.
I'd say it is doable, but I'm not sure how many people still have P2s.
Have you thought about upgrading your hardware?
Harold Naparst
ID: 182950 · Report as offensive
Profile Crunch3r
Volunteer tester
Avatar

Send message
Joined: 15 Apr 99
Posts: 1546
Credit: 3,438,823
RAC: 0
Germany
Message 182953 - Posted: 27 Oct 2005, 20:58:59 UTC - in response to Message 182950.  

@Ned,
I had provisionally called it r2.9 also, but this version
contains a fundamentally new idea that eliminated a major computation
bottleneck, so I deemed that Hans' idea was worth a major number.
Besides, we were getting close to 3.0 anyway.
The SSE2 version was compiled with only -xW -ipo -fp-model fast -O3,
and without a couple of the other flags I've used in the past that
could affect accuracy. So it might not be totally as fast as it could be.
I didn't really test it as much as I should have, but it does give
strong similarity to the reference wu, and it has been running for 24
hours on my AMD 64x2 for a day with no problems.
As I said, I just wanted to put it out for you all. It is in
the source tree too as
svn co svn://hnaparst.homelinux.com/seti_boinc/tags/naparst-r3.0

As for keeping the memory below 256M. Yes, I plan to do this..soon.
It shouldn't be a problem. Hans says that the extra memory isn't necessary,
anyway, and I just want to verify it. I realize that's an issue. It's
an issue for me, too. Several of the computers I have have only 256M total
RAM.

For spacemeat...A non SSE version for your P2 is a distinct possibility.
I'd say it is doable, but I'm not sure how many people still have P2s.
Have you thought about upgrading your hardware?


@Harold

Just a note

analyzeFuncs.cpp: In function 'int v_ChirpData(float*, float*, float, int, double)':
analyzeFuncs.cpp:938: error: 'align' was not declared in this scope
analyzeFuncs.cpp:938: error: '__declspec' was not declared in this scope
analyzeFuncs.cpp:938: error: expected `;' before 'float'


Join BOINC United now!
ID: 182953 · Report as offensive
Profile spacemeat
Avatar

Send message
Joined: 4 Oct 99
Posts: 239
Credit: 8,425,288
RAC: 0
United States
Message 182964 - Posted: 27 Oct 2005, 21:17:38 UTC

For spacemeat...A non SSE version for your P2 is a distinct possibility.
I'd say it is doable, but I'm not sure how many people still have P2s.
Have you thought about upgrading your hardware?


upgrade? it's a limited use machine that only serves one non cpu-intensive purpose for me. and it was free. so i use the extra cycles for seti. ive got about 15 old boxes sitting around that i'd rather make useful than throw away and if i weren't so concerned with the electric bill, they would all be running seti too. i don't think i'm alone in the 'put old PCs to work' demographic. or sparc & alpha for that matter.

but i certainly don't expect non-SSE to be part of the development of new clients; unpopularity and duration of benchmark testing are a big hindrance. just really wondering if anyone has compiled one yet. if Portage had recent versions of IPP, ICC and/or GCC i might at least cross-compile on something faster but i do not have the time to mess around that much with experimental packages in gentoo.
ID: 182964 · Report as offensive
Profile Crunch3r
Volunteer tester
Avatar

Send message
Joined: 15 Apr 99
Posts: 1546
Credit: 3,438,823
RAC: 0
Germany
Message 182965 - Posted: 27 Oct 2005, 21:22:08 UTC - in response to Message 182964.  

For spacemeat...A non SSE version for your P2 is a distinct possibility.
I'd say it is doable, but I'm not sure how many people still have P2s.
Have you thought about upgrading your hardware?


upgrade? it's a limited use machine that only serves one non cpu-intensive purpose for me. and it was free. so i use the extra cycles for seti. ive got about 15 old boxes sitting around that i'd rather make useful than throw away and if i weren't so concerned with the electric bill, they would all be running seti too. i don't think i'm alone in the 'put old PCs to work' demographic. or sparc & alpha for that matter.

but i certainly don't expect non-SSE to be part of the development of new clients; unpopularity and duration of benchmark testing are a big hindrance. just really wondering if anyone has compiled one yet. if Portage had recent versions of IPP, ICC and/or GCC i might at least cross-compile on something faster but i do not have the time to mess around that much with experimental packages in gentoo.


If you need a non-SSE client then follow the link in my SIG :-)


Join BOINC United now!
ID: 182965 · Report as offensive
Ned Slider

Send message
Joined: 12 Oct 01
Posts: 668
Credit: 4,375,315
RAC: 0
United Kingdom
Message 182974 - Posted: 27 Oct 2005, 21:38:56 UTC

Yes, we can certainly build clients for older systems (P2, i686, athlon etc), but with the current pace of developments I'm having a hard time keeping up on my Athlon XP :)

Once things settle down, and we're not getting new source improvements every day or so, then we can do some more builds.

Like Crunch3r said, in the meantime he has some builds posted on his site for older processors (Crunch3r - do you think you could indicate the revision of the clients posted).

Ned

*** My Guide to Compiling Optimised BOINC and SETI Clients ***
*** Download Optimised BOINC and SETI Clients for Linux Here ***
ID: 182974 · Report as offensive
Profile Crunch3r
Volunteer tester
Avatar

Send message
Joined: 15 Apr 99
Posts: 1546
Credit: 3,438,823
RAC: 0
Germany
Message 182977 - Posted: 27 Oct 2005, 21:45:55 UTC - in response to Message 182974.  
Last modified: 27 Oct 2005, 21:46:16 UTC

Yes, we can certainly build clients for older systems (P2, i686, athlon etc), but with the current pace of developments I'm having a hard time keeping up on my Athlon XP :)

Once things settle down, and we're not getting new source improvements every day or so, then we can do some more builds.

Like Crunch3r said, in the meantime he has some builds posted on his site for older processors (Crunch3r - do you think you could indicate the revision of the clients posted).

Ned


Hi Ned,

I think the i686 client is from naparst-r2.75
If i find some time over the weekend i'll build some new based on the new sources.

At the moment i'm buisy on building some clients for TRU64 because some have asked for them.

Join BOINC United now!
ID: 182977 · Report as offensive
Ned Slider

Send message
Joined: 12 Oct 01
Posts: 668
Credit: 4,375,315
RAC: 0
United Kingdom
Message 182980 - Posted: 27 Oct 2005, 21:53:39 UTC - in response to Message 182950.  

@Ned,
I had provisionally called it r2.9 also, but this version
contains a fundamentally new idea that eliminated a major computation
bottleneck, so I deemed that Hans' idea was worth a major number.
Besides, we were getting close to 3.0 anyway.
The SSE2 version was compiled with only -xW -ipo -fp-model fast -O3,
and without a couple of the other flags I've used in the past that
could affect accuracy. So it might not be totally as fast as it could be.
I didn't really test it as much as I should have, but it does give
strong similarity to the reference wu, and it has been running for 24
hours on my AMD 64x2 for a day with no problems.
As I said, I just wanted to put it out for you all. It is in
the source tree too as
svn co svn://hnaparst.homelinux.com/seti_boinc/tags/naparst-r3.0



Good stuff :)

I really think we need to address version numbering in the wider context. Firstly, I'd like to see the major version number stay in line with Berkeley, ie version 4. Next, I would propose that the minor version number reflects the core seti source originally used (x.07 in this case?), and then append our own versioning onto that. Thus, for example, this current release could be 4.07.3.0. Individual optimizers could further append letters to reflect further optimizations achieved with different compiler flags etc, so one might do v4.07.3.0a and v4.07.3.0b etc.

And Harold, would you take responsibility for maintaining a community source tree as you seem to be doing such a good job with it atm? Then Harold could build for P4 (and P3 as this now appears faster with ICC/IPP??), and Crunch3r and myself could build for AMD and older platforms thus sharing the workload around.

^^ Just kicking a few ideas around for comment. LOL, but this is just so much more fun working together than struggling on alone :)

Ned


*** My Guide to Compiling Optimised BOINC and SETI Clients ***
*** Download Optimised BOINC and SETI Clients for Linux Here ***
ID: 182980 · Report as offensive
Ned Slider

Send message
Joined: 12 Oct 01
Posts: 668
Credit: 4,375,315
RAC: 0
United Kingdom
Message 182981 - Posted: 27 Oct 2005, 21:55:04 UTC - in response to Message 182977.  


Hi Ned,

I think the i686 client is from naparst-r2.75
If i find some time over the weekend i'll build some new based on the new sources.

At the moment i'm buisy on building some clients for TRU64 because some have asked for them.


I can probably find time to build an i686 version on the weekend if you're pushed (as long as there are no more exciting source releases in the meantime!)

Ned

*** My Guide to Compiling Optimised BOINC and SETI Clients ***
*** Download Optimised BOINC and SETI Clients for Linux Here ***
ID: 182981 · Report as offensive
Harold Naparst
Volunteer tester

Send message
Joined: 11 May 05
Posts: 236
Credit: 91,803
RAC: 0
Sweden
Message 182983 - Posted: 27 Oct 2005, 21:58:16 UTC - in response to Message 182965.  


If you need a non-SSE client then follow the link in my SIG :-)


@Cruncher --> I don't know what got into me, but I tried to find
your non-SSE client on your site. I couldn't find it, but I'm
sure its there...somewhere.

Also, again for the record, although you list Tetsuji, Hans, Alex Kan,
and me as contributors to the source, I think that at the moment
literally all the speedup is directly attributable to the Hans' two
caches: one for invert_lcgf() and one for v_ChirpData.

All the stuff that Alex and I did is really made irrelevant by the caches,
most probably.

Anyway, did you use ICC/IPP to compile it? What do you think would
be faster, ICC/IPP, or GCC/FFTW for P2?

Also, great design for your web site. We clearly need to get a P2
client for spacemeat, since he is a Gentooist and all that.

@spacemeat --> Portage has icc-9 and gcc-4.0.2, but they don't have
IPP. Anyway, that shouldn't stop you. It is a very simple matter
to install them and compile with the appropriate flags. The instructions
are on my web site: http://naparst.name/sources.htm
Have fun, and let us know if GCC/FFTW is faster than ICC/IPP.

@Ned --> as far as hosting the community client goes, I am happy to do
it if you have some kind of plan. I am not even using 2% of my bandwidth
and space isn't an issue. I'd just need a way to allow people to update
the site..like a wiki or something else. I'm sure it is not hard to do.

Or we can just have links to each others sites, like we are doing now.
That might be the best solution anyway, since it allows people to post
what they believe is the fastest and have everyone else experiment on it.

What do you think?
Harold Naparst
ID: 182983 · Report as offensive
Ned Slider

Send message
Joined: 12 Oct 01
Posts: 668
Credit: 4,375,315
RAC: 0
United Kingdom
Message 183007 - Posted: 27 Oct 2005, 22:54:59 UTC - in response to Message 182983.  
Last modified: 27 Oct 2005, 22:55:56 UTC


Anyway, did you use ICC/IPP to compile it? What do you think would
be faster, ICC/IPP, or GCC/FFTW for P2?


I wouldn't bother with a specific P2 version - in the past I've seen very little difference between i686 and Pentium2 (at least with GCC), so I'd just go with -march=i686 for greater compatibility.


@Ned --> as far as hosting the community client goes, I am happy to do
it if you have some kind of plan. I am not even using 2% of my bandwidth
and space isn't an issue. I'd just need a way to allow people to update
the site..like a wiki or something else. I'm sure it is not hard to do.

Or we can just have links to each others sites, like we are doing now.
That might be the best solution anyway, since it allows people to post
what they believe is the fastest and have everyone else experiment on it.

What do you think?


Yes, a webring type of thing :)

I was thinking mostly for end users benefit - avoid the confusion of multiple websites hosting multiple clients.

On my site I have plenty of space and unlimited bandwidth, plus the ability to setup web admin logins to allow others access without disclosing my main ISP username/password. If we wanted to go with a single site, such a setup would definately be an advantage (although not necessarily mine).

Another possibility using the linking methodology would be to use absolutely identical site structures (as we're already very similar anyway) and have links in the left hand navigation panel for "AMD Clients", "Intel Clients", "Alpha Clients" etc that are direct links to each others sites for those pages, rather than locally mirrored pages. So, for example, if a visitor to my site clicked on the "Alpha Clients" navigation panel link they'd jump straight to Crunch3r's site without even knowing it, and similarly then clicked on the Intel navigation panel link they'd jump on to Harolds site - that way each person only needs maintain the page for their own individual clients and the rest of the site layout is just an exact copy on each server - no mirroring needed.

Ned

*** My Guide to Compiling Optimised BOINC and SETI Clients ***
*** Download Optimised BOINC and SETI Clients for Linux Here ***
ID: 183007 · Report as offensive
1 · 2 · 3 · 4 . . . 8 · Next

Message boards : Number crunching : Version 3 of faster SETI cruncher for Linux


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.