Message boards :
Number crunching :
Optimized windows clients - plz help listing cpu times
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Astro ![]() Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
I'm sorry for hijacking your thread Speedy, I didn't realize it would take up so many posts. Sorry. Thanks to all who have tried to help. I'll be waiting till I'm back in SC to try to fix this. Your appreciation at that time continues to be appreciated. tony |
![]() ![]() Send message Joined: 14 Jul 99 Posts: 335 Credit: 1,178,138 RAC: 0 ![]() |
That's interesting! I was always under the impression that maximum single-thread performance was only possible with HT disabled. I don't know if the process that uses only one 'virtual' cpu can use the full L2 cache when the other 'virtual' cpu is inactive? Anybody who can shine a light on this? Greetings, Speedy67 ![]() ![]() |
![]() ![]() Send message Joined: 14 Jul 99 Posts: 335 Credit: 1,178,138 RAC: 0 ![]() |
|
MiCrO ![]() Send message Joined: 5 Apr 00 Posts: 48 Credit: 43,924,114 RAC: 7 ![]() |
Only programs capable of using the P4 (and compatible) fully run faster with ht disabled. All other programs should not make a big difference. Most will be a bit faster with HT enabled (at least if they use more threads). Greez MiCrO |
Jordi Valls Send message Joined: 10 Jun 99 Posts: 6 Credit: 1,599,185 RAC: 0 |
Processor: AMD Athlon XP 2600+ Mobile(Barton core) 512 KB L2 Cache Stock: 1995 MHz /133 MHz FSB * 15 / 1,45v Overclocked: 2120 Mhz / 185 *11,5 / 1,5v System: Barebone Shuttle SN41g2v3 (NForce2) Memory: 1024 MB Value Kingston PC3200 185 MHz DDR 2-3-3-6 Results: YAOSCW-K-r8.1: 8364 YAOSCW-K-r7: 8494 Note: Thread Master 90% to Seti. Greetings, AsDeCopes |
![]() ![]() Send message Joined: 14 Jul 99 Posts: 335 Credit: 1,178,138 RAC: 0 ![]() |
|
Urs Echternacht ![]() Send message Joined: 15 May 99 Posts: 692 Credit: 135,197,781 RAC: 211 ![]() ![]() |
CPU-Info: Intel(R) Pentium(R) III CPU-S 1400MHz (Tualatin) Family 6 Model B Stepping 4 Revision tB1 Core-Speed 1512MHz, FSB 144MHz L1 32KB, L2 512KB, Instructions: MMX, SSE RAM: 256MB 2-2-2-7-9 SD-RAM, 144MHz seti-p3........: 13228 YAOSCW-K-r7....: 12073 YAOSCW-K-r8.1..: 11751 I do have another one of these with an older stepping and a dual PIII-Coppermine which i'll bench next. _\|/_ U r s |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 ![]() |
My impression was that a typical modern CPU (Intel, AMD or any other) has more than one integer unit and more than one FP unit. Then there's a tiny part of processor that dispatches instructions into appropriate units. Even one process can be executed in several units at the time. An example would be execution of if then else type of code (both branches in parallel) just to decide what results to take after the criteria (the if part) becomes known (speculative branch prediction or something similar). Typically only a few of each units can be used at a time and if a processor has more, they are unused. But, if you can make one processor look like two (HT), you can execute two processes at the time and make better use of all these integer and FP units. So, whne HT enabled, if you run only one process, the process won't notice lack of integer/FP units and will run full speed. If you run two (or more) processes, any of processes may suffer because of lack of free integer/FP units, but processor as a whole will be utilized better. Metod ... ![]() |
![]() Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 ![]() |
So, whne HT enabled, if you run only one process, the process won't notice lack of integer/FP units and will run full speed. If you run two (or more) processes, any of processes may suffer because of lack of free integer/FP units, but processor as a whole will be utilized better. true, HT off, you under-utilize the available resources. HT on - use more resouces my the time to complete goes up ... but the total throughput also rises ... so, more time to complete, more work done in unit time. |
B.U.M.S.P.A.S.S.A.T Send message Joined: 10 Jun 02 Posts: 5 Credit: 669,560 RAC: 0 ![]() |
Well, you gotta think of one other thing the boinc developers seemed not to think about. When my Northwood 3.0 HT_on crunches 2 units and I check the process affinity mask of each, i have to find out that each process is supposed to run on both 'virtual' processors. It might be, that this slows things down, because when a client switches a cpu during execution the result will be a complete l2cache-refill. this is a real slowdown...at least on real multiprocessor boards. maybe this should be included in the boinc software which starts the clients. st0ff |
![]() Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 ![]() |
Ouch! I never thought of that ... Sounds like a case of a bug report crying out to be made ... |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 ![]() |
Well, you gotta think of one other thing the boinc developers seemed not to think about. When my Northwood 3.0 HT_on crunches 2 units and I check the process affinity mask of each, i have to find out that each process is supposed to run on both 'virtual' processors. It might be, that this slows things down, because when a client switches a cpu during execution the result will be a complete l2cache-refill. this is a real slowdown...at least on real multiprocessor boards. maybe this should be included in the boinc software which starts the clients. This is certainly true if you see SETI (or any BOINC project FWIW) as the main task of your computer. I see it as as welcome task to keep CPUs warm when it doesn't have to do anything more important. Developers see it the same way and that's why all BOINC related processes run with the lowest possible priority. I wouldn't like to see a low priority process to run with CPU affinity so it doesn't get transferred to othe CPUs when some normal- or high prority task starts to run (and thusly stealing something like 10% of CPU power from it). In short: I'm quite happy SETI doesn't get started with CPU affinity set. Metod ... ![]() |
![]() ![]() Send message Joined: 4 Jul 99 Posts: 1575 Credit: 4,152,111 RAC: 1 ![]() |
I personally would like to see BOINC set affinity too (for whenever I get a dual/HT machine). However I do not think this will be a high priority item, ever. The main reason is: the good reliable reports I have seen only show a moderate improvement with affinity set on dual CPU machines, and on HT machines some of these reports show a decrease in performance. The best chance of it getting in is if someone works out a way for the CPU scheduler to operate independantly on multiple CPUs and setting affinity happens as a by-product of that solution. It also may not be possible with the CPU scheduler, especially if suspending to memory. For example WU1 starts on CPU0, it is suspended to memory. Later WU2 starts on CPU0, then WU1 restarts. Now you have 2 WUs running on CPU0 and none running on CPU1. BOINC WIKI ![]() ![]() BOINCing since 2002/12/8 |
Don Erway Send message Joined: 18 May 99 Posts: 305 Credit: 471,946 RAC: 0 ![]() |
I just got my new athlon 64 3200, venice core, to complete the reference WU in 6178 seconds, using the full up P4-sse3 client! I ran this at low priority, while other CPU intensive tasks were happening, but the CPU time measure should be independent of load.. Or is it not? The result file has diffs from the reference result. Does this mean no go? I've run the reference WU with the sse2-amd64 client. It took 6166 secs, so faster than the p4-sse3 version. The result file agrees with the one produced by the p4-sse3 client, and not with the reference result file. But I already know the sse2-amd64 client is returning valid results, because credit has already been granted. What's the deal with the mismatch results? I'm not going to send in times,until I get another stick of mem, to run at dual ddr speed. The sse2 boinc CC works great as well. Any chance of getting a non pentium specific sse3 version?? :) Don |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19810 Credit: 40,757,560 RAC: 67 ![]() ![]() |
I personally would like to see BOINC set affinity too (for whenever I get a dual/HT machine). I found that setting affinity was patchy, as it usually lost its setting when switching units. But I did discover that there was a performance boost if you can get it to run two different projects at same time i.e. one seti and one einstein there could be up to a 15% improvemant over running two units for one project. The problem here is without micro management it is almost impossible to get it to do it automatically. Andy |
![]() ![]() Send message Joined: 14 Jul 99 Posts: 335 Credit: 1,178,138 RAC: 0 ![]() |
The result files don't have to be 100% the same, but within certain limits. How this is calculated exactly, I don't know, but as you said your results have already been credited, so no worries there. More info on validation in the Boinc Wiki, by Paul D. Buck [edit: typo] Greetings, Speedy67 ![]() ![]() |
Don Erway Send message Joined: 18 May 99 Posts: 305 Credit: 471,946 RAC: 0 ![]() |
I just got my new athlon 64 3200, venice core, to complete the reference WU in 6178 seconds, using the full up P4-sse3 client! I spoke too soon. I had the client versions swapped, and in fact, the p4-sse3 version will NOT run on the venice core, even though cpu-z says it does have sse3. So, a non-P4 specific, sse3 client, might be worth creating/trying... But it sounds like there is not much difference betweeen sse2 and sse3, on the linux clients anyway. |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 ![]() |
I just got my new athlon 64 3200, venice core, to complete the reference WU in 6178 seconds, using the full up P4-sse3 client! Don, I built optimized linux clients for AMD64 with both SSE2 and SSE3 and there was no advantage in using SSE3. In fact, IIRC the SSE3 enabled client was very slightly slower. My AMD64 (SSE2) client for linux is available on my site :) Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Don Erway Send message Joined: 18 May 99 Posts: 305 Credit: 471,946 RAC: 0 ![]() |
Hi Ned. Yeah, I know. It was your reports that led me to say that it was unlikely to do any good... I guess cranking the cpu up to 2.45 GHz, or so ought to do it, and it is so easy to do, with the 939 chips, even with the stock AMD retail box HSF! Prime95 is happy as a clam. But hey, has anyone tried memtest86, on an athlon64 machine? Mine won't even start to run. Final results shortly... Don |
![]() ![]() Send message Joined: 1 Jul 03 Posts: 130 Credit: 48,466 RAC: 0 ![]() |
a reference unit? finally. good go! ----------------------- Click to see my tag My tag SNAFU'ed? Turn the Page! :D |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.