Problem running Boinc on T2000 and Solaris 10 !

Questions and Answers : Unix/Linux : Problem running Boinc on T2000 and Solaris 10 !
Message board moderation

To post messages, you must log in.

AuthorMessage
Didn?

Send message
Joined: 8 Mar 06
Posts: 4
Credit: 357,584
RAC: 0
Germany
Message 260042 - Posted: 10 Mar 2006, 11:04:28 UTC

Hi, can´t run Boinc on a SUN T2000 with Solaris 10 01/06, get the following message and the cpu benchmarks timed out:

ACTIVE_TASK_SET::check_app_exited(): pid 1490 not found

on my SUN 420R with 4 cpu´s Boinc is running without problems.

thx for any idea...Didn?
ID: 260042 · Report as offensive
Profile Jim-R.
Volunteer tester
Avatar

Send message
Joined: 7 Feb 06
Posts: 1494
Credit: 194,148
RAC: 0
United States
Message 260045 - Posted: 10 Mar 2006, 11:26:45 UTC

The message you are receiving means that a task was told to exit but it didn't exit as it was expected to but when it was tried to be used again it wasn't found in the active task list. From the way you describe it sounds probable that something in the benchmark routine failed. If you have boinc set up to start from a menu or an icon on your desktop or something I suggest running it from a terminal to see if there are any other errors listed such as missing libraries etc.
Jim
Jim

Some people plan their life out and look back at the wealth they've had.
Others live life day by day and look back at the wealth of experiences and enjoyment they've had.
ID: 260045 · Report as offensive
Didn?

Send message
Joined: 8 Mar 06
Posts: 4
Credit: 357,584
RAC: 0
Germany
Message 260116 - Posted: 10 Mar 2006, 15:49:27 UTC - in response to Message 260045.  

The message you are receiving means that a task was told to exit but it didn't exit as it was expected to but when it was tried to be used again it wasn't found in the active task list. From the way you describe it sounds probable that something in the benchmark routine failed. If you have boinc set up to start from a menu or an icon on your desktop or something I suggest running it from a terminal to see if there are any other errors listed such as missing libraries etc.
Jim


Hello Jim,
thank for your answer, i only work on the shell! i tryed a client for specially Sparc IV and the messages are simular and still after the benchmark nothing more happens :-( :

2006-03-10 14:01:08 [---] Starting BOINC client version 4.43 for sparc-sun-solaris2.10
2006-03-10 14:01:08 [---] Data directory: /export/home/dh
2006-03-10 14:01:08 [http://setiathome.berkeley.edu/] Computer ID: not assigned yet; location: ; project prefs: default
2006-03-10 14:01:08 [---] No general preferences found - using BOINC defaults
2006-03-10 14:01:08 [---] Remote control not allowed; using loopback address
2006-03-10 14:01:10 [---] Running CPU benchmarks
2006-03-10 14:06:08 [---] CPU benchmarks timed out, using default values
2006-03-10 14:06:08 [---] CPU benchmarks timed out, using default values
2006-03-10 14:06:08 [---] Resuming computation and network activity
2006-03-10 14:06:08 [---] ACTIVE_TASK_SET::check_app_exited(): pid 1858 not found
2006-03-10 14:06:08 [---] ACTIVE_TASK_SET::check_app_exited(): pid 1858 not found
2006-03-10 14:06:08 [---] Insufficient work; requesting more
2006-03-10 14:06:08 [---] Suspending computation and network activity - running CPU benchmarks
2006-03-10 14:06:10 [---] Running CPU benchmarks

any idea? Didn?

ID: 260116 · Report as offensive
Profile Jim-R.
Volunteer tester
Avatar

Send message
Joined: 7 Feb 06
Posts: 1494
Credit: 194,148
RAC: 0
United States
Message 260130 - Posted: 10 Mar 2006, 16:11:11 UTC - in response to Message 260116.  
Last modified: 10 Mar 2006, 16:14:08 UTC

The message you are receiving means that a task was told to exit but it didn't exit as it was expected to but when it was tried to be used again it wasn't found in the active task list. From the way you describe it sounds probable that something in the benchmark routine failed. If you have boinc set up to start from a menu or an icon on your desktop or something I suggest running it from a terminal to see if there are any other errors listed such as missing libraries etc.
Jim


Hello Jim,
thank for your answer, i only work on the shell! i tryed a client for specially Sparc IV and the messages are simular and still after the benchmark nothing more happens :-( :

2006-03-10 14:01:08 [---] Starting BOINC client version 4.43 for sparc-sun-solaris2.10
2006-03-10 14:01:08 [---] Data directory: /export/home/dh
2006-03-10 14:01:08 [http://setiathome.berkeley.edu/] Computer ID: not assigned yet; location: ; project prefs: default
2006-03-10 14:01:08 [---] No general preferences found - using BOINC defaults
2006-03-10 14:01:08 [---] Remote control not allowed; using loopback address
2006-03-10 14:01:10 [---] Running CPU benchmarks
2006-03-10 14:06:08 [---] CPU benchmarks timed out, using default values
2006-03-10 14:06:08 [---] CPU benchmarks timed out, using default values
2006-03-10 14:06:08 [---] Resuming computation and network activity
2006-03-10 14:06:08 [---] ACTIVE_TASK_SET::check_app_exited(): pid 1858 not found
2006-03-10 14:06:08 [---] ACTIVE_TASK_SET::check_app_exited(): pid 1858 not found
2006-03-10 14:06:08 [---] Insufficient work; requesting more
2006-03-10 14:06:08 [---] Suspending computation and network activity - running CPU benchmarks
2006-03-10 14:06:10 [---] Running CPU benchmarks

any idea? Didn?


I could suggest checking the file beginning with "stderr" in the boinc directory to see if there are any other error messages there. There were no error messages on the terminal after you typed in the command? Another thing I noticed is it seems that you have not attached this computer yet. An output line above says you don't have an id number for this one yet. I see two suns and a windows comp in your account but both of the Suns have computer id's.
If it is another computer from those try attaching to the project on the command line when you run it to see if it might need a computer id before it would run the benchmarks. If this doesn't help it seems as some problem is killing the benchmarking routine. I run Linux so I will be of limited help on the technical details of a SunOs but I will help what I can.
Jim

/edit/ Sorry, I reread your post and I do see it as a different computer T2000 instead of Ultra 80 and UltraSparc (don't remember the rest of it).

Jim

Some people plan their life out and look back at the wealth they've had.
Others live life day by day and look back at the wealth of experiences and enjoyment they've had.
ID: 260130 · Report as offensive
Didn?

Send message
Joined: 8 Mar 06
Posts: 4
Credit: 357,584
RAC: 0
Germany
Message 260140 - Posted: 10 Mar 2006, 16:29:18 UTC - in response to Message 260130.  

The message you are receiving means that a task was told to exit but it didn't exit as it was expected to but when it was tried to be used again it wasn't found in the active task list. From the way you describe it sounds probable that something in the benchmark routine failed. If you have boinc set up to start from a menu or an icon on your desktop or something I suggest running it from a terminal to see if there are any other errors listed such as missing libraries etc.
Jim


Hello Jim,
thank for your answer, i only work on the shell! i tryed a client for specially Sparc IV and the messages are simular and still after the benchmark nothing more happens :-( :

2006-03-10 14:01:08 [---] Starting BOINC client version 4.43 for sparc-sun-solaris2.10
2006-03-10 14:01:08 [---] Data directory: /export/home/dh
2006-03-10 14:01:08 [http://setiathome.berkeley.edu/] Computer ID: not assigned yet; location: ; project prefs: default
2006-03-10 14:01:08 [---] No general preferences found - using BOINC defaults
2006-03-10 14:01:08 [---] Remote control not allowed; using loopback address
2006-03-10 14:01:10 [---] Running CPU benchmarks
2006-03-10 14:06:08 [---] CPU benchmarks timed out, using default values
2006-03-10 14:06:08 [---] CPU benchmarks timed out, using default values
2006-03-10 14:06:08 [---] Resuming computation and network activity
2006-03-10 14:06:08 [---] ACTIVE_TASK_SET::check_app_exited(): pid 1858 not found
2006-03-10 14:06:08 [---] ACTIVE_TASK_SET::check_app_exited(): pid 1858 not found
2006-03-10 14:06:08 [---] Insufficient work; requesting more
2006-03-10 14:06:08 [---] Suspending computation and network activity - running CPU benchmarks
2006-03-10 14:06:10 [---] Running CPU benchmarks

any idea? Didn?


I could suggest checking the file beginning with "stderr" in the boinc directory to see if there are any other error messages there. There were no error messages on the terminal after you typed in the command? Another thing I noticed is it seems that you have not attached this computer yet. An output line above says you don't have an id number for this one yet. I see two suns and a windows comp in your account but both of the Suns have computer id's.
If it is another computer from those try attaching to the project on the command line when you run it to see if it might need a computer id before it would run the benchmarks. If this doesn't help it seems as some problem is killing the benchmarking routine. I run Linux so I will be of limited help on the technical details of a SunOs but I will help what I can.
Jim

/edit/ Sorry, I reread your post and I do see it as a different computer T2000 instead of Ultra 80 and UltraSparc (don't remember the rest of it).


Hi Jim,
i have no "stderr", all i have is:
# find .
.
./boinc
./boincmgr
./binstall.sh
./run_client
./lockfile
./projects
./projects/setiathome.berkeley.edu
./client_state.xml
./account_setiathome.berkeley.edu.xml
./client_state_prev.xml

i start the client with: ./boinc -attach_project http://setiathome.berkeley.edu KEY
There are no other error messages :-(

ID: 260140 · Report as offensive
Profile Jim-R.
Volunteer tester
Avatar

Send message
Joined: 7 Feb 06
Posts: 1494
Credit: 194,148
RAC: 0
United States
Message 260179 - Posted: 10 Mar 2006, 18:18:24 UTC - in response to Message 260140.  


i have no "stderr", all i have is:
# find .
.
./boinc
./boincmgr
./binstall.sh
./run_client
./lockfile
./projects
./projects/setiathome.berkeley.edu
./client_state.xml
./account_setiathome.berkeley.edu.xml
./client_state_prev.xml

i start the client with: ./boinc -attach_project http://setiathome.berkeley.edu KEY
There are no other error messages


Ok. Were you running boinc at the time that the above list was generated? If not, delete the lockfile and try again. If that is a stale lockfile left behind from it exiting ungraciously that may be the reason. I noticed that in both of your former posts it gave the exact same process id. That shouldn't happen unless it's picking it up from something left behind. I don't know where you would look for system errors on the sunos but you might can check them. You might try viewing the contents of the lockfile too. I checked mine and it gives some messages from Einstein but I didn't see any for Seti. I have rechecked all the info I can find on this error message and still can only find the wiki info that I posted earlier. Still no actual "fix". I hate to do it but since you haven't even got it to attach yet I suggest deleting everything and starting over. It is possible that one of the files could be corrupt. Either that or keep bumping this post up (or create a new one) and try to get someone that is more familiar with SunOs in here. You might could also try posting in the "number crunching" forum and prefix your post with @Eric. He is a lead developer for the Seti application and may be able to help, and you may also want to post in one of the boinc forums. I will keep trying to find something but right now it doesn't look promising.
Jim
Jim

Some people plan their life out and look back at the wealth they've had.
Others live life day by day and look back at the wealth of experiences and enjoyment they've had.
ID: 260179 · Report as offensive
Dotsch
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 2422
Credit: 919,393
RAC: 0
Germany
Message 260191 - Posted: 10 Mar 2006, 18:37:43 UTC - in response to Message 260140.  

You can not have a stderr.txt or can attach to the project, because boinc client hang on the benchmark, and could not start the app.

Do you have running boinc on a NFS filesystem ? - If yes work it on a local ufs filesystem ?

There was some posts here, that sometimes the benchmark hang. The error was not clearly reproducable. But for one user helped to reinstall or reattach to the project. So far a other on has rebootet the system.

Please, can mail me (seti_boinc@dotsch.de) the output from the command "truss ./boinc_client > out 2>&1". - But let it run some time (about five minutes).

ID: 260191 · Report as offensive
Profile SunMicrosystemsLLG

Send message
Joined: 4 Jul 05
Posts: 102
Credit: 1,360,617
RAC: 0
United Kingdom
Message 261630 - Posted: 13 Mar 2006, 23:06:39 UTC

Hi,

We also had some problems trying to get BOINC to benchmark on T2000s.

Since the UltraSPARC T1 CPU has upto 8 physical cores and 4 threads executing per core; 32 'virtual CPUs' are presented to Solaris.
This means the BOINC application tries to lauch 32 instances of the benchmarking process.
Unfortunatley, the US T1 only has 1 FPU shared between the 8 physical cores, so all 32 benchmarking processes are fighting over 1 floating point unit.


The only way around it that I got to work was to set up a zone on the system and add a processor set to the zone which only contained a single CPU - This way the BOINC app would only launch one benchmarking process and there would be no contention at the FPU.
Once the benchmarking process had finished I could then add more CPUs to the processor set - be careful not to add too many, I added CPUs 0, 4, 8 & 12 (virtual CPUs from different cores) so that each vCPU assigned to SETI had it's own L1 cache.

I didn't get as far as assigning 1 vCPU from each core, so I'm not sure how performance degrades as you increase the vCPU count.

Hope this helps.

ID: 261630 · Report as offensive
Didn?

Send message
Joined: 8 Mar 06
Posts: 4
Credit: 357,584
RAC: 0
Germany
Message 261893 - Posted: 14 Mar 2006, 16:20:37 UTC - in response to Message 261630.  

Thanks to all helping to solve this "problem" ;-)

Didn?

ID: 261893 · Report as offensive

Questions and Answers : Unix/Linux : Problem running Boinc on T2000 and Solaris 10 !


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.