Message boards :
News :
SETI@home v8 beta to begin on Tuesday
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 99 · Next
Author | Message |
---|---|
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
I've just replied to the email from Eric that Claggy is alluding to. From my observations through the day, I'd judge that the plain guppi_56520... WUs are the first run, and the guppi_8bit_56520... WUs are the second run. I haven't seen a third run. As posted in this thread, the problem seems to be with the <beam_width> parameter in the WU header. The first run seemed far too low (by a factor of a million), but as we saw they ran OK with the test app: presumably it also has a million-fold error and they cancel each other out. The second splitter run has <beam_width> values much closer to the ones we're familiar with from Arecibo, so that seems right: but the application can't handle them. I'm fairly sure a second correction, to the application this time, will be needed before the next run. |
![]() Send message Joined: 15 Mar 05 Posts: 1547 Credit: 27,183,456 RAC: 0 ![]() |
The problem was actually with the receiver_cfg.center_freq being reported in Hz rather than MHz. New work will be available when the splitter is rebuilt. ![]() |
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
Quoting my own post from earlier today, and adding the matching data for a v7 Arecibo task (also received on Beta today) This looks wrong. All the header data looks the same (by eye only, no guarantee), _except_: Matching data for a v7 Arecibo task: <name>06ap11ag.23142.2112.3.16.31</name> ... <receiver_cfg> <s4_id>3</s4_id> <name>Arecibo 1.4GHz Array, Beam 0, Pol 0</name> <beam_width>0.0500000007</beam_width> <center_freq>1420</center_freq> I'll try modding a guppi task so that both the <beam_with> and the <center_freq> are in the same ballpark as Arecibo (matching units), and see how it runs. Edit - it's got through the 'Optimal function choices' and it's using the normal amount of memory - time will tell, but looking good. (It's not found any signals yet, but it is generating a checkpoint file) |
Send message Joined: 1 May 07 Posts: 556 Credit: 6,470,846 RAC: 0 ![]() |
All v8 WU's have run without fault so far except 8bit a few more to complete. A computer program will always do what you tell it to do, but rarely what you want it to do. |
![]() Send message Joined: 15 Mar 05 Posts: 1547 Credit: 27,183,456 RAC: 0 ![]() |
Edited.... The problem was in generation of the table of Doppler drift rates to be examined. The frequency was a factor of a million too high (given in Hz rather than MHz). The client recomputes the beam width based on the ratio of the work_unit subband center frequency and the receiver center frequency and got a number that was a million times too large. That number is used to calculate the size of the chirp step, which came out way to small resulting in a huge table of Doppler Drift rates. Jeff is off dealing with some flooding problems, but will fix it today or tomorrow and fire off a new GBT splitter. He'll also start a splitter on some Arecibo data to make sure it still works there. ![]() |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
How out of memory crash looks on my PC: ![]() So, app tried to become "chatty" by showing some dialogue. But it can't be formed completely and disappears on first mouse click. I suspect it blocks processing slot until user action though, that's not good at all. EDIT: It's even worse! App's process killed, BOINC shows "waiting for memory" for that particular task... But whole CPU (in my case - 4 cores) blocked from running tasks from this or other projects (!). Most "funny"- GPU tasks continue to run (and their memory consumption for existing SETI apps even higher than normal memory consumption of CPU apps). That's completely absurd behavior that just wastes CPU computational resources. Not SETI issue but BOINC's sheduler one though. News about SETI opt app releases: https://twitter.com/Raistmer |
![]() Send message Joined: 21 Jun 05 Posts: 43 Credit: 155,681 RAC: 0 ![]() |
I guess the _8bit_ ones are now known to be faulty but just in case ... guppi_8bit_56520_VOYAGER1_0012.16007.1.20.23.141.vlar fails on all hosts, on my XP x64 it produced a message on the screen that the application couldn't be started. p.s.: it is a headless cruncher but next time I will try to catch the message and make a screenshot. |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
And more details on this issue. Look memory and CPU graphs at the time of crash. ![]() Whole memory consumed quite fast ((initial saturation) then crash happened (and mentioned dialogue box appeared). Quite interesting how slowly (!) subsequent memory restoration works. Amount of memory in use very gradually decreases (swapping like a hell). What next to note: flat line in the middle on ~2,5GB of consumed RAM. That's both SETI and BOINC problem. Memory consumption remained unappropriately high even when app already crashed (!). Until some of my (user required!!) actions on posting pics closed that half-baked VC++ runtime dialog. Only then memory usage dropped. After that next attempt of running was taken and memory usage become saturated again until next crash. And all this happened with next BOINC client settings: ![]() That is, BOINC completely failed its guarding duties. And allowed "rogue app" to put my system on knees with constant swapping and mouse freeeze in between. I would say this BOINC area requires heavely reworking.... News about SETI opt app releases: https://twitter.com/Raistmer |
Send message Joined: 3 Jan 07 Posts: 1451 Credit: 3,272,268 RAC: 0 ![]() |
I would say this BOINC area requires heavely reworking.... Now would be a good time to talk to Rom Walton about this, and put forward your suggestions for better code. Overnight (10 hours ago) he committed some updates to do with Microsoft C runtime exception handling, like a97b15c20963ab1235b4768ea3b3e3e077a10574 LIB: Explicitly declare a termination function for handling terminate()/unhandled()/abort() CRT calls. |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
I would say this BOINC area requires heavely reworking.... I sent mail about this to dev group. Now up to them to properly react :) News about SETI opt app releases: https://twitter.com/Raistmer |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
Cause this issue understood and described already maybe worth to issue task abortion from server to free those 1434 v8 tasks currently in processing? To save environment, to save testers from frozen systems and to avoid cross-validation of new tasks with already known to be bad and deprecated results. News about SETI opt app releases: https://twitter.com/Raistmer |
Send message Joined: 30 Dec 13 Posts: 258 Credit: 12,340,341 RAC: 0 ![]() |
Just got a fresh batch of work units v8 "_8bit_" Been running now for 54 minutes without a problem. Will see how they do. |
![]() Send message Joined: 15 Mar 05 Posts: 1547 Credit: 27,183,456 RAC: 0 ![]() |
Cause this issue understood and described already maybe worth to issue task abortion from server to free those 1434 v8 tasks currently in processing? In theory, I already did that. In practice BOINC doesn't always work the way it does in theory. :( ![]() |
Send message Joined: 30 Dec 13 Posts: 258 Credit: 12,340,341 RAC: 0 ![]() |
Anyone noticing an increase in Temps on their CPU when crunching these work units compared to other work units? Might just be mine but thought I should say something |
Send message Joined: 12 Nov 10 Posts: 1149 Credit: 32,460,657 RAC: 1 ![]() |
Just noticed I've had an "Error while computing" on a non 8-bit work unit. Only two of us have completed the work unit so far - other result "Completed, waiting for validation" http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=7529656 My results (on a WindowsXP PC with BOINC 7.6.9 running as a service): <core_client_version>7.6.9</core_client_version> <![CDATA[ <message> The system cannot find the path specified. (0x3) - exit code 3 (0x3) </message> <stderr_txt> setiathome_v8 7.99 DevC++/MinGW/g++ 4.8.1 libboinc: 7.7.0 Results from fellow cruncher: <core_client_version>7.6.18</core_client_version> <![CDATA[ <stderr_txt> setiathome_v8 7.99 DevC++/MinGW/g++ 4.8.1 libboinc: 7.7.0[/url] |
Send message Joined: 1 May 07 Posts: 556 Credit: 6,470,846 RAC: 0 ![]() |
Received a batch of 8 bit depends on host when they start processing off to model rail ex for the afternoon. EDIT six failed four running 63% 26% 24% and 3% A computer program will always do what you tell it to do, but rarely what you want it to do. |
Send message Joined: 1 May 07 Posts: 556 Credit: 6,470,846 RAC: 0 ![]() |
Just curious been watching graphics. two different WU's power 500 1012 duration 96583 222298 score 1.03 1.05 A computer program will always do what you tell it to do, but rarely what you want it to do. |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
New WUs splitted. When to expect app's binaries update? News about SETI opt app releases: https://twitter.com/Raistmer |
![]() Send message Joined: 10 Sep 10 Posts: 21 Credit: 852,516 RAC: 0 ![]() |
|
![]() Send message Joined: 11 Dec 14 Posts: 96 Credit: 1,240,941 RAC: 0 ![]() |
I'm seeing a vastly underestimated run time for the new batch of tasks. The initial estimates on mine were also only about a quarter of what they probably should have been, and on one machine they're not readjusting as the tasks progress. That machine is still on BOINC 7.2.33. However, on another machine running BOINC 7.6.6, the remaining estimated times seemed to get recalculated by the time the progress reached about 30%, or perhaps even earlier than that. Now the remaining times look very realistic. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.