The Database

Author	Message
cRunchy Volunteer moderator Send message Joined: 3 Apr 99 Posts: 3555 Credit: 1,920,030 RAC: 3	Message 2038754 - Posted: 18 Mar 2020, 19:28:41 UTC There are a number of threads around the 'hibernation' of SETI@Home so I would not like to get into feelings here. I would however like to understand the idea of the end product of our WUs as a database and how the next step works. It has been suggested that, for example, the database can not be divided into parts and processed \ tested against algorithms (models.) (EG: Can not be sent out to us to process as units to work on.) I would in lay-person's terms like to understand this better. Can someone with knowledge help us understand 'the database' and how it is structured with our WUs and how models (EG: 'that sounds like ET' and 'that is ET speaking to us.' ..) might be applied. Other interesting questions are welcomed. . ID: 2038754 · Reply Quote

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22732 Credit: 416,307,556 RAC: 380	Message 2038760 - Posted: 18 Mar 2020, 20:04:57 UTC The work we did was to filter out the "dross", leaving potential signals. These potential signals were added to the big database with the intent at looking for repeating signals, those which came from the same place in the sky, at the same frequency and of the same characteristic over a period of time. Initially it was thought that this would be possible in near-real-time, but a few things conspired against that approach, including database size, rate data was being added, and that the available hardware was a long way short of the task. Years went by, the database grew even bigger, I'd guess by about an order of magnitude, maybe more. A small sample of the database was selected to do a trial on, using more modern techniques, a super computer - this approach is called "Nebula", led by David Anderson, supported By Eric Korpela. After about three years this is starting to show promise. A few issues with the way the data was stored in the database have come to light which are making the job harder than it could be, but these are surmountable. One thing that has become apparent is that the rate the database is growing would make it very difficult to do all the correlations required and not miss any "new" data. Among the correlations required are location (fairly obvious), frequency, red-shift (despite the fact we are looking at GHz frequencies not light this is still an appropriate name), Dopler shift (slightly different to red-shift, but sort of related). Because there was no attempt to do a sort at the time of data being added to the database you more or less have to look at every other signal for each signal (or group of signals) turn - Nebula's first task is to do that sort then try the various correlation tools to see what ties together - on billions of signal combinations! That's a very rough description of the why and what. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 2038760 · Reply Quote

William Rothamel Send message Joined: 25 Oct 06 Posts: 3756 Credit: 1,999,735 RAC: 4	Message 2038818 - Posted: 19 Mar 2020, 1:03:31 UTC - in response to Message 2038760. Statistical analysis should have rejected the vast majority of the data. I still claim that with graphic chips screening the data coming in off of the antenna this could have been accomplished in near real time. ID: 2038818 · Reply Quote

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 31204 Credit: 53,134,872 RAC: 32	Message 2038820 - Posted: 19 Mar 2020, 1:14:48 UTC - in response to Message 2038818. Statistical analysis should have rejected the vast majority of the data. I still claim that with graphic chips screening the data coming in off of the antenna this could have been accomplished in near real time. GPU didn't exist 21 years ago when collection started. Serendip is doing it in real time across a much broader frequency range, but it is nearly deaf compared to the deep look we give. ID: 2038820 · Reply Quote

cRunchy Volunteer moderator Send message Joined: 3 Apr 99 Posts: 3555 Credit: 1,920,030 RAC: 3	Message 2038843 - Posted: 19 Mar 2020, 2:39:14 UTC - in response to Message 2038820. GPU's of sorts did exist as they are just dedicated CPU's or maths processors. TV cards with graphics processors were around. Analog processing that we so much try to emulate with digital and AI today was certainly around. Honeywell (a name we hear little of today) is touting it's break through in quantum processing. Maybe a nod from Berkeley might get the SETI@Home DB processing posed as a good test project. I'm still unsure as to why we can't break up the database and ship out WU's to run algorithms against. BreakThrough offers out parts of it's database and some help for developers to join in so it must be possible somehow. I guess things and perspectives and technologies have changed over the years but (in theory) the data in the base has not. We just have to apply our new tools to these old chunks of data. (I assume the data in the database relates to chunks of the sky scanned and has not been fouled some way.) ID: 2038843 · Reply Quote

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22732 Credit: 416,307,556 RAC: 380	Message 2039072 - Posted: 20 Mar 2020, 6:57:14 UTC - in response to Message 2038818. Highly unlikley - think how long it takes a modern GPU to process a few seconds worth of data, which is actually only a narrow frequency sample from the live data stream from a single channel from a single multi-channel receiver. My best guess is that it is about 30 times too slow in the time domain and somewhat less than 1% of the frequency domain. NtiPicker would certainly be a lot faster today than it was back when first tried as the hardware available then was so much slower, and may have just about worked on the filtered data stream back then, but what it lacked was the historic data, which took years to build up. Now its just a question of sifting through and correlating data. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 2039072 · Reply Quote

William Rothamel Send message Joined: 25 Oct 06 Posts: 3756 Credit: 1,999,735 RAC: 4	Message 2039111 - Posted: 20 Mar 2020, 12:22:01 UTC - in response to Message 2039072. So whats in this so called data base. I presume that these were work units that repeated on 3 computations of the same unit showing a strong signal that was above the noise. I don't know whether any other criteria were applied to decide to store it. Were they compared to a "Clutter Map" of known emissions at certain frequencies and locations. Were they cross-correlated with any type of of square or sawtooth wave ? Was there any search for simple modulation : on-off, frequency or analog. How is the Allen array handling their data ? ID: 2039111 · Reply Quote

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22732 Credit: 416,307,556 RAC: 380	Message 2039751 - Posted: 22 Mar 2020, 20:33:43 UTC "So called database" - a few years ago it was one of the largest non-commercial databases in the world, and bigger than most banks! The database holds millions, if not billions, of the results culled by us from the data we ploughed through over the last twenty years. Each entry comprises, among other things, the date, time, location, frequency, signal type, signal strength, but they are in a somewhat random order. What has to be done is sort this by location and frequency, correcting for "red-shift" and a few more bits to see if there are "pairs" of data have been collected at different times. These times need to differ by months or years. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 2039751 · Reply Quote

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22732 Credit: 416,307,556 RAC: 380	Message 2039752 - Posted: 22 Mar 2020, 20:37:40 UTC The Allen telescope, like Aerecibo does a lot of different sorts of analysis apart from the bit we've been doing. SETI@Home has only ever worked with a very small part of the data collected, this is down to design decissions done during the very early development days. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 2039752 · Reply Quote

©2025 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.