Multiple computers setup?

Questions and Answers : Unix/Linux : Multiple computers setup?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
ScottieD369

Send message
Joined: 8 Jun 15
Posts: 26
Credit: 264,597
RAC: 2
United States
Message 2031261 - Posted: 7 Feb 2020, 20:23:42 UTC
Last modified: 7 Feb 2020, 20:36:42 UTC

No I didn't clone the memory card. Full step by step install manually one by one on all Pi's! Yes tiredsome.
The RPI's are connected to a CISCO Catalyst 2960-C Series PoE switch then my ASUS RT-AC5300 router.

PS Rob...... PLUTO IS A PLANET!!!!!!!!!!!
ID: 2031261 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22241
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2031269 - Posted: 7 Feb 2020, 20:55:12 UTC

Are you using DHCP to assign the (internal) IP addresses, or have you fixed the addresses on the RPIs?
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2031269 · Report as offensive     Reply Quote
ScottieD369

Send message
Joined: 8 Jun 15
Posts: 26
Credit: 264,597
RAC: 2
United States
Message 2031273 - Posted: 7 Feb 2020, 21:02:45 UTC - in response to Message 2031269.  

I am using DHCP. I have thought of using static.
ID: 2031273 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22241
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2031288 - Posted: 7 Feb 2020, 21:41:54 UTC

A question for Richard - How does the server side of BOINC determine that a pair of new computers are actually not the same one re-trying?
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2031288 · Report as offensive     Reply Quote
ScottieD369

Send message
Joined: 8 Jun 15
Posts: 26
Credit: 264,597
RAC: 2
United States
Message 2031521 - Posted: 9 Feb 2020, 0:38:12 UTC

I guess we are leaving this to be unsolved?
HACK THE PLANET!!!!!!!!!!
ID: 2031521 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22241
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2031546 - Posted: 9 Feb 2020, 8:29:51 UTC

Richard is somewhat preoccupied just not now sorting out some problems with applications not behaving properly so it might be some time before he gets to this one.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2031546 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2031548 - Posted: 9 Feb 2020, 8:53:49 UTC - in response to Message 2031546.  

Actually, I'm in UTC timezone so at the time you posted, I was simply asleep in bed. And on the way to bed, I realised what I'd done wrong (or failed to do) on the other project, so I may be able to finish that one off quickly today.

I have seen the decision-making code for detecting duplicate HostIDs, but it was some time ago and I've forgotten exactly where. Shouldn't take too long to find.
ID: 2031548 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2031608 - Posted: 9 Feb 2020, 18:17:18 UTC - in response to Message 2031288.  

A question for Richard - How does the server side of BOINC determine that a pair of new computers are actually not the same one re-trying?
OK, here goes.

The routines which handle this are all in https://github.com/BOINC/boinc/blob/master/sched/handle_request.cpp

The normal starting point is at line 237. The problem of multiple computers sharing one HostID number is handled by the RPC sequence number: both the server and the client independently count the number of times the computer has contacted the server. If another computer has contacted the server using the same ID number, this computer won't know about it, and will be lagging behind. That should trigger the issuing of a new ID number for this machine, transmitted with the rest of the reply.

The other routines before line 237 handle various ways of getting things back in sync if ID numbers go AWOL. To check if two computers are 'obviously different' (line 69), the things considered are

* number of CPUs
* vendor
* model
* os_name
* os_version

The other thing which might be important is the host.domain_name - the name by which a computer is known on your local network. I'm most familiar with Windows networks, which have a rule that every computer must have a different name. That's a legacy from pre TCP/IP days and addressing by IP. I don't know what the rule is for multi-Pis running Linux.
ID: 2031608 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22241
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2031614 - Posted: 9 Feb 2020, 18:47:17 UTC

Computer names are much less important these days. There are two names of importance, the MAC and the IP address. The MAC is (supposed) to be an absolute identifier, but these days it probably isn't; the IP address has to be unique within a network. In most of our domestic situations we use our router (WiFi or hard wire) in DHCP mode, crudely this assigns each attached device its IP address within the domestic network. Using NATS the router resolves internal and external addresses so that messages get to the right destination. But things can get messy when the "lease" on an IP address by and IP device expires - the device gets a new (internal) address - I think you can see where this is going - if the lease time is too short fora very slow device I suppose it would be possible for two different devices to have the same address.
But this may be trapped by the server taking notice of how often a host has connected it - RPi are very slow responders and it may be possible for a pair (or more) of them to neatly drop into the cycle o the other, but I somehow think that would be an exceeding rare event.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2031614 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2031620 - Posted: 9 Feb 2020, 19:21:15 UTC

Linux doesn't consider two computers named nano or Nano to be the same computer. Don't know if the BOINC code differentiates between capitalization or not. Interesting thread and the look at the code module Richard linked provided some insight to the error message Buckeye4LF was getting on his new host. So learned something new today.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2031620 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2031643 - Posted: 9 Feb 2020, 20:24:04 UTC - in response to Message 2031614.  

the IP address has to be unique within a network. In most of our domestic situations we use our router (WiFi or hard wire) in DHCP mode, crudely this assigns each attached device its IP address within the domestic network.
Strictly speaking, the IP address belongs to the network interface. I have notebooks which I sometimes attach to the router by cable, and sometimes by WiFi. Different IP addresses are assigned to each connection.
ID: 2031643 · Report as offensive     Reply Quote
ScottieD369

Send message
Joined: 8 Jun 15
Posts: 26
Credit: 264,597
RAC: 2
United States
Message 2031675 - Posted: 9 Feb 2020, 21:57:28 UTC

I am still not considering this issue "solved" just a bunch of guessing at this point?
HACK THE PLANET!!!!!!!!!!
ID: 2031675 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2031681 - Posted: 9 Feb 2020, 22:19:12 UTC - in response to Message 2031675.  

I am still not considering this issue "solved" just a bunch of guessing at this point?


I don't think you ever definitely answered the exact setup of your Pis. you called it a "cluster" but is it actually running as a cluster? or are then simply mounted physically close to each other and share common cooling? If it's actually setup as a cluster and you have one instance of BOINC running on some middle man system distributing work to the pis, then that might explain what's happening.

but if it's not and you have 7 different instances of BOINC running on 7 different systems, you could try some real cowboy stuff to force the servers to treat them as new hosts and issue a new hostID.

- close BOINC
- delete your client_state_prev.xml file (delete the backup)
- open your client_state.xml file
- edit the alphanumeric in the host_cpid field. pick one or more characters and just change them to something random. they look like hex to me, so stick with 0-9, a-f to be safe.

but beware, you are likely to ghost work this way, I do not recommend doing this, only at your own risk. maybe finish out your current batch of tasks by setting NNT, then detach from the project before doing the above sequence.

this host_cpid is generated based on your hardware. if you force the value to be different, the servers will see it as a different system and issue a different hostID.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2031681 · Report as offensive     Reply Quote
ScottieD369

Send message
Joined: 8 Jun 15
Posts: 26
Credit: 264,597
RAC: 2
United States
Message 2031682 - Posted: 9 Feb 2020, 22:22:20 UTC - in response to Message 2031681.  

I am pretty sure I have stated that I have all 7 instances running 7 pi's and 7 BOINC's.
HACK THE PLANET!!!!!!!!!!
ID: 2031682 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2031683 - Posted: 9 Feb 2020, 22:26:58 UTC - in response to Message 2031682.  

ok, so not a cluster then, got it.

you could give the sequence i mentioned a try. be mindful of the impacts.

I would NNT all your Pi's, finish the work, detach from the project. re-attach and then check the host_cpid values. if several of them have the same value, then change them as I described.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2031683 · Report as offensive     Reply Quote
ScottieD369

Send message
Joined: 8 Jun 15
Posts: 26
Credit: 264,597
RAC: 2
United States
Message 2031941 - Posted: 11 Feb 2020, 5:28:45 UTC - in response to Message 2031683.  

I almost kinda hate todo anything right now my number keep growing every day?
HACK THE PLANET!!!!!!!!!!
ID: 2031941 · Report as offensive     Reply Quote
ScottieD369

Send message
Joined: 8 Jun 15
Posts: 26
Credit: 264,597
RAC: 2
United States
Message 2032004 - Posted: 12 Feb 2020, 0:08:55 UTC
Last modified: 12 Feb 2020, 0:09:44 UTC

My first attempt is too static ip all the pi's.
Currently from router view the ip's come and go when they access. (I wonder if it's because of a switch setting?)
And yes I do see all 1-7 addresses assigned.
Also yes they all do have internet access working web browser.
I now guess to wait a few days for this to see any change?
HACK THE PLANET!!!!!!!!!!
ID: 2032004 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22241
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2032056 - Posted: 12 Feb 2020, 8:27:54 UTC

On most routers there is a "lease time" setting which allows you to say how long an address is allocated to a device. It sounds as if your router has a very short lease time, I run mine at 24 hours, which tends to mean effectively have fixed addresses.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2032056 · Report as offensive     Reply Quote
ScottieD369

Send message
Joined: 8 Jun 15
Posts: 26
Credit: 264,597
RAC: 2
United States
Message 2032057 - Posted: 12 Feb 2020, 8:34:52 UTC - in response to Message 2032056.  

Well Rob as I stated that the Pi's are running on a PoE switch. SO I think it is because of the switch? My other devices plugged into the router are my laptop and Chromecast and they both don't do that. Alongwith my phone wirelessly.
HACK THE PLANET!!!!!!!!!!
ID: 2032057 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22241
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2032060 - Posted: 12 Feb 2020, 9:22:34 UTC

The PoE switch shouldn't affect the ip address which should be assigned by the router. But it may be that the switch acts as a router when doing PoE.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2032060 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Questions and Answers : Unix/Linux : Multiple computers setup?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.