[arm-allstar] Repeater key down lockup question
Doug Crompton
doug at crompton.com
Mon Jul 25 10:35:44 EST 2016
Tom,
It is a little hard to diagnose something like this remotely but here are a couple of ideas.
When a large group of nodes are connected together any one can hang the system if it issues an RXKEY and then no RXUNKEY. So it is very important to monitor the logs for these kind of problems. I run a hub here and I have written scripts to monitor the logs. A snippet of the log looks like this -
07/25/2016 10:54:13 RXKEY 40865 WD9EQD 147.435 Simplex 88.5 Bill - Smithville, NJ
07/25/2016 10:54:16 RXUNK 40865 WD9EQD 147.435 Simplex 88.5 Bill - Smithville, NJ
07/25/2016 10:54:20 RXKEY 40879 WA3DSP 223.16/76 PL 131.8 Abington, PA
07/25/2016 10:54:24 RXUNK 40879 WA3DSP 223.16/76 PL 131.8 Abington, PA
07/25/2016 10:54:27 RXKEY 40865 WD9EQD 147.435 Simplex 88.5 Bill - Smithville, NJ
07/25/2016 10:54:32 RXUNK 40865 WD9EQD 147.435 Simplex 88.5 Bill - Smithville, NJ
07/25/2016 10:54:35 RXKEY 40879 WA3DSP 223.16/76 PL 131.8 Abington, PA
07/25/2016 10:54:47 RXUNK 40879 WA3DSP 223.16/76 PL 131.8 Abington, PA
As you can see every RXKEY has an associated RXUNKEY. The only thing you can do in a situation where a hang occurs from outside it to disconnect the offending node.
In your situation it is not clear if something hung in your system or elsewhere. What you could do if this happens again is pull the the USB cable from the offending sound FOB to the Pi. This connection can be hot plugged. So pull it out and then plug it back in. It should restart with the green light blinking.
I guess another question is what FOB are you using? A DMK-URI, modified FOB, etc.?
Monitoring the asterisk log in the client would also be helpful. 'asterisk -rvvv' in a ssh window to the Pi.
It is not beyond possibility that Echolink is causing the problem. People on Echolink often forget to unkey when they have a key setup to toggle TX. I always run echolink on a separate Allstar node so it can easily be disconnected and isolated from the rest of the nodes. So you would setup another Allstar node on your server not connected to any radio in pseudo mode. Configure Echolink to use this Allstar node number. Then connect this node to any other node you want to offer Echolink to. If there is a problem on Echolink just disconnect that node and it is isolated.
Also RF and power spikes can cause things like this. Make sure you use good practices in that regard.
Finally V1.0 is very old now and while it is considered a stable release there have been many improvement in the current 1.02 image. We are also going to have another interim release working up to our V2.0 shortly. If you continue to have problems you might try an update. Always keep your old image though just in case you want to go back to it.
73 Doug
WA3DSP
http://www.crompton.com/hamradio
> To: arm-allstar at hamvoip.org
> Date: Mon, 25 Jul 2016 09:43:26 -0500
> Subject: [arm-allstar] Repeater key down lockup question
> From: arm-allstar at hamvoip.org
> CC: tomw at ecpi.com
>
> We had a lockup occur during an ARES net last that has me scratching my
> head. We are running two arm-allstar repeaters here with a permanent link
> and using version 1.0 of the software. One has been up for almost 8 months
> and the other for two months with an additional 5 months in test mode.
> These are actively used repeaters logging typically 30 hours of transmit
> time a month and they have been very solid.
>
> So what I am about to report may be a fluke??? Last night's lockup was the
> second one we have experienced - both have been on the machine that also
> hosts Echolink and both at times with multiple Echolink users logged on. I
> point that out because mostly we no Echolink connections and then typically
> only one while there were 5 logged in last night when the lockup occurred...
> Not sure if that has anything to do with it but seems like an interesting
> coincidence at least.
>
> When the lock up occurred, a station was keyed up and talking and then the
> audio disappeared - the station talking could still be heard on reverse.
> When that station unkeyed, we did not hear any courtesy beep or sound of any
> kind and the repeater remained keyed.
>
> I was able to log into the controller and initiated an astres.sh script but
> nothing happened. Fortunately, I have a remote outlet controller and
> cycling power on the controller cleared the condition and the repeater
> booted normally.
>
> Several questions as a result of this:
>
> (1) I have been just letting these systems run without any periodic restarts
> or reboots of the RP2 controllers. The last restart of Asterisk was 31
> days before this happened and no telling how long the RP2 had been up.
> Should I be periodically restarting things for good measure?
>
> (2) I'm not a LINUX experienced user - mostly just using what I have learned
> on this project? In Windows, I know where to look at event logs and task
> manager to learn things about the state of things. Is there an equivalent
> set of things I should be check on the PI?
>
> (3) Have you seen this sort of thing happen?
>
> (4) Do you think there is a correlation with added loading from Echolink?
> Should I consider deploying an additional RP on a separate Allstar node to
> handle the Echolink connections?
>
> Tom N5TW
>
> _______________________________________________
>
> arm-allstar mailing list
> arm-allstar at hamvoip.org
> http://lists.hamvoip.org/cgi-bin/mailman/listinfo/arm-allstar
>
> Visit the BBB and RPi2 web page - http://hamvoip.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.hamvoip.org/pipermail/arm-allstar/attachments/20160725/724e8f82/attachment-0001.html>
More information about the arm-allstar
mailing list