Feed on
Posts
Comments

I was recently expanding my analog voice empire and noticed my Cisco ATA191 was blinking like it was rebooting, and coming back. Looking at logs it was indeed warm rebooting and SIP re-registering every few minutes. I ruled out a duplicate IP address, and it never missed a ping. I was wondering if the VoIP provider was having issues, or if adding a new extension somehow broke things, so I rolled back changes I made. After a few hours of maddening searching I saw link flaps on my switch and thought ah ha it’s just a flaky cable. No, shortly after fixing the cable the ATA still kept rebooting. My other ATAs like the Cisco SPA122 and Grandstream weren’t having problems and stayed registered the whole time.

In the debug logs on the ATA I’d see something like this, like something was happening to cause the unit to want to reboot. The “reboot reason 800000” and “reason 0x800000” looked interesting but I didn’t turn up any useful information. Had I had a support contract they could probably tell me quickly, but I’m on my own.

Jul 28 20:09:22 ata01 Network[468]: [netCtrl]: raMonitorMain(), send AUTO_CFG_CHANGE to WAN
Jul 28 20:09:22 ata01 [161241.038349] sysevt_comm_sendto: (54, rc)=>
Jul 28 20:09:22 ata01 System[468]: [rcDbg]==== event_process start-832 , module(wan_module, evid=0x702) ====
Jul 28 20:09:22 ata01 Network[468]: wan_event_process(1279)..recv AUTO_CFG_CHANGE...
Jul 28 20:09:23 ata01 Network[468]: [netCtrl]: runDhcpv6App(), infoOnly = 1, prefixLen = 64
Jul 28 20:09:23 ata01 System[468]: [rcDbg]==== event_process end-832 ====
...
Jul 28 20:09:25 ata01 vsock: nmlink_server_task(), message received: 14
Jul 28 20:09:25 ata01 vsock: nmlink_server_task(), voice app restart
Jul 28 20:09:25 ata01 vsock: system request reboot, type 1, reason 0x800000, graceful 0
Jul 28 20:09:25 ata01 vsock: [cc_pre_reboot_check]: NO CALL, send unregister here...
...
Jul 28 20:09:26 ata01 vsock: SIP_regTsEventProc(event: 28)
Jul 28 20:09:26 ata01 vsock: setRegState(), line{1} REG State(1->0) pCause=unREG
Jul 28 20:09:26 ata01 vsock: SIP_regTsEventProc(event: 32)
Jul 28 20:09:27 ata01 vsock: fpar2_update_flash() Finish SYS Saved, infoCnt=0 sysCnt=3
Jul 28 20:09:27 ata01 vsock: fpar2_update_flash() PAL-PARM Saved, pid=208, type=2, attr=0x0, name=SIP Reg Call ID State
Jul 28 20:09:28 ata01 vsock: fpar2_update_flash() Finish PAL Saved, palCnt=1 infoCnt=0 sysCnt=3
Jul 28 20:09:28 ata01 vsock: reboot_check(341), reboot reason 800000
Jul 28 20:09:29 ata01 vsock: hal_board_warm_reboot (145, tid=0xc47ff460) do system sync
Jul 28 20:09:29 ata01 vsock: SAFE_MON_main() ccTick:6266->6366, ccCnt=6146->0, monNum=2
Jul 28 20:09:30 ata01 vsock: hal_board_warm_reboot () fp-size=1048576 date=2024-07-28T20:09:28
Jul 28 20:09:30 ata01 vsock: hal_board_warm_reboot (163, tid=0xc47ff460) terminate VoIP service

The “AUTO_CFG_CHANGE” bit followed a dump of dhcpv6c details made me think something related to SLAAC, DHCPv6, router advertisements. Especially when I was watching the unit now reboot every 5 minutes now. I had just powered up a new, freshly wiped Cisco 2821 on my LAN with some basic IPv6 config and realized the thing must be sending out IPv6 RAs or maybe some sort of CallManager auto-provisioning thing the ATA was picking up on.

Logging into the Cisco I did a  ‘ipv6 nd ra suppress‘ on the interface plugged into my LAN. It’s not connected to anything else and therefore has no connectivity to offer. Lo and behold the auto reboots stopped!

It was only the Cisco ATA191 that had this problem. The Cisco SPA122 and Grandstream HT802 are on the same LAN and they had no problems at all. I have only one other router on this LAN, and it’s been speaking IPv6 for years just fine. The ATA is configured to get an IPv6 address from DHCPv6 along with a static list of DNS servers. (Now I remember the SPA122 is lame and doesn’t even support IPv6). I can reproduce this by re-enabling RAs on the Cisco and the problem with the ATA191 comes right back and starts warm rebooting again.

This feels like a bug, a device receiving two sets of RAs shouldn’t go janky like this. I dug into this for a while today doing some packet captures from the ATA’s switchport. I could see RAs from my normal router and the Cisco router on the wire, but nothing was really lining up time-wise with the log messages I was seeing. RAs might go by and then like 60-120 seconds later it decides to reboot itself. I can’t even come up with any off the cuff theories, it’s not like the Cisco was advertising anything crazy.

I stopped looking into this problem but maybe if somebody else stumbles upon this it’ll give them some insight to carry on and find a root cause. To be clear I am absolutely not advocating for disabling IPv6 here!

Cisco 2821 Noctua fan replacement

[photos: flickr – Cisco 2821 fan replacement]

I have both a Cisco 2921 and this Cisco 2821 to play with. The 2921 is considerably louder even at idle and not really suitable for my 24/7 homelab production. The 2821 is much quieter so I wanted to use it, but was still enough white noise to notice. If it had brand new fans it may have been quiet enough, but these were of 2005 vintage and had some noise to them. The existing fans were Delta AFB0812SH-F00R, 4000 RPM, 80 mm, 12 VDC, with a 3-pin connector.

Instead of just buying a set of the same OEM fans, I tried a set of Noctua NF-A8 FLX fans. I didn’t think I needed to go all the way to their ultra low noise versions, just some with a lower RPM. At first they didn’t work, then I noticed the red cable on the Delta fans was on the outside, and on the middle on the Noctua fans. Using a paperclip to push out the terminals, I re-arranged them with the red on the left, black in the middle, and yellow on the right. The fans worked, at full RPM.

IOS was cranky about it, repeatedly with %ENVMON-4-FAN_LOW_RPM in the logs and show environment reporting “Low RPM”.

vintage-gw2#show environment

 Main Power Supply is AC

 Fan 1 Low RPM
 Fan 2 Low RPM
 Fan 3 Low RPM

 Fan Speed Setting: Normal

 System Temperature: 29 Celsius (normal)

 Environmental information last updated 00:00:10 ago

The datasheet for the Delta fans shows the white cable is a tach / frequency generator output, and the Noctua fan also has a tach output. At idle the Delta fan was running around 2520 RPM. When I measured the Nocuta it was running at 1400 RPM, so this may be too low for what the router was expecting. I’ve seen on reddit other people have encountered this same problem with other Cisco routers, trying various wiring hacks, with no satisfactory solutions. It may need a circuit to artificially output double the frequency so the Cisco things it’s running faster, or just short the thing to 12 V and be done with it. At least here at home I don’t think it’s going to overheat.

7/29: Playing with my oscilloscope today, I see these kind of waveforms on the tach/fan speed pins of the Delta and Noctua fan. Also at boot the Cisco kicks the output voltage to 12 volt and then settles in around 7.2 volt at idle.

Delta fan tach pin output

Noctua speed output

Update 9/2:

If you get really fed up with the %ENVMON-4-FAN_LOW_RPM messages want want to yeet them into the void, hashtag YOLO, ignore all the consequences, you can set up a logging discriminator:

logging discriminator nolog msg-body drops Fan
logging buffered discriminator nolog 4096
logging console discriminator nolog
logging monitor discriminator nolog

Leave a Reply