Feed on
Posts
Comments

TL;DR I hex edit a binary to make it work with my “newer” Courier modems

TL;DR 2: Less than 12 hours later posting I find an updated version of USRSTAT2.EXE from 1997 that fixed my bug and makes my hacking irrelevant

Way back on the US Robotics BBS (USR BBS) they had a door program that would display information about your modem connection from their BBS’s perspective. I always kind of assumed this was some door that talked to their Total Control modem rack over the network and gleamed connection statistics via SNMP or something. I used to run Total Controls later on at my ISP and they had a vast amount of information you could collect over the network and actually had a page written in PHP that would display connection speed to my callers.

Fast forward to the present with my new BBS and it got me to thinking about the old door on USR BBS. I really care about modem connections now, trying to squeeze as much performance out of VoIP to demonstrate it can be done. I knew call diagnostics stats were available on my US Robotics Courier modems via AT I6, I4, I11, just how do you display that to a caller? Browsing various BBS door archives then I found a few programs that said they’d display stats but I assumed they only worked with things like Supra/Diamond modems or were written for specific BBS software. It also seemed kind of crazy to have a door program send +++ commands to take over the modem while a caller as on. So I didn’t put much thought into it, written off as a forgotten wish.

Calling around to some new BBSs the other night I called up Another Millennium (949-59-31337, cute) and saw they had a USRstats page that looked like what I remembered from USR BBS! I looked around at what they were doing it with, it looks like they are running a version of USRSTATS for Maximus from 1995, which is its own compiled add-on.

USRSTATS – MEX on Another Millennium BBS

MODST

This got me back to looking at the doors that were out there like Modem Stats (MODST120.ZIP) and USRStats Generic (STGEN107.ZIP). Modem Stats was the first I got working. Despite being developed/tested apparently in a Remote Access + Diamond Supra modem environment, it worked just fine with Wildcat! and my USR Courier modems. Hooray!! It was a bit basic and just displayed raw ATI6 output to the caller, which I understand now was so that it provided generic output that would work with any modem. But it worked, finally!

Modem Stats (MODSTATS)

STGEN

Mentioned in the Modem Stats documentation was how it was inspired by STGEN, written by Joe Frankiewicz. So I went looking at STGEN and what it did. This looked more promising but when I got it working it would display part of a screenful of ANSI-formatted data and then abruptly end:

Broken USRSTAT2.EXE

Spending more time reading docs and playing with STGEN (and the modified STGEN-MC.EXE) I realized that STGEN handles comms with the BBS software, caller, and modem, but it saves raw AT-command output from the modem and feeds this to USRSTAT2.EXE, which actually parses the output and makes the pretty ANSI screens to display to the caller. If one were so inspired they could write their own USRSTAT2 replacement and generate whatever screens they wanted. The QuickBASIC source code to STGEN and STGEN-MC are included so you can even modify those till your heart’s content.

(STGEN108 contains both the original v1.07 source code written by Joseph C. Frankiewicz, and a binary+source of a revision called STGEN-MC written by Michael Conley 12/14/95. I’m running the STGEN-MC version but may refer to it as STGEN)

But why was USRSTAT2.EXE only displaying part of a screen? The fact that it cut off after displaying “Preemphasis” made me think it had a problem parsing the modem output, and indeed when I ran USRSTAT2 by hand while connected it threw a message saying “USRSTATS trapped error.”

Now I’m stuck, I don’t know what it’s dying on, there’s no source code for USRSTAT2 included, and a bunch of Google and BBS archive searches don’t turn up anything. It seems I have the latest version of USRSTAT2 that exists. I went back and looked at the USRSTATS MEX version for Maximus and wondered if that could be compiled as a standalone binary I could use with Wildcat. I noticed that the MEX version had a bunch of .LOC files that were output captures for various USR Courier models that it had been tested against. I wonder what would happen if I fed one of those to my USRSTAT2.EXE?

SUCCESS!

USRSTAT2 happily generated a full connection report from the files in the MEX version, including the frequency response table:

Correct USRSTAT2 output

Correct USRSTAT2 output

Now the question was why? I combed over the example VEVR1195.LOC file compared to the output STGEN-MC was grabbing from my modem. Command output was ordered differently, so using a text editor I moved blocks of output around, that didn’t help. Because USRSTAT2 was dying at “Preemphasis” I started looking at that line in the modem output. Ah hah! At least one line of modem output from my 1997-era Courier was slightly different than the 1995 Courier when STGEN/USRSTAT2 were written:

1995 Courier ATI11 output:
...
Preemphasis (-dB ) 8/8
...

1997 Courier ATI11 output:
...
Preemphasis Index 0/0
...

I manually changed that line in my modem output and boom, USRSTAT2 produced a full report! Now the question moved to “how do I fix this output on the fly?” STGEN-MC does a SHELL "USRSTAT2.EXE" directly, so there’s no way to modify the temp file before it’s fed to the report generator. I pondered re-compiling STGEN-MC to fix up the modem output on the fly, or having ChatGPT whip me up a C or Pascal shim to replace USRSTAT2.EXE, fix up the modem output, and call the real USRSTAT2.EXE.

After sleeping on it I wondered if I could get away with just hex editing USRSTAT2.EXE and fudge the string it’s looking for? So that’s what I did. Using WinHex, I found the instances that contained ‘Preemphasis (dB)’ and replaced them with ‘Preemphasis Index’, making sure to make the new string fit in the same spot.

 

USRSTAT2.EXE – Before

Fixing two strings

USRSTAT2.EXE – After hax

And it worked!!! It read in the output from my newer Courier modem and didn’t crash. I now have a functional STGEN-MC and USRSTAT2 door that produces pretty modem reports for callers. I know nothing about patching Windows binaries, so I don’t know how to distribute what’s about a 10 byte change.

I did notice at the very tail end of STGEN.DOC “Source code for the USRSTAT2.EXE module is NOT being released at this time, as development of that module WILL be continuing.” I have no idea if a new version has been released since then, if there is I haven’t found it.

 

As an aside, I didn’t know who Joseph Frankiewicz was until now. I was Googling his name to find out more about USRSTAT2 and found his name in an old German US Robotics FAQ usrfaq.txt file, where he talks about the USR BBS and USRSTAT and identifies as working at US Robotics. I found out through forum posts it turns out he was either the sysop of the USR BBS and/or an engineer with a ton of modem knowledge that interacted a lot with sysops. It would seem he wrote the original USR BBS door or at least the first PCBoard version if it, found as ST234B.ZIP “USR STATS V.234 BETA 5/22/94”. This zip file includes more documentation about his original USRSTAT.EXE program. I have no idea if he’s still developing software or still around.

Update 5/14/2024 7:00 PM

Less than 12 hours after I patched my USRSTAT2.EXE and typed up this post, I found USRST419.ZIP on sak.sk through random googling. This included USRSTAT2.EXE version 4.19 dated 2/28/97 which the change notes says it has fixes for “Total Control x2 modems, Courier x2 modems, newer Sportster modems, and fixed colorization of the Preemphasis fields.” This new version works right away with the STGEN-MC door and my Courier modems, making all my clever hacking completely obsolete. My ego is crushed a bit but I’m glad I found a newer version.

Taking a peek at the v4.19 binary it looks like it now can string compare three variants of the field I was having problems with, “Preemphasis (-dB)” “Preemphasis     (-dB)” and “Preemphasis Index”. I’m sure it has more string comparisons to handle other newer modems but I didn’t check them that closely.

USRSTAT2 v4.19 2/28/97

Google is very frustrating to search for anything named ‘USR’ because it desperately wants to add inflection even with quotes, and a few decades of indexing unix things with “/usr” sure isn’t helping.

(Rant warning) TL;DR I gripe at how complicated it gets and offer no solutions. I really do like what Let’s Encrypt offers. Just getting there figuring out what options work and don’t work is work. I don’t know how the muggles manage it.

TL;DR 2: HTTP-01 was out because of internal sites. DNS-01 was the only option, but I don’t use 3rd party DNS with APIs to handle automated challenge updates. Wound up installing a standalone ACME-DNS server for challenge responses.

I finally got annoyed enough at my TLS certificates that I started seriously trying to use Let’s Encrypt and ACME. I only have a couple of normal public-facing websites running on port 443 on the Internet, but internally I have a small army of Ubiquiti EdgeRouters, switches, wireless bridges, UniFi wireless controllers, Raspberry Pis, and other software with web servers on ports other than 80/443 that all need certificates. For years I’ve ran my own private CA to issue certificates, but it’s the same problem as commercial certificates to issue them and load them on all of my devices. Some browsers like Chrome now bitch at self-signed certificates in some cases too, so that’s not really a fix either.

One could say “but ha ha only suckers use web interfaces on routers”, which is true, but on occasion I do use them I’m reminded of the stupid problem of a long expired certificate and have to jump through the browser warning hoops every time that yes I’m yolo’ing to this allegedly sketchy device. A lot of the Ubiquiti stuff is only manageable over WebUI. This goes double if I’m on a new device that doesn’t have my private CA root certificates installed, or Android which makes it really difficult to install a private root CA. Triple combo pain if you’ve accidentally configured HTTP Strict Transport Security with a long lifetime to cover sub-domains and you try to reach something internally with a hostname with that domain with a bad certificate and the browser is like fuck you I won’t let you visit this site at all! This is where Chome’s ‘thisisunsafe’ override really comes in handy! I just want stuff to work and go about my day man, and not leak passwords.

No to HTTP-01

I can’t just throw certbot or acme.sh everywhere and call the problem solved. First of all not all of my devices are exposed to the Internet to accept HTTP-01 challenges from random sources, not to mention the non port-443 services. (God bless them for having a mix of IPv4 and IPv6 probe sources to handle IPv6-only endpoints which helps.) For the whole existence of Let’s Encrypt every year I thought about this problem I would look up how to do run my own ACME server with my private CA, groan at the apparent effort learning the whole ACME protocol and leave it.

DNS-01 with caveats

That leaves me with DNS-01 challenges. The problem here is that I don’t use a cloudy/third-party DNS provider that has an API where I can automatically update TXT records for automatic certificate renewals. I run straight up BIND and further my authoritative servers each use independent replicated master files with no slaving. This means any kind of dynamic updates to my DNS would have to go to each DNS server. I do have a few dynamically-updated A/AAAA records in sub-zones and for years I’ve just been running nsupdate twice, one for each authoritative server, and this has worked fine. ACME clients I’ve seen don’t support nsupdate to multiple servers, so this would be a hack to carry around.

I’m not casually replacing BIND nor throwing it all on the cloud. This recently lead to me to thinking ok fine maybe it’s not so bad doing a master/slave of my dynamic zones, that way an ACME client would only have to update one. This then lead me what to do about zone keys distribution. I’d have to copy the same master TSIG key around to all of my devices, or create a TSIG key per domain, sub-zone, or per device/A/AAAA record, which gets tedious and unpalatable.

acme-dns

I retreated and thought surely others have hit this problem too. This lead me to the acme-dns server project. It’s a little standalone DNS server that does nothing but serve up TXT records and has a simple REST API. I set it up on my internal IPv6 network so all of my internal devices can reach the API, and expose the DNS server on port 53 to the Internet so DNS-01 challenges are queryable. In my master zone files I delegate a sub-domain via NS record to the acme-dns IP address, and then create CNAMEs that point at that sub-domain so all challenges go to the acme-dns server. This is where I praise Let’s Encrypt for having IPv6 probes, they can reach my acme-dns server without having to burn a public IPv4 address just for it.

It took me a while to figure out how to actually use the thing. Certbot requires yet another thing to be installed, an acme-dns hook program. Have I mentioned how complicated this whole ecosystem is? acme.sh already includes a hook. Rant 1: good lord that thing is one massive unit of a Bash script. Rant 2: I really do not like it when installation instructions are “here just curl | sh”. It doesn’t just download a single file, it downloads several directories of files and shoves stuff into your crontab. Who knows what else it did. Must find DEB/RPM packages of that sucker.

Ok so now for each and every hostname+FQDN you want a certificate you want, you have to hit the /register endpoint of the acme-dns server first with a curl POST request. This generates a “username”, password, and a string for a sub-domain. This only exists in the acme-dns server database. If for example I had gw1.example.com, I would now add a record in my zone file that says “_acme-challenge.gw1.example.com.  IN CNAME asdf-asdf-asdf-asdf-asdf-asdf.acme-dns.example.com.” *.acme-dns.example.com is already delegated via NS record to the acme-dns instance. This is tedious and annoying to do for a bunch of hostnames but it’s for the greater good and only has to be done once fortunately. By default acme-dns uses SQLite (or Postgres), so either way back that sucker up or you’ll have to re-generate every single one of your domain usernames when something dies.

Then for each and every hostname, take the username/password/subdomain, feed them into environment variables and then run acme.sh to issue the certificates. Witness the gigantic scripts in action!  Stuff going to the CA, TXT records being fed to acme-DNS, stuff going to the DNS server, stuff coming back from the CA, more stuff going back and forth!  If you’re lucky you get a few certificate and key files left. If you’re unlucky, good luck troubleshooting which step in this whole process broke down.

Finally, certificates!

Now you have certificates, what to do with them! This is another whole bear of a problem to tackle because there’s an infinite amount of web servers and directories to insert certificates into. Again there’s a whole ecosystem of Certbot/acme.sh deployment hooks that try to handle your webserver. Also remember by default this is all happening within your home directory, so keys have to be copied to secure system directories owned by root too. This is where I’m at now. I have some devices like Ubiquiti EdgeSwitches that can’t run an ACME client directly, so I have to rig up things to scp over the certificates.

I hope all of this just magically works and auto-renews in 90 days, what a pain to set up!

TRS-80 video fixed!

Good news! A new video RAM chip fixed the lingering video artifacts I had. I looked up the IC model that was in it “HM6116LP-2” and found some on eBay. I had to wait a couple of weeks for them to arrive from China, but popped the first one in and it worked like a charm. Now I have four extra SRAM, what to do.

After scant Reddit and YouTube review browsing I ordered up a Rigol DHO814 oscilloscope to take on floppy drive repair. It’s a bit of a letdown that it seems like it’s just a glorified Android tablet with inputs, but maybe they all are these days.

Original chip with problems in center

I finally got around to working on the TRS-80 model 4 again, specifically troubleshooting the floppy drives. Last time I tried to boot it, the drive light would come on and it didn’t sound like it was doing anything. I suspected the belt on the bottom had deteriorated/and or lost tension and wasn’t spinning the disc.

I pulled the drives out out and the spindle drive belts were intact, had tension, and moved freely. The stepper motors that moved the head were another story. Following this YouTube video on restoring TRS-80 floppy drives, first thing I did was swab down the rails and apply some lube. On drive 0, the stepper motor was completely seized up and wouldn’t move. At first I wondered if this was because there was no power and there was some sort of clutch holding it in place. Some YT videos suggested that the stepper motor and head assembly should move freely with very little effort. This was the case with drive 1, I could easily slide the head back and forth with my fingertip (video), especially after lubricating the slide rails.

For the record I identified I have a model 4 gate array system, 26-1069A. The two 5.25″ floppy drives are from Texas Peripherals, model 10-5355-001. The drive PCBs are marked “Tandy Corp” 1983, 10 5053-012. I actually have an original Tandy 5.25″ head cleaning disk somewhere, but I couldn’t find it when it came time to work on the drives. These disks are amazingly spendy now on eBay!

On drive 0 I worked the stepper motor coupling clamp back with my fingers and forth for a few minutes and eventually it worked loose, the head was able to travel the full range of motion without any effort. On the Vintage Computer Federation forums there were several threads about TRS-80 stepper motors, in particular I followed this one which had some cassette BASIC code that would manually spin the floppy and move the head carriage back and forth between track 0 and 39. Drive 0 seemed to be functional, it would spin the drive motor and would seek the drive the full distance over and over again (albeit a bit noisy). Drive 1 would spin the drive motor, but the stepper motor angrily chattered and jerked in place (video) like it wanted to move but was stuck. A common problem with these drives seems to be the coupling clamp would come loose and slip when the stepper motor moved, best I can see there’s no slip and the screw is snug.

I have a bunch of our old TRS-80 disks, I found a couple I think had TRS-DOS 1.3 on them and tried to boot. No go. Maybe my 40 year old floppy disks were shot? Following a couple (#1, #2) of Adrian’s Digital Basement videos, I wound up getting a 5.25″ 360k PC floppy drive off of eBay to hook up to the 486 to make some new TRS-80 boot disks. For good measure since I’m going through all this effort I bought some new 5.25″ DS/DD floppy disks from floppydisks.com. I downloaded Dave Dunfield’s ImageDisk and made a few different TRS-80 boot disks with TRSDOS 6.21, 1.3, and Floppy Doctor. Unfortunately after a week of waiting for stuff to show up none of them booted in my TRS-80 either.

So now I’m back at figuring out what to do next. It seems likely that wrestling with the stepper motor on drive 0 I could have knocked the drive out of alignment. There are several pages and forum posts that describe how to check and re-align them but I’m short an oscilloscope at the moment. I saw at least one person just advocating yolo’ing it by turning the alignment screw until it boots. Because I kinda want to learn how to use a scope, that might be my next sub-project.

FreHD

I’d like to keep both floppy drives in the system to keep it original, so I’m not keen on replacing one with a disk emulator. I did decide to opt for ordering a FreHD hard drive emulator because ultimately I’d like to be able to use this thing and not be at the whim of 40 year old floppy disks and disk drives. I learned there’s an EEPROM that can be installed in a model 4 along with some bodge wires so it’ll boot directly from the FreHD without need of loading a DOS from a floppy first. This might save me if I can’t get the floppy drives working.

Computerfacts!

By sheer dumb luck of googling for part numbers of IC chips I ran across the Sam’s Computerfacts for the TRS-80 Model 4 Gate Array on archive.org. I’ve never seen a Computerfacts book before, but it seems to be the Chilton/Haynes of old computer repair. Most of the manuals and schematics I’ve been looking at were for the original Model 4, so they didn’t quite match up. This is exciting because this manual matches my motherboard and TPI floppy drives exactly! I wish I had found this a long time ago, it provides a ton of troubleshooting workflows, including aligning the floppy drives and checking track 0 seeking.

Video artifacts

A secondary problem on the TRS-80 is video artifacts. I have several rows of ( ( ( ( ( ( ( ( ( that stay on the screen. It doesn’t appear to affect operation, altho it does jumble text when it scrolls up into the affected rows. Seems like it could be some bad RAM or video RAM. Some enterprising individuals at least in model III have gone as far as taking the hex values of the garbage characters, subtracting from what it should be and identifying exactly which RAM chips need replacing. In my model 4 there’s a bank of 8 chips of RAM, and only 1 chip of video RAM. The Computerfacts lists video RAM as a commonly replaced part in troubleshooting video issues, so I ordered up some 4016 chips on eBay from China. Hopefully they’re not fake and fix my issue, else it could be the regular RAM. I don’t think it’s the character generation ROM, which is a good thing because I don’t know how I’d get a replacement.

Heater line repair

The pandemic, work from home, and layoffs have put a damper on my driving. I would easily put 15-20k miles a year on the truck, in Feburary 2022 I hit 380,000 miles, and I just rolled 389,000 miles, so 9k in 2 years. I figured I’d pass 400,000 by now. I also now have my dad’s Ram truck too which gets some of the miles.

Right before I left for Christmas I had a leaking heater hose connector, where the hose meets the heater core in the firewall. I thought ah hah, I’ll just slap another clamp on there. And it worked, I drove it for a while, got the engine hot, leak stopped, great. I’m not sure how long it had been leaking, occasionally I’d smell coolant, but it wasn’t until a cool day that I just happened to see some steam coming out when I was getting something out of the passenger side.

Two days later after running some errands, the day before I was supposed to fly out, it started leaking again and by the time I pulled into my driveway the hose completely broke off and dumped all the coolant in a giant cloud. What I hadn’t realized until I watched YouTube later was that there was a plastic quick disconnect fitting between the hose and the heater core, that’s what snapped off. Had I known that thing was there I would’ve just taken it directly to the mechanic the first time.

I was really worried that I might have finally done the truck in, this may have caused it to overheat and warp something. Either way it was gonna have to set a while till I got back. Last week I finally got it towed to a mechanic and fortunately it was a simple fix of replacing the hoses and fittings, no other damage was done. It’s been running fine since. I also had them replace the leaking valve cover gaskets, so between the two of them hopefully this fixes my coolant and oil consumption.

To date, here’s all the things I’ve replaced or fixed on the truck in it’s 20 year lifespan. I feel like I go through a lot of batteries. The transmission is the major thing I’ve had to fix, first a rebuild way back in 2010-something and then finally had to replace it entirely. AAA is great for towing, would recommend.

The clear coat on the roof has started to go, I’m debating if I want to spend the money to get it re-sprayed. I’d like to keep this truck as a backup as I’ve learned having two vehicles is pretty handy and want to trade the Ram in on something that’s 4wd.

– At least 4 sets of plugs and wires
– heater hoses, disconnect fittings
– valve cover gaskets
– 2 fuel pumps
– 2 water pumps
– 2 sets of O2 sensors
– At least 4 sets of tires
– 5 batteries
– 2 alternators
– 3 sets of brake pads
– 1 remanufactured transmission
– 1 transmission rebuild
– 1 differential rebuild
– 1 a/c compressor and system
– 1 starter
– 1 throttle body assembly
– 1 evap vent solenoid
– 1 set front wheel bearings
– 2 sets shocks
– 1 oil pressure sensor/sender
– 1 rear main seal
– 2 windshields
– 2 catalytic converters
– 1 upper control arm
– 1 blower motor resistor
– 2 serp belts

I guess sometimes you have to challenge your assumptions and just try things. For years and years at home and and in Oklahoma I have been running Hurricane Electric’s Tunnelbroker tunnels to get IPv6 to my networks. At the time when I set them up, everything said the 6in4 tunnel (specifically IP protocol 41) wouldn’t work through NAT and would have to set up the local tunnel endpoint on the same router where you get the public IP address from your ISP. So that’s what I did, I used my own routers and configured my VDSL and cable modems to work in bridge mode so the real public IP address was directly on my router and my tunnel traffic wasn’t being NAT’d. This has worked great for years and years and I never gave it another thought, I never had to think about the config since.

I would discover recently that this assumption is not quite right, 6in4 tunnels can indeed work across a NAT, even double NAT.

For various reasons I had two sites in Oklahoma, each with their own DSL connections to the same ISP, and interconnected with a wireless point to point bridge. At “Site 1” I got the ISP to configure their VDSL modem in bridge mode and I provided my own EdgeRouter. I configured PPPoE on my router and got the public IP address on my router. Here on this router I configured my Tunnelbroker tunnel, through it I funneled IPv6 subnets from both sites. Locally originated IPv4 traffic went out the local DSL connection.

Site 2 was a little more traditional, here I was using the ISP’s provided VDSL modem+router (Comtrend CT-5374) which acted as a NAT; public IP address on the WAN interface, 192.168.1.x/24 on the LAN side. Behind it was another one of my EdgeRouters that had an interface that connected back to site 1, and the local LANs attached to it. IPv4 things here going out to the Internet wound up being double NAT’d, first through my EdgeRouter, and then NAT’d again through the ISP router. Again locally originated IPv4 traffic went out the local DSL connection. Any v6 traffic was hauled over to Site 1 where the Tunnelbroker tunnel was.

Site 1 and Site 2

Local IPv4 traffic went out the local VDSL connection, IPv6 traffic went out the tunnel on router A:

IPv4 and IPv6 traffic routes

Sadness and a surprise

This all worked great for years until this summer the VDSL modem at Site 1 (in bridge mode) up and died. I had a spare modem but I couldn’t get it to work with the ISP. This broke my IPv6 connectivity because Site 1 no longer had any direct internet connectivity. This made me sad because I didn’t have another bridged modem and just (wrongly) assumed it wouldn’t work through the NAT at Site 2. I was gonna have to beg the ISP change the config for me, or do some janky port forwarding, gnashing of teeth, resort to setting up an IPsec tunnel, etc etc etc.

One day I was changing default routes on Router A at Site 1 to send all traffic to Router B at Site 2, so I’d at least have working IPv4 Internet access at Site 1. A few minutes later I started getting recovery alerts that all of my IPv6 hosts were reachable again. Huh?

Initially I thought the modem at Site 1 started working again, but checking again it was in fact still dead and everything was now running through Site 2. I thought about it for a minute, why wouldn’t IP protocol 41 work through a NAT? And for that matter, through double NAT? We can translate ICMP (IP protocol 1) through it just fine, and from an IP perspective, it’s just another protocol number.

IPv6 after the breakage

 

I got to re-reading all of the forums discussing the setup of Tunnelbroker connections and noticed none of them really outright say “this won’t work through NAT”. Instead it was always strongly implied it might or might not depending on the vendor’s NAT implementation, “it’s best if you do this instead”, caveat emptor, you’re on your own, etc etc. So if your NAT was really just doing only UDP/TCP PATs, or had something obnoxious that specifically blocked IP protocol 41 (I hear AT&T used to do this for “security”), or didn’t have full mapping from the public address like with CGNAT, this wouldn’t work. But from a pure NAT standpoint on paper, it would work fine.

Normally I’d have packet captures to prove to myself how it works, but in this situation I got lazy because there’s a lot of v6 traffic going in and out and it’s annoying to sort it out right now.  I got to looking and there was no special IP protocol 41 NAT config on the ISP’s router (I cheated and have access on it) nor on my EdgeRouter. More annoyingly the ISP’s router doesn’t provide any way to dump the full NAT translation table to prove my situation, just a list of the PATs which doesn’t show non UDP/TCP things.

I would assume that from a first-power-on sequence, there would have to be one outbound v6 packet to get the two NAT translations set up. After that, both routers know how to forward packets to/from the Tunnelbroker. It’s worth noting that the other direction works too, Internet -> inside connections work fine too, I can SSH directly to my v6 hosts from the v6 Internet. In my particular setup I have a script that tries to ping a host on the Internet to check for v6 reachability before trying to send a command to the tunnel API to update the endpoint address, so this seems to fill the first packet checkbox.

For egress IPv6 traffic, the path looks something like this:

Packet leaves host -> hits v6 gateway on router A -> v6 packet goes to tunnel tun0 interface where it’s encapsulated as 6in4.  6in4 packet is destine to the Tunnelbroker server as an ordinary IPv4 packet sourced from 192.168.10.1. Gets routed over wireless bridge to router B, NAT’d to 192.168.1.4, hits the ISP’s router, where it’s NAT’d once again to the public IP address, and forwarded out to the Internet.

The return reply comes in on the ISP’s router where the NAT table says it came from 192.168.1.4. It gets forwarded to the EdgeRouter which in turn resolves the NAT and sends it to 192.168.10.1. Once it’s back on router A, it goes through the tun0 interface, and out pops a IPv6 packet.

NAT translation on router B, 216.66.77.230 is the Tunnelbroker server, “unknown” protocol:

$ show nat translations source address 192.168.10.2
Pre-NAT src Pre-NAT dst Post-NAT src Post-NAT dst
192.168.10.2 216.66.77.230 192.168.1.4 216.66.77.230
unknown: snat: 192.168.10.2 ==> 192.168.1.4 timeout: 599 use: 1

TIL.

This has some interesting implications had I figured it out sooner. One problem I’ve always dealt with on EdgeRouters is that the 6in4 encapsulation happens on the general purpose CPU and is not hardware accelerated. As such I was always bottlenecked on my Internet IPv6 throughput using a tunnel. My solution was to buy a faster EdgeRouter and then eventually come up with an IPv6 policy based routing setup where I could use native (read: faster) Comcast IPv6 for day to day stuff while keeping my servers on the HE IPv6. Had I known what I know now, I could’ve moved my HE tunnel to a Linux box to offload the work of processing 6in4 encapsulation and saved a bunch of (admittedly interesting) work arounds.

I need to get around to moving the HE tunnel from router A to router B in Oklahoma so I can finally deprecate Site 1, but that’s a project for a future time.

Update 28-Dec-2023:

While in Oklahoma I did some fiddling with the EdgeRouter and found some shortcomings. I had been running version 1.10.10 of their software and attempted to update to 2.0.9 on router B. When the router came back up, the tunnel was broken. Also doing things like moving the tunnel between router A and B necessitated a reboot of B, so there’s some sort of state I’m losing. (Keep in mind my external public IP address doesn’t ever change during all this because it’s on the ISP’s CPE router)

What was interesting after upgrading to 2.0.9 on router B was that running tcpdump on the interface facing the ISP router and lots of IP protocol 41 packets were very clearly coming in from the Tunnelbroker server to my inside NAT address on router B. But, they weren’t being forwarded to router A which was configured with the tunnel. As far as I can tell they were just being blackholed. Likewise, running tcpdump on my tunnel router A, IP protocol 41 packets were being originated and destine for the Tunnelbroker server on the Internet, but also weren’t being passed on beyond router B. This seems to imply some sort of implicit NAT translation on router B was missing after the 2.0.9 upgrade to let them flow back and forth. I tried a number of static destination and even source NAT rules on router B which looked like they should shoehorn inbound traffic to the tunnel router A and the return traffic, but nothing seemed to do the trick.

I reverted back to 1.10.10 with the exact same original configuration (that had no special protocol 41 NATs). If I recall correctly, the tunnel wasn’t working at all, and then suddenly on its own 5-10 minutes later it spontaneously started working and pings to a v6 Internet host worked. I didn’t capture that bit in tcpdump, so I’m not sure what went by to knock it loose.

There were some times when I’d reboot that the tunnel wouldn’t work, I’d give up and reboot it again. Then it would work on the next boot. The last thing I did was to move the tunnel endpoint over to router B to consolidate and simplify things. I would’ve expected the tunnel to just start working again after the move. Inbound protocol 41 traffic was still coming in from the Internet, and I was generating response traffic, but it was still being blackholed again on router B. After a reboot, IPv6 traffic to/from the Internet started working again. I don’t know what the deal was, maybe a stale IP route cache or a stale translation?

I ran out of time to dig deeper to find a more respectable root cause. I’ll have to try the 2.0.9 upgrade again on the next trip to see if I can get it to stick, or find out if there really was a change that broke it.

TuxedoCat Lounge BBS

This year’s project has been running a vintage DOS-based bulletin board system like I use to run in 1995 before I started my ISP. I’m running Wildcat! 4.20 Multiline 10, the same as I did back then, except now under Windows 7 32-bit instead of Windows 95. It has real US Robotics Courier modems that you can dial into, along with supporting telnet connections. I have several drafts written up about the project, I just haven’t decided where to start. (Some info is linked at a top tab on my site)

I had my first ever Internet interactions with BBSes that offered e-mail gateways and had my first e-mail address in 1994. I wished to do the same thing back then, but it just wasn’t affordable for a teenager between the provider fees, long distance charges, and frankly I had no users.

Most BBSes did not have TCP/IP stacks until the late 90s, thus no SMTP. For message networks, dial-up modem store-and-forward was the name of the game, be it FidoNet, or calling up a UNIX system and using UUCP (UNIX to UNIX copy) to gateway mail+usenet to the Internet.

After setting up my Wildcat! board this year I pondered if it was possible to set up wcGate and actual, authentic dial-up UUCP in 2023. All the traditional UUCP mail+usenet news providers are long gone, even the TCP based ones, so I’d have to roll my own. Most documentation explaining how to set up a BBS to use a UUCP or Internet Service Provider spent far more time explaining how to secure a Internet domain name, describing how expensive an ordeal it was, or preaching netiquette, and virtually nothing about what happens under the hood as that was the provider’s job. I have no experience at all with UUCP and wanted to see if I could get it working.

wcGate

The elusive wcGate Sysop Guide

The accompanying message handling software for Wildcat! was an add-on called wcGate. This served as a message gateway from the BBS to a variety of systems such as Novell MHS (mail to Novell Netware users in your org), other BBSes, and Internet email and usenet newsgroups via UUCP and satellite connections (e.g. the now defunct Planet Connect, PageSat).

It was a whole side quest getting documentation for wcGate. It’s long since been discontinued and I hear when Mustang Software sold off Wildcat!, no manuals made it. I had my old sysop manual for Wildcat!, but not wcGate. I spent weeks searching Google, Bing, Pirate Bay, Reddit, FB groups, BBS libraries, BBS archives, eBay, university libraries, nothing. Out of pure desperation and curiosity I searched the Library of Congress and found they actually had a physical copy in their collection and I was seriously in contact with them to have it duplicated just to amuse myself (they balked until I could get copyright holder permission because they were technically selling me a copy). Eventually somebody on Facebook heard my plea and very graciously got me a copy of the manual, and I set off to figure out how to get it set up.

I threw together a Raspberry Pi, Taylor UUCP, an extra modem as my UUCP host and got wcGate talking to it in a couple of afternoons. Despite getting a 40 year old protocol working with a 30 year old DOS application, it was actually pretty straightforward.

I won’t go into the implementation details here, I’ll save that for another post. However I will go through the life of a message. I will soon upload Chef cookbooks to Github that I created for setting up DKIM, mgetty, and UUCP.

 

Topology

  • Modems are connected to Cisco analog telephone adapters (ATA191, SPA112), and subscribed to a VoIP provider to provide a real phone number for inbound calls
  • Each modem has its own telephone extension which allows for internal calls for free
  • tuxedocat” is the UUCP site name of the BBS, and “wannnet” is the UUCP site name of my Raspberry Pi.
  • wcGate automatically maps an e-mail address of <firstname>.<lastname>@tuxedocatbbs.com to users on the BBS.
  • uucp.wann.net is the FQDN of the Raspberry Pi.

 

BBS message to Internet walkthrough

It’s worth emphasizing that this is a batch process, it’s not real time like SMTP. In the old days a BBS without a full time Internet connection might dial up a UUCP provider once or twice a day to send e-mail to the Internet, meaning it may take 24 hours from the time you sent it before the recipient got it. If they replied to it, it might take another 24 hours for the reply to make it back to the BBS!

Here at TuxedoCat Lounge the e-mail is processed every 2 hours.

New message

Creating a new message on the BBS

First we’ll join the Internet e-mail message conference on the TuxedoCat Lounge BBS, and create a new message. Wildcat! supports long To: fields for Internet e-mail, it used to be you had to kludge this by putting the e-mail address on the first line of the message.

Batch export of messages and prep for transport

Behind the scenes on another node or during a scheduled event, wcgate export is run. This exports messages from the Wildcat! message database to UUCP working files in a local spool directory, here in C:\WILDCAT\GATEWAY\WANNNET. A message-id and e-mail headers are generated for each message.

Running wcGate to export messages

Normally this is all ran from a batch script via node event every few hours to export, call the UUCP provider, and import replies.

In the spool directory, for each outbound e-mail there are three files: .CMD, .DAT, and XQT.

One good reference for these files can be found in the IBM z/OS manual.

UUCP working files created on BBS for test message

.DAT – data file, contains the actual e-mail contents, and message header.

.CMD – command file, contains two lines to perform two file transfer requests.

C:\wildcat\GATEWAY\WANNNET>type 35045W.CMD
S 35045W.DAT D.tuxed35045W root - 35045W.CMD 0666
S 35045W.XQT X.tuxed35045W root - 35045W.CMD 0666

The format is

‘S’ – send a file from local to remote, followed by the local filename, and then the destination filename. The sender name, in this case ‘root’, then the local temporary file name (??), and local file permissions. Since this is a DOS machine, wcGate fabricates the root and file perms fields for us.

.XQT – execute file, contains a list of commands we want the remote site to execute for us.

C:\wildcat\GATEWAY\WANNNET>type 35045W.XQT
U bryan.wann tuxedocat          # <- remote user, remote site
R bryan.wann                    # <- ??
F D.tuxed35045W                 # <- filename to run
I D.tuxed35045W                 # <- file to use as STDIN to the app
C rmail bwann@wann.net

In this case we want the remote site (our UUCP gateway) to ultimately run rmail bwann@wann.net, that is, run the Unix rmail command with the argument bwann@wann.net and use the file D.tuxed3504SW file as STDIN which contains the message. (More on rmail later)

Note that all of these UUCP commands are baked into wcGate, we have no ability to fiddle with them.

FX UUCICO

Running uucico to contact the remote site ‘wannnet’

Once the message(s) have been exported to the spool directory, it’s time to dial-up our UUCP host. This is done by a DOS-based software included with wcGate called FX UUCICO (unix to unix copy in copy out). Configuration files in C:\WILDCAT\GATEWAY such as DIALERS, SCRIPTS, SYSTEMS, FXUUCP.CFG tell UUCICO how to interact with the outside world.

DIALERS – contains a list of modem definitions, modem speeds, and expect-style chat script to send AT commands to the modem for initializing, dialing, and how to know when the call is connected.

SCRIPTS – More expect-style scripts for logging into the remote UNIX UUCP host, e.g. look for the login: and password: prompts, send passwords, and any other extra UNIX shell commands before starting the UUCP transfer.

SYSTEMS – A list of remote systems/hosts we’re allowed to connect to. This ties together which modem to use from DIALERS, which chat script from SCRIPTS to log in, and what username/password to use for logging in.

FXUUCP.CFG – contains configuration commands such as our local site name, which type of comm/serial driver to use, buffering, and how many times to retry the call.

C:\WILDCAT\GATEWAY\UUCICO.EXE is ran, specifying which system to connect to. It creates a lockfile for the modem serial port, and sends AT-commands to dial the UUCP host. After dialing and logging in, the UUCP copying magic happens.

uucico dialing extension 104 on the modem

UUCP host/gateway/provider

Raspberry Pi and dial-up modem acting as UUCP provider

I use a Raspberry Pi running Debian 12, Taylor UUCP, Postfix, mgetty, and an old Cardinal 28.8k modem as my UUCP <-> SMTP gateway. This sits on my homelab LAN, and to avoid problems trying to run an Internet-facing SMTP server at home, mail is relayed to/from my public SMTP server at Linode. For the modem’s telephone connection it’s connected to one of the same Cisco ATAs as the BBS modems are connected to, so the call does not have to go out to the Internet nor PSTN.

I thought this was going to be a giant ordeal and I was going to miss some glaring chunk of the big picture, but since I was just taking inbound calls it wound up being pretty easy. Postfix provides an rmail binary (and in some packages it’s just a wrapper script around sendmail) for UUCP->SMTP, and the stock master.cf config already has a entry for calling UUX for SMTP->UUCP. Thank god for applications keeping legacy configurations around!

To give e-mails an extra boost to avoid being spam filtered, I set up DKIM to sign messages as soon as they’re received from the BBS, and added SPF rules to the DNS domain. This seems to help with Gmail, but not everywhere. Troubleshooting my SMTP relay and setting up DKIM took far more time than the rest of this project.

Mgetty interfaces with the modem, providing a plain ordinary login tty when the modem answers.

The BBS has its own Linux user account on the UUCP host, called Utuxedocat. There’s not a lot special about it. It’s homedir is /var/spool/uucppublic and it’s shell is set to /usr/lib/uucp/uucico. The password for the uucp connection is also in /etc/uucp/passwd, because uucico can’t read /etc/shadow.

The username starts with the letter ‘U‘ to traditionally indicate it’s a system dialing in, and not a user. mgetty can also be configured to automatically start uucico if a login name matches U*.

When the Utuxedocat account logs in, it automatically starts up uucico instead of a shell like bash. There’s a whole little handshake process taking place between uucico on the BBS machine and the one on the Linux machine. Things like transfer protocols, packet and window size are established. FX UUCICO actually can log this in great detail which is kind of interesting if you want to get into the weeds of how it works.

Update 14-May-2024:

After learning more about UUCP thanks to the Bear book, I learned there’s a couple of ways to start up UUCICO. One way is like how I’m doing it and mgetty is answering the phone, gets the username, and then passes off authentication to either uucico or /usr/bin/login. In this case uucico is running as user ‘uucp’, it can’t access /etc/shadow, and thus you must use the Taylor /etc/uucp/passwd file to authenticate users. (Note: in this case /etc/uucp/passwd must be owned uucp:uucp, mode 640). I haven’t tested this but I’m pretty sure it means the unix password can be locked for the Uusername account as it’s never used. Another benefit of mgetty is that it also lets you start up PPP sessions to modem callers if you’re doing that sort of thing.

One is to set /usr/lib/uucp/uucico as the unix account shell so uucico and only uucico starts up right away. This would be useful for inbound TCP connections, and I think it would use the shadow password file for authentication.

Here is where mgetty has answered the phone, the username is passed to uucico, uucico authenticates the password, and then presents the “Shere” (system here) prompt.

 

Example of logging in and seeing the “Shere=uucp” session start message from the remote uucico program

Pending spool files are transfered, and uux is called to execute rmail (from the .XQT file) on the remote system to send mail messages to Postfix. (This also happens in reverse, pending incoming email from the Internet that are spooled up on the UUCP host are fed to a dummy ‘rmail‘ command on the BBS machine.)

UUCICO finishes and hangs up. Any email destine for the Internet is processed normally by Postfix and sent out via SMTP.

Here’s the log file from /var/log/uucp/Log with our Hello World:

uucico tuxedocatbbs - (2023-11-16 18:30:47.52 281914) Handshake successful (protocol 'g' sending packet/window 64/7 receiving 64/7)
uucico tuxedocatbbs root (2023-11-16 18:30:48.70 281914) Receiving D.tuxed35045W
uucico tuxedocatbbs root (2023-11-16 18:30:50.01 281914) Receiving X.tuxed35045W
uucico tuxedocatbbs - (2023-11-16 18:30:50.59 281914) Protocol 'g' packets: sent 6, resent 0, received 13
uucico tuxedocatbbs - (2023-11-16 18:30:51.18 281914) Call complete (6 seconds 465 bytes 77 bps)
uuxqt tuxedocatbbs bryan.wann (2023-11-16 18:30:53.20 282183) Executing X.tuxed35045W (rmail bwann[at]wann.net)

It has received the data and execute files that wcGate generated, and executes the request.

And here’s where Postfix picked up the message from rmail, signed with DKIM, and sent out via SMTP to the Internet:

Postfix log showing message sent to the Internet

Success!

Here is the email from the BBS in my inbox (DKIM header removed for brevity):


An e-mail reply

When I reply to my message from the BBS and send it using SMTP, Postfix will receive the message and then call out to UUX to write it to the local spool directory:

Postfix handling a message from SMTP and delivering to UUCP spool

Likewise there’s an entry in /var/log/uucp/Log showing the UUCP subsystem processed it and it’s sitting in the spool directory:

After another cycle of running UUCICO on the BBS machine:

Any email/replies destine for the BBS are now sitting in C:\WILDCAT\GATEWAY\WANNNET:

 

Batch import of new mail

wcGate is ran again, this time with an import command. This reads emails from the local spool directory and inserts them into the Wildcat! message database.

Now the email shows up ready for the caller to read!


Will I be a purveyor of fine UUCP service for retro setups? Maybe. I’d like to learn more about UUCP workings, and maybe get private usenet working before then.

Update 2024-03-26: The Internet Archive has a copy of the 1996 book Using & Managing UUCP 2nd edition which is a great in-depth guide to UUCP!

 

Vandenberg Space Force Base (f/k/a Vandenberg Air Force Base) in southern California is where most west coast rocket and missile launch activity happens, dating all the way back to the 1950s. Lately people may know of it as where SpaceX launches payloads on Falcon 9 and ULA launches Atlas and Delta rockets. It’s also where the USAF test fires all sorts of missiles including intercontinental ballistic missiles, such as Minuteman III.

In the case of a Minuteman ICBM test, what they will do is pull a random active missile from a silo in Montana, Wyoming, or North Dakota, remove the nuclear warhead, and truck it down to Vandenberg. Here they will mount an instrumented dummy warhead on it, and store it one of the many silos right on the coast.

Usually about 4 times a year they’ll regularly launch a test Minuteman III from Vandenberg toward the Kwajalein Atoll, way out in the Pacific Ocean on the other side of Hawaii. This is done to test readiness, maintenance, and try out new technologies. These tests are scheduled way in advance and are publicly announced as to not escalate tensions somewhere. Generally a press release is sent out a couple of days ahead of a test, sometimes noting which launch facility will be used, and the launch window. If you don’t mind dropping everything at a moments notice to drive there and going to sit in a lawn chair for potentially 8 hours in the cold, you can watch one get launched.

Anyways as part of following SpaceX, ULA, and USAF launch activity there, I spent a while reading up on the various history and lore of the area. I’ve been down there several times to photograph launches, often with time to spend wandering around the area after a scrub. Apparently it was tradition after successful launch for all the technicians and whatnot to head over to a small steak house called The Hitching Post in Casamilia for a beer and steak, which is just a short drive away from Vandenberg. I went there in 2018 and it’s mostly nondescript, except for all the mission badges mounted on the wall and stickers plastered on the mirrors in the bathroom and behind the bar.

A while back I came across this video which described in more detail what happened when a Minuteman missile was launched toward Kwaj. After launch, radar stations at Hawaii and Kwaj will track the incoming missile and warhead, and measure the accuracy of the impact of the warheads vs the intended target. Around the 6:42 mark of the video, they describe the tradition of the launch teams going to Hitching Post, tallying up the number of yards the warheads missed their targets, and then drinking that many yard-sized glasses of beer. Interestingly, the narrator says the person starting off the yard glass activities is Launch Director Randy Eady, and the credits say the video was written and directed by Randy Eady.

I don’t have another source to back this story, but it sure seems plausible, because what an excuse to drink beer.

 

If you’re running Qodem on MacOS with iTerm2, you may notice extended ASCII/code page 437 characters don’t render as lines, but just a bunch of letters like PPPPPPPPPPPPPPPPPPPP. Try unsetting the environment variable TERMINFO_DIRS (unset TERMINFO_DIRS) before starting Qodem. This seems to have fixed it for me.

I figured this out by noticing when I ssh’d from my Mac to a Linux system and ran Qodem, extended ASCII and ANSI escape sequences rendered just fine. However, when I ran Qodem locally from my Mac, all of the line drawing was messed up. I’m reasonably sure it was compiled right by the Homebrew community but couldn’t see any other reason for it to not render correctly. For giggles I did ssh localhost, ran qodem again on the same Mac and to my surprise everything rendered correctly!

I did a little diff’ing of env before and after unsetting things and finally settled on the TERMINFO_DIRS variable that was doing it. It gets set locally, but unset when ssh’ing to a remote system. I don’t know offhand what the difference is in the terminfo iTerm is using vs system terminfo, I just know it fixed my problem.

(For reference I’m running iTerm2 with the Cousine font, and Courier New for Non-ASCII Font, and UTF-8 character encoding. TERM is set to xterm-256color too.)

 

Before, running Qodem in iTerm2 on MacOS:

 

Yuck!

After running unset TERMINFO_DIRS before running Qodem:

 

Much better!!

 

Terminal.app

This doesn’t seem to be an issue in the standard Terminal.app, as it doesn’t define any sort of custom TERMINFO variables. However Terminal.app does seem to enforce a character and line spacing of “1” which gives ANSI graphics a bit of a screen door effect, which is mildly annoying to me.

Default Terminal.app settings

 

Wasted

[photos: flickr – Seagate ST-225 repair]

A follow-up to https://binaryfury.wann.net/2023/07/2023-drive-belt-saga/

I wound up buying another “parts only” Seagate ST-225 hard drive off of eBay to attempt to fix my ST-225. This is a 20 MB MFM drive out of the old family 286 computer. Long story short I’m pretty sure my repair worked but ultimately the drive wasn’t recognizable by the BIOS. I think track 0 (on the outermost of the platter) was too knackered up for it to be found. This could have been due to the move.  I don’t know if there’s any software I can use to get any sort of raw dumps since the drive doesn’t even get recognized.

As an aside when working on my 486, I noticed in the 5.25″ floppy drive there’s totally a stepper motor and a split band assembly in there. I’m wondering if I could’ve stolen one off an extra floppy drive, if it was long enough to resize to what I needed.

 

Donor drive before fiddling

Opening up the new donor drive both of its split drive bands were intact which was great. Upon further inspection I found one of the read/write heads had broken off and was rattling around in the drive, which explains why it was dead. A literal hard drive crash.

Crashed/missing drive head on donor drive

 

I very carefully disassembled the donor drive bands, taking lots of close up pictures beforehand so I would see how the various bits like washers and ends were orientated so I could do the same on the old drive.

After some very careful threading and tugging, I got the drive bands on my old drive. Tip: mount up the left band first to the two studs, then do the rightmost band. At the end of the band next to the screw hole is an extra hole, put a paperclip or something through this to let you keep everything under tension while trying to thread up the screw.

Bands assembled on old drive

After hooking my old drive back up to the 286 and booting it, I could see the stepper motor spinning and it sounded like it was seeking. Unfortunately POST paused and it coughed up the C: drive error.

After a couple of reboots I decided to throw in the towel. I just had to open up the drive to see if my fix worked. This probably destroyed the heads with dust or something. Turning on the system again, the heads did fully seek across the platters and back. It had a slight ticking sound like maybe it was trying to seek further than it could so I don’t know if maybe there’s an index on the stepper motor and I twisted it out of place when working on it.

Maybe someday I’ll get back to working on it…

 

Seeing if my fix worked

« Newer Posts - Older Posts »