Skip to Content

Continuing…

Written on May 5, 2012 at 4:04 pm, by

So, the Comcast tech is here in the building, and he says that the WA market doesn’t use modems other than the one I’ve got for business accounts, so the fix I was hoping for was a bust.    He says there’s a lot of RF noise and he’s working on that, and he’ll swap out for a different modem of the same type.  I was told that the routing problem we’re seeing is a firmware issue, so I don’t hold out a lot of hope for this being a good solution.

So I’m on to Plan F: Abandon the affected IP addresses and ask for my money back.   I’m going to start the process of moving said services onto a different IP.   Since there isn’t a physical move involved, there should be no appreciable outages (other than when the modem decides to crap the bed again).

Puget Sound Atheism and vis.nu Networks are the only affected subsystems– basically, everything brought into the corporate substrate from The Great Convergence.  It’s all one set of servers now, but it still speaks on three addresses.

I’ll just move them onto the same IP as Tacoma Telematics, and all should be well.   This is also a temporary solution, as there’s likely another move coming up.

Tacoma Routing Issues: Problem Upstream

Written on May 4, 2012 at 1:53 pm, by

It’s a situation of everything happening at once.

The hypervisor is fixed, the router is running smoothly, and I’m even making progress on my terminal server.   I’ve upgraded the Hypervisor routing tools, and they’re working more efficiently than before.   But vis.nu and PSA would still stop talking to the world now and again, and flushing caches didn’t fix anything this time.    I started to suspect that it was something upstream from here.

I placed a laptop outside of the rack, and the it can see PSA and vis.nu when the rest of the world can’t, no problem.  Next step is the CPE, so I rebooted that.   Lo and behold: everything came back up.   I’m on the horn with comcast, just got handed to level 2, and I’m working on doing some long term testing.

At this point I’m kind of hoping the problem comes back, so they’ll replace the router.  But if I were a betting man, I wouldn’t put money on it.

Update 21:23 PM: Finally got a solid answer out of Comcast after about 12 hours of fighting– a new firmware push introduced a bug where the CPE would drop the last two IP addresses on /28 networks.   They’re sending a tech between 8:00 and 10:00 tomorrow to install a modem with a different firmware, which should fix the problem.   For now, I’ve moved the comcast router into my remote rebooter so that if/when this happens again tonight, I can reboot the modem without having to live at the office.

New Hypervisor and Kernel Installed

Written on May 3, 2012 at 7:41 pm, by

The Tacoma production server is on the upgraded packages, including the new server.  The reason the kernel was being strange was my fault.   After working some things out, the new kernel is booting fine and the VMs are running.     I’m going to nap in the back room now (too bleary-eyed to drive) and then get back to it.

Router Issues, Hypervisor Update

Written on May 3, 2012 at 9:23 am, by

For some reason, ganesha is losing both the PSA and vis.nu IP now and again.   A reboot fixes it… I’m hoping it’s part of our continuing issues with the present Xen hypervisor, but I’m not holding out of a lot of hope.   I’m keeping a close eye on it.

Meanwhile, I’ve got helium (my incredibly loud development server that I stopped using when I moved out of the Washington Building and had to have the rack on the same floor as me) fired up with my fresh Xen package.  I’m load testing the Linux PVMs while installing a Windows HVMs, and no crashing so far.   If that holds out for the next few hours, I’ll be taking hydrogen (my production server) down in order to upgrade to the new hypervisor.    Shoudn’t take long, and I can back out of the upgrade if something breaks.

Update 08:38: The new copy of Xen is installed, but it didn’t take with the same kernel I’d been using on my test hardware.   The older 2.6 kernel seem to work, so I’ve put that back on and loaded the virtual machines that way.   I’m still talking to Xen folks, and when I’ve got a solution I’ll give it a shot.  Probably not until tonight.

Hypervisor Acting up In Tacoma

Written on April 17, 2012 at 10:34 pm, by

The Xen Hypervisor is acting up in Tacoma, causing intermittent crashes there.   The watchdog timeout is catching the crash and rebooting the system, so we come back up, and I’m still working on it.

Also, we’ll be moving again, and ironically, back to Arizona, as I’m now in sole control of the colocation contract for that server, and connectivity there is stable.    But it’ll be with brand new hardware on Xen once again.

I’ll update this post as things change.

Update 6:46PM PDT: Okay, I’ve tracked down what is triggering the crash in the hypervisor.   I’m attempting to build new patches with a potential fix applied.   Unscheduled crashing should be over, but I will be using non-prime times to do some testing.

Short Outage at the Phoenix Server

Written on April 9, 2012 at 10:37 am, by

The Phoenix virtualization server stopped responding around 00:28 on Monday, April 9th.   I don’t have a reason, and the hardware self-checkout is only complaining about DRM (Direct Rendering Manager, not Digital Rights Management) checksum failing and three bad sectors on one of the RAID disks.  The DRM thing is Xen related I’m pretty sure, and the bad sectors are normal for a drive like that one.

So, no idea why it went down.

Effect: Podcast downloads for Ask an Atheist were down, and DNS for vis.nu was out of broken after our TTL of one hour expired.  Mail delivery for tonight will be delayed, but we’re well within tolerances for mail delivery.

The Great Convergence is Complete!

Written on January 26, 2012 at 8:46 am, by

I moved the PSA server into the cluster, and then figured I would stick with technology that works and research more advanced routing possibilities later.   There was a bit of an ARP twiddle, but after I got that moved, I’m happy to say that the move from 14 virtual machines to 6 is complete.

Now that I’ve simplified things, I can actually start on some improvements.     That’ll happen a little later, I need to catch up on stuff first.

Dev and Shell Integrated

Written on January 25, 2012 at 7:37 pm, by

Indra and tt-dev are now integrated, leaving only the firewall/routers left to integrate.   I need to do a bit of research before I can integrate those systems.   I believe I will now start on the PSA network, so I can finally shut that system down.

So for the next little while, I’ll have three routers pointing at the same set of systems.    Fun!

Web Server Integrated

Written on January 25, 2012 at 10:43 am, by

The tactel and vis.nu web servers are now integrated, leaving just the dev/shell and firewalls left out of the cluster, as well as the PSA server, which, save for mail services, remains separate from the new cluster.

Integration of PSA begins once vis.nu and tactel are fully integrated.

Also, all webmail services are now through https://mail.tacomatelematics.com, including vis.nu mail services.   This way, all I have to do is focus on getting one webmail client up to spec.    This includes password, domain, and mailing list management.

You’ll note the https, and it is indeed encrypted, but with a self-signed certificate at this time.

I’m going to begin on implementing the other portions now, and then moving on to PSA.   Once that’s complete, I’m going to being transitioning all active WordPress installations into a single multisite system, again to make our lives easier.

Mail Servers Integrated!

Written on January 21, 2012 at 5:22 pm, by

The server integration continues apace!

Since the new and old servers are actually on the same physical hardware, stuff gets moved and integrated pretty quickly.   Lists, accounts, web clients, and admin stuff is moved off of the old server and onto the new.   There are pointers in place to help you find them.

The current URLs are temporary as we continue to move new services.   As you’ll note, web clients and stuff all have Tacoma Telematics logos on them, as they’re now the same system.   Make no mistake!   It’s still vis.nu under the hood.