Hi Lisa, First they're only internal, so not worried about unusual hacking on them. Myself and the wife are the only users on the network. No changes what so ever across my chroot - I validated nothing got deleted, though I didn't run CRC's since literally i just duplicated the vmdk to another guest system. Doing so, I actually cloned the disk, redid hostname, hosts, and networking to rehost the server, got bind fully functional, cloned again, and rebooted both instances. Both failed in exactly the same way. This also happened to the original clone, after setup, after a normal reboot of the guest No weird config files showing up. RNDC worked prior to my reboots, I was using it to force a zone xfer to test. There's no forwarding, only recursion from my internal subnets only. The file definitely exists, set 664 for appropriate user/group. I gave bind a shell and su'd to it and perused all the directories to make sure I could read/write to the right portions of the fs. As far as I know, bind should launch regardless if there's an RNDC key, it's mostly for the external control of the daemon. I remember having it working at one point when the key was gone by mistake, and couldn't rndc reload the zones. It's really quite odd that I only have this issue with the slave, and not the master. I have the relevant slave directory where it creates the xfer'd zones writeable, and relevant dev's only, but the rest is entirely readable or owned by bind. I spent some time messing with the apparmor profile as they don't accommodate chroots by default, but can't find anything else in strace that it's trying to reference and cannot. It's fairly clean until the point it pukes about what the binary stdout's anyways in non-forked. I really had enough short of throwing the monitor out the window today, so I'm going to pick up with it tomorrow and look at purging/reinstalling the binaries to validate that, and might just remove apparmor all together. I have a sneaking suspicion it's doing something it shouldn't. This was working prior to my rebuilding them as ibex servers vs. an old gutsy or even feisty, prior to apparmor inclusion. It's the only big thing I can see that might be screwing with it. Thanks for the input, I'll look at it some more from your perspective and see what I can see in the morning. -mb On Wed, 2009-08-26 at 16:46 -0700, Lisa Kachold wrote: > Hi Michael, > > I have seen a good many hacked bind servers and various known things > happen to them: > > 1) something strange changes chroot? > 2) configuration files mysterious appear with ALT255 ascii characters > in front of localhost entries, etc. > 3) rndc key permissions are opened so anyone can control the server, > when not completely firewalled. > 4) when recursion and forwarding are misconfigured, cache poisoning is > rampant. > > In any case YOUR bind error is describing FIRST inability to find the > /etc/bind/named.conf file. Does it exist? > > Following bind to socket() issues is due to the failure to load a > perfectly acceptable named.conf file that calls rndc key, etc. I > believe? > > But run a crc check against the binary, blow away the package and reinstall it. > > BE sure your configuration files (not using a db?) are intact... > > On Wed, Aug 26, 2009 at 2:23 PM, Michael Butash wrote: > > I'm curious if anyone's seen anything nutty like this before... > > > > So I'm migrating my dns instances between boxes when I noticed my > > secondary dns server isn't starting bind anymore. Primary still works > > fine, no issues. Debugging gets me this error: > > > > user@dns03:~$ sudo named -u bind -t /var/lib/bind -g > > 26-Aug-2009 21:01:33.568 starting BIND 9.5.0-P2 -u bind -t /var/lib/bind > > -g > > 26-Aug-2009 21:01:33.569 found 1 CPU, using 1 worker thread > > 26-Aug-2009 21:01:33.575 loading configuration from > > '/etc/bind/named.conf' > > 26-Aug-2009 21:01:33.575 none:0: open: /etc/bind/named.conf: file not > > found > > 26-Aug-2009 21:01:33.587 net.c:80: unexpected error: > > 26-Aug-2009 21:01:33.587 socket() failed: Permission denied > > 26-Aug-2009 21:01:33.588 net.c:80: unexpected error: > > 26-Aug-2009 21:01:33.588 socket() failed: Permission denied > > 26-Aug-2009 21:01:33.588 loading configuration: file not found > > 26-Aug-2009 21:01:33.589 exiting (due to fatal error) > > > > After futzing with this for several hours, I give up, clone the primary, > > migrate the slave config files, and get it working again. Happy it's > > working, I reboot it, migrate the instance again, and I get the same > > damn errors. I can find _nothing_ related to an error like this > > anywhere on google, and even strace-ing it shows me nothing other than > > for some awful reason it now doesn't seem to think an ethernet interface > > exists. > > > > Any ideas why a functional slave bind server would "lose" it's > > capability of binding to an ethernet interface after a reboot? As far > > as I can tell, this is the only thing that seems to be holding it up. > > This is the most frustrating and asinine thing I've seen ubuntu do in a > > while, pretty much killing my entire day thus far... > > > > I've checked apparmor, permissions (all files readable fine by user), > > named.conf allowing "any" bind interfaces, and again, it was working > > perfectly before a reboot. This is entirely reproducible as well as > > apparently I just flipping did. Ugh. > > > > I do know about djbdns and rdns being "better", I'd just rather not have > > to waste a few days when bind has and does always suite my needs just > > fine. > > > > -mb > > > > --------------------------------------------------- > > PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us > > To subscribe, unsubscribe, or to change your mail settings: > > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss > > > > > --------------------------------------------------- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss