I woke up late this morning, didn't get to work until 10am.
Had a 10.30am meeting, about the new system I'm writing, where we're at, and what the plan is.
That went until about 11.45am.
After that, I had another meeting with a guy, about server configurations and an apache upgrade I'll have to do.
I have to make changes to the production intranet server, so I'll have to do it out of hours, and because it takes ages to get apache/php/pdflib all compiled and working properly, and then configured, I'm going to end up doing it on the weekend, great, I want nothing more than to have to go to work on the weekend.
I went across the road to get as coffee, ordered, and realised I'd left my wallet on my desk, because I'd had it out of my pocket because my access card is in it, and I have to get through all the stupid security doors.
I went back across to the building, got my wallet, and went back to the coffee shop, paid, and then came back with my coffee.
On the way back into the building, I bumped into the guy from Orange who's taking over IT, he was at reception with a couple of other people.
He commented that I could have had a free coffee, I looked at the people he was with, it said "Mobile Coffee" on their shirts or something. I said maybe I'd have another one in a bit (the way this day's going so far).
I got back to my desk, and I emailed the guy I got the Sparc off, letting him know the problems I found.
It was now a bit after midday, I got an email from IT, saying there would be an email outage for 10 minutes.
At about 12.45, when it should have been back up, I tried to check my mail, the mail server wouldn't accept my logon details. Hmm.
My initial thought was that someone had installed a service pack or whatever for exchange server, and the major function/ulterior motive for M$ putting out the update, is that it stops Thunderbird from authentication against exchange.
A new guy from IT came down, and said they were having a problem with the network, and phones, and they were working on it, then jokingly asked what I'd done.
I just told him I couldn't access my mail, and said I'd fix it for myself if it wasn't fixed in the next hour, and then he went away again.
Our phones and network connections (downstairs) were fine.
I browsed the web for a little bit, and then checked my mail again, still couldn't connect.
I went up to the IT area, and it was empty. The only guy there was the guy who's taken over, he said that most of them were in the server room, so I asked him to open the door for me (I still haven't got around to getting access to the server room on my card).
There was a stack of people in there.
I asked about the mail server, apparently it had run out of disk space, one of the guys had deleted a couple of big files that didn't belong there, and rebooted, but it still only had 300MB free.
I wasn't able to connect to it, even locally on the exchange server, I checked the eventviewer log, and there were a ton of errors about "OpenMsgStr" or something.
I restarted a couple of exchange services, and then it looked like it was working again, I could connect to the webmail interface, and it was back logging a million errors about people not having a master SID or some nonsense.
I went back downstairs, and tested it, I was able to connect to the mail server again.
I went back upstairs, and at this point I tried to find out why everyone else was in the server room, it seems that there's a truckload of broadcast network traffic going on, on all the data/voice switches upstairs, and no routing/connectivity to other subnets.
The router is set up for redundancy, but both cable runs/cards/links are stuffed.
I went down and got a knoppix cd, booted it up on a laptop that was in there, manually configured the ethernet interface, since I couldn't get any dhcp details, and then went about packet sniffing.
There was a stack of dhcp requests going on, and some routing stuff, as well as some dns requests from the stupid 169 autoconfiguration range.
I'll bet all the broadcast stuff is just the dhcp requests, then I remembered about the voip phones, of course that's what it is, since there's no connectivity to the voip server, the phones keep rebooting.
There was a new guy there, he was trying to work out how to connect to the core router, to check it, since the links had gone down between the ports on it, and the switches connected to all the gear upstairs in the building (phones, workstations etc).
They managed to phone someone, and find out the passwords for the core router, then we had to find a way to connect to it, they were trying to use a cisco console cable, somehow I doubt that's going to work on a nortel router, it didn't.
I found a couple of nortel cables in the rack, and then we were right. The new guy went about logging in, but wasn't sure how to interact with it, I had a look, it looked a bit like cisco config, and also a bit like ericsson config.
We worked out how to navigate around the CLI, got in and found that if we switched the individual port states, to test, and then enable, then the links came back up.
There was still no routing going on though. I telnet to the router from the mail server, and poked around a bit, but couldn't really find anything, I tried disabling routing, and reenabling it, no difference.
I went back to dealing with the mail server.
The log directory had over 8000 5MB log files, over 40GB worth.
I googled, found a page detailing recovering from an exchange server data storage disk running out of space and went about following the instructions they recommended.
Apparently you run a util, and point it at the checkpoint log, it tells you the last log file you need to rebuild the database. I found out then that the log files are a transaction log, in case the database falls over, and can't be unmounted properly, when it comes back up, it checks the log files.
Anyway, I found the last log file I needed for that to work, it was from only a few hours ago, and the log dirctory still has logs from November last year in there.
I ended up moving 3GB worth of log files to another partition (would have moved more, but the whole machine is pretty full), I then rebooted the server again, it came back up, and all looked to be working.
I asked the new guy if he was getting anywhere with the routing issue, no luck yet. I suggested deleting the whole subnet, and recreating it, didn't work.
This is getting beyond a joke, we're down to a couple of options:
disabling/reenabling the slots (hopefully restarting the cards),
rebooting the whole router (which will cause massive connectivity issues, since it will disconnect everything, including remote sites)
If neither of those work, then perhaps the unit is stuffed. Seems strange that both cards would go, in the same way.
It was now almost 4pm, I hadn't been out for lunch, neither had a couple of the other guys. I asked the new guy if he wanted a pie or something, since he hadn't had lunch either.
We came out of the server room to talk to another guy, who's staying up there in the pub at night. He just wanted to finish the stuff he was working on, so we waited a few minutes for him.
Because of the network issues, he was working on another guy's pc, which is attached to the server subnet (which still has connectivity). The guy whose desk it was, was just packing his stuff up to leave, since he couldn't do anything.
He had one of the free coffees on his desk, which I commented on, and he said it was the worst thing he'd ever tasted. He'd only had a couple of mouthfulls out of it, and threw it away, and then left.
The guy we were waiting on finished up his stuff, and then he grabbed all his stuff, and we left.
We went down and jumped in his car, while he moved it around to park behind the pub, he's got a cool old vw bug, the inside was a bit rough, but the body was ok, had been resprayed a while a go.
We went into the pub while he checked in for the week, this is his first day back at work since Christmas, and when he went to check in, they didn't have his reservation.
We had to wait for a while for the manager to turn up, to sort it out. He eventually turned up, and was able to give the guy a room for the next 2 nights, and then they'll have to sort something out.
He finished checking in, went and dumped his stuff in the room, and then we wandered across to the mall. It was now a bit after 4pm.
I went to the bakery, and bought a couple of pies, one of the other guys got a kebab, and the other guy didn't get anything.
We wandered out, found a seat, sat and ate and chatted for a while. We were there until 5pm, and then we went back to the building.
The router had been sorted out, and the guys we left in there working on it/waiting for a phone call had left.
I sat there, and watched one of the guys working on some database stuff, showing me the application he's trying to fix up.
I ate the other pie, rather than waste it.
I went back down, did a little bit of work, and then grabbed my stuff and took off, around 6pm.
I went to the supermarket on the way home, got food for dinner, and came home.
No sign of the other tivo I bought last week.
I watched a bit of tv, drank some of the Bundy triple filtered (neat, something I can't do with normal Bundy), and went to bed about midnight.

0 Comments:
Post a Comment
<< Home