Monday, April 19, 2004

I woke up early, what I thought was about 5am (that I later worked out to be 7am), hearing everything going off, oh great, a blackout. I wasn't really awake, just a bit conscious to notice that.

I went back to sleep.

I was woken again several hours later, by the phone ringing (the land line). I couldn't be bothered getting out of bed to answer it.

When it stopped ringing, my mobile started ringing.

It was the guy I go to dinner with, he wanted to know about how to remove NetSky.Q from a machine he was working on.

I have no idea, I run a firewall, and keep my virus scanner up to date, so virus removal is not something I need to worry about.

I suggested he look on Google for help.

I noticed the power was still off. I decided to sleep until it came back on, there's nothing else I can do.

I can't finish configuring the servers, I can't check my mail, I can't blog.

I woke up again several hours later, I didn't know what time it was, since all the clocks were off (and as a result my alarm had not gone off).

I checked my watch, it was about 11.40am.

Bugger.

I got up, and decided to check the fuse box.

I turned off all the power points that run almost everything in my place (since everything is on powerboards chained together).

I wandered around past my parents' house, I looked in the front window to see if they had power, my sister saw me, opened the door. I asked if they had power, she turned the light on, and it came on.

Great.

I wandered around to the fusebox, opened it, started checking the circuitbreakers.

I discovered that the circuitbreakers that provide power to my place were off.

I switched them back on, they stayed on.

I came back in, and started turning everything back on. Everything came up.

The servers rebuilt their RAID arrays again. I don't know why they do that all the time, I'm worried it will wear the disks out.

My dodgy flakey motherboard wouldn't boot. I fiddled around with it for a few minutes, no good. I unplugged the psu and left it, it sometimes comes good after that.

I had noticed the other night, that all the capacitors looked really dodgy, like they were popped/blown. There was some brown crud on the board around them.

I noticed one of them had done this like 18 months ago, but thought nothing of it.

I decided I would try replacing the capacitors on the board, they are all the same (that look weird) JPCON 2200uF 6.3V caps.

I counted that 7 were obviously stuffed (there are actually 8, an extra one I didn't see near the keyboard connector).

I have to get parts to make up a couple of serial cables people ordered recently too.

So much for moving the gear today, there's not time left now, it's like 12.30pm.

I went out to see the guy who I'm going into the colo with, to get the keys, and check some IP addresses.

He hadn't copied the keys yet, so he gave me the originals to go and get copied. I also got some details about IPs and radius details I needed. I'll have to change all the IPs on my gear now.

I went across the road to the electronics place.

I bought the parts for the serial cables, they didn't have the db9 backshells I need. Bugger, I'll have to get them in Gosford, I'm going there (or near there) to get the keys done anyway. I also bought the capacitors to fix my board. In 2200uF, 63v were the smallest I could get.

I looked at the keys, one is a registered security key, it had the contact details of the place that copied it. They should be able to help me.

I called them, to find out where they were. I told them I had a security key to get copied.

The woman started asking me who I was, and where the building/unit was that the key was for, and who the real estate agent that looks after the building was, and a whole bunch of other details I didn't know.

SHe put me on hold for ages while she made a couple of telephone calls to chase up the registry for the security keys.

All I wanted to know was where they were, I didn't care about getting the authorisation to copy the keys sorted out while I was on the phone.

When she came back from putting me on hold for the 3rd time, I said I would come in there and sort it out.

I rode into Gosford, and I went to the locksmiths.

I gave the guy the keys, said who I was and the story so far.

He went away, He came back a minute later, and wanted the contact details of the other guy, whose keys they are.

I gave him the phone number. He went out the back again, and I could hear him on the phone.

I waited, and waited, and waited.

Some other customer came in, with a broken padlock. Some other guy working there came out and helped him.

I wandered around looking at the gun lockers, and safes they have in there.

Some woman came in wanting some ancient looking key cut, someone else came and helped her.

I sat on the lounge and waited, and waited, and waited.

About 25 minutes later, they guy came out, with copies of the 4 keys.

I was thinking it would be nice to get change out of $10 for the lot, but it's not likely, especially since the security key that caused all the issues is a dopey double pin "alligator" or something key, which is sort of like 2 keys together.

It was over $20 for the 4 keys.

The guy told me there were now 10 sets of these keys floating around (at least of the registered security key), and they only knew were 4 of them were.

I took the keys. I thought about going and looking at the place, but I have to get the keys back.

I went and got some lunch, I went down the road to McDonalds.

I looked in the paper for anything about all the stuff that went on last night with the sirens etc, but found nothing.

I went back through Gosford again, I was thinking about looking in the place again, but I couldn't find a parking spot, so I just kept going.

I went back out to see the guy again, and I dropped his keys back to him.

I said I was going to be moving the gear in tomorrow, and that I would give him a call, so he could come in and give me a network link etc.

I came back to Gosford, I looped all around looking for a parking spot.

I eventually found on on the other side of the road, after going right around the block, wasting about 5 minutes.

I went into the building, up the stairs, found the units.

I went to the second unit, that's just used for storage, I opened the door (good, the key works), and went in.

It was pitch black, I couldn't see a thing.

I stood there for a minute waiting for my eyes to adjust. They slowly did.

I felt around for a light swtich, next to the door, behind the door, on the other wall, I didn't find one.

There was a door on the wall behind the door I came in. I pushed it, and it opened, ah, a bit of light came in.

The next room (also the other unit) has all the gear in it. A few racks in a row, cables and stuff running around the place, pretty empty though.

I could now see in this room a bit.

There's nothing much in here, a sink, and a few bits and pieces on the floor.

There's another door on the wall opposite the door into the other unit.

I walked over (carefully, not really able to see what's on the floor), and opened the door. There's a big empty office, it was bright in here, it has all windows facing the street where I was parked.

I went back in, I left the door open for light.

I went into the comms room, had a look around. Nothing terribly exciting, it's just like any other, some racks. pcs, networking gear.

There's a couple of monitors sitting on the floor, and keyboards and spare cables etc.

I noticed on the wall a light switch, conveniently, on the opposite wall to the entry door.

I wondered if there was one in the same place in the first unit, where I came in.

I could just make out something on the wall in a similar place.

I went back in where I came from, closed the door to the comms room.

I went over, and turned on the light. That's a bit better, I could see now. Nothing more than I thought, just a few bits and pieces on the floor.

I closed the door to the front office, turned off the light, went out, and closed and locked the door behind me.

I came home after that.

I googled around for details about leaking electrolytic capacitors, and found a great deal of stuff about it, especially in the case of failing motherboards. Ah! so that's what's going on.

Apparently it goes back to some corporate espionage a few years ago, when some workers in Taiwan stole a formula for a water based electrolyte, but they didn't steal the whole thing. They sold it to a bunch of Taiwanese capacitor manufacturers, who made millions of capacitors based on it, before they realised the stolen formula (that they didn't know was stolen, apparently), was incomplete, resulting in premature failure of almost all of them.

Apparently a lot of these capacitors found their way onto motherboards, video cards, power supplies, and other devices.

Several places on the internet have setup to replace the capacitors on the motherboards, restoring function/stability to the board.

I googled around more, and found that symptoms of failed caps include things like the machine BSODing for no reason, and looking like memory failures, or other unexplained errors. Operating systems failing to install, boot, issues with POSTing, everything.

These were all symptoms I had experienced, I remembered a few months ago (maybe 8 now) the machine became really unreliable, wouldn't run for more than a few minutes without locking up or BSODing.

I thought it was heat issues, and bought a massive fan and put it in the case, but it did nothing.

I tried reinstalling windows a bunch of times, it wouldn't finish installing, or then it wouldn't boot.

I tried a different hard disk, made no difference.

I tried a windows install from a different machine (on a different disk) that would boot, and run, but it wasn't very happy.

I fiddle around for days with it, even trying to run windows 2000 instead of 98, that worked slightly better, but only for a few days, then it kept corrupting it's own registry.

Linux was somewhat happy on the machine (more happy than windows at least), but even it was locking up hard every now and then, something I almost never see linux do.

I put it down to hardware issues, I pulled the whole machine apart in the end, swapped all the cards around, mucked around with the whole thing. It ran just sitting on the desk as a board with cards in it, and all drives hanging off it everywhere. It seemed somewhat reliable, and ran like that for a few weeks, (I put the original windows install back it had been running forever) before I put the machine back together, and it worked until around Christmas, when it failed again. (I think I blogged this).

I had just arranged to get the board fixed/replaced by Abit (the only company to have admitted using the faulty capacitors too), when after refusing to boot or do anything for a couple of weeks, the machine started working again.

It did this again. Died, for a couple of days, and then after I bought the new board, it started working again.

I fiddled around on the servers a bit more, I changed the IPs I used, reconfigured the RAS, put in the details of the other guys radius server, so that his customers can dial into it, and be authenticated against his server.

I called my mate (who got punched the other night) and asked if I could borrow the van in the next couple of days, or perhaps the trailer, just to get the rack into Gosford, as I now had the keys.

He said he would see, the trailer was loaned to someone in Sydney, and he only had a couple of short jobs tomorrow, so he would call me in the afternoon.

I watched a bit of tv after that.

I went in and saw my parents, had dinner with them, told them what was going on, since I wanted to find out what Dad was doing tomorrow, so I could arrange to get the gear moved into Gosford.

He was going to go back to work today, but he decided he would take an extra day off, so he could help, and be there while the gear moved in.

Cool. I could have got my mate (that I go to dinner with) to help me, but he likes to play snooker on Tuesdays, and I'd rather not interrupt that if I can help it.

I sat and watched tv for a while. They must have changed the timing on the key updates (or my machine was behaving) because it rarely had issues/mpeg breakup. (Except for one time, when the machine locked up and I had to come reboot it, just after I had told the story about the dodgy caps to my parents).

I came back in a bit after midnight, IRCed for a while, I talked about the caps in there, someone pointed me at a site about capacitors, it says that capacitors have much reduced capacitance when run well below their rating (I got 63v rated to replace 6.3v rated), and I'd also read things saying that motherboards need "low ESR" caps, which "normal" electorlytics are not.

I guess I will need to chase some more closer to spec caps, I found a site on the internet where some guy sells kits of caps you need to replace the caps on certain boards, maybe he has one for mine (Abit KT7 non RAID).

I went to bed about 3am, after watching more tv.