I’m not great with Linux. I know just enough to know that there are a lot of things I don’t know. On a windows file system, I can look at the folders and tell you what is supposed to be there and what is not, but when it comes to Linux I get lost. That doesn’t mean I don’t like it or that I don’t like to play with it, but it does mean that I’m slightly out of my comfort zone when making changes on a Linux server. So when I crashed our web server this morning, you can imagine how quickly my normal unease moved to outright panic.
Having played that particular game before, not only did I have a backup of the important files but I also had a snapshot of the server itself. Unfortunately my snapshot failed to open and the server is currently located Far Far Away ™ or “In the Cloud” as the kids say these days. Obviously I got the website back online or you wouldn’t be reading this blog entry, but the experience was one of trepidation. However, it was actually my lack of knowledge that got the website back up quickly.
Because I’m not so great with Linux, there have been quite a few times where I have decided to start over. While toying with something I didn’t fully understand, I inevitably performed an action that I didn’t know how to undo. Thus it has been part of my standard operating procedure to take a mulligan by completely reloading the operating system (ie, starting over). Because I break Linux a lot, I start over quite often. I’ve done it enough that I’m really good at starting over.
When I crashed our webserver, that experience came in quite handy. I was able to completely reload the web server and the site was back up in about 30 minutes. Most people don’t want to hear that someone they pay to know about these things made a mistake so huge that he had to reload his server, but I’m going to let you in on one of the major secrets in IT: The best engineers, the very best technicians, break things all the time. They break things almost as often as they fix them. That’s a lot.
The reason? It’s not that IT people are malicious, although we have some of those as well. The reason is because that’s how IT people generally learn. The white-hot pressure of knowing that you’ve just broken something and now you have to figure out not only how to fix it but also how to make sure it never happens again, that palpable tension, either breaks you down with stress or forces you to rise to the occassion. Either way it sharpens your mind and drives your thoughts to a new level. It teaches you a lesson that you’ll never forget.
The best engineers I know are the ones who, in thier spare time, broke their own servers. Some of them did it to servers in production. Some of them cracked under the stress. But all of them learned from the experience and were wiser for it. The very best of them were forged from that pressure and have never cracked since, because they remember that experience. They learn from their mistakes. They get back up.
Sometimes the lessons we learn in life are about what not to do. Sometimes the lessons are just reinforcing the other lessons we previously ignored. The important part is that we get out of our comfort zone and continue to learn. Learning doesn’t happen when you’re secure in your knowledge of how things are, it happens when you’re put in a position where you have to find a way out. Sometimes you have to crash your web server.