Skip to main content

Our host explains the reasons behind today's downtime

Posted by Kim on January 29, 2012, 10:52pm

I just received word from our host about the horrific downtime we experienced today.

Apparently, for the last several years they have used an auto-updater to apply critical security patches the moment they become available, in the belief that the faster they can plug security holes the safer their customers will be.

This time, a security update was released for their operating system that had some kind of fatal flaw in it, and instead of making systems more safe, it crashed them utterly. Although they realized what was happening quickly, their human brains were too slow for the speed of an automatic installer, which proceeding to trash many systems before it could be shut down. After that, everything had to be reinstalled, tested and sent live. Which took all day.

They've promised that they will strike a better balance between speed of security patches and careful testing in the future, and make certain that nothing ever again gets auto-applied, no matter how much they hate leaving security holes. Clearly, a day of testing is much preferrable.

The community discussion so far tonight has been very helpful in starting to figure out what to do after this incident. You have had some really wonderful ideas, and your kind and rational words have helped to temper my extreme upset over the RPR so poorly treated. Over the last year and a half I have poured my heart and soul into this place, and it's very upsetting when it vanishes for a day!

On the bright side, no data appears to have been lost, and all systems have been running quite smoothly for hours now. It looks like we may be able to put this behind us.

In the coming weeks I will be scouting for other hosting solutions and comparing prices, so that when the day comes that we are ready to make our move, we know exactly where to go for the best service and the best price.

Epic Memberships have been extended by a full week to help make up for today's downtime.

Comments

Kim

February 3, 2012
5:05pm

Thanks for those links nukleartoast. I will definitely add them to my list of hosts to investigate.

nukleartoast

February 3, 2012
4:59pm

https://phpfog.com/

-- or --

http://www.linode.com

I know you're trying to avoid naming names, but this isn't the first time the hosting company you're currently using has had these kinds of problems. Personally I moved away in September of 2011 after an "unscheduled" system upgrade completely hosed a $50/mo VPS without warning. The final straw in a year of random outages.

Bonebag

January 30, 2012
10:37pm

My life is nothing without RPR. You people have filled a void in my heart that was originally a barren wasteland of LOLcats, random memes, and /b/.

...What have you don to me?...

Minerva

January 30, 2012
2:29pm

Well... that explains it. Here I was thinking it was my phone.

Sherlock

January 30, 2012
1:20am

I must say, I've seen websites go down of weeks on end. Some even months. This is one of the swiftest turn around times that I've bared witness to when a host detects an automatic update harbors a series of fatal errors within. I would find myself being drawn more towards Sanne's suggestion of giving them another try to see how the website will hold up now that they are implementing a 'new' and safer habit when they perform security updates.

They seem to have their customers in mind more so than their wallets. Some places you can't get any form of customer service at all.

-Sherlock

Degu

January 30, 2012
12:31am

It was my first time finally using this website for my characters and after trying to write a biography it all went down! Haha. I had wondered what happened and i can imagine it was stressful beyond belief.

You've got something wonderful going on with this place though, Don't worry about it, from what i can tell i haven't met one person who doesn't swear by this website loyally :)

thoreau

January 30, 2012
12:01am

I'll stick with you guys no matter where you move to. :)

CrescentNomad

January 29, 2012
11:40pm

I love the RPR and everything about it. To have it down for a day was rather shocking but I am more relieved that it is back. Everyone is very friendly, the layouts are convenient and it is probably the best case of customer service I've ever seen on a single website. This is a grand community and I want to see it succeed to the highest level. So thank you for all that you do, your hard work and dedication to something as simple as roleplay. Many of us love it, and RPR brings out the best in it. Whatever you decide to do server wise, I will certainly back you 100%! I do agree with Haelbane, sometimes technology doesn't like to work on a perfect scale. I can only hope they did indeed learn from this insane downtime and will work to improve for the future.

Sanne

January 29, 2012
11:33pm

*hugs* As outraged as I was, I have seen worse and none of us are angry with you!! Don't let this upset you too much; things like these can happen to any host, and they seem to have picked it up very nicely. How about we give them another chance, see how the site will hols up for the coming six months? :)

Dylan

January 29, 2012
11:27pm

It's good to know what happened! A day is inconvenient and you were more than generous with offering additional time when the site goes down. However, like Haelbane said, better a day than a week, and it is good they went ahead and got all their testing done at once!

Haelbane

January 29, 2012
11:12pm

It's a day, at least it wasn't a week. Things happen, technologies fail, at least they did their best to get it taken care of in a day, instead of longer.

I'm not the kind of person to get upset over one day of lost service. More than a day? Then I'd be working on changing my hosts. Stuff happens, this just proves that we shouldn't put all our faith in Tech.