September 2019 Outage Emails
From August 29 to October 11, 2019, RoboWiki and Old RoboWiki suffered a 44-day outage. Subsequently, on the third week of November, RoboWiki experienced a shorter outage.
During the downtime, there was an email discussion between RoboWiki's administrators about VPS hosting, server software, and how to move forward.
The highlights:
- David Alves pays for and manages the VPS that RoboWiki runs. PEZ pays for and administers RoboWiki's domain name.
- A subset of RoboWiki's administrators, of about seven people, have SSH access to the server.
- RoboWiki runs on Ubuntu 12.04 (Precise Pangolin), MediaWiki 1.19.6, lighttpd, and MySQL.
- RoboWiki's server makes automatic backups regularly, but these are stored on the server and not offsite.
- RoboWiki's server regularly requires manual reboots from SSH. On August 29, when the server was rebooted, it failed to boot.
- MultiplyByZer0 and Flemming N. Larsen notified RoboWiki administrators about the outage. They began investigating on September 24.
- Because the server could not be restarted from SSH or cPanel, David Alves had to file a support ticket with the VPS hosting company. The host's response time (17 days) was unsatisfactory, but they fixed the issue.
- RoboWiki administrators considered their options for moving to a new host. Companies were scrutinized, and the top choices were Hetzner Cloud and Miraheze.
- Since the Old RoboWiki is now effectively a static website, it can placed on static web hosting, which is free. PEZ made a proof of concept with Netlify.
August 29
The RoboWiki server begins returning "500 - Internal Server Error" for every URL.
A day later, it stops responding at all. All requests time out.
September 24-25
Date | Tuesday, September 24, 2019 21:42:10 UTC |
---|---|
Sender | MultiplyByZer0 |
Recipient | Rednaxela, Skilgannon, Voidious |
Subject | RoboWiki has been down for a month |
Hello, Apologies if you already know about this and are working on fixing it, but RoboWiki and the Old RoboWiki are both down, and have been that way for almost a month (since August 29). We are forced to read pages through the Internet Archive, and we cannot edit pages, discuss Robocode, or submit bots to the RoboRumble. Since you have shell access to the server, can you take a look at why the website is down? Thanks. MultiplyByZer0 |
Date | Tuesday, September 24, 2019 22:13:39 UTC |
---|---|
Sender | Rednaxela |
Recipient | David Alves, Skilgannon, Voidious, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Hmm, strange, the server isn't responding at all. Hi David, hope you're doing well. It seems the VPS isn't responding at all? Best Regards, Rednaxela/Alex |
Date | Wednesday, September 25, 2019 07:05:23 UTC |
---|---|
Sender | Skilgannon |
Recipient | Rednaxela, David Alves, Voidious, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Hey I've been trying to look into this with David, David has already contacted the hosting provider. From time to time before the wiki was crashing, I suspect security issues due to running on an old 12.04 installation. Each time it would require a manual reboot. When I rebooted the final time it didn't come back up. We have backups daily/weekly/monthly on the server, otherwise I have an older (~1 year? I'd need to check) backup of the database locally. Hopefully we can restore these onto a more modern base OS + mediawiki install. Best Julian |
October 7
Date | Monday, October 7, 2019 16:34:12 UTC |
---|---|
Sender | David Alves |
Recipient | Rednaxela, Skilgannon, Voidious, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Hey all, Sorry for the lack of updates here. I filed a ticket with the VPS company, then emailed them asking for a status on Oct 1st, then emailed them again just now. So far no useful response. I can see the container on their control panel, but I can't ping it via any of the container IP addresses (209.40.205.177, 67.223.226.21, 64.79.213.157). I have the option of rebooting the container via the web interface which I've already tried, and I can also reinstall the container but I believe that completely wipes the drive so I haven't done that so far. Not really sure what to do next. We should probably switch to something else like AWS but I think we need to get the data off the drive first, right?. David |
October 11-13
Date | Friday, October 11, 2019 15:14:38 UTC |
---|---|
Sender | David Alves |
Recipient | Rednaxela, Skilgannon, Voidious, Flemming N. Larsen, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
+Adding Flemming to this email chain (he emailed me separately). They fixed the issue and I can now SSH to the server, so I'd love it if one of you guys could ssh in and get the web server running again. I'm pretty unhappy with how long it took for them to resolve that considering how much this VPS costs. We should probably move robowiki to some other hosting solution (AWS? some mediawiki hosting thing?). David |
Date | Friday, October 11 2019 20:43:21 UTC |
---|---|
Sender | Skilgannon |
Recipient | Rednaxela, David Alves, Voidious, Flemming N. Larsen, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Success! After restarting mysql and lighttpd a few times it seems to be running again. I've also downloaded the backups, so even if it dies again we'll be OK. Agreed, let's try to find some different hosting, and we can take the opportunity to upgrade the OS to Ubuntu 18.04. Thanks for your help on this David! Best Julian |
Date | Friday, October 11 2019 20:50:12 UTC |
---|---|
Sender | Voidious |
Recipient | Rednaxela, David Alves, Skilgannon, PEZ, Flemming N. Larsen, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
+PEZ Amazing, that's great news! And I also agree, we can probably find much better (cheaper, reliable, hands-off) hosting for a wiki now than we did last time, over ten years ago... Julian, do you want to take the lead on that? Or should I? I don't have a ton of time to contribute, but I certainly do have some, and I care about the RoboWiki living on instead of dying out while people are still trying to use it... |
Date | Friday, October 11, 2019 20:58:31 UTC |
---|---|
Sender | Rednaxela |
Recipient | David Alves, Skilgannon, PEZ, Voidious, Flemming N. Larsen, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Nice, thanks David and Julian. Yeah, different hosting may be good. While I don't think I have the time to take the lead on this from my side I am interested in helping to extent I can. About other hosting options, I will say I tend to find AWS to be a bit overpriced for what it is, and I'm not sure about mediawiki-specific hosting but maybe there are okay options. If we want to stay with a VPS but just a different one I will say I've had good experiences with DigitalOcean, good price, easy to work with, and very reliable. |
Date | Friday, October 11 2019 22:50:34 UTC |
---|---|
Sender | Flemming N. Larsen |
Recipient | Rednaxela, David Alves, Skilgannon, PEZ, Voidious, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Nice job David and Julian! I love you guys! :-D I thought it was lost for god this time, which would probably kill Robocode really fast without access to the fantastic documentation, RoboRumble etc. I simply can't thank you enough. I don't know anything about hosting the MediaWiki. But if there is anyway I can help you out with keep it up and running, and/or perhaps move it to another hosting provider, I will do what I can to help you out. Just tell me how I can help. I could also pay for the hosting etc. I am so happy that you got it up and running again, and I know that lots of Robocoders out there will be really happy to get the news. :-D Best, - Flemming |
Date | Sunday, October 13 2019 02:19:49 UTC |
---|---|
Sender | MultiplyByZer0 |
Recipient | Rednaxela, David Alves, Skilgannon, PEZ, Voidious, Flemming N. Larsen |
Subject | Re: RoboWiki has been down for a month |
Thanks for the hard work, everyone. I definitely agree that RoboWiki should move off its current host; good hosting services have effectively zero downtime. I did some research, and my suggestions are:
Also, would it be fine to publish this email chain on RoboWiki? It contains some extremely useful information about the RoboWiki server setup. MultiplyByZer0 |
Date | Sunday, October 13 2019 06:16:14 UTC | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Sender | Rednaxela | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Recipient | David Alves, Skilgannon, PEZ, Voidious, Flemming N. Larsen, MultiplyByZer0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject | Re: RoboWiki has been down for a month | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I'm extremely strongly against using any ad-supported hosting, and am perfectly willing to pay to avoid that. Miraheze looks interesting, however they do have a "dormany policy" where they close or delete wikis after 60 days of inactivity, which I don't see as favorable for something like RoboWiki where we want it to remain accessible regardless of how activity ebbs and flows As far as VPSs, the two I have experience using personally are DigitalOcean and BuyVM, both of which I've been using for many years at this point. BuyVM is a little better value on paper, and I've found it to have pretty good reliability but not perfect. DigitalOcean on the other hand I've found to be flawless for uptime. I will note though that BuyVM's unmetered bandwidth is kind of nice, giving peace of mind one won't run out of transfer or have to pay overage. Looking at Amazon Lightsale, the pricing is in the same ballpark as DigitalOcean and BuyVM. I tend to think a 1GB instance is plenty, but here are some comparisons in general for both 1GB and 2GB sort of class.
EC2 is harder to compare as it's a very different pricing model, so far as I can tell more expensive for hosting these sorts of things, and is also much less predictable in price (big negative in my books), at least when one factors in storage+bandwidth which are charged for separately with EC2 While OVH looks like a good value on paper, looks like like there's a fair bit of negative reviews around them and they seem a littler sketchier. Hostinger does the cheeky thing of advertising favorable looking introductory prices but charges more to renew making for a pretty poor deal really. All in all, if going for a VPS, I'd be inclined to go with DigitalOcean or Amazon Lightsail. Mostly ruling out BuyVM as it doesn't have built-in automatic backups (though if we set up backups more manually maybe it would be fine) Between those DO and Lightsail, I will say Amazon Lightsail's bandwidth overage rate spooks me a lot more than DO's rate, despite Lightsail offering a larger base transfer, and offers more disk. As a note, I'm pretty sure both DigitalOcean and Amazon Lightsail do have ways to allow multiple folks with separate accounts to have full admin console access, if we want to distribute that. A lot of smaller VPS providers don't have that. Best Regards, Alex/Red |
Date | Sunday, October 13 2019 08:22:31 UTC |
---|---|
Sender | Skilgannon |
Recipient | Rednaxela, David Alves, PEZ, Voidious, Flemming N. Larsen, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
I'd prefer the VPS route. Especially because we want a read-only copy of the old wiki, I don't see any other option TBH. And just to throw another into the mix, Hetzner has an excellent reputation for uptime, and prices are good too. |
Date | Sunday, October 13 2019 09:26:58 UTC |
---|---|
Sender | Rednaxela |
Recipient | David Alves, Skilgannon, PEZ, Voidious, Flemming N. Larsen, MulitplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
That's true, the read-only old wiki backup does make going the VPS route fairly preferable. Hetzner does look pretty good... looks like equivalent to 3.27USD/mo for 2GB/20GB/20TB and backup pricing that's 20% of instance price (just like DigitalOcean), and equivalent to just 6.43USD/mo to double that ram and disk space. From a search looks like they have a relatively solid reputation too. Also looks like their traffic pricing for going over the included traffic limit is like 1/10th of DO's traffic overage pricing (and like 1/75th what it is with Amazon Lightsail), which is good if there's unexpected traffic. Would be higher ping to those of us in the Americas but I don't see that as a big deal. On paper I'm leaning toward Hetzner as the most preferred I've seen yet (though I do still also like DigitalOcean too based on my experiences with it) |
Date | Sunday, October 13 2019 10:43:55 UTC |
---|---|
Sender | PEZ |
Recipient | Rednaxela, David Alves, Skilgannon, Voidious, Flemming N. Larsen, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Hi all! I now see that I only sent this to Flemming:
About old wiki asking for a VPS solution. Since it is static, we can crawl it and make a static site that we then just let Netlify serve. That hosting would cost zero dollars. Regards, /PEZ |
Date | Sunday, October 13 2019 14:15:03 UTC |
---|---|
Sender | PEZ |
Recipient | Rednaxela, David Alves, Skilgannon, Voidious, Flemming N. Larsen, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Here's a POC of the static site option. I just threw wget at old.robowiki.net and gave the files to Netlify: https://old-robowiki.netlify.com Lots of the links do not work yet. (I ran wget in windows mode to make it rename files so that Netlify accepted them. Then only replaced the most obvious pattern (? -> @) in the files.) We would need to run wget much more surgically than I did here, because with all diff links and all it gets A LOT OF pages.I aborted the crawling after some 5K pages. Naturally we would run such a site off the real domain name. A bit of more work, not a lot, to get it good enough, but after that no maintenance at all for the old wiki. /PEZ |
Date | Sunday, October 13 2019 18:55:03 UTC |
---|---|
Sender | Flemming N. Larsen |
Recipient | Rednaxela, David Alves, Skilgannon, PEZ, Voidious, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Running the old RoboWiki as a static site (read-only) at Netlify looks very promising. :-) Regarding new hosting, for the current RoboWiki, I recommend using one which deals with spam and infrastructure etc. so it is easy to maintain it. However, I don't know much about hosting like the rest of you guys. So I trust your expertise in this area. - Flemming |
November 17?
The RoboWiki server once again stops responding to HTTP requests.
November 21
Date | Thursday, November 21, 2019 08:34:16 UTC |
---|---|
Sender | PEZ |
Recipient | Rednaxela, David Alves, Skilgannon, Voidious, Flemming N. Larsen, MultiplyByZer0 |
Subject | Re: RoboWiki has been down for a month |
Seems the site is down again. I think Miraheze looks really good. So that for the live wiki and static for the old static is my suggestion. /PEZ |