Server issues and site downtime

Posted: Thu May 10, 2018 5:51 pm
by Arokhs Twin
There has been a few issues with the server crashing when under heavy load such as backing up etc. Today seemingly several files were deleted and / or corrupt and permissions were set to defaults.

I restored yesterday's backup and it seems OK but there are performance issues and some things such as avatars don't load or take a long time to load. Sometimes the site takes a while to respond and / or load. I suspect by looking at the log files the server has a faulty Hard Disk or corrupt filesystem. In other words the site may disappear for a while whilst it's fixed.


Re: Server issues and site downtime

Posted: Thu May 10, 2018 6:09 pm
by Blackbeard
Phew, i thought you went "fuck it"-mode and shut the site down. Good to see it back up again.

Re: Server issues and site downtime

Posted: Thu May 10, 2018 6:31 pm
by Arokhs Twin
BTW data should not be lost as the database is stored on another server and that one is OK. It's the main hosting server that has the problem so hopefully the only thing we will have is downtime. Even if the server needs a rebuild it will not take long as I guess the host will switch out the hardware and put the virtual machine images back onto it.

It all runs under VMware so backing up and restoring an entire server isn't as hard as it sounds.

Re: Server issues and site downtime

Posted: Sat May 12, 2018 9:32 pm
by Arokhs Twin
Seems to be back to normal now. No idea what went wrong but the server uses a bank of SSD's as a cache which had failed on my other website a while back and that had similar symptoms. The server hardware hosts multiple virtual servers not just mine so if the hardware fails it tends to get noticed pretty quickly.

The fix (on my other site) was to replace the hard drives with SSD's and dump the cache especially now that large SSD's have come down in price.