About the service downtime (crash) on Sept 22 and Sept 24

Discussions about BGA (all languages)
Forum rules
Warning: challenging a moderation in Forum = 10 days ban
More info & details about how to challenge a moderation: viewtopic.php?p=119756
Locked
User avatar
sourisdudesert
Administrateur
Posts: 4630
Joined: 23 January 2010, 22:02

About the service downtime (crash) on Sept 22 and Sept 24

Post by sourisdudesert »

Dear all,

Board Game Arena experienced 2 consecutive crash in the 3 last days, with a downtime between 45 minutes and 1 hour.

On September 22th around 6:00 UTC, and then on September 24th around 16:00 UTC, the service suddenly stopped working.

At first, we'd like to say that we're deeply sorry about these crashes. We know that a lot of realtime games were in progress and most of them has been lost. We removed all the penalties linked to the crashes. Feel free to write at contact@boardgamearena.com if this is not the case.

We'd like to explain a little bit here the causes of these crashes, how we handled them, and what we did to avoid other crashes in the future.

During the last weeks, we did a lot of changes on Board Game Arena. Some changes have been very visible, like the new lobby. Others were optimizations of the service to make it quicker.

It appears that some of these changes disrupted an important component of BGA: our database system, where all your datas are stored.

On September 22th, the behavior of our database suddenly changed and made the trophy page incredibly slow. This page is not very used, but it was enough to kill the database and our service. We immediately put the trophy page offline, restored the service, then build an optimized version of the trophy page to make it back online.

On September 24th, the same thing happened to our 'games in progress" page. At the bottom of the page, there is a table which give you the number of games in progress for all games. This table is working for years like a charm, and suddenly the behavior change and crashes BGA. We immediately unplugged this function, restored the service, and then build an optimized version to make it back online.

These crashes were VERY difficult to anticipate, as they concern functions which are working fine for years and suddenly change. This is no coincidence that it happens now: it happened because we changed a lot of things on BGA. However, this is not directly related to our latest addition, which makes it difficult to anticipate, to understand, and to handle.

We've learned a lot from these 2 crashes, and we will try in the next days to find a way to limit the impact of such events in the future. There is no reason to fear another crash in the next days ... but as said before this is really difficult to anticipate so we cannot know for sure.

Sorry again for the downtime and the lost game. To comfort you, you should know that each crash is making BGA stronger and more reliable.

Thanks for playing here.
User avatar
Lotus Blossom
Posts: 149
Joined: 12 November 2017, 01:45

Re: About the service downtime (crash) on Sept 22 and Sept 24

Post by Lotus Blossom »

Really appreciate the information thank you ☺️
User avatar
JollyBird
Posts: 191
Joined: 27 March 2014, 13:21

Re: About the service downtime (crash) on Sept 22 and Sept 24

Post by JollyBird »

Thanks for this comprehensive and clear explanation!
User avatar
Waterd103
Posts: 40
Joined: 16 February 2014, 23:08

Re: About the service downtime (crash) on Sept 22 and Sept 24

Post by Waterd103 »

Wow, I don't get this amount of apologies and explanations when this happens in thousands of dollars tournaments in poker sites :lol:
Locked

Return to “Discussions”