Scalable Distributed System on Roblox - Real World Example

On one of the game that I worked on Rensselaer County, I’m responsible for implementing the Feedback Hub, which is a forum-like feedback system where players could vote and submit various suggestions.
(Unfortunately, we ended up not shipping the feature, soo here’s me writing about it since I don’t think anyone wrote an article on making something similar before)
Why is Distributed System difficult to implement
Most of you familiar with the Roblox system would recognize right way that “distributed” is ridiculously difficult on Roblox.
For example:
- No Session Lock, High Latency for interacting with the DataStore (Database for Roblox Game, >400ms for most read/write operations, basically a really slow KV store with no atomic transaction)
- The lower latency alternative MemoryStore (Roblox’s Redis, but all data can only be stored for up to 40 days)
- DataStore and MemoryStore’s data are shared across game server. There could be as many as up to 1000 game server at once.
However, our suggestion system need the following:
- Be able to vote on suggestions from different game server (Distributed counting)
- Submit new suggestions and have it appears in all other game servers
- Should not use any external services not provided by Roblox natively (e.g., our own server to keep track of votes/suggestions)
- Everything is searchable! Not just prefix, but actual search with fuzzy matching.
Our Solution
A tiered caching system! Looking back at the design, there are a few things we need to store:
- Positive Vote Count
- Negative Vote Count
- Name of the suggestion
- Description of the suggestion
- ID of the suggestion
- The submitter’s info (Username and UserId)
Right away, we could already see that some data are frequently updated, while others are unlikely to change after submitting. So that leave us with PV and NV that will need to be updated.
We only have 2 methods of storing data on Roblox, DataStore (slow, but can store large amount of data for a long time), and MemoryStore (fast, but can only store a small amount of data per key, and expire after a month).
So let separate out the responsibilities of each store:
- MemoryStore:
- Positive Vote Count
- Negative Vote Count
- DataStore:
- Basically everything else
However, you can quickly see how this would not scale up.
Assuming that:
- If a server is trying to read multiple suggestions at once, this would quickly hit DataStore’s read budget.
- If a server is trying to vote on multiple suggestions, this would require the MemoryStore to update frequently, potentially running out of MemoryStore’s read/write budget.
We fixed this by basically:
- Consolidating all suggestion entries into a single DataStore key, and apply aggressive compression on it:
This mean that it will only take 1 request to get every suggestion, reducing our read usage. This would also allow us to build our fuzzy matcher for the search feature using the data.
(If a player submits a new suggestion, we would have to read the current DataStore entry, then write it back into the DataStore with the new suggestion appended, obtaining a copy of the latest data in the process)
Each game server will also periodically update their suggestion entries from the DataStore, ensuring that everything won’t go out of sync too much.
- Caching the positive vote count and the negative vote count into the DataStore data.
Each game server will run an update cycle every hour, which update the DataStore’s vote count to match MemoryStore’s version with the vote count that the game server updated previously.
Whatever server got the highest count will be considered to have the most updated copy.
(Again, if a player voted for a suggestion, we would read the current vote count in MemoryStore, add one, and write it back)
Although this solves our read/write problems, and could provide the vote count to the player right away, this means that a different game server could have a different vote count.
However, we’re aiming for eventual consistency, rather than a truly real time system
TLDR:
Just use an external server and save yourself all these troubles