Updated my Mastodon instance o 3.0.
...I worked on the the serving infrastructure of gstatic.com, which includes all of the thumbnails of YouTube and Image Search, as well as fonts and petabytes of other highly cacheable content.
Generic storage systems and generic web servers are a poor solution for serving small static files efficiently.
I'd be happy to help come up with a scalable design for Mastodon.
@benoit @angristan The second question is: does the datastore continue to perform well when the write qps increases? Some blob stores must do periodic garbage collection or replication which impacts latency.
If your datastore supports replication, can you create serving replicas in americas, europe and asia? Can you tie http frontends to those same zones? Can you redirect user traffic to the closest replica? Can you spill traffic to the other replicas when the closest one is overloaded?
Serving static files is starting to approach a realtime system, although deadlines are not that hard. As long as your 99-percentile latency is consistently low, your service will feel snappy to most users, most of the time.
In centralized platforms, the cost per user goes down as the number of users grows... classic scale economy.
I'm still not convinced that the current fediverse architecture can scale up to 1B users by just adding more instances, unless they find a way to share some of the serving infrastructure.
@angristan @benoit If each instance buys separate Amazon S3 buckets to store images and serves them via Cloudflare on separate domains... we're effectively as centralized as we could possibly get, while having low cache-hit rate and huge storage costs per instance. I couldn't possible imagine a worst serving architecture for Mastodon 😦
@Gargron, what do you think? Is there a hope for instances to share some content serving infra in the future?
Similar to git, you could clone a repository from an untrusted source if someone you trust gives you the hash of the revision you wanted.
@angristan @benoit Our intution of how distributed systems work is often wrong. Amazon S3 is centralized from the administrative pov, but it's highly distributed and redundant in the ways that matter for scalability and high availability.
The fedivers is administratively distributed, but each instance is a SPOF from the point of view of the users on that instance.
Your instance suddenly went down and you didn't dump all your data the day before? Sorry, you lost all your toots and your network!
The single-user instance is an interesting case, because it gives us a baseline to characterize how storage costs vary with the number of users per instance.
The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!