Hello and welcome to a (hopefully-)weekly update from microcosm!
About a million years ago I asked if anyone liked weekly updates, got a strong "yes", and then... kept just building instead of ever writing. Now all the building has moved forward, so hopefully I can be a bit more balanced with more writing as well. Going to try to keep up with weekly updates on Friday (oops) and regular longer posts as well.
This week
It's been officially six months since launching microcosm! I wrote about it here: https://updates.microcosm.blue/3lw5wc2nkw22t
I was trying to finish updating the main website at microcosm.blue in advance of that post, but that's still a work in progress as of now. Here's a little preview of some new elements:
I was aiming to have a more coms-heavy week, but the UFOs-API database had other ideas. UFOs keeps hourly database snapshots for about 24 hours, but the cleanup task only runs daily, so the size occupied on disk reaches a peak each day just before the cleanup. The database has grown far more than I had expected (foreshadowing), to over 300GiB on its 512GiB disk, and this week was when the snapshots cycle finally pushed it over the edge, with its consumer task stuck with a full disk for some time each day until the cleanup came around.
Details below, but ultimately UFOs should be in a much better state now and going forward, with 93% reduced disk usage (almost for free!) and, bonus, its somewhat-frequent lockups during compaction have been entirely eliminated thanks to an upstream storage engine fix!
It's easy to underestimate how much time gets sucked into operations tasks like this, just to keep services online. With four production microcosm API services plus four public firehoses, it's not a small amount. Thankfully the other services are usually pretty drama-free (knock on wood).
~
Other things this week: it seems like the number of people actually building things on Constellation is accelerating, which is exciting! And insecurity-triggering! Constellation hasn't changed much since initial launch, with documentation fixes and API gaps to fill long overdue. It gets held up as an example of indie infra, but there haven't really been very many apps pushing its limits for most of its early days. I believe its usefulness increases non-linearly when it can be used together with the other microcosm services, so I've been trying to run ahead with sketching out the larger multi-service vision while Constellation's usage lags. This week has felt like an inflection point in usage, so it's probably time to go back and fill some of those gaps.
Slingshot, the newest microcosm service, is getting a faster developer pickup! Its v0 public launch is a little limited in its core record-caching performance (big wins for relatively small effort to come), but its secondary identity-resolving endpoints are striking an even louder chord. Especially my favourite, identity.resolveMiniDoc, which a couple apps are already using to streamline user login flows from handles!
Next week I'm hoping to get the microcosm.blue website refresh actually out the door, and get a draft of a spec for maybe the most important atproto tech for microcosm: link source (and record path) syntaxes.
See you next week and thanks for reading!
below are daily notes from the week, mostly kept for myself but hey have fun if you're curious
monday
investigated HTTP 502s for Constellation (normal)
published post: Six months of microcosm!
welcomed four new people to the discord
contributed adding a user-agent header for pattern
made some progress updating the microcosm.blue main site
tuesday
followed up the indigo rkey fix with a jetstream fix
tested and deployed the jetstream fix on both microcosm jetstream instances
maintenance
OS updates and fresh builds for jetstream hosts
OS updates and fresh builds for relay hosts
implemented "remove_weak" for UFOs-API secondary rank indexes -- hopefully a major performance win on multiple levels
discussed an account-wide backlinks api for constellation
wednesday
merged and deployed remove_weak for UFOs secondary ranks
got help debugging the stalls from marvin
added and deployed a bunch more metrics for ufos
...despite initial optimism, write stalls still occur
bit of progress on the website revamp (starting over counts as progress right?)
thursday
fixed a bug in atrium
more debugging of UFOs
worked on the code-first quickstart part of the microcosm website
proposed an atproto spec update to help sdk authors write more-compatible https-method handle resolvers
handle resolution failure reported for slingshot's ResolveMiniDoc led to
improving upstream https-method handling in atrium
contributing a small atproto spec update
started an attempt to major-compact the UFOs db (from backup on a different machine)
friday
continued major-compact attempt (whyyyy am i rsyncing a network-mounted folder)
improved batch-insert timing metric for ufos (old system was verrrrry wild and meant for inspection by logging only)
^^ UFOs backup finally received!
profiled UFOs on dev machine with samply
shared with fjall dev, who fixed it 🎉
ran major compaction on UFOs DB, which worked except for the most important partition: records!!
found and fixed the bug: record deletions were being applied against the wrong partition 🤦♀️
wrote a lil brute-force records garbage-collector to clean up everything that wasn't actually deleted! and ran it! and then ran a major compaction!
overall DB size is now reduced 93% to just ~23GiB immediately after the compaction, from 311GiB before.
UFOs' data store is now running well at last 💆♀️