Last-last week my plans were overtaken by someone physically cutting the fibre internet to my house. This time it was the relays that got me!

Indie relays

I've been operating free public full-network indie atproto relays for a little over four months. #indiesky stunts tend to get a lot of attention, but one thing I've learned is that actual motivation to use independent infra can be a bit of a let-down. It can be a little demoralizing to spend significant time planning contingencies for demand—i knew exactly what i would do and how much it would cost if i needed to spin up more/bigger relays—only to wind up with, a lot of the time, just one consumer per relay. (…the microcosm jetstream instances)

However!

I don't think the excitement that indiesky stunts get just evaporates. Moving away from things that work takes time, effort, potentially costs money, and carries risk. The amount you need to learn right now, whether you're a user or a developer, is a lot! But I think you only need to look at the recent progress in PDS self-hosting to see that the interest is there. Better tooling, ironing out bugs, lowering risks, and making options available takes time, but with every step, more people move. (of course, mainstream adoption needs all this plus positive incentives. but we're still largely pre-early-adopter in my view)

So, what about relays.

The atproto network is in a bit of a tricky, but hopefully temporary, state right now. Bluesky has a production relay at bsky.network, which from my understanding is the relay used by:

  • Bluesky's own appview, and

  • Bluesky's jetstream instances,

  • The large majority of third-party firehose-listening apps (whether directly or downstream via jetstream)

But this relay is old! It's still running the old bigsky relay codebase! It's been seven months since sync1.1 was introduced, which is the thing that makes operating full-network relays affordable and scalable. Bluesky made a reference sync1.1 relay, and runs two public instances of it, but has yet to switch many (any?) of their firehose consumers over. So, effectively, nearly all of the ATmosphere is still powered by that one old bigsky relay. Meanwhile, every* indie relay is on the new codebase.

Why does any of this matter?

did a scan of all non-bsky (+non-bridgy) accounts (ie active self-hosters) 12% are in a bad state with sync1.1 relays 7% if you only consider the bsky sync1.1 relays not great news for self-hosting, but definitely fixable

probably not worth trying to re-re-re-clarify but:

the issue seems real (despite some false-positives earlier):

bsky *relays* mark a number of accounts on third-party PDS instances as `deactivated` and drop their events, despite their upstream PDS host showing them `active`
Aug 22, 2025, 9:43 PM

I will be the first to overexplain that it mostly doesn't matter which relay your appview listens to. How the "AT" in "ATProto" is for data getting Authenticated-ly Transferred. You don't need to trust the relay very much! Not zero trust, but a magically low upper bound of trust. You can just switch your appview to a different relay. You can just spin up your own relay and switch to it for about $20USD/mo (seriously!). Using the main one is convenient and nice on the network, but everything is designed around making that centralization non-essential: it's not locked in, it has credible exit.

oh wow this actually panics the relay oops

echo: http: panic serving 127.0.0.1:59908: runtime error: invalid memory address or nil pointer dereference
goroutine 138031046 [running]:
net/http.(*conn).serve.func1()
        /usr/local/go/src/net/http/server.go:1947 +0xbe
panic({0x120d4a0?, 0x1ed6c50?})
        /usr/local/go/src/runtime/panic.go:792 +0x132
main.(*Service).handleComAtprotoSyncGetRepoStatus(0xc000176f00, {0x1615860, 0xc11b883c20}, {0xc118feda0d, 0x20})
        /dockerbuild/cmd/relay/handlers.go:201 +0x399
main.(*Service).HandleComAtprotoSyncGetRepoStatus(0xc000176f00, {0x1615860, 0xc11b883c20})
        /dockerbuild/cmd/relay/stubs.go:171 +0x22a
github.com/labstack/echo/v4.(*Echo).add.func1({0x1615860, 0xc11b883c20})
        /go/pkg/mod/github.com/labstack/echo/v4@v4.11.3/echo.go:582 +0x45
github.com/bluesky-social/indigo/util/svcutil.MetricsMiddleware.func1({0x1615860, 0xc11b883c20})
        /dockerbuild/util/svcutil/metrics_middleware.go:48 +0x1c5
github.com/labstack/echo/v4/middleware.LoggerWithConfig.func2.1({0x1615860, 0xc11b883c20})
        /go/pkg/mod/github.com/labstack/echo/v4@v4.11.3/middleware/logger.go:126 +0xd8
main.(*Service).startWithListener.func1.1({0x1615860, 0xc11b883c20})
        /dockerbuild/cmd/relay/service.go:98 +0x124
github.com/labstack/echo/v4/middleware.CORSWithConfig.func1.1({0x1615860, 0xc11b883c20})
        /go/pkg/mod/github.com/labstack/echo/v4@v4.11.3/middleware/cors.go:204 +0x45e
github.com/labstack/echo/v4.(*Echo).ServeHTTP(0xc000112fc8, {0x15f8240, 0xc11c101180}, 0xc11a6df2c0)
        /go/pkg/mod/github.com/labstack/echo/v4@v4.11.3/echo.go:669 +0x327
net/http.serverHandler.ServeHTTP({0xc0345932f0?}, {0x15f8240?, 0xc11c101180?}, 0x6?)
        /usr/local/go/src/net/http/server.go:3301 +0x8e
net/http.(*conn).serve(0xc11bf42ea0, {0x15fb6c0, 0xc00477a990})
        /usr/local/go/src/net/http/server.go:2102 +0x625
created by net/http.(*Server).Serve in goroutine 7687
        /usr/local/go/src/net/http/server.go:3454 +0x485
Aug 28, 2025, 5:21 PM

Or rather...... hmm.

If I'm really honest, a small part of the reason constellation still listens to a Bluesky-run jetstream instance is because, well, I guess I haven't reached a level of confidence in my own relays yet to make the switch, at least for the most critical microcosm APIs.

The tooling story for observing and comparing running relays is not great. There are (as I've been finding now that I'm looking) bugs in the reference sync1.1 relay to be ironed out. There are open problems that might even lead to protocol-level fixes. The risks of switching to an indie relay turn out to, today, include missing all data from hundreds of self-hosted accounts.

When Bluesky-run jetstreams started lagging and dropping clients this weekend, I wish I could have recommended folks try my existing full-indie jetstreams. Instead, I spun up a whole new one that subscribes to bsky.network, that one, old, not-fully-authenticating, centralized relay which is ultimately used by everyone. And I recommended folks use that jetstream instance because at least I knew it really would be a drop-in alternative.

~

I don't think Bluesky is disengaged here, but it looks like other efforts have been given higher priority for some months. They're a small team! But we can't just wait for Bluesky to commit to sync1.1 in production and fix everything. I'm working on it, landing some small mitigations and digging in to root causes. Critically, I've received quick responses to questions and fast pull request reviews from Bluesky devs. (thank u forever Bryan)

More indie relays are coming online. We finally got a #relay-operators channel in the atproto dev discord. There's bumps to get over and tooling to build, but we're on our way.

~

* extremely notable exception: Blacksky's rsky-relay, at atproto.africa! it's a reimplementation of the bluesky sync1.1 relay in rust.

It was hard to fit in but: one reason i'm finding problems in sync1.1 relays is that they expose more state to inspect. They implement more atproto http queries, including com.atproto.sync.getRepoStatus, which are unavailable from either bsky.network or atproto.africa, making it harder to observe impact.

I think more accounts tend to work properly on bluesky's bsky.network relay primarily because account owners can self-debug, because everything downstream uses it. But not all its bugs necessarily have easy resolutions, see this post for a frustrating (and unresolved) self-hosting experience.

a final relay note: If you self-host a PDS, you can help your local indie relays stay in sync with your events by adding them to your PDS_CRAWLER config! See here for an example that includes microcosm relays, and you can find the list of public indie relays I'm aware of at compare hoses. It's not required, but for now it helps — expect some protocol evolution here.


💃 jump in the microcosm discord if you want to talk more about relays! if you want to support independent atproto infrastructure, you can sponsor me on github or support me on ko-fi. thanks!


Building on microcosm

Conversations

Note: at the start of this week I wrote down here everything i'd recently added my backlog, but the leaflet sync bug that ate some data from last week's post managed took a chunk off this one too. Oh well! I will now pretend the backlog isn't growing quite so terrifyingly since i can't see it.

boring week notes below


Monday

  • mostly still travelling

  • wrapped up last week's notes (er, not quite)

[lost content]

Wednesday

  • finished last weeks microcosm weekly, and then started to re-do all the finishing work after it was eaten 🥲

Thursday

Friday

  • scraped bluesky relays to discover any banned PDS hostnames

    • found four hostnames banned

    • cross-referenced the banned hosts from the relay's com.atproto.sync.listHosts endpoint: one was missing. this is likely because it was banned before it was actually added as a host in the relay's database

  • got a reproduction of the relay's bad account state on a new account (fresh PDS): https://bsky.app/profile/bad-example.com/post/3lxkrt3yh722a (tracking issue on github: github.com/bluesky-social/indigo/issues/1143 )

  • fixed broken mobile rendering for https://microcosm.blue. it still feels pretty incomplete but at least now you can read it on a phone.