Previous incidents
Large backlog in webhooks and automations
Resolved Mar 28 at 08:18pm EDT
We've processed the backlog.
1 previous update
Increased spam rate for custom domains
Resolved Mar 25 at 02:51pm EDT
We've confirmed and resent the majority of our affected emails, and are in touch with authors with whom we haven't automatically resent on their behalf.
3 previous updates
Emails were timing out
Resolved Mar 14 at 02:31pm EDT
Here is the not-so-fun thing about running an email service provider: you get malicious actors trying to use your infrastructure for, well, malice — phishing, spoofing, et cetera.
We have a lot of defenses in place for this, but we detected someone with a relatively novel approach of trying to pass in problematic URLs which we weren't catching. Our other various systems did catch and apprehend this user before they were able to send any emails, but in our haste to push forward a solution ...
Delays in background processing
Resolved Jan 31 at 03:52pm EST
Post mortem
What broke?
workerscheduler
, our process for running asynchronous jobs that are scheduled for some date in the future, was hard down for ~six hours. This meant, amongst other things:
- Outbound emails were down
- Cron was down
- Other stuff, but those two dwarf everything else
Why did it break?
At a very high level, our asynchronous worker schedulers work something like this (none of this is bespoke, it's standard RQ):
- To enqueue a job, serialize the method n...
3 previous updates
Unable to send from the author-facing app
Resolved Jan 27 at 01:52pm EST
We have identified the root cause of this issue, and a fix has been deployed.
1 previous update
Author-facing app failing to load
Resolved Jan 23 at 08:15pm EST
This issue has been identified, and the fix has been shipped.
1 previous update
Author-facing app failing to complete requests
Resolved Jan 17 at 09:26pm EST
As written above, we've since recovered from the incident.
1 previous update