Blinkist is a nonfiction learning app that gives powerful ideas from nonfiction. The app contains Blinks—bite-sized 15-minute audio and text explainers. In 2017, Blinkist's app was nominated by Apple as one of the best in the App Store. Blinkist has received the World SummitAward, granted by the United Nations, in the Learning & Education category, as well as a Google Material Design Award for Brand Expressiveness. With Blinkist, you can understand the key takeaways from nonfiction bestsellers in minutes, find your next read, and enjoy their “Shortcasts”—which are like Blinks, but for podcasts, produced with the original creators.
We recently spoke with Patrice Liang, Senior Software Engineer at Blinkist, to find out what she loves about Airbrake, and how Airbrake helps her triage incidents.
Patrice finds Airbrake’s Slack integration and visual dashboards especially helpful. “Slack is a key part of our workday and we use the Airbrake integrations as an alerting tool. We also use Airbrake to triage incidents. Recently we've been doubling down on clear ownership and being able to integrate specific Airbrake projects with specific Slack channels has helped in automatically routing concerns to the right team.” She added “Visualization [by project] is perhaps our team's favorite feature of Airbrake.”
In one instance, Airbrake played a crucial role in the Blinkist Services team's incident triage not only in problem detection, but also in its facilitation of collaboration and communication. Through the integration with Slack, one team member noticed an Airbrake error indicating that our internal “Admin” tool could not reach our content microservice. This signaled to us that there was an issue with the content microservice. The problem appeared to then resolve itself, only to surface again through an incident triggered by a spike in errors on an endpoint. In addition to the Admin tool being down, the team started to see degradation in performance of multiple endpoints, including signup and login. Externally, some customers experienced increased latency and timeouts while signing up/logging in.
The Airbrake errors and incident alerted more team members to join the investigation, which was largely conducted within a Slack thread on the Airbrake error. In the thread, the team decided to start a Zoom call to triage the incident. The team ultimately identified that a code change and simultaneous surge in traffic due to a scheduled push notification to the Blinkist app brought the internal Admin tool down. The code change had to do with how we serialize certain content item responses in the content microservice. We launched Shortcasts in German and sent push notifications to inform users, which increased traffic to our systems. The team temporarily increased resources to the service to mitigate impact while the code change was reverted and a permanent fix was being worked on.
There were two key results for Blinkist developers using Airbrake. First, Airbrake was the first line of detection in this instance and several others. As the side effects of this problem spanned from increased latency in signups/login to an internal tool outage, it would have been difficult to quickly narrow down the root cause. Airbrake not only identified the problematic service right away, it allowed the team to start the investigation even before a formal incident was triggered.
In this time of remote work, teams are always seeking ways to better communicate. Blinkist developers found Airbrake’s Slack integration feature to be an efficient way to communicate and to bring the team together for incident triaging.
Patrice Liang is a Senior Software Engineer at Blinkist. She is one of two backend engineers who works on supporting features and larger, more technical projects. In her spare time, she likes to run around the flat landscapes of Berlin, visit galleries and bodies of water, and write about the mundane with wonder. You can follow Patrice on Twitter (@tricepat) or connect with her via LinkedIn.