Member-only story

Patching a memory leak

4 min readJun 18, 2024

Identifying and fixing a memory leak in a Python Flask web server using tracemalloc and code analysis

Problem

At Freshworks, we were alerted to a recurring pod restart on a Python Flask web server. I could not find any issue in the logs at the first alert. However, the same alert reappeared for the same server the next month. This time, I was pretty sure that something fishy was going on.

I checked the logs and the infra-monitoring dashboard for the last hour; nothing fishy was there. Then, to increase the scope, I checked the memory usage of the pods from one hour to one day and then from one day to one week.

Figure 1: Memory usage of a Server pod in three days, from ~600MB to ~700MB.

The Server’s memory usage was increasing. It often crossed the maximum allocated memory size, causing the pod to restart. This restart caused an occasional API request loss. The server has many APIs. The next challenge was to pinpoint the exact location of the leak.

Leak Detection

This was a slowly accumulating memory leak over a month. These are generally hard to detect. I tried to scope down the leak circumstances.

Patching a memory leak

Problem

Leak Detection

Scoping down the leak

Written by Soumendra's Blog

No responses yet