An application with a memory leak will eventually terminate abnormally, after exhausting all available memory. Applications written in languages such as Java or C#, despite the automatic reclamation of memory, nonetheless may still have memory leaks. Finding the cause for leaks in such settings is an immense challenge. This is in part because the leaks can occur for a number of different reasons: e.g. forgetting to deregister as a listener, improperly implementing object equality, sizing caches to be larger than available memory. They are also difficult to diagnose because of problems of scale: if you're leaking objects, there's bound to be a lot of objects to dig through! Finally, applications these days are composed of a myriad of frameworks that no one developer entirely understands. Even for experts, manually digging through the mountain of details is error prone and time consuming. We have developed a tool that automates this process, called Leakbot. It uses graph algorithms to group the live objects of an application into equivalence classes, and pattern matching to look for trends in the constituency of each class. A leak is a class that exhibits monotonically increasing number of matches.
Figure 1. We group the live objects in an application into equivalence classes, based on how they evolve, and where they are situated in the object reference graph. For example, in today's complex applications, we find groups of data structures that evolve like a leak (growing without bound), or like a cache, or a pool.
Our group has always been driven by work with important IBM customers. These engagements were the origin of the ideas behind Leakbot. When analyzing memory leaks on these engagements, the developers typically provided us with snapshots of the Java heap. Most existing tools required us to digest a graphical view of the snapshots: a node for each object, and an edge for each reference between them. However, these graphs were very large, consisted of objects from unfamiliar frameworks, and were connected in complex ways. As the reliance on frameworks increased, we found that manually poring over the individual details of the data became more painful. Nobody had the time or expertise to dig through the mountain of details manually.
Unfortunately, simple summaries of the graph didn't help. Knowing the common data types only told us that there are lots of strings. But applications weren't just leaking strings, they were leaking whole data structures.
We next used graph algorithms to look for the big data structures. This step was beyond most tools, but alone was not enough. Sometimes, data structures were just big, not bugs. Even then, knowing that a class loader owned most of the objects didn't help. The heads of the leaking data structures usually lay somewhere between the class loader roots and the strings.
Figure 2. We identify three attributes of each object to define the eqiuvalence class to which an object belongs. We then match the canonical keys against a series of memory snapshots, and look for trends.
Therefore, we implemented analyses, based on scalable graph algorithms, to automate the discovery of those leaking data structures and the context in which they are situated. It was particularly important to distinguish based on context, because the leaking structures were often of a common type, from a framework the application used. For example, a single application these days uses XML documents for several purposes, but only one is problematic. Our analyses groups objects into equivalence classes based on their type and context. It is very likely that the same code logic governs whether or not the objects in a class will eventually be collected, or will live forever. The analyses then look for how the equivalence classes evolve over time. Each equivalence class is described by a pattern: a location in the graph, and a type that is leaking. We look for trends in the membership of each class by matching the patterns against memory snapshots from the application, each taken over time as the application runs.
This technology now ships as part of Rational Application Developer version 6.0, and as part of the Memory Dump Diagnostic for Java technology preview. We are currently working on enhancing the algorithms we use to group objects into equivalence classes, and on an infrastructure that allows the graph analyses to scale to graphs with hundreds of millions of nodes. We are also extending the work to new domains, such as diagnosing problems with excessive memory footprint. Our experience has shown that applications that make heavy use of frameworks very often require too much memory to accomplish their tasks. Analyzing this type of problem shares many of the difficulties of leak analysis. It is even more challenging, due to not having trends to analyze.
Nick Mitchell and Gary S. Sevitsky. LeakBot: An Automated and Lightweight Tool For Diagnosing Memory Leaks in Large Java Applications. ECOOP 03 - European Conference on Object-Oriented Programming. February 2003 (2003 Pat Goldberg Memorial Best Paper Award).