Tuesday, April 27, 2010

This blog -- taking a break

Prompted, in part, by my annoyance at Google dropping the FTP support this blog relies on, but mostly by a lack of time, I won't be doing any work on this blog for the foreseeable future.

Commenting on all posts will be disabled.

However I do have plans to write new material on my other "non-technical" blog, especially about "people skills for geeks".  (A topic on which I recently posted this video).

Thanks to everyone who's read and commented on this blog over the past 4 years.

-- John

Memory Leaks in Managed Code

(This is a repost of an old article I wrote in early 2006, before I started this blog)

When writing a Window Forms application, it's useful to display the current memory usage in the "About" box. It comes in handy when trouble shooting. It usually proves that memory usage it not the problem, but sometimes it proves the opposite: the application is chewing up more and more RAM.

How can that be? How can a managed application leak memory?

Types of Managed Memory Leaks

There are two main types of "leak" in managed applications

1. Unintended references keeping managed objects alive

You're not using an object any more, so you expect that the garbage collector will clean it up. But the garbage collector doesn't. Why? There is only one possible reason: something, somewhere, still has a reference to the object. Perhaps you put the object in some global (i.e. static) list or perhaps it is referred to be some other object which you are still using.

So this isn't a "real" leak at all, it just looks like one. You've forgotten about a reference to your object, but the garbage collector hasn't. It sees the reference and keeps the object alive.

Of course, a reference only counts if it can be (recursively) traced back to a "root" object which is still in use - i.e. a static field or a local variable/parameter in a currently executing method. References from other objects which are, themselves, due for garbage collection will not keep an object alive. In fact, the garbage collector never even sees them. (It just starts from the roots and recursively visits reachable objects.)

Event handlers are a common cause of this problem. You create objects A and B then run some code like this:

A.SomeEvent += new EventHandler(B.SomeMethod)

Later, you finish using B, but you're still using A. Your on-going use of A keeps B alive, since A refers to B by way of the event handler. If you want B's lifetime to be shorter than A's, you have to unhook the event handler with a line like this:

A.SomeEvent -= new EventHandler(B.SomeMethod)  //note the '-='

Of course, event handlers aren't the only cause of this problem, any reference between objects will do. Consequently, these problems can be difficult to track down. You can't think what the remaining reference is, but there must be one there, somewhere. To help track it down try a tool like SciTech .NET Memory Profiler. Or, if you prefer free development tools, try Microsoft's SOS debugger extension. But be warned, it can be a bit complicated.

2. Managed objects holding unmanaged resources


There are two variants of this problem:

(a) Badly-written managed objects, which don't clean-up after themselves. Of course, you don't write any of these, so let's move straight on to (b)…

(You don't write bad objects because you know that a good managed object should always implement IDisposable if it uses unmanaged resources. A good managed programmer will use that interface to clean the object up when he or she has finished with it. IF the programmer forgets, the good managed object will clean up the unmanaged resources anyway when the garbage collector collects the object. )

(b) Small managed objects, which use significant amounts of unmanaged memory. The .NET 1.1 garbage collector cannot see the unmanaged memory. It just sees the small managed part, so it thinks there's no need for a collection. (If it did do a collection, the well-written managed object would indeed free the unmanaged memory, but the garbage collector doesn't realise a collection is required.).

Bitmaps are a classic example of this problem, since they have a small managed component and a large unmanaged part. Under .NET 2.0, there is a new method which such objects can call to inform the garbage collector of their true size. It will solve this problem (as long as it's used correctly).

Other Types of Managed Memory Problems

Two other leak-like problems are possible (although relatively unlikely, in practice, I'd say). The first relates to "pinned" objects preventing the garbage collector from moving objects around. This is an internal .NET implementation issue, and has been significantly improved in .NET 2. The second relates to declaring lots of large objects and never letting them leave scope. In the unlikely even that you find yourself doing that - and suffering unacceptable memory usage - you should stop doing it :-)

This page itemizes all the kinds of memory leaks that I've listed here (although it fails to mention managed objects holding unmanaged resources).

What to Do
  • Always examine memory usage when you test your application. Don't assume everything will be OK simply because you're using managed code. Test it.

  • If you're writing a GUI application, consider displaying both managed and unmanaged memory usage in the About box, using GC.GetTotalMemory(true) and Process.GetCurrentProcess().PrivateMemorySize respectively. (See this page, and this one, for an explanation of exactly what PrivateMemorySize means.) Open the About box from time to time during your testing, to check that the figures look reasonable.

  • Call Dispose on IDisposable objects when you have finished with them. If an object implements IDisposable it is saying "please clean me up when you've finished with me". The object may hold unmanaged memory, or other resources such as files or GDI handles. Call Dispose() or get the using statement to do it for you.

  • Be careful with global (static) members. Don't overuse them. Why? If you have a lot of statics, you may find it harder to understand which of them are keeping objects alive. (This recommendation is for your benefit, not the garbage collector's.)

  • Get to know your garbage collector. Here's a good starting point and some more advanced stuff too.

What NOT to Do

  • In general, do not call GC.Collect(). Leave it to the garbage collector to decide when a collection is required. Having said that, I like to force a collection when my About box opens, to ensure the figures are accurate. (Here's how to force a complete collection - but like I said, you should almost never do so.)

  • Don't null-out local variables. In general, there is no need to set local variables to null when you have finished using them. In Release builds (unlike Debug builds), a local variable becomes eligible for garbage collection immediately after the last line that uses it. If you add a line after that, to set the variable to null, the new line does no good. It will either (very slightly) extend the lifetime of the object or it will have no effect at all (if the JIT compiler realizes the line is useless and optimizes it away). So keep your code clean and readable by avoiding pointless assignments to null. (When should you set things to null? I can think of a couple of situations: (1) When a small object with long lifetime refers to a large object with a short lifetime. Say you have two objects, A and B, and A contains a field that refers to B. If B is a very large object and you want its lifetime to be significantly shorter than that of A, then consider setting the field to null when you've finished with B. But don't go overboard - you probably won't need to do this very often. (2) Another situation, relating specifically to servers with threads waiting on external resources,is described here. Make sure you read the comments at the end.)

  • Don't get paranoid about garbage collector performance. It is easy to underestimate just how good the .NET garbage collector is. For instance, how many developers realise that a .NET app can create and destory around 50 million small, short-lived objects per second! (Admittedly, small-shortlived objects is the garbage collector's best-case scenario - but it also happens to be the most common scenaio in real world apps.)

Happy coding!