What is memory Leak?
We can say that memory leakage occurs every time there is an allocated memory longer than necessary. This is usually because the code (or more precisely the programmer who wrote it) is lost and forgets to release the memory after it is no longer needed. There is memory waste, and the accumulation of waste can lead to disable execution because of so much memory that the application is occupying.
Actually there is an even bigger problem that is releasing memory prematurely that still needs to be accessed, but the subject here is another.
With a most common garbage collector it is normal that objects are not released right after it is no longer necessary. The collection occurs from time to time according to the need for memory. There is a certain waste, but it is by design. When you use something like a Garbage Collector we have given up having the greatest memory savings possible. But we can not give up having memory no more necessary to be without release.
When there is a GC the definition of memory leakage varies somewhat. Only the objects that survive a collection even though it is no longer necessary in fact is that it is considered leak. Before having a collection it is considered normal that the object is there even if it is not necessary more.
If there’s too much leakage, sooner or later you’ll get one OutOfMemoryException
.
In applications that run for a short time can have leaks that are never noticed, some innocuous, others not so much.
Let it be clear that no leaks are allocated in the stack, this memory has automatic management.
Why does GC not collect something that is already unnecessary?
It’s not because he’s bad. It’s because the code has some problem and it maintains at least one reference to an object that is actually no longer needed, but what the code is saying is that it is necessary. This may occur for some reasons we will see below.
I have to restart the computer to release the leaked memory?
No. At the end of the application the operating system will release all memory linked to it, even if your application does not. He does it perfectly and completely every time, even because it was he who gave memory to his application, he knows everything that was given to take back.
Managed memory
As a matter of fact. NET works with a memory manager, which unlike the name cares more about the way of allocation than the release of memory. This greatly facilitates application development as it is a huge hassle to properly manage memory, especially when we use more advanced language and library features.
The . NET has a very smart GC and allows allocations at the cost close to the allocation in stack (which is absurdly fast) and has no cost of release. But has the collection breaks. In addition it keeps the memory without fragments, which helps maintain reference location that provides better performance.
The problem is that many programmers think that because of this we don’t need to worry about anything in the use of memory. We have several questions here at Sopt where users report memory leak.
Unmanaged memory
Not all memory allocated in your application is managed by the trash collector. You can use operating system libraries and services that allocate memory on their own and the . NET has no direct control over this allocation, only this component it has allocated can free up memory and usually it needs to be warned that your application no longer needs this allocated content.
The mechanism . NET adopted to give such notice is the Disposable Pattern. Every object that accesses external resources to managed memory must implement the method Dispose()
of the interface IDisposable
. In this method all the necessary work is done to release all resources allocated outside managed memory.
Note that the managed memory of the object is not released, this will only occur when the GC triggers a collection.
To ensure that the Dispose()
be always called it is important to put it inside a finally
of a try
. Or better yet, use the statement using
that mounts a scope where the object needs unmanaged memory and assuredly calls the method to its end.
If you create an object Disposable without the using
or calling the method manually guaranteed, there will be memory leakage.
You have many questions on the subject:
Whew, good thing I always do that and I don’t take risks
Not quite so. It has objects that you have to implement the Dispose()
. It is true that this is not so common in most applications, in general you only consume these classes, after almost always the access to the resource will be done by an API in C or other unmanaged language. But if you have to do a class like this you will have to release the resource managed in an appropriate way according to the component you are using. The method does not release anything magically. Properly releasing unmanaged memory is not always intuitive. But there are cases that is simple and the operating system itself does for you simply signaling that it is terminating.
As I said this pattern is a flag that something should be done to throw away something no longer needed.
One of the most common mistakes I see people make is creating a class to manage connection. Someone taught it wrong one day and everyone went wrong. First that in general this class does not help much, and second that it has a resource that should be discarded. This pattern of Disposis is viral, ie if you use it on an object it must become Disposable.
And if I forget to use the using
I get the leak until the end of the application?
Maybe not. All. NET objects have a method Finalize()
that ends up being called by the GC when an object is collected and if it is written right will end up killing the leak, but we can still say that there was leak by having been alive longer than it should.
Nothing guarantees that the finisher will ever run. And it’s even a bit common that it’s not called because it never has the object collected. The managed object does not tickle in memory, so it doesn’t cause pressure in the GC, but the unmanaged resource it refers to usually takes up a lot of space and causes damage. Not to mention that it should perhaps be closed to another use, when it is exclusive, and this does not occur.
Another problem to leave to release the resource during collection is that it will take longer.
It’s rare to happen, but a code can suppress the completion improperly. Not so rare is a finisher prevent others from running.
Referenced objects
Eventually some object, especially large, ends up being referenced by another object that lives longer than this first object needs to survive. This may be regarded by some as a leak.
You may have put a reference to it in an object static that will probably have the lifetime for every application. Or it is referenced in an object that is circling everywhere in the application without need. Or in an unmanaged object that is never released.
Some components of . NET or third-party libraries have such situations. Or because they were poorly made or because there is no possibility of doing it properly. It is rare, but there is. And you have to take additional care.
Another common mistake is for the programmer to put a null
to end the reference. Almost every time you do this there is something wrong with the code. Of course there are cases where the semantics of the null object is valid.
Events
A typical case is the use of the event. For those who do not know the event
is the implementation of observer standard. In it you reference a delegate of another object who wishes to be notified. If the observed object survives longer than the observer needs it will hold the living observer without need.
var obj = new Classe();
// ...
obj2.Event += obj.objEventHandler;
// ...
obj = null //deveria librar para o GC coletar, mas ainda tem referência p/ ele
This can be solved by removing the notification signature when the observer no longer needs to stay alive. And it’s rare that the lifespan of these objects are different, but it’s a possibility, attention to it. Something like this should be done:
obj2.Event -= obj.objEventHandler;
I put in the Github for future reference.
Closures
If you create a closure, that is, a delegate who makes use of a variable created outside its scope, this variable will be created in the heap (creates a class to support the variable(s)(s), and will only lose the reference when the delegate ceases to exist. This may take longer than it should or may never happen.
In a way it is the same problem of having a reference in any object. But many people do not realize that the variable will be "promoted" to the heap and that the delegate may survive longer than you initially think.
Concatenation of collections
If I have to concatenate many strings, Each step in the concatenation generates a new object and the previous one is discarded. This isn’t exactly a classic memory leak, but it doesn’t stop producing too much junk. In this case it is better use a StringBuilder
.
The same may occur in a List
and other collections, but the problem is slightly different from what happened with string
. These collections were made to add new items. The collection does not generate a new object in each addition, but generates a new object logarithmically and can be large objects to be discarded. The ideal is to avoid it already reserving space at least close to what is expected to be consumed (is Java, but . NET is similar).
If reserving too much space we can consider a little as leakage, after all something allocated and unused is not good either, this should be minimized.
It occurs with any immutable object that requires many big changes or changes in some moments.
Are not leaks per se, but they act almost as if they were, I think it’s important to understand why this type of use can cause more problems than some real leaks.
Allocate into heap needlessly
Also not a memory leak per se, but create an object in the heap (perhaps because Boxing improper) will allocate something that would not even need to be there and therefore need not be collected. It is a temporary leak, but can be bad if done in quantity, and can even by undue pressure on GC.
Today there are several ways to avoid this, one of them is the Span
.
Cache
Cache must be accessed by a reference that will die very quickly or by a weak reference which allows the release of the referenced object. If this does not occur you can hold an object improperly not only occupying general memory, but also the cache space that could have something more useful.
Frameworks
WPF and Winforms may leak memory in the Binding, in the Textbox
undo, EventHandler
and others.
This is not a complete list, I have just cited examples, several frameworks of various types leak memory depending on how you use, so reading the documentation completely, carefully is necessary before doing. Doing it fast can’t be an excuse to skip that requirement.
Thread
If a thread not shut down when it should, if you go into deadlock, not only the objects referenced within it can live longer than it should, but the memory itself needed for this thread remains alive (only the stack can have 1MB and stay all there, even if you have no allocations in it).
There may be a deadlock for various reasons, including forgetting to give a Monitor.Exit()
.
Dynamically generated code
It is rare to do this, but if you generate a lot of code at runtime (this can be done in many ways) it is likely that all the memory needed for your creation will no longer be released even if you no longer need this code.
Fragmentation of the LOH
Large Object Heap (see below) is not properly compressed. If the reuse of the holes left are much smaller than that occupied before there will be a huge waste. It is true that it does not occur so much because if you have several smaller ones, it is possible that you can fit two or more new blocks in the places of an old one. This is a rare concern and not exactly a memory leak, but a waste that can compromise in extreme cases.
Extreme care
I recently learned a "good practice" :D for applications that run with a generational garbage collector where the objects must die young or survive forever.
These memory managers have a short 0 generation (something like 256KB) to ensure that it is collected very quickly (in the microsecond home). Obviously it tends to fill relatively fast (not so much, I’ve seen statistics that most applications have objects with average size of 35 bytes since the large ones do not enter these generations) and when this occurs the collection is triggered by copying all objects that still have references to the generation 1. Ideally should not copy any object. Of course this is almost impossible, but we should try to make it happen.
The more objects are copied to Gen 1 the faster it fills up. It’s also a little short (a 2MB). It is done so to be fast (1 or 2ms) and not give noticeable pause. When it fills, it has to copy everything that must survive for Generation 2. Again the ideal is to copy as little as possible.
Gen 2 has no theoretical size limit. Of course, it tries to keep everything within RAM. Even the virtual memory has its limit, in 32 bits is 4GB and in 64 bits is 16 Exabytes (rare machines that manage to pass 1 TB today, which are 7 orders of magnitude smaller than the maximum allowed, or if you prefer it is more or less all RAM available in all computers in the world today).
It takes a long time, but if this last generation fills up, the pause will be potentially long. It is true that there are techniques to minimize this, such as competition part of the process, and it should also be rarer to have a high collection. So ideally rare objects arrive in this generation.
If many objects arrive in Gen 2 it would be better to create one Object pooling preferably large enough to fall into the LOH and never be copied or collected, for example can use Memory
.
Objects over 85000 bytes are allocated in the LOH - Large Object Heap and are only compressed together with the collection of Gen 2. In fact it does not compact in the sense of reducing memory fragments, it just releases what is not being used. Because they are large objects the holes are not a problem for performance as it occurs with small objects. The holes enter in a free list to be used again by new objects that fit there, more or less as occurs in memory managed by the operating system.
Of course not every kind of application needs this whole concern, most do not give the perception that there are stops.
And most importantly, like all good practice, you have to use where it makes sense, is useful, and only after understanding all the implications. Create a pool of objects can cause memory leakage.
What does this have to do with memory leakage?
Not that this is exactly a classic memory leak, but this memory usage pattern can decrease the pressure on Garbage Collector. Letting an object survive longer than necessary, even if it ends up being collected is still a temporary leak.
How to detect
Each problem requires a specific technique. What will help are the profilers from memory. I already answered something about this and listed some profilers.
I found a interesting technique for testing.
Article in Code Project.
Tips from . NET Memory Profiller.
Question on the subject in the OS.
Specific questions may be asked.
The WPF, by practice, abuse the Observer standard (
INotifyPropertyChanged
) for data Binding, must suffer from it. Or I’m wrong?– vinibrsl
I don’t know the specific implementation to talk about, but it has a good possibility. Of course maybe they thought about it and found a solution.
– Maniero