Monday, October 18, 2004

How to improve garbage collection performance

DotNet developers can free themselves from tedious memory management for their application as Microsoft Framework and CLR do it automatically.CLR provides a mechanism called as Garbage Collection which manages your applications memory. In this session we will discuss how garbage collector works and how it affects the performance of your Applications.

When you create an object using new () operator, the object’s memory is obtained from the managed heap. When the garbage collector decides that sufficient garbage has accumulated, it performs a collection to free some memory. This process is fully automatic, but there are a number of factors that you need to be aware of that can make the process more or less efficient.
Garbage Collection Algorithm
Each application has a set of roots. Roots identify storage location which refers to object on the managed heap or the objects that are set to be null. For example all the global and static object pointers are considered as application roots. In addition any local variable/parameter object pointers on a thread stack are considered as a part of application roots. The list of application roots are maintained by JIT and CLR, and are made available at the time of garbage collection. When Garbage collector starts running it assumes all the objects in the heap as garbage. Now the garbage collector starts walking the roots and building a graph of all objects reachable from the roots. If the GC attempts to add an object already present in the graph, then it stops walking down that path. This serves two purposes. First, it helps performance significantly since it doesn't walk through a set of objects more than once. Second, it prevents infinite loops should you have any circular linked lists of objects. Thus cycles are handles properly. Once All the roots have been checked the garbage collector’s graph contains the set of all objects that are some how reachable form application root. Any objects that are not in the graph are not accessible by the application and considered as garbage. The garbage collector now walks through the heap linearly, looking for contiguous blocks of garbage objects (now considered free space). The garbage collector then shifts the non-garbage objects down in memory, removing all of the gaps in the heap. Moving the objects in memory invalidates all pointers to the objects. So the garbage collector modifies the application's roots so that the pointers point to the objects' new locations. In addition, if any object contains a pointer to another object, the garbage collector is responsible for correcting these pointers as well.
One feature of the garbage collector that exists purely to improve performance is called generations. A generational garbage collector takes into account two facts that have been empirically observed in most programs in a variety of languages. One newly created objects tend to have short lives and second one the older an object is, the longer it will survive.
Generational collectors group objects by age and collect younger objects more often than older objects. When initialized, the managed heap contains no objects. All new objects added to the heap can be said to be in generation 0, until the heap gets filled up which invokes garbage collection. As most objects are short-lived, only a small percentage of young objects are likely to survive their first collection. Once an object survives the first garbage collection, it gets promoted to generation 1.Newer objects after GC can then be said to be in generation 0.The garbage collector gets invoked next only when the sub-heap of generation 0 gets filled up. All objects in generation 1 that survive get compacted and promoted to generation 2. All survivors in generation 0 also get compacted and promoted to generation 1. Generation 0 then contains no objects, but all newer objects after GC go into generation 0.Thus, dividing the heap into generations of objects and collecting and compacting younger generation objects improves the efficiency of the basic underlying garbage collection algorithm by reclaiming a significant amount of space from the heap and also being faster than if the collector had examined the objects in all generations.
Garbage Collection Class (System.GC)
As the garbage collection by the CLR is Nondeterministic and developer has no control on it Microsoft provided System.GC object class using which you can force a garbage collection in your application. Different methods of System.GC class are as follows.
System.GC.Collect: This method forces a garbage collection. You should generally avoid this and let the runtime determine the appropriate time to perform a collection. The main reason that you might be tempted to call this method is that you cannot see memory being freed that you expect to see freed. However, the main reason that this occurs is because you are inadvertently holding on to one or more objects that are no longer needed. In this case, forcing a collection does not help.
System.GC.WaitForPendingFinalizers: This suspends the current thread until the finalization thread has emptied the finalization queue. Generally, this method is called immediately after System.GC.Collect to ensure that the current thread waits until finalizers for all objects are called. However, because you should not call GC.Collect, you should not need to call GC.WaitForPendingFinalizers.
System.GC.KeepAlive: This is used to prevent an object from being prematurely collected by holding a reference to the object. A common scenario is when there are no references to an object in managed code but the object is still in use in unmanaged code.
System.GC.SuppressFinalize This prevents the finalizer being called for a specified
object. Use this method when you implement the dispose pattern. If you have explicitly released resources because the client has called your objects Dispose method.Dispose should call SuppressFinalize because finalization is no longer required.
The garbage collector offers an additional, optional service called finalization. Use
Finalization for objects that need to perform cleanup processing during the collection
Process and just before the object’s memory is reclaimed. Finalization is most often
Used to release unmanaged resources maintained by an object; any other use should
Be closely examined. Examples of unmanaged resources include file handles, Database connections and COM object references an object’s Finalize method is called before the objects managed memory is reclaimed. This allows you to release any unmanaged resources that are maintained by the object. If you implement Finalize, you cannot control when this method should be called because this is left to the garbage collector. The finalization process requires a minimum of two collection cycles to fully release the object’s memory. The potential existence of finalizers complicates the job of garbage collection in .NET by adding some extra steps before freeing an object.
Whenever a new object, having a Finalize method, is allocated on the heap a pointer to the object is placed in an internal data structure called Finalization queue. When an object is not reachable, the garbage collector considers the object garbage. The garbage collector scans the finalization queue looking for pointers to these objects. When a pointer is found, the pointer is removed from the finalization queue and appended to another internal data structure called Freachable queue, making the object no longer a part of the garbage. At this point, the garbage collector has finished identifying garbage. The garbage collector compacts the reclaimable memory and the special runtime thread empties the freachable queue, executing each object's Finalize method. The next time the garbage collector is invoked, it sees that the finalized objects are truly garbage and the memory for those objects is then, simply freed. It is recommended to avoid using Finalize method, unless required. Finalize methods increase memory pressure by not letting the memory and the resources used by that object to be released, until two garbage collections. Since you do not have control on the order in which the finalize methods are executed, it may lead to unpredictable results.
CLR provided a method named Dispose for types that contain references to external resources that need to be explicitly freed by the calling code. You can avoid finalization by implementing the IDisposable interface and by allowing your class’s consumers to call Dispose. The reason you want to avoid finalization is because it is performed asynchronously and unmanaged resources might not be freed in a timely fashion. This is especially important for large and expensive unmanaged resources such as bitmaps or database connections. In these cases, the classic style of explicitly releasing your resources is preferred (using the IDisposable interface and providing a Dispose method). With this approach, resources are reclaimed as soon as the consumer calls dispose and the object need not be queued for finalization. Statistically, what you want to see is that
almost all of your finalizable objects are being disposed and not finalized. The finalizer should only be your backup. With this approach, you release unmanaged resources in the IDisposable.Dispose method. This method can be called explicitly by your class’s consumers or implicitly by using the C# using statement. To prevent the garbage collector from requesting finalization, your Dispose implementation should call GC.SuppressFinalization. Common disposable resources include the following:
Database-related classes: SqlConnection, SqlDataReader, and SqlTransaction.
File-based classes: FileStream and BinaryWriter.
Stream-based classes: StreamReader, TextReader, TextWriter, BinaryReader and TextWriter.
Network-based classes: Socket, UdpClient, and TcpClient

How to Implement Dispose method for a class

● Create a class that derives from IDisposable.
● Add a private member variable to track whether IDisposable.Dispose has already
been called. Clients should be allowed to call the method multiple times without generating an exception. If another method on the class is called after a call to Dispose, you should throw an ObjectDisposedException.
● Implement a protected virtual void override of the Dispose method that accepts a single bool parameter. This method contains common cleanup code that is called either when the client explicitly calls IDisposable.Dispose or when the finalizer runs. The bool parameter is used to indicate whether the cleanup is being performed as a result of a client call to IDisposable.Dispose or as a result of finalization.
● Implement the IDisposable.Dispose method that accepts no parameters. This method is called by clients to explicitly force the release of resources. Check whether Dispose has been called before; if it has not been called, call Dispose (true) and then prevent finalization by calling GC.SuppressFinalize (this).Finalization is no longer needed because the client has explicitly forced a release of resources.
● Create a finalizer, by using destructor syntax. In the finalizer, call Dispose (false).
Example (In VB.NET)
Public Class My Dispose Implements IDisposable

Public Overloads Sub Dispose () Implements IDisposable.Dispose
GC.SuppressFinalize (Me) ' No need call finalizer
End Sub

Protected Overridable Overloads Sub Dispose (ByVal disposing As Boolean)

If disposing then
‘Free managed resources
End If
‘Free unmanaged resources

End Sub

Protected Overrides Sub Finalize ()
Dispose (False)
End Sub

End Class
Garbage Collection Guidelines:
1. Avoid Calling GC.Collect: The default GC.Collect method causes a full collection of all generations. Full collections are expensive because literally every live object in the system must be visited to ensure complete collection. Needless to say, exhaustively visiting all live objects could, and usually does, take a significant amount of time. The garbage collector’s algorithm is tuned so that it does full collections only when it is likely to be worth the expense of doing so. As a result, do not call GC.Collect directly — let
the garbage collector determine when it needs to run. The garbage collector is designed to be self-tuning and it adjusts its operation to meet the needs of your application based on memory pressure. Programmatically forcing collection can hinder tuning and operation of the garbage collector. If you have a particular niche scenario where you have to call GC.Collect, consider the following:
● Call GC.WaitForPendingFinalizers after you call GC.Collect. This ensures that
the current thread waits until finalizers for all objects are called.
● After the finalizers run, there are more dead objects (those that were just finalized)
that needs to be collected. One more call to GC.Collect collects the remaining dead
2. Call Close or Dispose on Classes that Support It: If the managed class you use implements Close or Dispose, call one of these methods as soon as you are finished with the object. Do not simply let the resource fall out of scope. If an object implements Close or Dispose, it does so because it holds an expensive, shared, native resource that should be released as soon as possible.
3. Call System.Runtime.InteropServices.Marshal.ReleaseComObject if you are using COM components: In server scenarios where you create and destroy COM objects on a per-request basis, you may need to call system.interopServices. Marshal. Release ComObject.The Runtime Callable Wrapper (RCW) has a reference count that is incremented every time a COM interface pointer is mapped to it. The ReleaseComObject method decrements the reference counts of the RCW. When the reference count reaches zero, the runtime releases all its references on the unmanaged COM object.
4. Set Unneeded Member Variables to Null before Making Long-Running Calls: Before you block on a long-running call, you should explicitly set any unneeded member variables to null before making the call so they can be collected. This advice applies to any objects which are still statically or lexically reachable but are actually not needed Do not set local variables to null (C#) or Nothing (Visual Basic .NET) because the JIT compiler can statically determine that the variable is no longer referenced and there is no need to explicitly set it to null.
5. Prevent the Promotion of Short-Lived Objects: Objects that are allocated and collected before leaving Gen 0 are referred as short-lived objects. The following principles help ensure that your short-lived objects are not promoted:
● Do not reference short-lived objects from long-lived objects. A common example where this occurs is when you assign a local object to a class level object reference.
● Avoid implementing a Finalize method. The garbage collector must promote finalizable objects to older generations to facilitate finalization, which makes them long-lived objects.
● Avoid having finalizable objects refer to anything. This can cause the referenced object(s) to become long-lived.
6. Minimize Hidden Allocations: Memory allocation is extremely quick because it involves only a pointer relocation to create space for the new object. However, the memory has to be garbage collected at some point and that can hurt performance, so be aware of apparently simple lines of code that actually result in many allocations. For example, String.Split uses a delimiter to create an array of strings from a source string. In doing so, String.Split allocates a new string object for each string that it has split out, plus one object for the array. As a result, using String.Split in a heavy duty context (such as a sorting routine) can be expensive.Use stringbuilder class in place of string class
7. Use the using Statement in C# and Try/Finally Blocks in Visual Basic .NET to Ensure Dispose Is Called: If you r using C# use using statement as it will automatically generates a try and finally block at compile time that calls Dispose on the object allocated inside the using block.
8. Do Not Implement Finalize Unless Required: Implementing a finalizer on classes that do not require it adds load to the finalizer thread and the garbage collector. Avoid implementing a finalizer or destructor unless finalization is required. Classes with finalizers require a minimum of two garbage collection cycles to be reclaimed. This prolongs the use of memory and can contribute to memory pressure. When the garbage collector encounters an unused object that requires finalization, it moves it to the “ready-to-be-finalized” list. Cleanup of the object’s memory is deferred until after the single specialized finalizer thread can execute the registered finalizer method on the object. After the finalizer runs, the object is removed from the queue and literally dies a second death. At that point, it is collected along with any other objects. If your class does not require finalization, do not implement a Finalize method. Use a finalizer only on objects that hold unmanaged resources across client calls. You should implement IDisposable if you implement a finalizer. In this way, the calling code has an explicit way to free resources by calling the Dispose method.
9. Call Dispose On Base Classes and On IDisposable Members: If your class inherits from a disposable class, then make sure that it calls the base class’s Dispose. Also, if you have any member variables that implement IDisposable, call Dispose on them, too.

Thursday, October 07, 2004

Native Image Generator (Ngen.exe)

Using the Native Image Generator (Ngen.exe) tool, an assembly can be converted into its Native Code or image. This means that calls to the Native Image will load faster since the nend for JIT compilation has been eliminated.When you run the NGen .NET Framework command line utility on an assembly, the Native Image will be generated and installed in the Native Image Cache and subsequent calls to methods of that assembly will be handled by the Native Image of the assembly. E.g. entering Ngen C:/post.dll at the command prompt where post.dll represents a managed assembly will create the Native Image of the assembly.A native image is a file containing compiled processor-specific machine code. Note that the native image that Ngen.exe generates cannot be shared across Application Domains. Therefore, you cannot use Ngen.exe in application scenarios, such as ASP.NET, that require assemblies to be shared across application domains.If Ngen.exe encounters any methods in an assembly that it cannot generate, it excludes them from the native image. When the runtime executes this assembly, it will revert to JIT compilation for the methods that were not included in the native image.

The following command generates a native image for drnotes.exe, located in the current directory. If a configuration file exists for the application, Ngen.exe will use it. The tool will not generate native images for any DLLs that drnotes.exe references.

ngen drnotes.exe

If drnotes.exe directly references two DLLs, drnotes1.dll and drnotes2.dll, you must supply Ngen.exe with the fully specified assembly names for these DLLs to generate native images for them. Run Ildasm.exe over drnotes.exe to determine the fully specified assembly names of the referenced DLLs. For the purpose of this example, the fully specified assembly names of drnotes1.dll and drnotes2.dll are "drnotes1, Version=, Culture=neutral, PublicKeyToken=0038abc9deabfle5" and "drnotes2, Version=, Culture=neutral, PublicKeyToken=0038abc9deabfle5". Using this information, the following command generates native images for drnotes.exe, drnotes1.dll, and drnotes2.dll.

ngen drnotes.exe "drnotes1, Version=, Culture=neutral, PublicKeyToken=0038abc9deabfle5", "drnotes2, Version=, Culture=neutral, PublicKeyToken=0039def8abcbste7", "drnotes3, Version=, Culture=neutral, PublicKeyToken=0038abc9deabfle5"

The following command generates a native image for drnotes.exe with the specified path.

ngen c:\myfiles\myAssembly.exe

The following command generates a native image for myLibrary.dll, with the specified path.

ngen c:\myfiles\myLibrary.dll

Ngen.exe looks in the native image cache to delete an assembly specified with a partial assembly name. The following command deletes all native images with the name myAssembly.

ngen /delete myAssembly

The following command deletes the native image myAssembly with the fully specified assembly name.

ngen /delete "myAssembly, Version=, Culture=neutral, PublicKeyToken=0038abc9deabfle5"

The following command displays all native images in the native image cache.

ngen /show

The following command displays all native images in the native image cache with the name myAssembly.

ngen /show myAssembly

The following command displays all native images in the native image cache with the name myAssembly and the version 1.0.

ngen /show "myAssembly, version="