Tuesday, May 03, 2005

Kit George, Program Manager on the Common Language Runtime (CLR) team in charge of the base library, stood up as our first presenter.  He jumped right in to some demos to show some v2 improvements.  (All the demos are available on the Visual Component Library (VCL) website

  • The first demo introduced us to some memory management (Garbage Collection) concerning unmanaged resources. 
  • The second demo showed the improvement in performance between Try{Parse}Catch and TryParse.
  • The third app showed us how the VCL now supports the serial ports.  It also showed us color in console applications and generic collections.

Brad Abrams, a founding member of the .Net Framework team, was here to present the deep internals of the CLR and basically show that he knew everything about CLR and compilers.  He strongly warned us that his presentation was not going to cover knowledge required to be a programmer and that everything he showed us was going to change as the CLR evolved.  After a rapid review of the process of creating an executable, we created a quick “Hello World” application and looked at the Intermediate Language (IL) code thanks to ILDASM.  Looking at the IL we were able to see which version of the CLR the exe was compiled with, the assembly hash information, and the entrypoint information.  We even saw that by modifying the IL entry point and recompiling via ILASM, the behavior of the program could be changed (not that there is ever any practical use for this information).  We learned that .Net assemblies are made up of the Manifest (external data) and the compilable IL + it’s metadata and required resources.  IL metadata is the actual language for execution.  It is independent of the CPU and platform, making it capable of running as both 32 bit and 64 bit applications without change.  IL is also a stack based language, meaning that every operation runs on a stack – pushing and popping variables.  IL bakes type safety into the CLR through a process called verification.  This helps prevent mixing incompatible types and bad array handling.  Brad covered the startup logic, showing how the CLR uses mscoree.dll to check version numbers and run the correct CLR version, allowing multiple .Net frameworks to be on the same machine and code designed for each to run successfully.  We looked at what happens in memory when an object is allocated and its methods are called.  This included understanding the memory hex addresses (pre-stub dispatcher addresses and compiled method addresses), JIT compilations, and more.  We looked at the Just In Time (JIT) Compiler optimizations, including register allocations, loop unrolling, dead code elimination, constant and copy propagation, and processor specific code generation.  A new JIT optimization in v2.0 is range check elimination – not repeating range checking on arrays when the array is reused in an identical or safer manner (resulting in faster JIT compile times).  Brad’s favorite runtime service (I’m guessing b/c this is the one he covered) is the Garbage Collector.  Brad wanted to point out that a better way to think of the GC was to consider it a stuff collector – it copies everything that still has references and anything left behind is zeroed out.  Brad then introduce Claudio, who would cover the GC in more detail.

Claudio Caldato presented how to write faster managed code.  When talking about application performance, the following points are important to keep in mind.  You need to set realistic goals and evaluate them continuously so they’re not too optimistic or conservative.  You need to measure frequently.  Know your platform to know where to make changes (memory, processor, etc).  Performance improvements are not a one shot deal – make continuous improvements (automated tests help).  Finally, build a performance culture – the developers need to understand that performance is part of their development process.  Claudio talked about the garbage collection in action – maintaining objects with references and collapsing the memory queue to prevent fragmentation.  Garbage collection works in a generational cycle.  Gen 0 is run every time.  Gen 1 is run only when Gen 0 doesn’t have enough memory on the heap to create all the required variables.  The GC will pause threads, trace reachable objects from roots, compact live objects and recycle dead memory.  Each generation is it’s own heap, so there are three Gen heaps and a large object heap (separated b/c moving large objects is expensive).  To improve performance, make sure you understand the lifetime of your objects – don’t let them live to Gen2 if you can help it.  Use perfmon and the CLRProfiler to examine your allocation profile.  Some common pitfalls include keeping references to dead objects (be sure to null out object references).  Implicit boxing (casting value types to ref. types) without knowing it happens when you do something like creating an ArrayList of Ints.  Don’t do GC.Collect if you can help it.  Because ~C( ) is not deterministic (you don’t know when the memory is cleared) you should try to use the Dispose pattern – implement IDisposable and call .SuppressFinalize.  In C# this is as simple as the using keyword.  Claudio showed us what he was talking about with some demos, using the CLRProfiler (available online for free).  The demo also showed us explicitly how much the String.Append object method costs vs the StringBuilder.Append object method when modifying the string.  The difference was astonishingly huge.  The next demo showed the difference between the Finalize method and the using statement (which calls the IDispose interface).  The point of this demo was to show how much memory could be saved by having deterministic memory clearing.  We talked about Reflection a bit – noting that the new Token/Handle resolution APIs are very fast.  An example would be calling MemberInfo on an object (a costly call), retrieving a token for that member, and then calling the Token API for future references to that object.  COM Interop is efficient but frequent calls add up.  This is because marshaling can be expensive.  Primitive types and arrays of primitive types are cheap, but Unicode to ANSI string conversion are not.  To diagnose some of these problems, you can use some of the built in CLR Interop counters in Perfmon.  You can mitigate Interop call costs by batching calls (chunky, not chatty) or move the boundary (helpful if you control both the managed and unmanaged code).  finally, some deployment considerations are to reduce the number of assemblies to speed load times, use the GAC to prevent repetitive signature verification, and use native code (NGEN) to reduce startup time and improve code shareability.  Also, keep in mind that XML is not always the answer – System.XML.dll is 2Mb and might be overkill if you’re storing extremely simple data. When working with Performance Counters, start by looking at how much time is being spent in GC.  20% is too much.  Next you’ll want to look at the promotion rate.  If Gen2/(Gen0+Gen1) > 1/20 then look at the objects’ lifecycles.  Next you’ll want to check the byte allocation to ensure you’re not above 30Mb/s under stress.  This indicates too many objects.  When using the CLRProfiler, examine the total allocation, the reallocated memory, the allocation graph, and the time line.  To improve startup time, reduce the number of dlls loaded at startup (your own app dlls and .Net platform dlls), NGEN your assemblies to prevent JIT costs.  Place strong named assemblies in the GAC.  Finally, check your own application logic to see if your own computations are the bottleneck.

These guys are seriously smart.  They’re so smart that just writing about them makes ME sound smart!  Well, they’re product managers so you have to keep that in mind, but even so… I think they might each have more than one brain.  Be sure to check out their blogs and books. 

— Matt Ranlett
posted with BlogJet

5/3/2005 10:00:34 AM (Eastern Standard Time, UTC-05:00)  #    Trackback
Tracked by:
"Great C# User Group Mtg with CLR Team!" (adventures of dotnetboy (aka Brenton H... [Trackback]
"best buy diet pill" (best buy diet pill) [Trackback]