Girish Jain on .Net Framework

I will be writing about my favorite technology which is Microsoft .Net Framework and how to use it to improve developer productivity

Asynchronous Programming in .Net with QnA

clock April 9, 2015 03:33 by author Girish Jain

 

In this post, I am going to talk about asynchronous programming in Microsoft .Net Framework in the form of questions and answer. It will help reader with understanding scenarios where to use asynchronous code and how it will benefit application. Would love to hear your comments and questions, so feel free to use comments area.

Please Note:

1. This article is not about how to write asynchronous code, you need to use your google skills for that. I am going to talk about some of the basic concepts of Task based Asynchronous Pattern (TAP) and try answer questions which come to developer's mind on his journey to learning asynchronous programming. Well, I had these questions so I found answer to these and am hoping this would help fellow developer.

2. I am talking about asynchronous code execution in CLR and hence, concepts discussed in here need to be seen in that context. For example, when talking about threads, I am referring to managed threads in CLR.

Overview

I am going to give a brief overview of asynchronous programming in .Net and then move on to questions and answers. Support for asynchronous execution has always been there in .Net since early version of framework. Remember IAsyncResult interface? Yes, of course, it resulted in too complex code in the first place and then it was a nightmare to maintain it or make changes to it. Even for simplest of functional scenarios, it resulted in too much of code. No wonder, developers dreaded it.

Fast forward to .Net 4 and Microsoft introduced Task, async, and await which greatly simplifies writing asynchronous code. These new constructs made it super easy for developer to write methods which can spawn a background operation and then later join with its result. All heavy-lifting is shifted to compiler to generate code for making this possible. I must say, it has been one of the best feature additions to .Net based compilers, C# (and I believe VB.Net too), an engineering marvel.

Task based Asynchronous Pattern (TAP) is based on concept of a task, represented by Task type in System.Threasing.Tasks namespace. It represents an asynchronous operation which you could wait for completion, cancel it, or specify a continuation to execute when this asynchronous operation is complete. It provides an object-oriented approach to writing asynchronous code. This frees up developer from worrying about semantics of language or execution environment for executing asynchronous operation and he can rather focus on functional aspects of application. Core idea here is to enable developer to execute methods on a separate thread seamlessly.

Let's drill deep into this TAP to understand more about it with the help of questions and answers.

Q1. What's most common use case scenario for asynchronous code?

I am not going to show you a code sample for how to write asynchronous code. Google is your best friend for that. Let's try and understand this with the help of a real-life like scenario:

To give you an analogy, let's consider case of a restaurant. We all know how it works in a restaurant, waiter takes order and then passes it to chef to prepare and waiter continues with other tasks while chef is preparing food. This is an example of asynchronous execution of tasks in real-life as Waiter is free after passing order to chef and can cater to other customers or serve a completed order to customer and so on. Let's see how it relates to software applications. Your front-end application (waiter) which takes multiple requests (orders) from end users (customers)

can schedule these requests on a background worker (chef) thread. Similar to waiter, while worker thread is executing long running action, your front-end application is free to respond to further user requests. This would be most common use case scenario for asynchronous code.

First example that you would come across when learning asynchronous code execution in .Net would be a Windows UI application being blocked while executing a long running task, as part of a button click event handler, and then asynchronous code would come to rescue and solve this problem. Task based asynchronous pattern helps to a great extent in case of UI applications.

Biggest pain point solved by asynchronous code, in case of windows applications, is to do with application responsiveness. Asynchronous code helps you prevent those scary freeze moments (imagine waiter going to kitchen to prepare your order). We've all had that experience many times with Windows XP. Long running tasks can be scheduled on background thread so that main UI thread can respond to user actions in parallel, thereby enabling optimum utilization of processor time and responsive UI.

Q2. Is asynchronous code only for UI applications? How about windows or web services, there is no UI thread in there so is there any benefit of using asynchronous code with these background applications?

Asynchronous code is not only for windows UI applications, it can help with background applications too. In case of background applications such as windows or web services, we don't have UI threads but we have I/O threads. Each call to a web service is serviced by an I/O thread in your IIS worker process. If work performed by your web method does not trigger a long running task, you are good but you may have some methods which could take longer to complete. In case of such methods, using asynchronous code to start long running task on a background worker thread (chef) will help free up your I/O thread (waiter) to perform other activities in parallel.

Hence, asynchronous code in background applications helps with optimum utilization of processor time and hence, should be used with windows and web services too.

Q3. Still unsure with above answer for how asynchronous code would help in case of a web service. I/O thread would not really be free until worker thread finishes, so in essence, it won't go back to thread pool to serve more requests after invoking long running task but, wait for it to finish. What we are doing is taking work from I/O thread, scheduling it on a worker thread, and then making I/O thread wait for worker thread to finish. Where's net gain?

It is correct that I/O thread will not be going back to thread pool until it finishes executing the web method (unless long running task was invoked as fire-and-forget, in which case, it would finish execution of web method and go back to pool). But, it does NOT need to wait while worker thread is executing long running task and that's where you gain with asynchronous code. You can continue executing code on I/O thread in parallel while worker thread is executing long running task. Remember from earlier restaurant analogy, waiter does not need to wait for chef to complete an order but can continue in parallel. Hence, gain comes from ability for I/O thread to continue executing in parallel with worker thread. At certain point though, I/O thread will need result from long running task and it may go into wait mode for worker thread to finish.

Q4. I don't have anything else to execute in parallel while worker thread is executing long running task. Should I still execute long running task asynchronously?

Would you create a restaurant with the assumption that you may not have lot of customers coming in and hence, one person as waiter-cum-chef will do? I am sure no, then why create software like that. Waiter and chef have different roles and responsibilities and hence, you should build these roles separately in your software too.

So yes, it would still make sense to execute long running task asynchronously at least from a design perspective. Don't put yourself into a corner. Also, it does not take lot of effort to write asynchronous code in first place and hence, Microsoft recommends you to trigger long running activity as an async task.

Q5. Functional requirements of my application require a given (long running) method to be invoked synchronously, will writing it as an async method help?

Yes, it will help. Don't create tight coupling between your implementation and business requirements, which have a high tendency to change frequently and when that happens, you will need to modify your implementation. It is best to keep your long running task method as async and you can always call an asynchronous method synchronously. So, create method performing long running task as async and depending upon your business requirements, you can decide to invoke this method and wait for it to finish.

Q6. Can we invoke an async method synchronously?

Yes, you can invoke an asynchronous method synchronously too. Here's an example:

MyAsyncMethod().Wait();

Q7. Is it required to have multiple processors on system to take advantage of asynchronous code?

Not really, you can benefit from asynchronous code on a machine with single processor too.

Q8. How would asynchronous code execution help on a single processor machine? If there's just a single processor then it is going to execute any one thread at a time (main or worker), so how does offloading long running task to a separate thread help?

At a logical level, I am sure that restaurant analogy has made it clear to you for how delegating a long running task to a separate worker helps main worker continue in parallel. This question would arise in developer's mind more from technical perspective to understand how it would work in case of a single processor machine because end of the day, you have a single processor and it would execute one thread at a time, so how would creating multiple threads help. Let's try and understand that.

Processor is a shared resource from perspective of a process. Each process utilizes CPU in a time-sharing manner:

So each process gets a fraction of processor time. Processor executes threads in a process. A process can have multiple threads. Unlike processes, wherein each process gets processor time, not every thread in a process will get executed by processor. Processor will execute “active” threads in a process, ones which are currently executing and it will continue from where it last left off, as it goes around processes.

So within a given process, threads need to efficiently utilize processor time made available to process. A greedy thread who wants to do it all will not help and this is where your application can be smart to spawn another thread where you execute a potentially long running task and provide multiple threads to processor to execute. If main thread in your UI application is executing a long running database query then it is not best utilization of processor time because most of the time, this thread would be blocked waiting for response from database server. Hence, if you have multiple threads of execution then processor time made available for a process can be utilized in optimum manner, by executing other threads which are available and ready for executing instructions. Hence, even when you have a single processor machine, asynchronous code would help with executing in parallel threads.

Q9. If having more threads in a process helps, should I keep that as a goal and create as many threads as possible?

Will increasing number of waiters in your restaurant without increasing chefs in kitchen help? Similarly, increasing threads in the process without increase in processor capacity would not help. Threads are executed by processors and we are limited on number of processors available. Having too little threads may lead to underutilization of processor but, at the same time, creating too many threads would be overkill for the process and rather degrade performance of application.

Q10. What happens when I invoke an asynchronous method? Does it immediately start executing on a separate worker thread of its own?

That depends upon availability of chef, am sorry, processor here as threads are executed by processor. When you start a new task, it is queued by CLR to thread pool and it will be executed as processor executes threads from pool. CLR manages queueing of a task using thread pool extremely well and hence, we don't have to worry about low-level details of scheduling and managing thread pool.

Q11. How do I decide whether a given method needs to be created as async?

Microsoft suggests any method which could potentially take longer than 50ms to complete, is a candidate for being async method. Caller of such method should have an option to invoke it asynchronously.

Q12. Should I consider target hardware for my application or service before writing asynchronous code?

For most cases, no, because CLR manages scheduling of worker threads for you so you don't have to worry about factoring underlying hardware capacity or availability. CLR makes optimum utilization of processor capacity so, leave it for CLR to take care of, it does a good job.

If you have more questions about asynchronous programming, please use comments area below and I will try to answer them. Thanks.

Happy Coding.

Vande Mataram!

(A salute to motherland)

P.S. In addition to blogging, I use Twitter to share tips, links, etc. My Twitter handle is: @girishjjain




CLR Journey from PInvoke to WinRT

clock June 21, 2013 11:20 by author Girish Jain

 

In this blog post, I am going to talk about Microsoft .Net Common Language Runtime (CLR) journey from its early days to latest Windows Runtime (WinRT) platform, which is released with Windows 8. I would suggest you read my previous blog post on WinMD files before reading further to gain better understanding of WinMD files introduced with Windows 8.

WinRT is the new platform released with Microsoft’s latest operating system Windows 8 and .Net Framework 4.5. It is important to remember that for WinRT APIs to work it needs at least Windows 8 as operating system (or Windows Server 2012 for server family of operating systems), and hence, WinRT is not available for previous platforms.

Coming back to WinRT, did we need another set of runtime libraries and a new file extension, WinMD? I am sure you are having this basic question in mind for what good this new WinRT library is going to do and why on earth Microsoft added a new file type? I will try and answer it in this blog post.

 

Windows Runtime (WinMD at heart)

WinRT is the new platform which exposes underlying operating system APIs in object-oriented manner to developers (across languages) and enables them to make best use of underlying platform capabilities to build great immersive apps.

WinMD files are at the heart of new Windows Runtime (WinRT) concept. These new WinMD files only contain metadata for WinRT types. WinMD files themselves do not contain any code. Hence, when you open any WinMD file using ILDAsm tool you see that all class methods are marked as "runtime managed" which, as per ECMA-335 CLI specification, means the managed implementation for the method will be provided by CLR. Please read my earlier blog post for more details on WinMD files.

Underneath, CLR bridges the gap of converting managed calls to low-level operating system (native) API calls and vice versa i.e. invoking managed callback functions from native world. This is a great step forward now because CLR is taking care of lot of complexities related to making these native API calls from managed code and vice-versa. Native APIs are written as C/C++ libraries, and before WinRT was introduced, these were consumed using Platform Invoke feature of CLR, commonly referred to as PInvoke.

With the introduction of WinRT, there has been a paradigm shift for managed developers for how they consume platform capabilities which otherwise are not already wrapped up in BCL. Let’s understand this evolution starting from its early days to latest WinRT platform. This will help you understand the internal workings of CLR and role of WinMD files too. We have come a long way with introduction of WinRT, it’s time for a little flashback.

 

.Net is Born!

To be able to best understand why we need Windows Runtime (WinRT) and WinMD files or how it is helpful, we need to go back a little into history and understand what kind of problems were there, before WinRT was born, which are now addressed by it. With the introduction of CLR, Microsoft developed rich metadata to describe types and their members in libraries (which is defined in ECMA-335 spec). This was a great step forward as existing libraries (DLLs) did not have a way to manifest it in themselves but we had to separately produce type library (TLB) files for components so that these components could be consumed in other applications. For example, you could write a COM component in C++ and then consume it in VB6 application. With the advent of .NET, assemblies themselves contained code as well as rich metadata, which was great as assemblies could describe its own types, members, and the external assemblies that it references (this was key feature which solved lot of DLL referencing related issues along with CLR assembly binder), which made assemblies self-contained and deployable unit. Hence, referencing and consuming these assemblies from other applications was a lot easier as compared to olden days, where referencing correct version of DLL, building code, and deploying binaries itself was a big challenge, commonly known as “DLL Hell”. The single biggest factor which helped solve DLL hell problem in .Net world was the ability to manifest all the rich metadata within the assembly (library) itself. Traditionally, components had very tight binary linking whereas CLR assemblies had metadata based linking.

 

PInvoke is born (Re-use existing investment)

To begin with, .Net developers did not have everything in the form of these assemblies, especially when it came to underlying low-level operating system APIs (such as APIs exposed by kernel32.dll) because, these were written and available in the form of C/C++ libraries. Microsoft did wrap certain APIs with managed wrappers over native Windows APIs, such as those for logging entries to event viewer, EventLog*classes in System.Diagnostics namespace which, are just managed wrappers over underlying native APIs. But, Microsoft could not have done it for all APIs (simply because there would be no value addition with creating and maintaining tons of managed wrappers over native APIs).

Therefore, Microsoft provided a way to enable .Net developers to consume these libraries using PInvoke. The way PInvoke works is that you would define API method signature in code and mark it as external i.e. the method implementation being available in another DLL (mentioned using attributes). Underneath, CLR would create Runtime Callable Wrapper (RCW) stub which would take care of marshalling calls between .Net Apps and native libraries. CLR would rely on method signature which you re-create in code to take care of marshalling requirements. Methods, marked as external, and type definitions re-created in source code helped CLR get the required metadata for it to be able to generate the RCW stub for marshalling and forwarding calls to libraries. In essence, metadata for native types/structures and methods was key input for CLR to make its magic and connect the two worlds.

 

PInvoke Limitations

PInvoke was great but it had its own limitations because there was still considerable amount of work to be done for .Net developers to recreate types in code (which were needed for the API call), define marshalling rules for these types, their memory layout, and so on (which was ugly and to make even a basic API call work, it took considerable amount of effort). Lots of these underlying APIs work with pointers whereas .Net developers don't directly work with pointers so they struggled with making API calls using PInvoke (working with IntPtr did not come easy to VB.Net or C# developers, to say the least). Although, they used familiar programming language such as VB .Net or C# but concepts were alien, such as pointers, memory layout, etc.

With PInvoke, CLR would use types and method signature created in source code to derive metadata, which it needed to create RCW stub, required for marshalling the call to native library. Lot of problems with this approach were derived from the fact that developers found it difficult to re-create method signatures and types in C#/VB code equivalents as they were required by CLR. In fact, there was a site (http://pinvoke.net) available, just for managed developers, to help with signature of these native methods and types such that these can be specified in their C#/VB counterparts, and this site proved to be a great asset for .Net developers trying to make native API calls using PInvoke. I am sure every developer who has created CLR applications, would have used this site at some point in time.

Therefore, Microsoft provided a way to enable .Net developers to consume these libraries using PInvoke. The way PInvoke works is that you would define API method signature in code and mark it as external i.e. the method implementation being available in another DLL (mentioned using attributes). Underneath, CLR would create Runtime Callable Wrapper (RCW) stub which would take care of marshalling calls between .Net Apps and native libraries. CLR would rely on method signature which you re-create in code to take care of marshalling requirements. Methods, marked as external, and type definitions re-created in source code helped CLR get the required metadata for it to be able to generate the RCW stub for marshalling and forwarding calls to libraries. In essence, metadata for native types/structures and methods was key input for CLR to make its magic and connect the two worlds.

For example:

[DllImport("kernel32.dll", CharSet=CharSet.Auto, SetLastError=true)]
internal static extern IntPtr GetProcAddress([In] IntPtr hModule, [In, MarshalAs(UnmanagedType.LPStr)] string lpProcName);

 

WinRT to Rescue (Partially)

.Net developers had to re-create metadata in source code such as method signatures and type definitions for CLR to do RCW magic. And this was the biggest pain area for most developers. This is where WinRT makes life easy for developers as Microsoft came up with these new WinMD files which define WinRT libraries classes and methods in rich metadata format supported and understood by CLR . Now, developers don’t need to re-create method signatures in code but just add WinMD files as reference to the project.

Now, with following three things falling in place with Windows 8 (and Windows Server 2012),

1. underlying operating system providing runtime libraries (WinRT), exposing native APIs in object-oriented manner

2. rich metadata available for WinRT classes from WINMD files and

3. RCW magic of CLR already in place,

Microsoft added new WinRT Interop functionality (on top of COM Interop infrastructure) to CLR and made life easy for managed developers. Now we can consume WinRT APIs just as you would consume any other managed API (in natural and familiar manner). With Windows Runtime you don’t have to deal with pointers and lot of complexities related with native Windows APIs but they are dealt for you by CLR. You still write code in managed language which you are familiar with.

So, WinMD files enable developers to consume native APIs from .Net apps in a natural and familiar manner (which they otherwise had to do painfully using PInvoke). So this solved an existing pain area for developers and then Microsoft went a step further which enabled developers to consume these native APIs from an array of programming languages, which are traditionally not used for .Net apps development, such as JavaScript by introducing a new feature called as "language projection".

So, to answer the question as WHY Microsoft introduced new WINMD files - it was to manifest rich metadata for operating system APIs so that developers can consume these platform APIs from managed code in a natural and familiar manner. This is the reason why Microsoft says that WinRT is NOT a new layer in itself BUT only exposes underlying operating system APIs to managed world developers.

 

Does WinRT replace PInvoke?

WinRT does not replace PInvoke as WinRT does not cover every API supported by native Windows libraries and hence, for certain APIs you would still need to use PInvoke. Rather, WinRT is enhanced version of PInvoke minus its complexities. WinRT builds on top of existing PInvoke and COM Interop infrastructure and build on existing concepts of RCW and CCW, which is great as it is re-using existing infrastructure and at the same time reducing complexities of it.

 

WinRT only for .Net?

Windows Runtime libraries (for e.g. Windows.UI.dll) are not implemented in managed code but some low-level language (my guess is C++) and made available to managed code using Windows Runtime Interop (built on top of COM Interop infrastructure) and .winmd files. The types which we see in WinMD files are defined and actually implemented by these WinRT libraries. But is WinRT only available for .Net developers? Not really, WinRT libraries can be accessed from outside CLR as well, for example, when you use JavaScript to build Metro apps, you are not using CLR to host your application but “Chakra” engine, which is used by Internet Explorer too, and internally it is consuming these WinRT libraries. So that proves the point that WinRT is not just for managed developers.

In fact, you can create classic-style COM components using Windows Runtime C++ Template Library (WRL) which can consume WinRT APIs and the COM component can then be accessed from any COM enabled technology including classic desktop apps.

 

WinRT Duplicates APIs?

Why Windows Runtime libraries duplicate certain sets of classes, especially those related to UI controls and thread pool when these are already available with .Net Framework? Why this duplication? – This duplication is there for a reason. As we saw above, WinRT is not just for managed developers but is consumed from other application hosts as well and it became necessary to make these basic sets of classes available through WinRT too. Managed developers had some of these classes already available to them and it looks like duplication for them but these classes exist in WinRT for a reason.

 

WinRT for Desktop Apps

Only a subset of WinRT Classes/APIs are available to desktop apps. Also, it is important to remember that WinRT is only available Windows 8 onwards so your target platform for the desktop app must be at least Windows 8 for the ability to access WinRT APIs.

S Hanslman has written this excellent blog post which shows manual steps for how to access WinRT APIs from desktop apps.

 

Summary

Microsoft has introduced new set of runtime libraries for Windows platform, commonly referred to as WinRT library, which expose native functionality in object oriented manner. These new libraries are based on solid principles of COM but Microsoft has shielded developers from complexities of COM. To make it easy for developers to interact with this underlying platform library, Microsoft developed WinMD files and supporting interop functionality to different runtime environments such as CLR and Chakra engine so that developers can consume these libraries from different languages (Java Script, VB .NET, C#, C++, etc) in natural and familiar manner. WinRT and its interop with runtime environments makes platform capabilities available to an array of programming languages in natural and familiar manner.

 

In my next blog post, I will try to go further into this new magical world of WinRT and further explore its internal working.

Happy Coding.

Vande Mataram!

(A salute to motherland)

P.S. In addition to blogging, I use Twitter to share tips, links, etc. My Twitter handle is: @girishjjain




LINQ Query - INNER JOIN with GROUP

clock October 3, 2012 22:39 by author Girish Jain

Recently, while working on a project, I came across a certain requirement to use LINQ and get data from two different sources and at the same time I had to use inner join and group the data. It took a while for me to figure out how best to write the query as I could either do inner join or group but was not able to achieve both in the same query. Finally, I figured out the procedural approach to get it done therefore, I thought of sharing the solution with you all on blog assuming it might be helpful to you. For the sake of simplicity, I am presenting the same challenge but with very simple and common real world entities such employee and department example.

Given two lists, one for departments and another for employees, we need to get the listing of all departments along with employee having least salary for that department. For example, find below listing of departments and employees:

ID Name
1 IT
2 Finance


Name DepartmentID Address Salary
Ram 2 Parel 2500
Shyam 2 Borivali 1500
Sita 2 Panvel 2000
Gita 1 Virar 1600
Kishan 1 Vikroli 1400
Kanhaiya 1 Surat 1300


Now we want to get the highlighted rows as output, i.e. departments listing along with employee with minimum salary for that department.

I have initialized these two data sources in code as shown below:

private List<Department> _deptartments = new List<Department>()
				{
					new Department{ ID=1, Name="IT" },
					new Department{ ID=2, Name="Finance" }
				};

private List<Employee> _employees = new List<Employee>()
				{
					new Employee{ Name="Ram", Address="Parel", Salary=2500, DepartmentID=2 },
					new Employee{ Name="Shyam", Address="Borivali", Salary=1500, DepartmentID=2 },
					new Employee{ Name="Gita", Address="Panvel", Salary=2000, DepartmentID=2 },
					new Employee{ Name="Sita", Address="Virar", Salary=1600, DepartmentID=1 },
					new Employee{ Name="Kishan", Address="Vikroli", Salary=1400, DepartmentID=1 },
					new Employee{ Name="Kanhaiya", Address="Surat", Salary=1300, DepartmentID=1 }
				};

With LINQ, there are two approaches possible, procedural approach or using new keywords for LINQ. Find below the procedural code to get the desired output using GroupJoin method:

var list = _deptartments.GroupJoin(_employees,
	Dep => Dep.ID,
	Emp => Emp.DepartmentID,
	(dep, empList) =>
		new
		{
			DepartmentName = dep.Name,
			MinSalary = empList.Min(x => x.Salary),
			EmployeeName = (from e in empList 
                    where e.Salary == empList.Min(x => x.Salary) 
                    select e.Name).FirstOrDefault()
		}
	);


Thanks to my colleague Aditya Dhiwar who showed me how to achieve the same using LINQ keywords in C#. Find below code for same:

var list = from e in _employees
            group e by e.DepartmentID into dptgrp
            join d in _deptartments
            on dptgrp.Key equals d.ID
            let minsal = dptgrp.Min(x => x.Salary)
            let employee = dptgrp.Where(e => e.Salary == minsal)
            from e in employee 
            select new
            {
              EmployeeName = e.Name,
              DepartmentName = d.Name,
              MinSalary = minsal,
            }

Find below the complete source code:


using System;
using System.Collections.Generic;
using System.Linq;

public class Employee
{
	public string Name { get; set; }
	public string Address { get; set; }
	public decimal Salary { get; set; } 
	public int DepartmentID { get; set; } 
}

public class Department
{
	public int ID  { get; set; }
	public string Name  { get; set; }
	public Employee Manager { get; set; }
}

public class LINQTest
{
	private List<Department> _deptartments = new List<Department>()
					{
						new Department{ ID=1, Name="IT" },
						new Department{ ID=2, Name="Finance" }
					};

	private List<Employee> _employees = new List<Employee>()
					{
						new Employee{ Name="Ram", Address="Parel", Salary=2500, DepartmentID=2 },
						new Employee{ Name="Shyam", Address="Borivali", Salary=1500, DepartmentID=2 },
						new Employee{ Name="Gita", Address="Panvel", Salary=2000, DepartmentID=2 },
						new Employee{ Name="Sita", Address="Virar", Salary=1600, DepartmentID=1 },
						new Employee{ Name="Kishan", Address="Vikroli", Salary=1400, DepartmentID=1 },
						new Employee{ Name="Kanhaiya", Address="Surat", Salary=1300, DepartmentID=1 }
					};

	public void GetEmployeesWithMinSalaryInDepartment()
	{
        // Get listing of all departments along with the employee who is
        // paid least salary for that department

        var list = _deptartments.GroupJoin(_employees,
	        Dep => Dep.ID,
	        Emp => Emp.DepartmentID,
	        (dep, empList) =>
		        new
		        {
			        DepartmentName = dep.Name,
			        MinSalary = empList.Min(x => x.Salary),
			        EmployeeName = (from e in empList 
                        where e.Salary == empList.Min(x => x.Salary) 
                        select e.Name).FirstOrDefault()
		        }
	        );

		foreach (var d in list)
		{
			Console.WriteLine(d.DepartmentName);
			Console.WriteLine(d.EmployeeName);
			Console.WriteLine(d.MinSalary);
		}
	}
}

Hope you find it useful. Happy Coding!!

Vande Mataram!

(A salute to motherland)

P.S. In addition to blogging, I use Twitter to share tips, links, etc. My Twitter handle is: @girishjjain




Assembly Identity and GAC

clock September 20, 2012 16:34 by author Girish Jain

Assembly Identity and GAC

Today, I am going to draw your attention to a very small detail in .Net Framework - Global Assembly Cache (GAC). GAC is the machine-wide store for .Net framework assemblies. Until recently, I have been under the impression that assemblies in GAC are identified based on following four characteristics:

1. Assembly Friendly Name

2. Assembly Version Number

3. Assembly Culture

4. Public Key Token

As far as I remember, I read it in the book so either I read it wrong or the book is wrong. In any case, lets clear up the understanding.

My understanding was that combination of these four attributes will always be unique in GAC. I lived under this understanding for a long while until recently when I noticed few assemblies in GAC where all above four attributes of a certain assembly were same. Shock! First because my understanding is proved to be incorrect and second it means I will have to go into detail to identify the attributes which uniquely identify assemblies in GAC and correct my understanding.

Let’s start with the discovery. I observed that when you open GAC folder (C:\Windows\assembly) in Windows Explorer there is an extra column shown, called Processor Architecture. I noticed that there is a difference in the processor architecture of the assemblies where all other four attributer were common. It led me to the conclusion that the assemblies share same friendly name, culture, version number, and public key token BUT they are targeting different processor architecture. I decided to try out this by creating a basic assembly and installing it to GAC with keeping all four attributes of the assembly same but all targeting different processor architecture - Any CPU, x86, and x64 (options in Visual Studio).

So I created a simple DLL with one class and a method with stereo code (am sure you will find code very familiar :-):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace GACLearning
{
    public class Class1
    {
        public static string HelloWorld()
        {
            return "Hello World";
        }
    }
}

Built the project successfully and now it was time to generate the DLL by targeting different platforms. You can change the target processor architecture (platform) for your assembly in Visual Studio using project property page as follows (it provides three options - Any CPU, x86, and x64) :

Since we need to install assembly to GAC, it needed to be strong named. Hence, sign the assembly on signing tab under project propert page as follows:

For detailed steps to generate the key file and how to re-sign the assembly just before deployment using sn.exe tool, refer MSDN.

[Note: I removed verification of the assembly using -Vr option of sn.exe tool to make it easy for me to install assemblies to GAC]

Now I built solution three time and each time selected a different platform target under project property page and copied generated DLL to a separate folder. Three DLLs are ready which have same name, version, culture, and public key token but each target a different platform. I am ready for my test! I just dragged and dropped these assemblies to GAC folder and they all got installed to GAC happily. All three living in GAC at the same time. Shell extension does the trick under the hood by calling gacutil.exe and installs the assemblies to GAC or you can do so manually as well. Refer below screenshot of all three assemblies in GAC with only difference being in target platform:


Conclusion

There are five attributes which uniquely identify an assembly or form the identity of the assembly:

1. Assembly Friendly Name

2. Assembly Version Number

3. Assembly Culture

4. Public Key Token

5. AND Processor Architecture (Target Platform)

On a side note, if you open GAC folder directly in command prompt you will see the assembly being installed into respective folder for each platform, as shown below:

GAC 32-bit folder:

GAC 64-bit folder:

GAC MSIL folder:

Happy Coding!!

Vande Mataram!

(A salute to motherland)

P.S. In addition to blogging, I use Twitter to share tips, links, etc. My Twitter handle is: @girishjjain




Tracing with Code Injection Part II

clock April 27, 2012 06:41 by author Girish Jain

Download source code - Advanced Tracing with Code Injection Part II.zip (171 kb)

Overview

Welcome to part II of this blog post series where I show you how to inject code into an assembly to trace method execution at runtime, along with its parameter values and, all of this being achieved without writing any code in your method’s body.

Before you read further, I would suggest you read first blog post to get fair idea of code injection approach developed in earlier post and how tracing works. In the first post, I have created an application that uses Mono.Cecil library to inject code into any CLR assembly (DLL or EXE). The application will inject code to all methods marked with a known attribute in the given assembly.

Objective of this blog post series is to develop an automated solution (using code injection) which logs certain key information from a running application, of course, only when tracing is turned on for the application (which is called as runtime instrumentation). Now, I want to achieve this without manually writing tracing code for the same. CLR tracing framework is great as you can turn tracing on/off using application’s configuration file, and when turned on you can further control the type of information (such as verbose, information, warning, or error), which gets logged to trace listener, using trace switch. So the key objective is to develop a code injection solution which injects tracing code to a given assembly which will log method signature and its parameters values each time the method executes.

Advantages

1. Code injection approach frees developer from manually writing tracing code at the start of their method body.

2. Information logged during runtime would be a great help when diagnosing a production environment issue, especially for a large distributed application as it will tell you the sequence in which methods executed, and it would be helpful during debugging in development phase as well.

Done so far

The first post has created an application which logs method signature to trace listener each time the method executes. Please note that tracing code is injected only to methods marked with a custom attribute. Now, all that you had to do to achieve this functionality in your own application was to follow these simple steps:

1. Add reference to AdvancedTracing library in your project (a light-weight library which defines custom attribute you need to apply to your methods and a custom database trace listener)

2. Mark methods with the custom attribute [LogMethodExecutionEntryAttribute]

3. Add a post-build event to project to invoke CodeInjection.exe (or you can use any other automated mechanism such as a .bat file to pass project's output, DLL/EXE, to CodeInjection.exe application to inject tracing code).

4. Add a trace listener to application configuration file

Voila! All methods marked with the custom attribute will start logging method signature to trace destination each time the method executes, as follows:

16/04/2012 11:20:03 : System.Int32 InjectedCalc::Add(System.Int32,System.Int32) invoked 

What Next?

Sounds good so far but, just the method signature would not be end of the world, let’s see if we can get more. What if we get method parameters values being logged as well to trace listener? Just the method signature being logged will not be very helpful but the data passed to method would be of great help while diagnosing any issue. With the same objective in mind, I started modifying CodeInjection.exe application to inject code to log method parameters data as well, along with method signature. So for example, for a simple Add method, such as:

[LogMethodExecutionEntry(true)] 
static public int Add(int i1, int i2) 
{ 
    return i1 + i2; 
} 

I want to see the values passed to its parameters being logged as well, as follows:

16/04/2012 11:23:10 : System.Int32 InjectedCalc::Add(System.Int32,System.Int32) invoked with data:10,20,

With this objective, I modified the original logic as follows: (sub-points 6.x are new changes)

1. Load the input assembly

2. Loop over all its modules

3. For each module, loop over all its types

4. For each type, loop over all its methods

5. For each method, check whether it is marked with LogMethodExecutionEntryAttribute custom attribute

6. If method is marked with the known custom attribute, then inject following IL instructions at the start of method body

a. If method has any parameters defined, then load an array of type object to stack, with array’s size being equal to number of method parameters using newarr instruction. Now loop over all the parameters and inject code as below:

i. Load (push) each argument value using ldarg instruction

ii. If the parameter is passed by reference, then use ldind instruction to de-reference the pointer and push actual value to stack

iii. If the parameter is value-type then box the value type instance using box instruction

iv. By now you will have either a boxed representation for value type parameters or a managed pointer for reference types on the stack. Store the value on stack to object array, created earlier, using stelem instruction

b. Load (push) method signature string to the evaluation stack using ldstr "method signature"

c. Now you have a string instance and an object array at the top of your evaluation stack, in the same order. Use call instruction and call AdvancedTracing::Utility::WriteTraceLineWithData method which will pop the method signature string and object array from the evaluation stack and pass these two to WriteTraceLineWithData method as parameters. This leaves evaluation stack as it was when method execution began.

7. Lastly, save modified assembly back to disk

The entire source code for the application has been attached to the post, you can find it at the top. I am making the core code of TraceInjection class available on this page as well.

Before you start understanding code, I want to remind you that I am a big fan of tracing code in method body to explain logical flow in code as it helps a lot during development/debugging cycle, especially for a large application and a big team of developers. Hence, I strongly recommend everyone to write lot of Trace.WriteLine or similar methods calls from Trace class in code. The code injection approach shown in this post simply automates some bit of it and saves you from doing it manually across entire code base and helps you keep it up to date and consistent as that will almost be a mission impossible. At the same time, I also believe that tracing is one of the best way to comment your code as well. So use this application to your advantage and make best use of CLR tracing framework.

Lastly I would strongly recommend you to read the Advanced Tracing blog post which creates a new database trace listener and explains nuances of CLR tracing and its advantages.

Here’s the entire code for TraceInjection class with the updated logic for logging method parameter values as well:


using Mono.Cecil;
using Mono.Cecil.Cil;
using System.Configuration;
using System.IO;
using System.Diagnostics;
using System.Collections.Generic;
using System;
using System.Linq;

namespace CodeInjection
{
    public class TraceInjection
    {
        public bool InjectTracingLine(string assemblyPath, string outputDirectory)
        {
            bool logWithData = false;                           // Boolean value to store developer's preference, whether he wants to inject code which wll dump parameters value as well.
            bool isAssemblyInjected = false;                    // Boolean flagt to indicate whether we have injected code to the assembly under consideration
            bool pointerToValueTypeVariable = false;            // Boolean flag to indicate whether we have a ByRef parameter where the underlying/referenced type is a value type
            MetadataType paramMetaData;                         // Meta data type enum from Mono.Cecil

            TypeSpecification referencedTypeSpec = null;

            CustomAttribute customAttr;
            AssemblyDefinition asmDef;
            TypeReference typeObject;

            Trace.WriteLine(string.Format("InjectTracingLine called for assembly: {0} and outputDirectory: {1}", assemblyPath, outputDirectory));

            string fileName = Path.GetFileName(assemblyPath);
            string newPath = outputDirectory + @"\" + fileName;


            // Check if Output directory already exists, if not, create one
            // ------------------------------------------------------------
            if (!Directory.Exists(outputDirectory))
            {
                Directory.CreateDirectory(outputDirectory);
            }

            try
            {
                // We need reference to AdvancedTracing.Utility type and its 
                // WriteTraceLineWithData method
                // ------------------------------------------------------------
                ModuleDefinition advancedTacingModule = ModuleDefinition.ReadModule(AppDomain.CurrentDomain.BaseDirectory + @"\AdvancedTracing.dll");
                TypeDefinition utilityType = advancedTacingModule.Types.First(t => t.Name == "Utility");
                MethodDefinition loggingMethod = utilityType.Methods.First(m => m.Name == "WriteTraceLine");
                MethodDefinition loggingMethodWithData = utilityType.Methods.First(m => m.Name == "WriteTraceLineWithData");

                // List of new tracing IL instructions which will be added to the method
                List objTracingInstructions = new List();

                // Load assembly
                // ------------------------------------------------------------
                asmDef = AssemblyDefinition.ReadAssembly(assemblyPath);
                
                foreach (ModuleDefinition modDef in asmDef.Modules)
                {
                    // Get System.Object type reference
                    typeObject = modDef.TypeSystem.Object;

                    foreach (TypeDefinition typDef in modDef.Types)
                    {
                        foreach (MethodDefinition metDef in typDef.Methods)
                        {
                            // Check if method has the required custom attribute set
                            // ------------------------------------------------------------
                            if (this.TryGetCustomAttribute(metDef, "AdvancedTracing.LogMethodExecutionEntryAttribute", out customAttr))
                            {
                                // Method has the desired attribute set, edit IL for method
                                Trace.WriteLine("Found method " + metDef.ToString());

                                // Now we gonna inject code so you can flag that assembly has 
                                // been code injected so that updated assembly will be written
                                // back to disk
                                // ------------------------------------------------------------
                                isAssemblyInjected = true;

                                // Check developer's intention whether he wants to just log 
                                // method execution OR method's parameter values as well
                                if (customAttr.HasConstructorArguments)
                                {
                                    // Developer has expilicitly specified his intention
                                    if (customAttr.ConstructorArguments != null
                                        && customAttr.ConstructorArguments.Count > 0)
                                    {
                                        if (!bool.TryParse(customAttr.ConstructorArguments[0].Value.ToString(), out logWithData))
                                        {
                                            // We could not parse the constructor argument to a boolean value
                                            // so we will assume it to be false (which is the default behavior)
                                            logWithData = false;
                                        }
                                    }
                                }
                                else
                                {
                                    // Developer has NOT expilicitly specified his intention
                                    // so we will assume it to be false i.e. don't log data
                                    logWithData = false;
                                }

                                // Get ILProcessor
                                ILProcessor ilProcessor = metDef.Body.GetILProcessor();

                                // Get required counts for the method
                                // ------------------------------------------------------------
                                int intMethodParamsCount = metDef.Parameters.Count;
                                int intArrayVarNumber = metDef.Body.Variables.Count;


                                // Clear the list so that we can reuse the existing list object
                                // ------------------------------------------------------------
                                objTracingInstructions.Clear();

                                
                                // Load method signature string
                                // ------------------------------------------------------------
                                objTracingInstructions.Add(ilProcessor.Create(
                                    OpCodes.Ldstr,
                                    metDef.ToString()
                                ));


                                // If method contains parameters, then emit code to log parameter 
                                // values as well
                                // ------------------------------------------------------------
                                if (intMethodParamsCount > 0 && logWithData)
                                {
                                    // Add metadata for a new variable of type object[] to method body
                                    // .locals init (object[] V_0)
                                    // ------------------------------------------------------------
                                    ArrayType objArrType = new ArrayType(typeObject);
                                    metDef.Body.Variables.Add(new VariableDefinition((TypeReference)objArrType));


                                    // Set InitLocals flag to true. At times, this is set to false
                                    // in case of static mehods and currently Mono.Cecil does not have 
                                    // capability to detect need of this flag and emit it automatically
                                    // ------------------------------------------------------------
                                    metDef.Body.InitLocals = true;

                                    // Create an array of type system.object with 
                                    // same number of elements as count of method parameters
                                    // ------------------------------------------------------------
                                    objTracingInstructions.Add(
                                        ilProcessor.Create(OpCodes.Ldc_I4, intMethodParamsCount)
                                    );

                                    objTracingInstructions.Add(
                                        ilProcessor.Create(OpCodes.Newarr, typeObject)
                                    );

                                    // This instruction will store the address of the newly created 
                                    // array in local variable
                                    // ------------------------------------------------------------
                                    objTracingInstructions.Add(
                                        ilProcessor.Create(OpCodes.Stloc, intArrayVarNumber)
                                    );


                                    // Loop over all the parameters of method and add their value to object[]
                                    // ------------------------------------------------------------
                                    for (int i = 0; i < intMethodParamsCount; i++)
                                    {
                                        paramMetaData = metDef.Parameters[i].ParameterType.MetadataType;
                                        if (paramMetaData == MetadataType.UIntPtr ||
                                            paramMetaData == MetadataType.FunctionPointer ||
                                            paramMetaData == MetadataType.IntPtr ||
                                            paramMetaData == MetadataType.Pointer)
                                        {
                                            // We don't want to log values of these parameters, so skip
                                            // this iteration
                                            break;
                                        }

                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldloc, intArrayVarNumber));
                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldc_I4, i));

                                        // Instance methods have an an implicit argument called "this"
                                        // and hence, we need to refer to actual arguments with +1 position
                                        // whereas, in case of static methods, "this" argument is not there
                                        // ------------------------------------------------------------
                                        if (metDef.IsStatic)
                                        {
                                            objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldarg, i));
                                        }
                                        else
                                        {
                                            objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldarg, i + 1));
                                        }

                                        // Reset boolean flag variable to false
                                        pointerToValueTypeVariable = false;

                                        // If aparameter is passed by reference then you need to use ldind
                                        // ------------------------------------------------------------
                                        TypeReference paramType = metDef.Parameters[i].ParameterType;
                                        if (paramType.IsByReference)
                                        {
                                            referencedTypeSpec = paramType as TypeSpecification;
                                            Trace.WriteLine(string.Format("Parameter Name:{0}, Type:{1}", metDef.Parameters[i].Name, metDef.Parameters[i].ParameterType.Name));

                                            if(referencedTypeSpec != null)
                                            {
                                                switch (referencedTypeSpec.ElementType.MetadataType)
                                                {
                                                    //Indirect load value of type int8 as int32 on the stack
                                                    case MetadataType.Boolean:
                                                    case MetadataType.SByte:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_I1));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type int16 as int32 on the stack
                                                    case MetadataType.Int16:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_I2));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type int32 as int32 on the stack
                                                    case MetadataType.Int32:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_I4));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type int64 as int64 on the stack
                                                    // Indirect load value of type unsigned int64 as int64 on the stack (alias for ldind.i8)
                                                    case MetadataType.Int64:
                                                    case MetadataType.UInt64:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_I8));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type unsigned int8 as int32 on the stack
                                                    case MetadataType.Byte:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_U1));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type unsigned int16 as int32 on the stack
                                                    case MetadataType.UInt16:
                                                    case MetadataType.Char:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_U2));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type unsigned int32 as int32 on the stack
                                                    case MetadataType.UInt32:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_U4));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type float32 as F on the stack
                                                    case MetadataType.Single:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_R4));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type float64 as F on the stack
                                                    case MetadataType.Double:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_R8));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    // Indirect load value of type native int as native int on the stack
                                                    case MetadataType.IntPtr:
                                                    case MetadataType.UIntPtr:
                                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_I));
                                                        pointerToValueTypeVariable = true;
                                                        break;

                                                    default:
                                                        // Need to check if it is a value type instance, in which case
                                                        // we use ldobj instruction to copy the contents of value type
                                                        // instance to stack and then box it
                                                        if (referencedTypeSpec.ElementType.IsValueType)
                                                        {
                                                            objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldobj, referencedTypeSpec.ElementType));
                                                            pointerToValueTypeVariable = true;
                                                        }
                                                        else
                                                        {
                                                            // It is a reference type so just use reference the pointer
                                                            objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldind_Ref));
                                                            pointerToValueTypeVariable = false;
                                                        }
                                                        break;
                                                }
                                            }
                                            else
                                            {
                                                // We dont have complete details about the type of referenced parameter
                                                // So we will just ignore this parameter value
                                            }
                                        }

                                        // If it is a value type then you need to box the instance as we are going 
                                        // to add it to an array which is of type object (reference type)
                                        // ------------------------------------------------------------
                                        if (paramType.IsValueType || pointerToValueTypeVariable)
                                        {
                                            if (pointerToValueTypeVariable)
                                            {
                                                // Box the dereferenced parameter type
                                                objTracingInstructions.Add(ilProcessor.Create(OpCodes.Box, referencedTypeSpec.ElementType));
                                            }
                                            else
                                            {
                                                // Box the parameter type
                                                objTracingInstructions.Add(ilProcessor.Create(OpCodes.Box, paramType));
                                            }
                                        }

                                        // Store parameter in object[] array
                                        // ------------------------------------------------------------
                                        objTracingInstructions.Add(ilProcessor.Create(OpCodes.Stelem_Ref));
                                    }

                                    // Load address of array variable on evaluation stack, to pass
                                    // it as a paremter 
                                    // ------------------------------------------------------------
                                    objTracingInstructions.Add(ilProcessor.Create(OpCodes.Ldloc, intArrayVarNumber));


                                    // Call the method which would write tracing info with data
                                    // ------------------------------------------------------------
                                    objTracingInstructions.Add(ilProcessor.Create(
                                        OpCodes.Call,
                                        metDef.Module.Import(
                                            loggingMethodWithData.GetElementMethod()
                                        )
                                    ));
                                }
                                else
                                {
                                    // Call the method which would write tracing info minus data
                                    // ------------------------------------------------------------
                                    objTracingInstructions.Add(ilProcessor.Create(
                                        OpCodes.Call,
                                        metDef.Module.Import(
                                            loggingMethod.GetElementMethod()
                                        )
                                    ));
                                }


                                // Add the new MSIL to the existing body of method
                                // ------------------------------------------------------------
                                objTracingInstructions.AddRange(metDef.Body.Instructions);
                                metDef.Body.Instructions.Clear();

                                foreach (var IL in objTracingInstructions)
                                {
                                    metDef.Body.Instructions.Add(IL);
                                }
                            }
                        }
                    }
                }

                // Save modified assembly, if code injected
                // ------------------------------------------------------------
                if (isAssemblyInjected)
                {
                    Trace.WriteLine(string.Format("Saving injected assembly at: {0}", newPath));
                    asmDef.Write(newPath, new WriterParameters() { WriteSymbols = true });
                }
                else
                {
                    Trace.TraceInformation(string.Format("No code has been injected to assembly {0}", asmDef.Name.ToString()));
                }
            }
            catch
            {
                // Nothing to be done, just let the caller handle exception 
                // or do logging and so on
                throw;
            }

            return true;
        }

        public bool InjectTracingLine(string assemblyPath)
        {
            Trace.WriteLine("InjectTracingLine called minus outputDirectory, will default to application config file value");
            // New assembly path
            string outputDirectory = ConfigurationManager.AppSettings["OutputDirectory"].ToString();
            return this.InjectTracingLine(assemblyPath, outputDirectory);
        }

        public bool TryGetCustomAttribute(MethodDefinition type, string attributeType, out CustomAttribute result)
        {
            result = null;
            if (!type.HasCustomAttributes)
                return false;

            foreach (CustomAttribute attribute in type.CustomAttributes)
            {
                if (attribute.Constructor.DeclaringType.FullName != attributeType)
                    continue;

                result = attribute;
                return true;
            }

            return false;
        }
    }
}

Happy Coding!!

Vande Mataram!

(A salute to motherland)

P.S. In addition to blogging, I use Twitter to share tips, links, etc. My Twitter handle is: @girishjjain




About the author

Girish Jain works on Microsoft .Net framework technologies and is a big fan of WPF, WCF, and LINQ technologies. He is currently based in India with his wife and a daughter. When not spending time with family, Girish enjoys creating small tools, utilities, frameworks to improve developer productivity.

Sign In