Friday, November 14, 2008

New features in C#4.0 (3)

Optional and Named Parameters
Optional parameter is a long-standing request from the community that made it to C# 4.0. By itself, the feature is definitely useful but in conjunction with the mission to make COM Interop easier, there's even more value to it.
Optional Parameters
(1) The syntax
C# 4.0 can both declare and consume optional parameters. Here's a sample of a very simple method that declares a parameter as optional:
public static class OptionalDemoLib
{
public static void SayHello(string s = "Hello World!")
{
Console.WriteLine(s);
}
}
This means you can either call Do with one argument or without an argument, in which case the default value is used:
public static class OptionalDemo
{
public static void Main()
{
OptionalDemoLib.SayHello(); OptionalDemoLib.SayHello("Hello Bart!");
}
}
Notice all optional parameters need to come at the end of the argument list.
(2) The implementation
How does it work? Let's start by taking a look at the definition side. Here's the IL corresponding to the declaration of SayHello above:

Two things are relevant here. First of all, the parameter is decorated with the [opt]. Second, the method body contains a .param directive. It turns out both of those primitives have been supported in the CLI since the very beginning. Let's dive a little deeper using the CLI specification, partition II:

15.4 Defining methods
opt specifies that this parameter is intended to be optional from an end-user point of view. The value to be supplied is stored using the .param syntax ($15.4.1.4).

15.4.1 Method body
.param `[` Int32 `]` [ `=` FieldInit ] Store a constant FieldInit value for parameter Int32.

15.4.1.4 The .param directive
This directive stores in the metadata a constant value associated with method parameter number Int32, see $22.9. (...) Unlike CIL instructions, .param uses index 0 to specify the return value of the method, index 1 to specify the first parameter of the method, ...

22.9 Constant : 0x0B
The Constant table is used to store compile-time, constant values for fields, parameters, and properties. The Constant table has the following columns: - Type ... - Parent ... - Value (an index into the Blob heap) Note that Constant information odes not directly influence runtime behavior, although it is visible via Reflection. Compilers inspect this information, at compile time, when importing metadata, but the value of the constant itself, if used, becomes embedded into the CIL stream the compiler emits. There are no CIL instructions to access the Constant table at runtime.
Default parameter value must be a compile-time constant.

Named Parameters
(1) The syntax

Assume the following simple subtraction method is defined:
static int Substract(int a, int b)
{
return a - b;
}
The typical way to call this method is obviously by specifying the parameters in order, like Subtract(5, 3). However, with named parameters it's possible to write the following:
public static void Main()
{
Console.WriteLine(Substract(b: 3, a: 5));
}
Ultimately this translates into a call to Subtract(5, 3). Typically named parameters are used where optional parameters appear in the target method, but you're not interested in lots of those:
static void Bar(int a = 1, string b = null,bool c = false)
{
// ...
}
static void Bar(int a = 1, string b = null, bool c = false){ // ...}
Now assume you're only interested in the last parameter, without the named parameter feature you'd have to write Bar(1, null, ...) but now you can go ahead and write:
Bar(c: true);
Bar(c: true);
You might wonder why the syntax uses a colon (:) instead of an assignment equals character (=). The answer is straightforward: assignments have a value and can be used everywhere a value is expected:
bool c = false;
Bar(c: true);
bool c = false;Bar(c = true);
This will assign true to the local variable c and feed that value in as the first argument of Bar. So colon is the way to go.
(2) The implementation
It should be clear that the implementation only affects the call site, not the caller. Here's how the Main method from above looks like:
First of all, notice the names of parameters don't appear in the call site in any way (they never have, that's not the way IL works). Ultimately we simply call Subtract with the parameters supplied in the right order. But how we get there is important to take a closer look at:
IL_0001: ldc.i4.3
IL_0002: stloc.0
IL_0003: ldc.i4.5
IL_0004: stloc.1
IL_0005: ldloc.1
IL_0006: ldloc.0 IL_0001: ldc.i4.3
IL_0002: stloc.0
IL_0003: ldc.i4.5
IL_0004: stloc.1
IL_0005: ldloc.1
IL_0006: ldloc.

The thing to notice here are the mirrored stloc (store to local variable) versus ldloc (load from local variable) instructions. And ldc.i4.num is used to push num on the stack as Int32. On lines IL_0002 and IL_0004 values are stored to variables 0 and 1, while on lines IL_0005 and IL_0006 they're read out in reverse order. What's happening here is that during the run-time processing of a function member invocation, the expressions or variable references of an argument list are evaluated in order, from left to right. Obviously, the complier has to do something to keep the order correct.
New features in C#4.0 (2)

Co- and Contra-variance
Within the type system of a programming language, an operator from types to types is covariant if it preserves the ordering, ≤, of types, which orders types from more specific ones to more generic ones; it is contra variant if it reverses this ordering. If neither of these applies, the operator is invariant.
This distinction is important in considering argument and return types of methods in class hierarchies. In object-oriented languages such as C++, if class B is a subtype of class A, then all member functions of B must return the same or narrower set of types as A; the return type is said to be covariant. On the other hand, the member functions of B must take the same or broader set of arguments compared with the member functions of A; the argument type is said to be contra variant.
Let’s take the following codes as an example.
string[] objs = new string[3];
Process(objs);
static void Process(object[] objs)
{
objs[0] = new ArgumentException();
}
You can code this in C# and compile it successfully because array in .net is covariance. But apparently, it’s not safe.
But if you do this:
List<string> strList = new List();
IEnumerable<object> test3 = strList;
You will get a compile error, because until now C# generics are invariance. In C# 4.0, it provides us a safe co- and contra-variance.public interface IReader
{
T Read();
}
Through the out keyword, you can allow the type argument T only to be output by the methods declared on the interface.
Contra variance is also supported through the in keyword. When this is used the type argument can only be used for arguments.
public interface IWriter{ void Write(T thing);}
Section II 9.5 in the CLI specification covers this and the only difference from C# 4.0 is that + is used to denote covariance and - is used to denote contra variance.
Limitation
Variant type parameters can only be declared on interfaces and delegate types, due to a restriction in the CLR. Variance only applies when there is a reference conversion between the type arguments. For instance, an IEnumerable<object> is not an IEnumerable because the conversion from int to object is a boxing conversion, not a reference conversion.

Thursday, November 13, 2008

New features in C#4.0 (1)

Dynamic Programming
The major theme for C# 4.0 is dynamic programming. Increasingly, objects are “dynamic” in the sense that their structure and behavior is not captured by a static type or at least not one that the compiler knows about when compiling your program. Some examples include:
a. objects from dynamic programming languages, such as Python or Ruby
b. ordinary .NET types accessed through reflection
c. objects with changing structure, such as HTML DOM objects
In order to clarify this, I copy a description about dynamic programming language from Wiki:

Dynamic programming language is a term used broadly in computer science to describe a class of high-level programming languages that execute at runtime many common behaviors that other languages might perform during compilation, if at all.

C# 4.0 features a new dynamic keyword that allows you to mix in a bit of late-bound code in the midst of your otherwise statically typed code in an extensible way. This helps cleanup the string-based programming mess that is a characteristic of late-bound code.
The dynamic keyword can be used when declaring a variable, method return type, or parameter. It is used to declare that the static type of the thing is dynamic. When a variable is marked as being dynamic, C# won’t bother to resolve calls made on that object at compile-time: instead it will delay all method resolution to run time, but will use exactly the same algorithm as it would have used, so that overload resolution works correctly. Not only method calls, but also field and property accesses, indexer and operator calls and even delegate invocations can be dispatched dynamically.
Runtime Lookup
At runtime a dynamic operation is dispatched according to the nature of its target object d:
COM objects
If d is a COM object, the operation is dispatched dynamically through COM IDispatch. This allows calling to COM types that don’t have a Primary Interop Assembly (PIA) and relying on COM features that don’t have a counterpart in C #, such as indexed properties and default properties.
Dynamic objects
If d implements the interface IDynamicObject d itself is asked to perform the operation. Thus by implementing IDynamicObject a type can completely redefine the meaning of dynamic operations. This is used intensively by dynamic languages such as IronPython and IronRuby to implement their own dynamic object models. It will also be used by APIs, e.g. by the HTML DOM to allow direct access to the object’s properties using property syntax.
Plain objects
Otherwise d is a standard .NET object, and the operation will be dispatched using reflection on its type and a C# “runtime binder” which implements C#’s lookup and overload resolution semantics at runtime. This is essentially a part of the C# compiler running as a runtime component to “finish the work” on dynamic operations that was deferred by the static compiler.
Dynamic Language Runtime
An important component in the underlying implementation of dynamic lookup is the Dynamic Language Runtime (DLR), which is a new API in .NET 4.0. The DLR from Microsoft is an ongoing effort to bring a set of services that run on top of the CLR and provides and unifies language services for several different dynamic languages. These services include:
A dynamic type system, to be shared by all languages utilizing the DLR services.
a. Dynamic method dispatch
b. Dynamic code generation
c. Hosting API
The core of DLR includes Expression Trees, Dynamic Dispatch, and Call Site Caching. Expression Trees is introduced in LINQ; in C#4.0, it grows up to support statements. Dynamic Dispatch is about dispatching dynamic invocation to different binders, which allows us to communicate with different technologies. Call site caching is for efficiency. If the call is used many times, it just needs to do the resolution one time and cache the result then.
By having several dynamic language implementations share a common underlying system, it should be easier to let these implementations interact with one another. For example, it should be possible to use libraries from any dynamic language in any other dynamic language. In addition, the hosting API allows interoperability with statically typed .NET languages like C#. And the DLR will be used to implement dynamic languages like Python and Ruby on the .NET Framework. The DLR services are currently used in the development versions of IronRuby, a .NET implementation of the Ruby language, and the upcoming IronPython 2.0.




IDynamicObject interface
The basis of the dynamic resolution is the IDynamicObject interface, shared with the Dynamic Language Runtime. You can customize the dynamic process by implementing IDynamicObject interface, it can help you participate in the resolution of how method calls and property accesses are resolved. This has on it methods like Invoke which the compiler will call to allow the object itself to participate in method resolution. As well as allowing easy interaction with other dynamic languages such as IronPython and IronRuby this also has the benefit of making COM Interop much more natural. Rather than having to resort to Reflection, using dynamic typing will allow natural-looking code all the way.
public abstract class DynamicObject : IDynamicObject
{
public virtual object GetMember(GetMemberBinder info);
public virtual object SetMember(SetMemberBinder info, object value);
public virtual object DeleteMember(DeleteMemberBinder info);

public virtual object UnaryOperation(UnaryOperationBinder info);
public virtual object BinaryOperation(BinaryOperationBinder info, object arg);
public virtual object Convert(ConvertBinder info);

public virtual object Invoke(InvokeBinder info, object[] args);
public virtual object InvokeMember(InvokeMemberBinder info, object[] args);
public virtual object CreateInstance(CreateInstanceBinder info, object[] args);

public virtual object GetIndex(GetIndexBinder info, object[] indices);
public virtual object SetIndex(SetIndexBinder info, object[] indices, object value);
public virtual object DeleteIndex(DeleteIndexBinder info, object[] indices);

public MetaObject IDynamicObject.GetMetaObject();
}


var & dynamic
Please pay attention to the difference between var and dynamic. In C#3.0, there is a new keyword var. Local variables can be given an inferred "type" of var instead of an explicit type. The var keyword instructs the compiler to infer the type of the variable from the expression on the right side of the initialization statement. The inferred type may be a built-in type, an anonymous type, a user-defined type, or a type defined in the .NET Framework class library.
But if the operand is dynamic, you can get the following benefits:
Member selection is deferred to run-time;
Actual type is substituted for dynamic at run-time;
Static result type of operation is dynamic.

Tuesday, November 04, 2008

What WCF client does ... (1)

Recently, I am curious about what WCF client really does when creating proxy object (ClientBase) or use the ChannelFactory directly.
If service reference is added by using the Visual Studio directly, it will automatically create a service client for you, it implements the service contract and inherits from the ClientBase; the generic type TChannel, it is not the IChannel, it is the service contract class. The following diagram is the class diagram for ClientBase.

It is just a snapshot; actually, there are many other members and methods. I only show you two important members: ChannelFactoryRef and EndpointTrait. When you create the client object, in the ClientBase class, the constructor will create one EndpointTrait object, it is used to create ChannelFactory. After that, the ClientBase is going to initialize the ChannelFactoryRef object.
Indeed, there is a cache called factoryRefCache of type ChannelFactoryRefCache, it is initialized in the static constructor of the ClientBase. When it initializes the ChannelFactoryRef object, it tries to get the object from the cache first, but if the state of the object isn’t opened, that means the object cannot be used, it will remove it from the cache and create a new object; else it uses it and adds one more reference. So, multiple clients for the same service may use the same ChannelFactory.
Let’s see what is going to happen when it needs to create a new object. It uses the EndpointTrait object to create the ChannelFactory object, and passes it to the ChannelFactoryRef constructor. You may have some imagination already; the ChannelFactoryRef is just a reference counter class for the ChannelFactory. After that, the ClientBase delegates all the work to the inner ChannelFactory object.
What is the job of the ChannelFactory? It creates the service endpoint first; then, it needs to configure this service endpoint. I use a diagram to show you the process.

From the diagram, it searches the configuration file first, if there is such a file, it read the configuration from it (LoadChannelBehaviors); if not, it uses some common configuration (LoadCommonClientBehaviors). In the LoadBehaviors() method, it gets all the IEndpointBehavior objects. Behavior is the most useful extensible point in WCF, IEndpointBehavior allows you to achieve the extensibility in client. If you want to apply you behavior to the client, you need to add your IEndpointBehavior to the IEndpointBehavior collection of ServiceEndpoint before it calls the ChannelFactory.Open() method.
Now, we need to use ChannelFactory.CreateChannel(), here is what happens.

It ensures the communication state is opened first. If the state isn’t opened, it calls Open(), in this step, it initializes the channel and applies some configuration (call opening()), let’s see what happens in detail.

ComputeContractRequirements() method is used to inject into the ContractDescription, it collects some information from it, such as the session mode, whether the operation is one-way etc. . In the BuildProxyBehavior() method, it gets the service endpoint’s contract description; then, based on the operation descriptions of the contract, it determines which operation type should be built. If the operation is initiated by the server, then it builds the dispatch operation (BuildDispatchOperation), because it acts as a receiver; otherwise, it builds a proxy operation (BuildProxyOperation), because it acts as a sender. In both methods, it adds the operation into the one collection of the ClientRuntime. The next step is to apply the client behavior, which is why I say you should add your IEndpointBehavior to the collection before you open it. In the BuildOperations() method, the IOperationBehavior will be applied according to their operation type: ApplyClientBehavior or ApplyDispatchBehavior. The ComputeRequiredChannels() uses the results from the ComputeContractRequirements() to determine which channel type it needs. Then, it compares these requirements with the Binding. If something mismatch, it throws exception. That’s all what happens before it opens the channel.
In order to get the service client, it creates a ServiceChannel object, CreateProxy() method creates the service client, the ServiceChannelProxy inherits the RealProxy, which is in the Remoting. The client uses the client to communicate with the service. WCF client runtime will actually intercept every WCF service call by injecting a TransparentProxy and ServiceChannelProxy between WCF service call site and underlying the channel stack. The reason WCF implements the client runtime the way it is that WCF needs to differ between normal method invocation and WCF service invocation on service proxy objects.
OK, the channel is prepared; we are ready to invoke the service, we can call the service just as invoking local method through the client. It is something like RPC or RMI, but there are many differences, I will explain what happens in this process in future.

Sunday, August 10, 2008

Internal of DependencyProperty (2)

Set the value of DP
In the preceding sections, it shows you how to register and what .NET does for you. If we want to set the value of MyContent, we just need to dot this:
this.TestCtrl1.MyContent = "Button Clicked!";
According to the definition of this property, you can find that it invokes SetValue method internally.
What happens in SetValue…?
It takes three steps to do this, as the following diagram.

[1.1] you should know the DP can only be set by the thread that creates it, how to ensure this is done in VerifyAccess, which calls Dispatcher.VerifyAccess method. The Dispatcher maintains a prioritized queue of work items for a specific thread.
When a Dispatcher is created on a thread, it becomes the only Dispatcher that can be associated with the thread, even if the Dispatcher is shut down. Dispatcher.VerifyAccess checks whether current thread is the thread that creates it.

[1.2] it gets the PropertyMetadata with the help of GetMetadata, which is discussed previously.

[1.3] it is responsible for setting the value. If you have read the WPF unleashed, you may remember the following diagram.

It illustrates the five-step process that WPF runs each dependency property through in order to calculate its final value.
SetValueCommon
First, let’s take a look at the sequence diagram.

Here is one thing that all the values of the DependencyProperty are stores in _effectiveValues, which is an instance field in DependencyObject. That means before setting the value, it needs to search this field first in order to find the existing value for this property. This step is done by LookupEntry method [1.1], this method returns an EntryIndex, which is the location of the property in the _effectiveValues. _effectiveValues is an array of EffectiveValueEntry, which contains the value of the property and also includes some methods to modify the value. These methods will be discussed later.

[1.2] next step is to validate the value that you’re going to set.

[1.3] since it gets the location in step 1.1, it can use the index to get the EffectiveValueEntry.

[1.4] it creates a new EffectiveValueEntry object, which can be used to store the new value.

[1.5] UpdateEffectiveValue method finishes the five-step process.

After the final value is calculated, the value is stored in the _effectiveValues field.
Get the value of DP
You can retrieve the value through GetValue method.

Tuesday, August 05, 2008

Internal of DependencyProperty (1)

How to create a new DependencyProperty?
Let’s start from creating a new DependencyProperty. This example is simple, a new UserControl is created, which is named TestCtrl. In this UserControl, I create a new DependencyProperty as the following codes.
public partial class TestCtrl : UserControl
{
public static readonly DependencyProperty ContentProperty;

static TestCtrl()
{
TestCtrl.ContentProperty = DependencyProperty.Register("MyContent",
typeof(string), typeof(TestCtrl),
new FrameworkPropertyMetadata(string.Empty, new PropertyChangedCallback(OnContentChanged)),
new ValidateValueCallback(OnDefaultValueValidated));
}

public TestCtrl()
{
InitializeComponent();
}

public string MyContent
{
get { return (string)GetValue(TestCtrl.ContentProperty); }
set { SetValue(TestCtrl.ContentProperty, value); }
}

private static void OnContentChanged(DependencyObject obj, DependencyPropertyChangedEventArgs e)
{
MessageBox.Show(string.Format("Old Value: {0}; New Value: {1}", e.OldValue, e.NewValue));
TestCtrl current = obj as TestCtrl;
if (current != null)
{
current.TestBlock.Text = (string)e.NewValue;
}

}

private static bool OnDefaultValueValidated(object value)
{
MessageBox.Show("Validated Successfully");

return true;
}
}
It’s easy, isn’t it? Do you want to know what does the .Net do for you?
What does Register do…?
From this simple example, it’s easy to define a DependencyProperty, because .NET does a lot for you. Let’s see what it does.

When Register method is invoked, it invokes three methods internally.
1. RegisterParameterValidation. This method is going to validate the input parameters and ensure that all the arguments are set properly.
2. RegisterCommon. This method creates DependencyProperty and stores it.
3. OverrideMetadata. This method merges the PropertyMetadata with the PropertyMetadata of base type of your DependencyObject.
RegisterParameterValidation is simple, so it won’t be discussed here. In this section, I pay more attention to RegisterCommon and OverrideMetadata.
RegisterCommon
The following diagram is the sequence diagram of this method.





I am going to analyze this method step by step.
[1.3.1] it creates a FromNameKey object, which contains the name of the DependencyProperty and the owner type, this type is the class where this DependencyProperty is defined. This FromNameKey object is used as a key to store this dependency property.

[1.3.2] before any works, it first checks whether this property has been defined. PropertyFromName is a static field defined in DependencyProperty, and its type is Hashtable. Its key is FromNameKey and value is DependencyProperty.

[1.3.3] in ValidateMetadataDefaultValue method, it validates the default value of this PropertyMetadata. In our example, this default value is String.Empty. After the validating, it’ll invoke ValidateValueCallback, in our example, it is the method named OnDefaultValueValidated.

[1.3.4] if the validation is success, it creates the DependencyProperty object.

[1.3.5] PropertyMetadata::Seal is invoked, which invokes OnApply method internally. OnApply method is called when this metadata has been applied to a property, which indicates that the metadata is being sealed.

[1.3.6] the new dependency property is added in the PropertyFromName.
OverrideMetadata

When the DP(DependencyProperty) has already been added in PropertyFromName, there is one thing to do with PropertyMetadata, which is done in OverrideMetadata.


The discussion follows the same approach.


[1.1] SetupOverrideMetadata method uses owner type and PropertyMetadata that is passed in DependencyProperty constructor to get the DependencyObjectType of owner type and PropertyMetadata of the base class of current DependencyObject. DependencyObjectType represents a specific underlying system (CLR) Type of a DependencyObject. DependencyObjectType is essentially a wrapper for the (CLR) Type so that it can extend some of its capabilities.

[1.1.2] as said in step 1.1, SetupOverrideMetadata returns a PropertyMetadata object, which is for the base class of current DependencyObject. This work is done by GetMetadata method. In step 1.1, the DependencyObjectType of owner type is got, through this object, the DependencyObjectType of base type can be got, which is used by GetMetadata method to get the corresponding PropertyMetadata.
How to get the PropertyMetadata for the base class? All the PropertyMetadata objects are stored in _metadataMap field, which is defined as instance field in DependencyProperty. Notice that this field is instance field; this means that it contains the PropertyMetadata objects that belong to current object. The PropertyMetadata can be found in this field using the DependencyObjectType.Id as the key. How to use this PropertyMetadata of the base class will be discussed later.

[1.2] ProcessOverrideMetadata method stores this object in the _metadataMap field and merges the PropertyMetadata with the PropertyMetadata of the base class, which is got in step 1.1.2.

[1.2.2] InvokeMerge method merges the PropertyMetadata with the PropertyMetadata of the base class, which is got in step 1.1.2. The following values are copied from baseMetadata to current Metadata:
1. The default value;
2. PropertyChangedCallback invocation list;
3. CoerceValueCallback.

This is the whole process of registering a new DependencyProperty.

Next post, I am going to talk about the GetValue process in DependencyProperty.

Saturday, July 05, 2008

In order to solve the problems happen in my application, I decided to dig into the AppDomain. I found some interesting information, so I write them down.

Keys of AppDomain

The CLR scopes all objects, values, and object references to a particular AppDomain. An object resides in exactly one AppDomain, as do values. Moreover, object references must refer to objects in the same AppDomain. Like objects, types reside in exactly one AppDomain. If two AppDomains need to use a type, one must initialize and allocate the type once per AppDomain. Additionally, one must load and initialize the type’s module and assembly once for each AppDomain the type is used in.

Besides, unloading an AppDomain is the only way to unload a module or assembly. Unloading an AppDomain is also the only way to reclaim the memory consumed by a type’s static fields.

I also found some interesting things about the Thread class in .Net framework. System.Threading.Thread represents a schedulable entity in an AppDomain. It is soft thread; it is not recognized by the underlying OS. OS threads are referred to as hard threads. There is no one-to-one relationship between hard threads and CLR soft thread objects. A CLR soft thread object resides in exactly one AppDomain. In the current implementation of CLR, a given hard thread must have at most one soft thread object affiliated with it for a give AppDomain. And it maintains a per-AppDomain thread table to ensure that a given hard thread is affiliated with only one soft thread object per AppDomain.

We can use SetData and GetData to share information between the AppDomains. AppDomain.DoCallBack method allows you to specify a method on a type that will be executed in the foreign domain. This method must comply with the CrossAppDomainDelegate’s signature and must be static.

Table 1 lists several useful AppDomain events.

Table 1 AppDomain Events

Event Name

EventArg Properties

Description

AssemblyLoad

Assembly LoadedAssembly

Assembly has just been successfully loaded

AssemblyResolve

string Name

Assembly reference cannot be resolved

TypeResolve

string Name

Type reference cannot be resolved

ResourceResolve

string Name

Resource reference cannot be resolved

DomainUnload

None

Domain is about to be unloaded

ProcessExit

None

Process is about to shut down

UnhandledException

bool is Terminating, object ExceptionObject

Exception escaped thread-specific handlers

Table 2 lists the properties of an AppDomain that are used by the assembly resolver.

Table 2 AppDomain Environment Properties

AppDomainSetup Property

Get/SetData Name

Description

ApplicationBase

APPBASE

Base directory for probing

ApplicationName

APP_NAME

Symbolic name of application

ConfigurationFile

APP_CONFIG_FILE

Name of.config file

DynamicBase

DYNAMIC_BASE

Root of codegen directory

PrivateBinPath

PRIVATE_BINPATH

Semicolon-delimited list of subdirs

PrivateBinPathProbe

BINPATH_PROBE_ONLY

Suppress probing at APPBASE ("*" or null)

ShadowCopyFiles

FORCE_CACHE_INSTALL

Enable/disable shadow copy (Boolean)

ShadowCopyDirectories

SHADOW_COPY_DIRS

Directories to shadow-copy from

CachePath

CACHE_BASE

Directory the shadow-copy to

LoaderOptimization

LOADER_OPTIMIZATION

JIT-compile per-process or per-domain

DiablePublisherPolicy

DISALLOW_APP

Suppress component-supplied version policy

AppDomain Property

Description

BaseDirectory

Alias to AppDomainSetup.ApplicationBase

RelativeSearchPath

Alias to AppDomainSetup.PrivateBinPath

DynamicDirectory

Directory for dynamic assemblies (/)

FriendlyName

Name of AppDomain used in debugger

AppDomain can also affect the way the JIT compiler works. When the CLR initializes an AppDomain, the CLR accepts a loader optimization flag (System.LoaderOptimization) that controls hwo code is JIT-compiled for modules loaded by that AppDomain. As shown in Table 3, this flag has three possible values.

Table 3 LoaderOptimization Enumeration/Attribute

Value

Expected Domains in Process

Each Domain Expected to Run ...

Code for MSCORLIB

Code for Assemblies in GAC

Code for Assemblies not in GAC

SingleDomain

One

N/A

Per-process

Per-domain

Per-domain

MultiDomain

Many

Same Program

Per-process

Per-process

Per-process

MultiDomainHost

Many

Different Programs

Per-process

Per-process

Per-domain

Marshal and Unmarshal

As I have mentioned before, all objects, values and object references are scoped to a particular AppDomain. When one needs to pass a reference or value to another AppDomain, one must first marshal it. .Net provides RemotingServices.Marshal to do this job. It takes an object reference of type System.Object and returns a serializable System.Runtime.RemotingServices.ObjRef object that can be passed in serialized form. to other AppDomains. Upon receiving a serialized ObjRef, one can obtain a valid object reference using the RemotingServices.Unmarshal method. When calling AppDomain.SetData on a foreign AppDomain, the CLR calls RemotingServices.Marshal. Similarly, calling AppDomain.GetData on a foreign AppDomain returns a marshaled reference, which is converted via RemotingServices.Unmarshal just prior to the method's completion.

If a type derives from System.MarshalByRefObject, then it’s AppDomain-bound, which means the type will marshal by reference and the CLR will give the receiver of the marshaled object (reference) a project that remotes all member access back to the object’s home AppDomain. Note, the proxies never remote static methods. If types don’t derive from MarshalByRefObject but do support object serialization (System.Serializable), they are considered unbound to any AppDomain. They will marshal by value. And the CLR will give the receiver of the marshaled object (reference) a disconnected clone of the original object.

Marshaling typically happens implicitly when a call is made to a cross-AppDomain proxy. The CLR marshals the input parameters to the method call into a serialized request message that the CLR sends to the target AppDomain. When the target AppDomain receives the serialized request, it first deserializes the message and pushes the parameters onto a new stack frame. After the CLR dispatches the method to the target object, the CLR then marshals the output parameters and return value into a serialized response message that the CLR sends back to the caller's AppDomain where the CLR unmarshals them and places them back on the caller's stack.The following diagram show the architecture.

Figure 1. Cross-Domain Method Calls

When one use cross-AppDomain proxies, then both AppDomains must have access to the same assemblies. Moreover, when the two AppDomains reside on different machines, both machines must have access to the shared types’ metadata.

CreateInstance

Now, let’s see what happens when we use AppDomain.CreateInstance method to create remote object in another AppDomain. This method only returns an object handle, which is similar to marshaled object references. And this method returns object handle rather than a real object reference, this can avoid requiring metadata in the caller’s AppDomain, in other words, in the caller’s AppDomain, it does need to load the assembly. An attempt to call CreateInstance on a target application domain that is not the current application domain will result in a successful load of the assembly in the target application domain. Since an Assembly is not MarshalByRefObject, when this method attempts to return the Assembly for the loaded assembly to the current application domain, the common language runtime will try to load the assembly into the current application domain and the load might fail. Actually, AppDomain.CreateInstance method delegates the work to Activator.CreateInstance; Internally, Activator loads the assembly first, then it tries find the class’s public constructors; after that, it checks whether the type can be marshaled. Then, it continues to use RuntimeType (internal class) to create the object. Finally, it wraps this object in an ObjectHandle object, and returns this ObjectHandle object back to the parent AppDomain.

Unload

Another thing I am concerned is how to unload an AppDomain. AppDomain.Unload method is used to unload the whole AppDomain. In the .NET Framework version 2.0 there is a thread dedicated to unloading application domains. This improves reliability, especially when the .NET Framework is hosted. When a thread calls Unload, the target domain is marked for unloading. The dedicated thread attempts to unload the domain, and all threads in the domain are aborted. If a thread does not abort, for example because it is executing unmanaged code, or because it is executing a finally block, then after a period of time a CannotUnloadAppDomainException is thrown in the thread that originally called Unload. If the thread that could not be aborted eventually ends, the target domain is not unloaded. Thus, in the .NET Framework version 2.0 domain is not guaranteed to unload, because it might not be possible to terminate executing threads. I try to use Reflector to dig into the .net code. I find that it needs to get the AppDomain ID first. In this method, it uses RemotingServices.IsTransparentProxy to see whether the AppDomain is a transparent proxy or a real object. If it is, it gets the AppDomain ID from the proxy by invoking RemotingServices.GetServiceDomainIdForProxy method; else, it gets the ID directly from the AppDomain.Id. After that, it can use this ID to unload the AppDomain. But, I can’t find out what really happens internally.


Tuesday, July 01, 2008

The Principals in Object-Oriented Design
Most of the OO developers should have heard about the principals in OOD: Open-Close, Liskov Substitution, Dependency Inversion, Interface Segregation, and Single Responsibility. Last weekend, I spent two days on reading and thinking of these principals.
This article contains my summarization about these principals.
1. The Open-Close Principal
I think the open-close principal is the basis of the other principals. This principal is at the heart of many of the claims made for OOD.
As Martin said, the modules that conform to the open-close principal have the following two primary attributes:
1. Open For Extension. This means that the behavior of the module can be extended. The module behavior can be changed to new and different ways as the requirements of the application change, or to meet the needs of new applications.
2. Closed for Modification. The source code of such a module is inviolate. No one is allowed to make source code changes to it.
In this principal, the abstraction is the key. Using abstraction can gain explicit closure. It’s possible to create abstractions that are fixed and yet represent an unbounded group of possible behaviors, which is represented by all the possible derivative classes. If a module manipulates an abstraction, that module can be closed for modification since it depends upon an abstraction not the implementation detail.
At the end of Martin’s article, he mentioned that conformance to this principle isn’t achieved simply by using an object-orient language; rather, it requires a dedication on the part of the designer to apply abstraction to those parts of the program that the designer feels are going to be subject to change.
Actually, the primary mechanisms behind the open-close principle are abstraction and polymorphism.
2. The Liskov Substitution Principle
In the object-orient language, one of the key mechanisms that support the abstraction and polymorphism is inheritance. By using inheritance, we can create derived classes that conform to the abstract interfaces, which is the key of this principle. According to the Martin’s article, the definition of this principle is:
Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.
If there is a function that doesn’t conform to this principle, it definitely violates the open-close principle, because it must be modified whenever a new derivative of the base class is created. This principle provides guidance for the use of public inheritance.
In his article, it gives us an example of square and rectangle. Square class inherits from Rectangle class, this seems valid. But if there is a method that takes a reference to one Rectangle object as parameter:
void Func(Rectangle& r)
{
r.SetWidth(5);
r.SetHight(4);
}
If we pass a reference to Square object to this method, it shows us the problem since the width and height should be same for square. This leads us to a very important conclusion. A model, view in isolation, can’t be meaningfully validated. The validity of a model can only be expressed in terms of its clients. Thus, when considering whether a particular design is appropriate or not, one must not simply view the solution in isolation. One must view it in terms of the reasonable assumptions that will be made by the users of that design.
This principle makes clear that in OOD the ISA relationship pertains to behavior. Not intrinsic private behavior, but extrinsic public behavior; behavior that clients depend upon. It’s important when we decide whether or not one class should inherit from another class.
And when redefining a method in a derivative classes, their behaviors and outputs must not violate any of the constraints established for the base class. Users of the base class must not be confused by the output of the derived class.
3. The Dependency Inversion Principle
In traditional software development methods, such as Structured Analysis and Design, tend to create software structures in the way that high level modules depend upon low level modules, and the abstractions depend upon details. Indeed one of the goals of these methods is to define the subprogram hierarchy that describes how the high level modules make calls to the low level modules. But when the lower level modules are changed, they force the high level to change. And it’s hard to reuse the high level modules, because they depend upon the low level modules.
A design is bad if it exhibits any or all of the following three traits:
1. Rigidity. It’s hard to change because every change affects too many other parts of the system.
2. Fragility. When you make a change, unexpected parts of the system break.
3. Immobility. It’s hard to reuse in another application because it can’t be disentangled from the current application.
Conforming to dependency inversion principle can solve these issues. Any module conform to this principle, it has the following requirement.
A. High level modules should not depend upon low level modules. Both should depend upon abstractions.
B. Abstractions should not depend upon details. Details should depend upon abstractions.
In this principle, the high level modules only depend upon the abstractions of the low level modules, when the implementations of the low level modules is changed, the high level modules don’t need to change, since the abstractions of low level modules are fixed. And the high level modules are independent of the low level modules, and then the high level modules can be reused quite simply. It’s impossible that a module that doesn’t conform to this principle comply with open-close principle, since any change to the low level module can force the module to change.
4. The Interface Segregation Principle
The interface segregation principle deals with the disadvantages of “fat” interfaces. Classes that have “fat” interfaces are classes whose interfaces aren’t cohesive. In other words, the interfaces of the class can be broken up into groups of member functions. Each group serves a different set of clients. Thus some clients use one group of member functions, and other clients use the other groups.
This principle acknowledges that there are objects that require non-cohesive interfaces; however, it suggests that client should not know about them as single class. Instead, clients should know about abstract base classes that have cohesive interfaces.There is an example in his article. TimedDoor class extends Door class, which extends TimerClient class, as the following class diagram.






Each time a new interface is added to the base class, that interface must be implemented in the derived classes. Actually, there is an associated practice that is to add these interfaces to the base class as nil virtual methods rather than pure virtual methods; specifically so that derived classes are not burdened with the need to implement them. This solution can lead to maintenance and reusability problems. When a change in one part of the program affects other completely unrelated parts of the program, the cost and repercussions of changes become unpredictable. And the risk of fallout from the change increases dramatically.
ISP provides us a correct solution to this problem:
Clients should not be forced to depend upon interfaces that they don’t use.
When clients are forced to depend upon interfaces that they don’t use, then those clients are subject to changes to these interfaces. It results in an inadvertent coupling between all the clients. In order to avoid such coupling, we need to separate the interfaces. In Martin’s article, he provides us two ways: separation through delegation and separation through multiple inheritances.
4.1 Separation through delegation
We can employ the Adapter pattern to the TimeDoor problem.

In this diagram, there is a new class named DoorTimerAdapter, whose responsibility is to delegates the message from Timer to the TimedDoor. This solution prevents the coupling of Door clients to Timer. But it involves the creation of a new object. It’s better to use this solution when the Adapter needs to translate the message.
4.2 Separation through multiple inheritances
In this solution, the TimedDoor extends from both Door and TimerClient.

I prefer to use this solution. The structure is more meaningful. TimerClient class and Door class take separate responsibilities; TimedDoor class combines these two responsibilities to complete the work.
5. The Single Responsibility Principle
Each responsibility is an axis of change. When the requirements change, that change will be manifest through a change in responsibility among the classes. If a class assumes more than one responsibility, that class will have more than one reason to change.How can two responsibilities be separated correctly? That depends on how the application is changing. If the application changes in ways that cause the two responsibilities to change to change at different times, then these two should be separated; otherwise, they are changed at the same time, there is no need to separate them. There is conclusion here. An axis of change is an axis of change only if the change occurs. It’s not wise to apply this principle or any other principle, for that matter if there is no symptom.

Friday, June 27, 2008

.NET Reflection
1. Definition
In computer science, reflection is the process by which a computer program can observe and modify its own structure and behavior. In other words, the reflection-oriented programming paradigm adds that program instructions can be modified dynamically at runtime and invoked in their modified state. The program architecture itself can be decided at runtime based upon the data, services, and specific operations that are applicable at runtime.
The reflective programming paradigm introduces the concept of meta-information, which keeps knowledge of program structure. Meta-information stores information such as the name of the contained methods, the name of the class, the name of parent classes, and/or what the compound statement is supposed to do. Without such information, it’s very obscure or impossible to accomplish.
2. Reflection in .Net
Before diving into the reflection, there are two concepts need to introduce first: metadata and type. Metadata is used to describe the component contracts in .net framework. Types are the building block of every CLR program. The description of a CLR type resides in the metadata. Reflection is achieved base on the metadata.
2.1 Metadata
The CLR begins its life on earth with a fully specified format for describing component contracts. This format is referred to generically as metadata. CLR metadata is machine-readable, and its format is fully specified. Additionally, the CLR provides facilities that let programs read and write metadata without knowledge of the underlying file format. CLR metadata is cleanly and easily extensible via custom attributes, which are themselves strongly typed. CLR metadata also contains component dependency and version information, allowing the use of a new range of techniques to handle component versioning.
Metadata describes all classes and class members that are defined in the assembly, and the classes and class members that the current assembly will call from another assembly. The metadata for a method contains the complete description of the method, including the class (and the assembly that contains the class), the return type and all of the method parameters.
A compiler for the common language runtime (CLR) will generate metadata during compilation and store the metadata (in a binary format) directly into assemblies and modules. Metadata in .NET cannot be underestimated. Metadata allows us to write a component in C# and let another application use the metadata from Visual Basic .NET. The metadata description of a type allows the runtime to layout an object in memory, to enforce security and type safety, and ensure version compatibilities. The CLR postpones the decisions regarding in-memory representations until the type is first loaded at runtime. It makes the assemblies in .NET fully self-describing. This allows developers to share components across languages and eliminates the need for header files.
2.2 Type at Runtime
As you know, CLR-based programs are built out of one or more molecules called assemblies. Furthermore, these assemblies are themselves built out of one or more atoms called modules. The atom of the module can be split into subatomic particles called types. Types are the building block of every CLR program. A CLR type is a named, reusable abstraction. The description of a CLR type resides in the metadata of a CLR module.
Every object in the CLR begins with a fixed-size object header, as in the following figure.

The object header has two fields. The first field of the object header is the sync block index. One uses this field to lazily associate additional resources (e.g., locks, COM objects) with the object. The second field of the object header is a handle to an opaque data structure that represents the object's type. This data structure contains a complete description of the type, including a pointer to the in-memory representation of the type’s metadata. Although the location of this handle is undocumented, there is explicit support for it via the System.RuntimeTypeHandle type. As a point of interest, in the current implementation of the CLR, an object reference always points to the type handle field of the object's header. The first user-defined field is always sizeof(void*) bytes beyond where the object reference points to.

Although the type handle and the data structure it references are largely opaque to programmers working with the CLR, most of the information that is stored in this data structure is made accessible to programmers via the System.Type. Here, we come close to the reflection. Reflection makes all aspects of a type’s definition available to programs, both at development time and at runtime.
2.3 System.Reflection
The following diagram shows the reflection object model.




3. Use of Reflection
In order to maximize the runtime flexibility, you can consider reflection and how it can improve your software. System.Reflection is a great framework because it is the .NET core base of several good practices that revolve around the Dynamic / Plug-In / Dependency Injection / Late-Binding kind of patterns.
There are two categories of organizing typical reflection-centric tasks:
1. Inspection. Inspection entails analyzing objects and types to gather structured information about their definition and behavior.
2. Manipulation. Manipulation uses the information gained through inspection to invoke code dynamically, create new instances of discovered types, or even restructure types and objects on the fly.
For a programmer’s perspective, reflection technology can sometimes blur the conventional distinction between objects and types. For instance, a typical reflection-centric task might be:
1. Start with a handle to an object O and use reflection to acquire a handle to its associated definition, a type T.
2. Inspect type T and acquire a handle to its method, M.
3. Invoke method M on another object, O1.

3.1 Example: Invoke method dynamically
Class Services provides several operations. The Client can send IDRequest and NameRequest to the server with respect to the service that it wants to use. Also, the server allows the client to send multiple sub-requests in one request, which is composite request. The following code is the service definition.
public class Services
{
public IDReply GetID(IDRequest request)
{
return new IDReply();
}

public NameReply GetName(NameRequest request)
{
return new NameReply();
}

public ICollection ProcessCompositeRequest(ICollection requestList)
{
ICollection replyList = new List();
Broker broker = new Broker();

foreach (IRequest request in requestList)
{
IReply reply = broker.ProcessRequest(request);
replyList.Add(reply);
}

return replyList;
}
}
Broker.ProcessRequest() uses Reflection to find out the method with respect to the request.
public IReply ProcessRequest(IRequest request)
{
IReply reply = null;
MethodInfo method = FindMethod(request.GetType());

if (method != null)
{
object[] parameters = new object[1];
parameters[0] = request;
reply = method.Invoke(service, parameters) as IReply;
}

return reply;
}
If this example doesn’t use Reflection here, then the handling logic needs to be hard coded, the codes maybe like this:
public IReply ProcessRequest(IRequest request)
{
if (request is IDRequest)
return service.GetID();
if (request is NameRequest)
return service.GetName();
}
This method can become larger and larger if we continue to add new operations in this class. It’s very easy for us to miss some request logic in this block, too.
In order to eliminate this boring and error-prone work, I determine to use Reflection here.
I use Broker class to dispatch the sub-request. First, all the methods in the Services class can be got by using Type.GetMethods() method, as the following code.
private MethodInfo FindMethod(Type paramType)
{
MethodInfo[] methodList = null;
MethodInfo foundMethod = null;

// Try to find a public method that matches by parameters
methodList = servicesType.GetMethods(BindingFlags.Instance BindingFlags.Public);
foundMethod = FindMethodInList(methodList, paramType);

return foundMethod;
}
Then, I have to find the method that accepts the request type, this is done in the FindMethodInList() method, as the following code.
private MethodInfo FindMethodInList(MethodInfo[] methodList, Type paramType)
{
ParameterInfo[] parameters = null;
MethodInfo foundMethod = null;

foreach (MethodInfo method in methodList)
{
parameters = method.GetParameters();

// Has exactly one parameter
if (MethodAcceptsOneParameter(method,paramType))
{
foundMethod = method;
}
}

return foundMethod;
}

private static bool MethodAcceptsOneParameter(MethodInfo method, Type paramType)
{
parameters = method.GetParameters();

return 1 == parameters.GetLength(0)
&& parameters[0].ParameterType == paramType;
}
In this example, with the help of reflection, the redundant code is reduced, the method need to be invoked can be determined at runtime.
3.2 Example: Implement custom attribute
Attributes can be used to achieve declarative programming. According to the definition from Wiki, a program is "declarative" if it is written in a purely functional programming language, logic programming language, or constraint programming language. In a declarative program you write (declare) a data structure that is processed by a standard algorithm (for that language) to produce the desired result.
.Attributes enhance flexibility in software systems because they promote loose coupling of functionality, custom attribute let users leverage the loose coupling power of attributes for their own purposes. Once we have associated our attribute with various source code elements, we can query the metadata of these elements at run-time by using the .NET Framework Reflection classes. And some specific functions can be added to these classes.
In the NUnit framework, it defines several attributes, such as TestFixtureAttribute and TestAttribute. If the TestFixtureAttribute is applied to one class, then the class is test class; if the TestAttribute is applied to one method, the method is recognized as test method. When the NUnit loads this assembly, it can find the test classes and the test methods. In this example, I define two attributes:
[AttributeUsage(AttributeTargets.Method)]
public class TestMethodAttribute : Attribute
{
}
[AttributeUsage(AttributeTargets.Class)]
public class TestClassAttribute : Attribute
{

}
TestSuite is define to load the assembly and find all the test classes and test methods in these test classed, as the following code.
public class TestSuite
{
private readonly IList collections = new List();

public TestSuite(string assemblyFile)
{
Assembly currentAssembly = Assembly.ReflectionOnlyLoadFrom(assemblyFile);
foreach (Type type in currentAssembly.GetTypes())
{
if (IsTestClass(type))
{
collections.Add(Activator.CreateInstance(type) as TestFixtureBase);
}
}
}

//Run all the methods marked with [Test] attribute in all test fixtures
public void RunTests()
{
foreach (TestFixtureBase testFixture in collections)
{
Type fixtureType = testFixture.GetType();
foreach (MethodInfo method in fixtureType.GetMethods())
{
if (IsTestMethod(method))
method.Invoke(testFixture, null);
}
}
}

private static bool IsTestClass(Type type)
{
return typeof(TestFixtureBase).IsAssignableFrom(type)
&& (type.GetCustomAttributes(typeof(TestClassAttribute), false).Length > 0);
}

private static bool IsTestMethod(MethodInfo methodInfo)
{
if ( HasTestAttribute(methodInfo))
return true;
else
return false;
}

private static bool HasTestAttribute(MethodInfo methodInfo)
{
object[] testAttrs = methodInfo.GetCustomAttributes(typeof(TestMethodAttribute), true);
if (testAttrs.Length > 0)
return true;
else
return false;
}
}
In the TestSuite constructor, it uses Type.GetExecutingAssembly() method to get the assembly that contains the code that is currently executing. In this example, all the test classes need to extend the TestFixtureBase class, because this base class contains some common methods. Type.IsAssignableFrom() method is used to determine the class extends the TestFixtureBase. Then, it uses Type.GetCustomAttributes() to find out whether the test class uses TestClassAttribute. If it find the test class, it Activator.CreateInstance() to create an instance of this test class, and add it to a list.
In the RunTests() method, it iterates the list and finds the test methods. At last, it uses MethodInfo.Invoke() to invoke the test methods.
4. Drawback
Actually, the drawback comes from its primary intention of doing Late-Binding things. As a result it can't consider code as just raw data. And this leads to many limitations such as:
1. You cannot unload an assembly once it is loaded into an AppDomain by System.Reflection. But you can unload the whole AppDomain in order to unload the assembly.
2. At any time, browsing the code of an assembly loaded with Reflection might trigger a Code Access Security (CAS) exception because the data you’re playing with are still considered as code.
3. It has poor performance (I suspect that CAS security checks plays a major role in this performance issue).
4. It consumes a lot of memory (here also I suspect that it is because the CLR considers data as code) and it is hard to release this memory once you went through all the code of an assembly
5. You cannot load 2 different versions of an assembly inside an AppDomain .
You can also read this great article from Joel Pobar Dodge Common Performance Pitfalls to Craft Speedy Applications if you want more understanding on how Reflection relies internally on cache that makes memory grow and some benchmark in the average performance of Reflection in general. Since .NET v2, System.Reflection supports a kind of read-only mode but most of problems persist with this mode.