NET Progress: April 2006

Monday, April 24, 2006

Extension Interfaces

In previous posts, I outlined the versioning problem with extension methods, and proposed a solution. The solution associates each group of extension methods with an interface – an interface which lists all the methods which appear to be "added" to objects.

For example (my suggested solution is in bold):

public class Thing
{
//...
}

public interface IFooable
{
void Foo();
void Bar();
}

public static class FooExtender: IFooable
{
public static void Foo(this Thing t)
{
//...
}

public static void Bar(this Thing t)
{
//...
}
}

I proposed this solution to solve versioning problems, but it has a convenient side effect: "extension interfaces". Instead of just making it look like an object implements additional methods, "extension interfaces" would make it look like the object implements an additional interface.

Given the above code, we can call the extension methods Foo and Bar as if they exist on type Thing. So why can’t we cast Thing to IFooable, like this:

Thing t = new Thing();
IFooable f = t;           //implicit cast
//or
IFooable f = (IFooable)t; //explicit cast

All we’d need is a little compiler magic. The compiler magic would have to generate a (hidden) class like this:

class FooThingAdapter: IFooable
{
private Thing target;

public FooThingAdapter(Thing t)
{
target = t;
}

public void Foo()
{
FooExtender.Foo(target);
}

public void Bar()
{
FooExtender.Bar(target);
}
}

Then, when we write this

IFooable f = (IFooable)t;

The compiler would compile it as if we had written:

IFooable f = new FooThingAdapter(t);

This approach is not without its problems. Jon Skeet highlighted the main one here: we actually end up with two objects: the Thing instance and an instance of FooThingAdapter. As Jon says that seems (and probably is) fundamentally wrong.

A possible solution to Jon's problem would be to leverage the existing support for TransparentProxy. With TransparentProxies we already have a situation where there are two objects pretending to be one, and there are special branches in the CLR’s logic to take care of it. So, solving the problem with extension interfaces is not out of the question. The difficulty would be in determining whether the existing CLR logic paths could be leveraged, unchanged, to support extension interfaces. Murphy’s Law suggests not, unfortunately.

Might Microsoft add support for extension interfaces? As much as I like the idea, I have to admit that, if it was up to me, I wouldn’t. I would just solve the versioning problem – which must be addressed – and consider introducing extension methods at a later date if there was enough demand for them.

If the versioning problem is solved with interfaces, adding extension interfaces later will be easy.

By the way, compared to extension methods, extension interfaces are a much better way to simulate mixins. Why? Because a mixin implies a contract for operation, as does an interface (but not an extension method).

In the code above we can act like Thing "is-a" IFooable, something we cannot do with extension methods alone.

(Update: see followup here)

Friday, April 07, 2006

Conflicting Extension Methods

Keith Farmer from the DLinq team replied to my previous post. He pointed out that the problems with extension methods are not only about conflicts with instance methods. You can also get conflicts between extension methods, when several namespaces define extension methods with the same name.

As Keith wrote:

FooCorp's extensions conflict with BarInc's. Yet you use each in non-conflicting manners elsewhere in the same code file. How to do specify which extensions should be applied where, in a way that doesn't always involve doing so at the call site?

It’s a good question. Jon Skeet made a relevant suggestion here, and I presume Microsoft have some ideas up their sleeve too.

Ian Griffiths points out that Microsoft already appear to be segregating extension methods into separate namespaces, and suggests that component vendors should follow suit. E.g. extension methods related to FooSoft.FooLib might be in FooSoft.FooLib.Extensions rather than the main FooSoft.FooLib library. That way, if you don’t want the extensions, you just don’t use FooSoft.FooLib.Extensions.

That separation of extension methods seems like a good thing. Why not get the compiler to enforce it? If extension methods are always in namespaces which contain only extension methods, then you simply don’t use them if you don’t want them.

Can the compiler enforce such separation of extension methods? Yes... almost :-)

In relatively rare cases you may see two (otherwise-unrelated) assemblies defining types in the same namespace. In that situation, the compiler could not enforce the rule, and you’d have to fall back on some other way to resolve conflicts. That’s not as bad as it sounds, since some other syntax (such as this) will still be required anyway to resolve conflicts between extension and interface methods – so why not apply it to the few remaining extension-extension conflicts too?

In all other situations (namespace in a single assembly or in several assemblies with compile-time dependencies on each other) the compiler could enforce the rule.

By the way , what if you just want to use the extension methods from one static type, not from a whole namespace? That’s where Jon’s suggestion comes in. He suggested something like this:

using static BarSoft.BarLib.StringExtender

(where StringExtender is a static class, not a namespace). I wonder if that might be changed to

using MyAlias = BarSoft.BarLib.StringExtender

which is already standard C# syntax and could achieve the same effect? (I.e. only extension methods of the aliased type become accessible, not those in the rest of the namespace.)

This post has been about ideas to make extension-extension conflicts less frequent, and more manageable. I don't think they completely answer Keith's question, in which conflicting extensions were both used in the same code file, but they might help...

I’ll post more on the extension-meets-interface idea soon...

Thursday, April 06, 2006

Extension Methods: The Solution

As outlined in my previous post, there are three versioning problems with extension methods:

The behaviour of your program can change unexpectedly

The compiler does not warn you about the changes

Even if the compiler did warn you, your ability to respond is limited

In this post, I’ll outline a suggested solution that fixes all three problems. This will be a lengthy post, so please bear with me (or skip to the conclusion below if you can’t wait ;-)

Introduction

I suggest that:

Extension methods should belong to interfaces. Think of it as defining an "extension contract", a defined set of methods which can be provided as extension methods.

For example, here’s an extension method called Foo, which belongs to interface IFooable:


public static class Extender: IFooable
{
  public static void Foo(this Thing t)
  {
    //...
  }
}

public interface IFooable
{
  void Foo();
}

That’s all standard C# 3, except for the bit in bold, which is new. It means, "The static class Extender provides extension methods which implement IFooable".

(The definition of interface IFooable is perfectly ordinary C#, there's nothing new there.)

Details

Every extension method is now associated with a particular “contract for functionality” (the contract represented by the interface). By associating extension methods with defined contracts, we can solve all of the problems noted above.

When the compiler hits this line...

a.Foo()

...it identifies all the applicable methods called Foo [1]. Say it finds an instance method and an extension method, both called Foo.

If they both belong to the same interface, the compiler can assume that the instance version is the best choice. The class is, in effect, “overriding” an extension method with a specialized alternative. (In the interests of brevity, I won’t dwell on this point, but please leave a comment below if I haven’t made myself clear.)

If the methods do not belong to the same interface, the compiler raises an error [2]. The call is ambiguous because the two methods belong to different contracts for functionality.

Example

Let’s see how this would work in practice. Consider my previous example of an extension method for strings, called Contains. If you wrote that method, you’d have to associate it with an interface. You’d write something like this:


public static class StringExtender: IContainable
{
  public static bool Contains
    (this string source,
    string value)
  {
    //...
  }
}

public interface IContainable
{
  bool Contains(string value);
}

Now, imagine that Microsoft release their own Contains method for strings, which is incompatible with yours. The compiler will look at both methods to see which interface they belong to. Microsoft’s Contains may belong to an interface defined by Microsoft[3] or, since its an instance method, it may not belong to any interface at all. In any case, it does not belong to your interface. Therefore, the compiler knows that it’s not a substitute for your Contains.

After detecting that the methods belong to different contracts (interfaces), the compiler raises an error. Fortunately, using interfaces offers us a convenient way to tell the compiler which method we actually want, and thereby resolve the error. Simply include the name of the interface which defines the method you want to call. Or, for instance methods with no interface, use the name of the class. Like this:


string s = "abc";
string t = "a";

//ambiguous 
s.Contains(t);
 
//calls your extension method 
s.(IContainable)Contains(t);
 
//calls Microsoft's instance method 
s.(string)Contains(t);

This hypothetical syntax looks a lot like the existing operator for casting types, except it looks like it's casting methods – which it is, in a way.

To tie all this back to my previous post, the current C# spec allows you to force the call to go to the extension method by writing StringExtender.Contains(s, t).That's like s.(IContainable)Contains(t). The current C# spec does not support the opposite, s.(string)Contains(t), in any way.

Conclusion

There are versioning problems with extension methods. The problems can be solved by making extension methods belong to well-defined contracts for functionally, using standard C# interfaces to define those contracts.

Here is each of the problems I originally stated, and its solution:

Problem: The behaviour of your program can change unexpectedly
Solution: There are no unexpected breakages, because instance methods only replace extension methods which belong to the same contract (interface).

Problem: The compiler does not warn you about the changes
Solution: The compiler detects any ambiguity, and warns you.

Problem: Even if the compiler did warn you, your ability to respond is limited.
Solution: You can force the call in either direction: to the extension method or to the instance.

Stay tuned for my next post, in which I’ll outline some groovy side-effects of this proposal... (and also see my later followup here)

Footnotes

[1] Is there a performance cost, at compile time, in identifying all the possible methods? In the current C# 3 spec, the compiler doesn’t even bother looking for extension methods if an instance one exists. In my suggested approach, it always has to check for extension methods. I would guess that my solution has some compile-time cost, but that the cost would be acceptable on modern hardware.

[2] Or, as I suggested in my previous post, perhaps the compiler could just raise a warning instead, and then go ahead and choose the instance method. That’s not my preferred approach, but it would keep the behaviour in line with Microsoft’s current spec for C# 3 – i.e. the compiler always does what the current spec says, with the difference being it does raise a warning when it knows it's doing something dangerous.

[3] Even if Microsoft gave their interface the same name as yours, the compiler could still tell the difference, just like it does right now for same-named interfaces from different namespaces.

Monday, April 03, 2006

Extension Methods: The Problem

There’s a problem in Microsoft’s proposed implementation of extension methods. I’ll illustrate the problem by showing how, under the current design, Microsoft can break your code and there’s not much you can do about it.

But first, some background…

The compiler will always favour a “normal” method, with the same name, over an extension method. As others have noted, this leads to problems when the object and the extension method are versioned independently. (The classic example of independent versioning is when one company writes the class and another write the extension method.) After a new version of the class is released, code that used to call an extension method will start calling an instance method instead, if the new version introduces an instance method with the same name.

Here’s an example: Imagine we had extension methods in .NET 1.1. You might have written you own Contains method for strings like this:


public static class StringExtender
{
  public static bool Contains(this string source, 
      string value)
  {
      CompareInfo compare = 
        CultureInfo.CurrentCulture.CompareInfo;
      return compare.IndexOf(
         source, value, 
         CompareOptions.IgnoreCase) >= 0;
  }
}

What happens when Microsoft add their own Contains method to the string class, as they did in .NET 2.0? You upgrade to 2.0 and suddenly your application starts malfunctioning. Why? Because, as noted above, all your existing code starts calling Microsofts’s Contains instead of yours. The catch is, their version is case sensitive and yours is not.

Microsoft changed the behaviour of your application, simply by making "harmless" addition to the framework!

There are actually three problems here.

The behaviour of your application changed
You don’t know where it changed, because the compiler did not warn you about the lines that were affected
Even if the compiler did provide warnings, your ability to respond to those warnings would be limited.

The first problem is scary.

The second is troubling, because compiler warnings (or errors) are the logical solution.

The third problem deserves further explanation. Say the compiler did warn you. Say, for argument’s sake, that didn’t give a fatal error, just a warning like this: “Method call is ambiguous. Instance method will be chosen instead of extension method.” That’s great, the compiler warned you. But how would you make the warning go away? How would you we tell the compiler which method you want, so it won’t keep warning you every single time you compile?

You can force the compiler to choose an extension method, simply by re-coding each call to use the syntax of an ordinary static method instead of an extension method. But you can’t do the opposite. You can’t say “It’s OK compiler, go ahead and choose the instance method. That’s the one I want. Don’t warn me about it again.”

I suspect the language designers at Microsoft have already identified these three problems. And I won’t be surprised if they fix them. But, while we wait, let’s guess what they might do. Stay tuned for my next post…

Sunday, April 02, 2006

Extension Methods: More than Sugar

One of the enhancements in C# 3.0 is extension methods. Extension methods let you write static functions that look like they belong to other objects - even objects that you didn't write.

Some bloggers have criticised extension methods, saying that they're just syntactic sugar. Yes, they are syntactic sugar; but they're not just syntactic sugar.

Consider this extension method:


    public static class Extender
    {
        public static void Foo(this Thing t)
        {
            //...
        }
    }

It tells the compiler to act as if the method Foo exists on class Thing:


        Thing t = new Thing();
        t.Foo();                 // A - what we write
        Extender.Foo(t);         // B - what the
                                 //compiler compiles

But, that's not all the extension method means. It really means this:

Compile t.Foo() as Extender.Foo(t) unless class Thing defines its own method Foo(), in which case you should call that instead.


        // When we write
        t.Foo();    

        // It compiles the same as this:
        Extender.Foo(t);

        // unless t.Foo() exists, 
        // in which case if compiles
        // to this, just as if the
        //extension method never existed:
        t.Foo();

That's what makes extension methods more than just syntactic sugar. They allow you to create a "standard" implemenation of Foo, which is used by all objects except those that define their own specialised implementation instead.

That's an interesting concept - but there's a problem in Microsoft's proposed implementation. I'll write about it in my next post...

NET Progress