Monday, April 03, 2006

Extension Methods: The Problem

There’s a problem in Microsoft’s proposed implementation of extension methods. I’ll illustrate the problem by showing how, under the current design, Microsoft can break your code and there’s not much you can do about it.

But first, some background…

The compiler will always favour a “normal” method, with the same name, over an extension method. As others have noted, this leads to problems when the object and the extension method are versioned independently. (The classic example of independent versioning is when one company writes the class and another write the extension method.) After a new version of the class is released, code that used to call an extension method will start calling an instance method instead, if the new version introduces an instance method with the same name.

Here’s an example: Imagine we had extension methods in .NET 1.1. You might have written you own Contains method for strings like this:

public static class StringExtender
{
public static bool Contains(this string source, 
string value)
{
CompareInfo compare = 
CultureInfo.CurrentCulture.CompareInfo;
return compare.IndexOf(
source, value, 
CompareOptions.IgnoreCase) >= 0;
}
}
What happens when Microsoft add their own Contains method to the string class, as they did in .NET 2.0? You upgrade to 2.0 and suddenly your application starts malfunctioning. Why? Because, as noted above, all your existing code starts calling Microsofts’s Contains instead of yours. The catch is, their version is case sensitive and yours is not.


Microsoft changed the behaviour of your application, simply by making "harmless" addition to the framework!

There are actually three problems here.
  1. The behaviour of your application changed
  2. You don’t know where it changed, because the compiler did not warn you about the lines that were affected
  3. Even if the compiler did provide warnings, your ability to respond to those warnings would be limited.

The first problem is scary.

The second is troubling, because compiler warnings (or errors) are the logical solution.

The third problem deserves further explanation. Say the compiler did warn you. Say, for argument’s sake, that didn’t give a fatal error, just a warning like this: “Method call is ambiguous. Instance method will be chosen instead of extension method.” That’s great, the compiler warned you. But how would you make the warning go away? How would you we tell the compiler which method you want, so it won’t keep warning you every single time you compile?

You can force the compiler to choose an extension method, simply by re-coding each call to use the syntax of an ordinary static method instead of an extension method. But you can’t do the opposite. You can’t say “It’s OK compiler, go ahead and choose the instance method. That’s the one I want. Don’t warn me about it again.”

I suspect the language designers at Microsoft have already identified these three problems. And I won’t be surprised if they fix them. But, while we wait, let’s guess what they might do. Stay tuned for my next post

8 Comments:

Anonymous damien morton said...

Extension methods are an absolutely essential part of Linq, and an absolutely essential part of bring function-style programming to the language.

Any suficiently powerfull programming language will have potential problems.

Youve identified one of them, and youre absolutely right - there is name collison problem, and not much that can be done from a language design perspective to fix it.

In some ways its analagous to operator overloading and implicit type conversions, and as it is for those powerful mechanims, the solution is to use them sparingly.

Sun Apr 09, 03:31:00 AM PDT  
Blogger John Rusk said...

>Extension methods are an absolutely essential part of Linq, and an absolutely essential part of bring function-style programming to the language.

I've never seen anyone spell out exactly why they are essential to those things.

In terms of Linq, I believe the only reason they are essential is because they allow a general purpose static method (the extension method) to be replaced by a special-pupose version (the instance method) for those classes that need a special-purpose version. That behaviour is clearly very important to Linq, but the exact syntax and mechanism used to obtain it is not. Why? Because Linq users generally don't see extension methods. Linq users write Query Expressions, which the compiler then translates into method calls.

E.g. consider this query expression

from c in customers
where c.City == "London"
select c

The compiler translates it into a bunch of method calls. It is important in that translation process that the compiler should favour specialised instance methods over generic static (extension) ones; but there is nothing about that translation process that actually requires extension methods - at least, not as a general purpose language feature accessible to programmers for purposes unrelated to Linq.

My point is that if MS _are_ going to take something which, in some ways, is an implementation detail of Linq, and turn it into a stand-alone language feature, then they need to make it a bit safer to use.

I'd be interested to hear why they're essential to functional-style programming.

>and not much that can be done from a language design perspective to fix it.

I beg to differ, having just spelled out a solution in my follow-up post :-)

>In some ways its analagous to operator overloading and implicit type conversions

I still feel that extension methods have a greater potential to create unforseen changes in program behaviour. The solution I propose drastically reduces that risk. Yes, it makes extension methods slightly more complex, but given that everyone agrees they are an advanced feature, maybe a certain level of intimidating complexity is a good thing - to deter those who probably shouldn't be using them anyway.

Mon Apr 10, 01:58:00 AM PDT  
Anonymous damien morton said...

I havent seen too many problems with operator overloading or implicit casting. In fact, I havent seen anyone use those mechanisms at all in real code. That said, when you need them, they are there.

Perhaps I overstated the case about extension methods being absolutely essential to introducing functional programming.

I will say that extension methods make function-programming easier within the confines of an object-oriented language. Given the noun.verb() grammar of object-oriented programming, it can be awkward to switch to the class.verb(noun) grammar required for applying static methods, and extension methods unify the two approaches nicely, making the static methods that apply to this class equally discoverable along with the class methods (Im talking about intellisense here). This naturally encourages a more functional style of programming.

You are right that linq doesnt actually need extension methods. You are wrong that users will only use the Linq SQL-like syntax, and for those users that dont, extension methods make a lot of sense. Keep in mind that the Select/Where/etc methods are only one possible suite of extension methods for one particular application. One could imagine there being a suite of operations that act on IList<T> objects, and implementing a vector algebra. Users of such a suite wouldnt use the SQL-like syntax, but would benefit from the dotted extension-method syntax. I dont see extension methods as purely a Linq implementation detail.

I agree that extension methods can be dangerous, and putting in place some hurdles on their creation or use wouldnt be a bad thing, but I would hate to see them emasculated for the sake of safety alone.

Mon Apr 10, 09:56:00 AM PDT  
Blogger John Rusk said...

Yes, that's a good point about extension methods taking away the jarring transition to class.verb(noun) for static methods.

> You are wrong that users will only use the Linq SQL-like syntax

True. (It was late and I was typing in a hurry - at least, that's my excuse ;-)

>One could imagine there being a suite of operations that act on IList of T objects, and implementing a vector algebra.

That's a good example. What I'm concerned about, is what happens if Microsoft decide to offer some of the same operations in a later relase of .NET, as instance methods? Suddenly, my code starts calling their methods and there's nothing I can do about it. I don't even know its happening (at first) since the compiler can't warn me.

I wouldn't want to see Extension methods emasculated either. What are the limitations that you see arising from my interface-based solution?

Mon Apr 10, 06:23:00 PM PDT  
Anonymous damien morton said...

Hi John,

I have a blog entry on a technique which leverages extension method capabilities that would be prohibited by your extension interface idea.

http://blog.lab49.com/?p=237

In it I describe a technique that relies on a class containing extension methods to create a relationship between two classes. Theres no interfaces involved, and forcing interfaces into the picture would just complicate it.

Tue Apr 11, 02:48:00 AM PDT  
Blogger John Rusk said...

True, the interface solution would be more complicated. But, for an advanced feaure, I'd rather have safe-and-complex than simple-and-dangerous :-)

For the record, the interface solution would look like this (additions to your code are in bold).

class Trader { /* add stuff to do with traders */ }
class Instrument { /* add stuff to do with instruments */ }

static class TraderInstruments : Relationship<ManyToMany, Trader, Instrument>, IInstruments, ITraders
{
public static ICollection<Instrumen> Instruments(this Trader trader) { return GetFirst(trader); }

public static ICollection<Trader>Traders(this Instrument instrument) { return GetSecond(instrument); }


public interface IInstruments
{
ICollection<Instrument> Instruments();
}

public interface ITraders
{
ICollection<Trader>Traders();
}

}

//Interfaces are defined inside static class in this case, but that's not essential - I'm just using the class as part of their "namespace".

//Need two interfaces, not one, because the "this" param is of different type of each of your extn methods

Tue Apr 11, 05:36:00 PM PDT  
Blogger John Rusk said...

Opps, that last sentence should read:

"because the "this" param is of different type on each of your extn methods"

Tue Apr 11, 06:24:00 PM PDT  
Blogger John Rusk said...

PS for a slightly better reply (i.e. one where I made more sense ;-) see this post: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=346074&SiteID=1

In short, the static class implements as many interfaces as required.

Wed Apr 12, 03:19:00 AM PDT  

Links to this post:

Create a Link

<< Home