[Tuples]

Introduction

Sunday, March 06, 2005

With the introduction of generics in Whidbey, we are relieved of writing our own type safe collection classes:

IList<Foo> foos = new List<Foo>();

What's less known is that, with some ground work, we can often skip the task of defining the element type for these type-safe collections as well:

ITuple<string, int> foo = new Tuple<string, int>("UeberFoo", 5);

In other words, we can create complex type-safe data structures in an ad-hoc fashion:

IList<ITuple<string, int>> foos = new List<ITuple<string, int>>();
foos.Add(new Tuple<string, int>("UeberFoo", 5));

Note that compatibility among type parameters does not imply compatibity of so constructed types: there is no covariance among generic containers. While Tuple derives from ITuple, the constructed list type is "of ITuple", and you can't assign a "list of Tuple" to the foos variable, even though you can add a Tuple object to the list. Don't worry.

Resources

The idea is inspired by a new part of the C++ standard library, which is also called tuples. Information is available on the web.

I have written a demo project which you can download.

Type Overview

Let's make sense of the types.

Basic Interfaces

Everyone here implements ITuple, which does nothing more than informing about the number of elements in the tuple.

Then, there are the generic ITuple overloads, each introducing one more element, up to five elements. These inherit form each other, so the constructed types are compatible. These interfaces are implemented by the default tuple classes (see below), as well as any interested class (we'll see that, too).

Then, there are some helper interfaces which aren't so interesting from a generics point of view, but some are nevertheless implemented by the default tuple classes, because they are useful for dynamic processing. These are IRandomAccessTuple (allows accessing tuple elements by numeric index), IArrayAccessTuple (allows accessing all elements at once), and INamedAccessTuple (allows accessing elements by name). All of these operations involve boxing, however.

Default Implementations

These implement the five generic ITuple overloads, as well as the IRandomAccessTuple and IArrayAccessTuple helper interfaces. Access by name is not implemented here.

Note that the implemenation of the helper interfaces could use less code, but the price would be even more boxing. But that's not our focus here.

With the default implementations, you can create ad-hoc objects having up to five read/write data members, all strongly typed. There is even a bit of OO-style polymorphism available:

ITuple<int, string> tuple = new Tuple<int, string, bool>(1, "One", true);
tuple.Value1 = 5;
tuple.Value2 = "Five";
Console.WriteLine(tuple.ElementCount);
Console.WriteLine(tuple);

// inspect the element types
foreach(Type tp in tuple.GetType().GetGenericArguments()){
    Console.WriteLine(tp);
}

Infrastructure methods, like ToString, Equals and GetHashCode are also implemented, but remember that these three methods use all data members, which might not be suitable in every case.

Sample Implementations

That would be the Person class. It implements the three-element ITuple overload (ITuple<T1, T2, T3>), as well as all helper interfaces.

public class Person : ITuple<string, string, DateTime>,
                      IRandomAccessTuple,
                      INamedAccessTuple,
                      IArrayAccessTuple {

    // ...

    #region "ITuple<T1, T2>"
    int ITuple.ElementCount {
        get{return 2;}
    }
    string ITuple<string>.Value1{
        get{return m_FirstName;}
        set{m_FirstName = value;}
    }
    string ITuple<string, string>.Value2{
        get{return m_LastName;}
        set{m_LastName = value;}
    }
    DateTime ITuple<string, string, DateTime>.Value3{
        get{return m_Born;}
        set{m_Born = value;}
    }
    #endregion

} // class Person 

Lest you thing I'm trying to show you C# property syntax, the interesting thing is how to explicitly implement a generic interface in a closed constructed type.

A Person object can be used with any code that uses the tuple interfaces, especially the utilities that follow.

Utilities

The utilities deal with processing tuples in collections. For now, I have implemented filtering and sorting.

Filtering works with C# iterators. For every generic tuple overload, there are two parameter-overloaded methods: one takes the element types to be used as filter criteria ("and"-combined; the default value of each element type indicates "ignore", which is just too bad for value types):

public static IEnumerable<ITuple<T1, T2>> SelectTuples<T1, T2>(
    IEnumerable<ITuple<T1, T2>> source, T1 match1, T2 match2
)
{
    ITuple<T1, T2> match = new Tuple<T1, T2>(match1, match2);
    return TupleUtil.SelectTuples<T1, T2>(
      source, match, delegate(ITuple<T1, T2> a, ITuple<T1, T2> b)
      {
        return (a.Value1 == null || a.Value1.Equals(default(T1)) || a.Value1.Equals(b.Value1))
            && (a.Value2 == null || a.Value2.Equals(default(T2)) || a.Value2.Equals(b.Value2));
      }
    );
}

As you can see, the first version just delegates to the second one, which takes a tuple object initialized to the criteria, and a custom callback:

public static IEnumerable<ITuple<T1, T2>> SelectTuples<T1, T2>(
    IEnumerable<ITuple<T1, T2>> source, ITuple<T1, T2> match, TupleMatcher<T1, T2> matchCallback
)
{
    foreach(ITuple<T1, T2> t in source){
        if(matchCallback(match, t)){
            yield return t;
        }
    }
}

The callback implementation in the first version has some problems. We need to test for null in case the element types are reference types. We also test for "default(T)" in case the element type are value types. However, in order to avoid boxing, we do that by calling the (instance) Equals method, which is hopefully defined on the value type, instead of the static object.Equals(object, object) method. Again, since we don't know whether we have a value type, the test against null is necessary.

Another problem is how to signal that a criterion should be ignored - using System.Nullable results in compiler warnings when used with reference types. Therefore, the default value of value types is treated like null, which means ignore the parameter.

Here's how to use the filters:

List<ITuple<string, string, DateTime>> people = new List<ITuple<string, string, DateTime>>();
list.Add(new Person("John", "Doe", new DateTime(1950, 1, 1)));
list.Add(new Person("Stella", "Blue", new DateTime(1980, 10, 7)));

// first version
foreach(Person anyPerson in TupleUtil.SelectTuples<string, string, DateTime>(
    people, null, "Doe", DateTime.MinValue)
)
{
    Console.WriteLine(anyPerson);
}

// second version
ITuple<string, string, DateTime> match = new Tuple<string, string, DateTime>(
                                                   null, "Doe", DateTime.MinValue);
foreach(Person anyPerson in TupleUtil.SelectTuples<string, string, DateTime>(
    people, match, delegate(<string, string, DateTime> a, ITuple<string, string, DateTime> b){
        // first parameter is the match tuple
        return b.Value2.StartsWith(a.Value2);
    })
)
{
    Console.WriteLine(anyPerson);
}

Support for sorting consists of five generic overloads of a comparer class. The tuple comparers can be dynamically configured (sort order, sort direction). Here the two-element overload:

public class TupleComparer <T1, T2> : TupleComparerBase, IComparer<ITuple<T1, T2>>
        where T1 : IComparable<T1> where T2 : IComparable<T2>
{

    public TupleComparer() : base(1, 2) {
    }

    public TupleComparer(int rank1, int rank2) : base(rank1, rank2) {
    }

    int IComparer<ITuple<T1, T2>>.Compare(ITuple<T1, T2> a, ITuple<T1, T2> b) {
        int ret = 0;
        foreach(int rank in this.RankOrder){
            int absRank = Math.Abs(rank);
            switch(absRank){
                case 1:
                    ret = a.Value1.CompareTo(b.Value1);
                    break;
                case 2:
                    ret = a.Value2.CompareTo(b.Value2); 
                    break;
            }
            if(rank < 0){
                ret *= -1;
            }
            if(ret != 0){
                return ret;
            }
        }
        return 0;        
    }

    // ...
}

Here's a usage example:

people.Sort(new TupleComparer<string, string, DateTime>());
people.Sort(new TupleComparer<string, string, DateTime>(2, -1, 3));

Discussion

It takes a bit of work to set up the generic overloads, but this is a one-time task. Besides, five elements should be enough for most purposes.

Notice that the implementation of IArrayAccessTuple.SetElements in the default implemenation classes checks whether the number of elements in the array is within range, and fails gracefully. This is so because the caller - polymorphically - doesn't need to know how many elements the tuple actually has.

[more to be supplied]