abxlhw
Last Updated: February 25, 2016
·
5.352K
· keboo

C# combination (subset) extension method

I needed a way to get all of the subsets of a specified size within a given list. Lets start with a brief example. Given the array [1, 2, 3, 4], and a desired subset size of 2, I wanted a method that would output a collection containing:[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]. Or given a subset size of 3 the expected output should be [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4].

For python coders you will notice that the functionality I am duplicating can be found in the itertools.combinations method.

For anyone who is unfamiliar with deferred execution, I highly recommend Jon Skeet’s blog posts on Edulinq. In part 2, his rewrite of the Where LINQ extension method gives a good example of why to split the extension method into two parts, immediate execution for validation, and deferred execution.

For my method I chose to implement the same pattern.

public static IEnumerable<T[]> Combinations<T>(this IList<T> argList, int argSetSize)
{
    if (argList == null) throw new ArgumentNullException("argList");
    if (argSetSize <= 0) throw new ArgumentException("argSetSize Must be greater than 0", "argSetSize");
    return combinationsImpl(argList, 0, argSetSize - 1);
}

This method performs validation on the arguments and calls the combinationsImpl method. The second parameter indicates the starting position in the list, and the third parameter indicates the number of times the combinationsImpl method is allowed to iterate (this will make sense later).

Now for the real meat and potatoes.

private static IEnumerable<T[]> combinationsImpl<T>(IList<T> argList, int argStart, int argIteration, List<int> argIndicies = null)
{
    argIndicies = argIndicies ?? new List<int>();
    for (int i = argStart; i < argList.Count; i++)
    {
        argIndicies.Add(i);
        if (argIteration > 0)
        {
            foreach (var array in combinationsImpl(argList, i + 1, argIteration - 1, argIndicies))
            {
                yield return array;
            }
        }
        else
        {
            var array = new T[argIndicies.Count];
            for (int j = 0; j < argIndicies.Count; j++)
            {
                array[j] = argList[argIndicies[j]];
            }

            yield return array;
        }
        argIndicies.RemoveAt(argIndicies.Count - 1);
    }
}

The algorithm here is not as complicated as it looks. Starting with the first items in the list we recursively add items to our array until the array reaches the desired length. Once the desired length is reached the array is returned, we remove the last index that was used to build the array (essentially backing up and skipping over) and continue as long as we can.

For example consider the array [1, 2, 3] finding all combinations of length 2.

We start by collecting the first item into our result set [1]. We then proceed to the next item and collect it. Our result set is now [1, 2]. Since the result set is of length 2 we return it, remove the last item (making the result set [1]) and continue going through the list. We then collect 3 making the result set [1, 3], and return it. Since we have reached the end of the list, we increment the starting index, and start the result set over making it [2]. Moving to the next item we collect 3 making the result set [2, 3].

This same pattern continues until the starting index moves so close to to the end of the list that there will not be any more combinations.

Source: http://dotnetgeek.tumblr.com/post/5205119053/c-combination-subset-extension-method

Say Thanks
Respond