Lists: Filter, Map and Reduce

There are 3 very handy list functions which make dealing with lists a breeze: Map, Filter and Reduce. I have come to the belief that not everyone understands their power, so I will attempt to explain it.

Filter

The filter function  (fairly aptly named) aims to filter out (remove) items from the list which are not wanted. Typically, the Filter will take a list to be filtered and a condition to be met. Filter will then iterate through the list, and check each item in the list against the condition and form a new list from the items which pass the condition. A practical example would be: you have a list of names, and from that list you want the names that start with “S”, so ["Snowy", "Shelly", "Muffin", "Oberon"] would filter to ["Snowy", "Shelly"].
public bool FilterFunction(T item)
{
return item.StartsWith("S")
}
public List Filter(List list, FilterFunction condition)
{
List filteredList = new List()
foreach (T item in list)
{
if (condition(item))
{
filteredList.Add(item)
}
}
return filteredList
}

 

Map

The map function takes a list and a function and applies the function to every item in the list. A practical example would be: a repository returns a list of Cars, but you need to have a list of the number of seats in the car, so [GWiz, SmartCar, CleverCityCar, PeopleMover] would map to [1, 2, 5, 9]. 

 

public R MapFunction(T item)
{
return item.NumberOfSeats()
}
public List Map(List list, MapFunction mapFunction)
{
List mappedList = new List()
foreach (T item in list)
{
mappedList.Add(mapFunction(item))
}
return mappedList
}

 

 

Reduce

Reduce is the hardest list function to understand (and explain). It is sometimes referred to as accumulate, fold, compress or inject. In effect, it takes the list and folds/compresses/reduces it down to a single value. Reduce takes a list and a function. It iterates over the list; it applies the function to the first item and a ‘starting value’. The result of the first function will be used as the ‘starting value’ in the function of the next iteration. (Some implementations do not require a starting value, but start iterating over the second item in the list, using the first item as the starting value). A practical example would be: finding the total number of seats available in a set of cars, so using the example before of [GWiz, SmartCar, CleverCityCar, PeopleMover], the result would be 17 (1 + 2 + 5 + 9).  The common example you will see regarding this function is finding the total of all the numbers in a list, so [1, 2, 5, 9] would yield 17.
public R ReduceFunction(T item, R accumalator)
{
return item.NumberOfSeats + accumalator
}
public R Reduce(List list, ReduceFunction reduceFunction, R startingValue)
{
R accumalator = startingValue
foreach (T item in list)
{
accumalator = reduceFunction(item, accumalator)
}
return accumalator
}

 

 

Summary

The filter function transforms a set (list) of items into a subset of the items. 
The map function transforms a list of items in one type to a list of items in another type.
The reduce function transforms a list of items in one type to an object in another type.
Both Ruby and Python make excellent use of these functions, so if you would like to further your understanding, I would suggest using these.
I am lead to believe that C# has these functions, but only for array implementations (and definitely not on IList). I don’t believe Java has these at all, but you could always create reusable methods which provides this behaviour. 
Next time you start to write a foreach loop, have a look to see if what you are really doing is a filter, a map or a fold. You may find you save yourself a few lines of code.
This entry was posted in Design. Bookmark the permalink.

3 Responses to Lists: Filter, Map and Reduce

  1. James Curran says:

    I don't believe C# has them, but they are simple enough to write:

    public IEnumerable< T > Filter(this IEnumerable< T > list, Action< T, bool> condition)
    {
    foreach (T item in list)
    {
    if (condition(item))
    {
    yield return item;
    }
    }
    }

    public IEnumerable< R > Map(this IEnumerable< T > list, Action< T, R> mapFunction)
    {
    foreach (T item in list)
    {
    yield return mapFunction(item))
    }
    }

    public R Reduce(this IEnumerable< T > list, Action< T, R, R > reduceFunction, R startingValue)
    {
    R accumalator = startingValue
    foreach (T item in list)
    {
    accumalator = reduceFunction(item, accumalator)
    }
    return accumalator
    }

    Then adding substitutes for your filter methods:

    public bool WithS(string item)
    {
    return item.StartsWith("S")
    }

    public int ByLength(string item)
    {
    return item.Length;
    }

    public double SumATenth(int item, double accumulator)
    {
    return (item / 10.0) + accumulator
    }

    We could them write it like this:

    List< string > myList = ……
    double lengthDiv10 = myList.Filter(WithS).Map(ByLength).Reduce(SumATenth);

    unless you prefer lambdas:

    List< string > myList = ……
    double lengthDiv10 = myList .Filter(item=>item.StartsWith("S"))
    .Map(item=>item.Length)
    .Reduce((item,acc)=>(item/10.0)+acc), 0.0);

    One advantage of this design, is that, despite appearances, it will actually only loop through the list once. (all three foreachs will be run essentially in parallel, with each "feeding" the next, one item at a time.)

  2. Sarah Taraporewalla says:

    @james – These are also implemented in my current codebase, however I am currently working with C# 2.5 so without nice things like lambdas and yields it is a harder to read. I have to pass delegates around all the time, I can’t just call yield, and the compiler needs me to declare the type of the operation in quite a lot of situations etc.

  3. James Curran says:

    Lambda weren’t added to recently, but yield should be available to you. (It was added in C#–Vs2005)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>