Stackoverflow-overflow: Reduce
My most popular answer on Stack Overflow right now is a response to a question about filtering while performing a map()
operation. Every time I get another upvote, I look at it, and feel the need to improve it.
Now that I’ve decided to give it a home on my blog, I’ve felt myself drawn in to a complete rewrite to be a version that I think would have helped me at the beginning of my programming career–something with more examples and fewer words than the original, and something more incremental.
Question
I would like to filter an array of items by using the map() function.
The problem is that filtered out items still uses space in the array and I would like to completely wipe them out.
Any idea?
If you want to perform map()
and filter()
in a single loop, the way to do that would be reduce()
. Array.reduce()
can be a little unintuitive at first encounter, so here’s my best shot at a gentle introduction.
In a Nutshell
If you’re like me, you may have used the reduce
algorithm in your mind to help you make a choice. Too many restaurants to pick from?
[“Lucky Robot”, “Ramen Tatsuya”, “Salt Lick”, “East Side Pies”]
One way to handle that problem is to just take two of them, and decide: which one would you pick between these two?
“Lucky Robot” > “Ramen Tatsuya” ? “Lucky Robot” : “Ramen Tatsuya”
Ok, today you feel like sushi, so Lucky Robot wins round 1. What now?
“Lucky Robot” > “Salt Lick” ?
BBQ sounds too heavy today.
“Lucky Robot” > “East Side Pies” ?
Oh snap. You know, pizza does sound good.
Done. We’ve reduced our choices down to one. Now, we did a simple comparison–what if we wanted to compare based on cost, or based on whether anyone in our group is a vegan, or on a complex multi-factor analysis?
Reduce gives us a generalized algorithmic framework for comparing group members and acting on them as a group by comparing them one-by-one. We don’t compare every member with every member–that would be a lot more cumbersome–but we do compare every member by proxy.
In Nuts and Bolts
Just like map
filter
and so on, reduce
is a pure function that does not modify the array it acts on, but instead returns a new array.
Here’s our big picture perspective to guide this instructional: The point of reduce()
is that you turn an array of values into one final value. To do this, we’ll take all the elements, evaluate all of them, and make some programatic decisions based on how those individual values should relate to our final value.
How does that look in practice?
In its most basic and common form, reduce
takes one argument:
A (typically anonymous) reducer function. (We’ll refer to this as the reducer from here on out.)
So a call to reduce will usually look like the equivalent of this:
1 | let reducedOutput = sourceArray.reduce(reducer); |
We now need to understand how to write a reducer.
Remember: like forEach
map
filter
and so on, you don’t call the reducer–reduce
calls it.
Like the function passed in to those methods, the first thing to keep in mind is that your reducer will have every element of your source array fed into it, one at a time.
So now we can expand the last code block to this:
1 | reducer = (____, oneElementOfSourceArray) => { |
What’s that first argument? I’m glad you asked–bear with me, because we have to describe a cyclic relationship here that is at the heart of how reduce
works, and why it’s tricky to grasp.
What goes in that gap is the return value from the last reducer call, which will be a result-in-progress. This is our current draft of our eventual final value that we “reduce” the array down to–and if it’s the last time it is called, with the final value of the array, then it is the final value.
But where does it get that draft? You have to return it. The return value from the last reducer is fed in as an argument to the next reducer.
So now we know enough to write this much:
1 | reducer = (lastResultInProgress, oneElementOfSourceArray) => { |
So here, we see resultInProgress
in two places. The return value, newResultInProgress
, is fed from every reducer call in to the next reducer call as lastResultInProgress
.
Summary in Two Statements
We should now be in a place to understand the following:
- When we call reduce, it takes an array and a reducer, and turns those into a single value.
- The reducer takes in two values, and returns one. Those two values are (1) the next value in the array, and (2) the summation of all previous array values as-processed by the reducer.
Read that slowly and carefully. That means for an array [a,b,c,d]
, we’ll compare a
with b
, and the reducer will come up with a placeholder
value. Then it will compare the placeholder
and c
, and update that placeholder
. Then it will compare the placeholder
and d
. With that information, it will return its final value.
Now let’s look at a classic reduce example–turning an array of numbers into the sum of all those numbers.
First Full Example
1 | sourceArray = [1,2,3]; // sum will be 6. |
That version is intentionally pedantic, of course. Once you’re more comfortable with reduce, you might choose to write it this way:
1 | sum = [1,2,3].reduce((m, v) => m + v); |
In the classic use-scenario of reduce, the resultInProgress
starts off as the value at index 0 in the array, and the oneElementOfSourceArray
in the first reducer call is the value at index 1 in the array.
That means the reducer would be called twice in our [1,2,3]
example above. The first time, the arguments would be placeholder = reducer(sourceArray[0], sourceArray[1])
, and the second time we’d see the equivalent of placeholder = reducer(placeholder, sourceArray[2])
.
If we had a fourth value, the next call would be placeholder = reducer(placeholder, sourceArray[3])
.
Then we’d return placeholder
as our final reduced value.
If you’ve followed up to this point, then you understand reduce
. Congratulations!
What follows from here are some details and illustrations that lead us to how we can use reduce
in place of map+filter
, and help us get a better feel for reduce
overall.
Next Steps
Seeding our start values
In our reduce
examples up to now, the first time the reducer is called, it takes whatever is in index 0 as its starting resultInProgress
value. However, we can actually pass in something else to seed that value, resulting in the sourceArray
elements only ever being passed in as the second argument to the reducer. Let’s look at a slight change that is functionally identical to our previous example, but utilizes that seed functionality:
1 | sourceArray = [1,2,3]; // sum will be 6. |
And the short version:
1 | sum = [1,2,3].reduce((resultDraft, element) => resultDraft + element, 0); |
Now the first reducer call will have 0
and 1
passed in as arguments (i.e., seedResultInProgress
and sourceArray[0]
), instead of 1
and 2
(i.e., sourceArray[0]
sourceArray[1]
).
The result in this case is identical; reduce
is written nicely so that if our output will be of the same format as the elements of our input array (array of numbers -> number), no seed is needed. But that ability to seed opens the door to some other options we’ll get to soon.
Filter to one value
We’ve looked now at reduce making new values out of old values–now let’s look at reduce operating as a tool to filter an array down to one of its values. Instead of reducing to a sum, let’s reduce an array of numbers down to the single largest member of the array.
1 | [1,2,3,30,50,4,5,4].reduce((resultInProgress, elementOfSourceArray) => { |
(Notice we’re passing the reducer inline now, which is more typical, as opposed to declaring it in advance.)
1 | [7,4,1,99,57,2,1,100].reduce((memo, val) => val > memo ? val : memo); |
(Another learning moment: notice we’re calling the first argument memo
in the succinct code. That’s the canonical term for that argument, so we’ll start using that term from here on out in the succinct code.)
Filter to subset
Once we’ve grokked that, let’s look at reducing an array down to a smaller array. Let’s say we want to filter out all values that are less than 3.
For this one, because our output will be an array, but the elements of our array are numbers, we’ll need to seed an empty array to start:
1 | sourceArray = [1,2,3,4,5]; |
(Tip to understand the short version: [1,2].concat(3)
returns [1,2,3]
, a handy feature for functional-style programming.)
1 | [1,2,3,4,5].reduce((memo, value) => value >= 3 ? memo.concat(value) : memo, []); |
Putting it all together
Now that we know how to filter like this, it’s trivially easy to see how we could include the functionality of map
–we just operate on it before push()
ing it into the resultInProgress
/memo
array.
For our example, we’ll do the same filter, and if it passes our filter condition, we’ll multiply it by two.
Just the short version this time:
1 | reduced = [1,2,3,4,5].reduce((memo, value) => value >= 3 ? memo.concat(value * 2) : memo, []); |
That’s it, now you know how to filter
and map
in one go with reduce
. Being comfortable enough with reduce
to feel like you could come up with a solution like this on the fly means you have a powerful tool in your arsenal, and can lead to new ways of thinking about problems.
Alternative: Prioritizing Human Readability
But you know what? If I wanted to both filter
and map
, I would just write it this way:
1 | reduced = [1,2,3,4,5] |
Here my intentions are a lot more obvious. This code can be grokked in a glance, and the method names guide me in understanding what’s supposed to be happening.
Reduce is great, and is a very powerful tool for a certain class of problems. But if you’re going to use it outside of its basic use case, it becomes less obvious what you’re doing. Code readability is more important than doing everything in one loop.
What about performance?
It’s also algorithmically equivalent. We can loop over an array of five items once, and do two operations per item, or we can loop over an array of five items twice, and do one operation per item per loop. That’s just 5*1*2
vs. 5*2*1
.
One might be tempted to think that the overhead of 5 anonymous function calls vs 10 anonymous function calls would make a difference here. If Javascript were closer to the metal, that might be the case, but one must remember that JS code is just a suggestion to the interpreter. In fact, functions like filter()
can often be optimized in a way that “manual” filtering within a reduce function cannot be.
Out of curiousity, I decided to test this hypothesis in jsperf. The result?
In this particular case, it’s nearly three orders of magnitude (as in, 847x–getting close to 1000x) faster to not code the logic by hand within reduce
. Once again, this illustrates the principle that in higher level languages like JS, offloading logic to the built-ins is often like getting a hook into the underlying low-level language your code will be compiled into.
Now, again, unless you’re operating on massive arrays in the client, you should concern yourself with readability over speed. And if you _do_ happen to be working with massive arrays in the client? Consider whether it might be more beneficial to offload that work to the server. No? Then it’s time to think about speed.
Focus on making code that’s easier to read and easier to write. It just happens to be that usually, it’ll make your code faster, as well.
Homework
If you want to get a better feel for reduce through practice, here’s some homework:
- How would you write a reducer for flattening an array of arrays?
- How would you write your own
Array.reduce
?
Example answers are given below, if you get stuck.
Answers To Homework
- Flatten a simple nested array:
How To Flatten With Reduce 1
2
3
4
5
6sourceArray = [
[1,2,3],
[4,5,6],
[7,8,9]
];
flattenedSource = sourceArray.reduce((memo, val) => memo.concat(...val), []); - Write your own reduce function:
How To Code Reduce 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28demoArray = [1,2,3,4,5];
// we accept an anonymous function, and an optional 'initial memo' value.
demoArray.my_reduce = function(reducer, seedMemo) {
// if we did not pass in a second argument, then our first memo value
// will be whatever is in index zero. (Otherwise, it will
// be that second argument.)
const haveSeedMemo = arguments.length < 2;
// here we use that logic to set the memo value accordingly.
let memo = haveSeedMemo ? this[0] : seedMemo;
// here we use that same boolean to decide whether the first
// value we pass in as iteratee is either the first or second
// element
const initialIteratee = haveSeedMemo ? 1 : 0;
for (let i = initialIteratee; i < this.length; i++) {
// memo is either the argument passed in above, or the
// first item in the list. initialIteratee is either the
// first item in the list, or the second item in the list.
memo = reducer(memo, this[i]);
}
// after we've compressed the array into a single value,
// we return it.
return memo;
};
Short version:1
2
3
4
5
6
7
8
9
10arr._reduce = function(reducer, seedMemo) {
let haveSeed = (seedMemo !== undefined);
let memo = haveSeed ? seedMemo : this[0];
let i = haveSeed ? 0 : 1;
for (;i < this.length; i++) {
memo = reducer(memo, this[i]);
}
return memo;
};
(Remember not to use an arrow function here, so you can access this
!)
The full native implementation allows access to things like the index, for example, but I hope this helped you get an uncomplicated feel for the gist of it. Check out MDN for the full API.