underscores and precedence in scala

You are viewing an old revision of this post, from February 27, 2011 @ 20:39:26. See below for differences between this version and the current revision.

At work recently, I was writing some code which wanted to add all the elements of a collection of strings to a document writer. This seemed like a classic case for foreach, so I wrote something like this:

data.foreach(s => writer.addDocument(createDocument s))

(Warning: I’m typing this from memory, without trying it out in the Scala CLI, and I’m new enough to Scala that it’s entirely possible that I’m making silly syntax errors.) That was great, but Scala lets you use underscores in place of explicit argument names for arguments that you’re only going to use once, so I tried doing this instead:

data.foreach(writer.addDocument(createDocument _))

That, however, didn’t work. After looking at the error message and talking to some coworkers, it seemed like Scala was parsing it as the following:

data.foreach(writer.addDocument(s => createDocument s))

And there isn’t a version of addDocument that takes a function argument. Which is good: if there had been, it might have compiled but not done what I wanted, which would have been even more confusing! Still, I was frustrated: why can’t Scala just read my mind? But, honestly, the compiler’s choice was a perfectly reasonable way to parse that expression, and I certainly wouldn’t want Scala’s parsing to be dependent on the types that function calls accept. So I was willing to leave it at that.

Except that, as one of my coworkers then pointed out, there was another way of breaking it down: instead of writing as a single foreach call, I would write it as map plus foreach, as follows:

data.map(createDocument).foreach(writer.addDocument _)

Which is much nicer! So, actually, Scala’s parse error was gently nudging me in the correct direction: yes, this sort of thing is potentially ambiguous to parse, but if you break down your function composition properly, then that ambiguity goes away. So it was nice of the compiler to help me write my code elegantly!

The funny thing was, I ran into a very similar situation an hour later: it involved two maps plus a foreach, but the exact same principle applied. That time, though, after writing it out as a chain of three collection functions, I didn’t really like the result: it was going too far into the details of how to use a library that I was integrating with, and I didn’t find the result particularly evocative. So I ended up going back to the single-foreach version, but this time I pulled out the function that I was applying to a member function, so I could give it a name that explained what was going on. It’s nice to have different tools in your toolkit, because ultimately you need to be guided by what makes your code the most expressive rather than falling in love with the most powerful tool.

Post Revisions:

March 1, 2011 @ 07:17:15 [Current Revision] by David Carlton
February 27, 2011 @ 20:39:26 by David Carlton

Changes:

--- February 27, 2011 @ 20:39:26
+++ Current Revision
 Unchanged: At work recently, I was writing some code which wanted to add all the elements of a collection of strings to a document writer. This seemed like a classic case for <code>foreach</code>, so I wrote something like this:
 Unchanged: <code>data.foreach(s => writer.addDocument( createDocument s))</code>
 Unchanged: (Warning: I'm typing this from memory, without trying it out in the Scala CLI, and I'm new enough to Scala that it's entirely possible that I'm making silly syntax errors.) That was great, but Scala lets you use underscores in place of explicit argument names for arguments that you're only going to use once, so I tried doing this instead:
 Unchanged: <code>data.foreach( writer.addDocument( createDocument _))</code>
 Unchanged: That, however, didn't work. After looking at the error message and talking to some coworkers, it seemed like Scala was parsing it as the following:
 Unchanged: <code>data.foreach( writer.addDocument(s => createDocument s))</code>
 Unchanged: And there isn't a version of <code>addDocument</code> that takes a function argument. Which is good: if there had been, it might have compiled but not done what I wanted, which would have been even more confusing! Still, I was frustrated: why can't Scala just read my mind? But, honestly, the compiler's choice was a perfectly reasonable way to parse that expression, and I certainly wouldn't want Scala's parsing to be dependent on the types that function calls accept. So I was willing to leave it at that.
 Unchanged: Except that, as one of my coworkers then pointed out, there was another way of breaking it down: instead of writing as a single <code>foreach</code> call, I would write it as <code>map</code> plus <code>foreach</code>, as follows:
-Deleted: <code>data.map( createDocument) .foreach(writer.addDocument _)</code>
+Added: <code>data.map( createDocument) .foreach(writer.addDocument)</code>
 Unchanged: Which is much nicer! So, actually, Scala's parse error was gently nudging me in the correct direction: yes, this sort of thing is potentially ambiguous to parse, but if you break down your function composition properly, then that ambiguity goes away. So it was nice of the compiler to help me write my code elegantly!
 Unchanged: The funny thing was, I ran into a very similar situation an hour later: it involved two <code>map</code>s plus a <code>foreach</code>, but the exact same principle applied. That time, though, after writing it out as a chain of three collection functions, I didn't really like the result: it was going too far into the details of how to use a library that I was integrating with, and I didn't find the result particularly evocative. So I ended up going back to the single-<code> foreach</code> version, but this time I pulled out the function that I was applying to a member function, so I could give it a name that explained what was going on. It's nice to have different tools in your toolkit, because ultimately you need to be guided by what makes your code the most expressive rather than falling in love with the most powerful tool.

Note: Spaces may be added to comparison text to allow better line wrapping.

Published 2/27/2011 & Filed in Programming

gdc 2011: monday »
« my gdc 2011 schedule

6 Comments

Comments closed

Comment by Arnold

You can leave out that last underscore too.

data map createDocument foreach writer.addDocument

:-)

3/1/2011 @ 1:15 am
Comment by David Carlton

Thanks, fixed!

3/1/2011 @ 7:20 am
Comment by raichoo

You might as well do this

data map (createDocument _ andThen addDocument)

This does only one iteration over the list. The underscore after createDocument lifts the method to a function object (eta-expansion) that supports the andThen method to do arrow-like chaining.

Regards,
raichoo

3/2/2011 @ 4:23 pm
Comment by David Carlton

Oh, thanks for the tip! I definitely need to learn about more of that sort of combinatorial function.

When, if ever, do you recommend doing the map + foreach instead of foreach + andThen? I definitely don’t have a good sense of style for that sort of thing yet.

3/2/2011 @ 9:58 pm
Comment by raichoo

Basically every time you want to omit an extra iteration ;) You might also use foreach instead of map in the above example (if addDocument is just causing a side effect and you don’t need the return value).

Right now you are mapping over a sequence generating a new one and iterate one more time for addDocument. But in this case you can add the document right after it has been created, so you might as well just combine those two functions to get the job done in one pass.

Hope that helps.

3/3/2011 @ 2:33 am
Pingback from composing, decomposing, and recomposing methods | malvasia bianca

[…] I wrote that post on precedence, map, and function composition in Scala, I started to wonder: I’ve been thinking that I should experiment more with applying Compose […]

3/17/2011 @ 9:22 pm

malvasia bianca