My Ideas for Java Closures (original) (raw)
According to Neal Gafter, the story for closures it still wide open.
As someone who writes a lot of code using closure-like mechanisms ... in the form of lots of inline inner classes ... I have a few idea of what I want in a solution. I think I'm writing some powerful and elegant code today, but that elegance in function is undermined by some severe awkwardness in its expression as Java code.
It really comes down to conciseness. I can accomplish pretty much everything I need using inner classes and holder objects, such as AtomicInteger and friends. But it ends up being more code than I'd like.
What I want (to borrow Stu's term) is to emphasize the essence of my logic, and strip away the ceremony: the naming of the interface (it should be known from context), the types of parameters (just the names, please), the list of thrown exceptions, etc.
First of all, let's constrain the problem. Closures would be defined in terms of an interface, an interface that contains a single method. Attempting to use the concise syntax with an interface that contains multiple methods would simply be a compiler error.
Second, the closure block should have free read/write access to parameters and local variables in the enclosing method. This can easily be accomplished with syntactic sugar: the variables can be converted into references to holder objects, such as AtomicInteger, that are stored on the heap. Today, shared fields must be final, to indicate that it is safe to share references to the object between the main method and any inner classes.
The hard part about sharing local variables in this way isn't implementing the syntactic sugar, it's about the updating of the information provided to the debugger so that the debugger can undo the syntactic sugar changes, and make variables in the enclosing method, or in the closure block, appear to be local.
Lastly, syntax. I think Groovy has the right syntax here. The important part is for the compiler to actually help out rather than for it to complain from the side-lines. Today's Java compiler has all the type information, but just uses that to build fancy error messages about what you should have typed. It should be using that type information to avoid the necessity of all the extra typing (that is, keyboard entry, not the need for types in the Ruby/Groovy sense of the word).
When a closure is passed to another method, the parameter type determines all the compiler should need to know. Likewise, when a closure is assigned to a variable, the variable defines the closure interface. Thus:
List widgets = ... ; Collections.sort(widgets, { w1, w2 -> return w1.getWidgetId().compareTo(w2.getWidgetId()); });
The generic type of the list, Widget, informs the generic type of the Comparator. Thus w1 and w2 are fully typed, as instances of Widget. The return type of the closure is pegged as int (also from the Comparator interface).
Side note: I'd also like to see a lot of other streamlining of Java, such as a Groovy-style implicit
return
.
In other words, the above could be expanded by the compiler to the following in terms of compilation:
List widgets = ... ; Collections.sort(widgets, new Comparator() { public int compare(Widget w1, Widget w2) { return w1.getWidgetId().compareTo(w2.getWidgetId()); } });
See, the essence is still there, the comparison of the widget Ids. But its now occluded by all that ceremony about interface names, method names, return values, generic types, and so forth.
A caveat: there are edge cases where we'll need to identify the closure interface type. This occurs when a method to be passed a closure is overridden. The compiler should be able to reduce the candidates based on parameter count, checked exceptions thrown inside the closure, and other factors ... but it may be necessary to implement an alternative syntax. For example:
ExecutorService service = ...; Future<?> = service.submit(Runnable { performExpensiveOperation(); });
submit() is overridden to accept a Callable as well as a Runnable. Callable can return a value. Again, minimal syntax here: the interface name followed by the implementation of the closure method. This compares to the Groovy as
keyword.
Now, it's easy to get carried away and put in stuff like Python style co-routines (the yield
keyword), or want a syntax to allow the closure to force a return value from the enclosing block or method. Yep, let's add a few more alien concepts to the language, people love that. .
These are also not function objects, so you can't easily do magic things such as currying (currying is a way of pre-supplying some of the parameters to a function, such that a new function is created that takes fewer parameters). An "interface" that's curried is a whole new interface and that's OK by me.
I'm more more modest. I don't particularly want to add new features to the Java language, just new syntax for the compiler, to let it do the ugly plumbing. That's kind of my theory for Tapestry's relationship to Java and the Servlet API as well, and it works.
A few side notes:
Annotations could be used to help the compiler out. For example, if a common interface should be a closure, but has multiple methods, an annotation could be used to tell the compiler which method of the interface is applicable to use as a closure; any other methods (in the inner class) would be no-ops or (perhaps guided by additional annotations) failures.
Annotations on fields and parameters (or perhaps on the method) could help the compiler decide how to share visible variables: do we need to use a thread-safe approach (such as something based on AtomicReference) or simple, non-synchronized holder objects? The compiler could do some escape analysis as well to chose the best solution, but in many cases, it's on the coder's shoulders to make the right decision.
From what I can tell, my ideas are closest to the Concise Instance Creation Expressions proposal by Bob Lee, Doug Lea, and Josh Bloch. However, in my opinion, the compiler can do much more in terms of providing type information. I think CICE stumbles in that it allows for creation of classes, not just interfaces, and gets bogged down in defining closure types and parameter types ... again, things that (with the mandate of single-method interfaces), the compiler can do autonomously.
Likewise, CICE requires that variables visible to the inner block be marked "public". This to me is something that the compiler can analyze; it can identify any assignments to variables and promote such variables to stored-on-the-heap status. I don't see the need for final; the compiler has plenty of ability to determine if a parameter or local variable is ever updated within the body of a method (and the body of any inner classes or closures of that method).
My basic concept is: Less is More. Support fewer cases but do so more cleanly, more concisely, and more understandably. Simplify. Let the compiler do more work. Reduce the density and complexity of the Java code. Expose the essence.