Primitive streams and optional (original) (raw)
Howard Lovatt howard.lovatt at gmail.com
Sat Nov 24 11:52:05 PST 2012
- Previous message: Primitive streams and optional
- Next message: Primitive streams and optional
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Doug's suggestion of ignoring null gets my vote. I used the same approach in my own parallel library and it worked well. It also is consistent with null treatment in some languages, like Objective-C.
The approach taken by Objective-C is instructive because like Java it is a mixed language with values and objects. However there is a caveat on looking at Objective-C for inspiration because in Objective-C (the equivalent of), null.method() does not throw a NPE instead it is a no-op. Therefore ignoring null is engrained throughout the language.
But as I said, ignoring null worked well for me. In the PSs below there is more detail about how I handled null.
-- Howard.
PS In my parallelisation library I split the data into a doubly linked list of segments. Each segment is the same size and is padded with null as necessary. Therefore all my ops (the equivalent of): map, filter, reduce, etc. preserved the segment size and therefore make parallelisation easy. If after processing a segment is largely empty it may be merged with the segments on either side. If a segment is completely empty or not required after a subList operation it is dropped.
PPS Returning null from a Mapper is the equivalent of a filter operation. In fact Filter 'is-a' Map that returns null or the value.
PPPS I also allow throw BREAK and throw CONTINUE. Where BREAK and CONTINUE are pre-made static fields and therefore don't incur a creation overhead. Throwing CONTINUE is equivalent to returning null from a Mapper. Throwing BREAK in a Mapper terminates all the parallel operations, pads the remainder of the segment with null, and discards the subsequent segments.
Sent from my iPad
On 25/11/2012, at 4:13 AM, Doug Lea <dl at cs.oswego.edu> wrote:
Just in case anyone is interested in re-deciding some basics In light of the continuing saga of unappealing API choices, here's one last push for adopting the j.u.c null policies in streams. Sorry that I can't think of a good way to present this without stepping back into prehistory! Long ago (1950s), people noticed that there are two basic flavors of data: values and pointers. A value is just, um, a value. A pointer differs conceptually in that it might not point to anything. Hence the invention of null, as a special state of a pointer, that for economy, is encoded as the special value zero if null, else a (possibly virtualized etc) memory address. (One disadvantage of this encoding is that it loses type information -- an early form of "erasure". A null pointer to an int looks the same as a null pointer to a double, etc.) Only slightly less long ago (late 1960s), people noticed that pointer-like notions could be elevated to the idea of "references to objects" (in early forms, an object's pointer address was its identity). But still with the notion that a reference might not point anywhere. So, now we have four different concepts: 1. values 2. possibly null pointers to values 3. objects 4. possibly null references to objects The possibly-null case naturally occurs with partial functions and methods, often related to lookup/search: get the thing at some uninitialized array position, or in a hash map without a binding, etc. Also for terminals in linked data structures. You need some way to say that there is no such thing there. The FP (and ADT) folks had an arguably easier time of this, since they only encountered cases (1) and (2). Still they had the notion of a compound-value, which is like an object, but has no defined identity. Any partial function that "should" return an X but need not can instead return an Optional. And the most common technique for implementing this notion is to "box" the value when present, else return null. The programmer is never never exposed to this though. For example, using "==" on a boxed vs unboxed int does the same thing (comparing values, not the "invisible" pointers). The pure OO (smalltalk etc) folks also in principle had an easier time, since they conceptually dealt only with cases (3) and (4). Since everything is an object, everything worked uniformly. (Although many people now think in retrospect that "nullable" should have a required part of any method return type spec so that programmers know when nulls might legitimately vs accidentally appear. JSR308 might help with this though.) However people don't appreciate it when "==" always compares pointers (among other issues) for integers, so special rules were made for these cases, that are basically the inverse of the FP approach. That is, in FP, pointerness is hidden, in OO, pointerless-valueness is hidden. But less hidden. for example Integers are objects with identity, monitors, etc, (and so are unlike "Optional" if such a thing existed) and you can readily tell if you have an Integer vs an int. On the other hand, you can still use ints as (autoboxed) objects inside collections etc without needing to have a special implementation just for ints (at the price of now-famous space bloats). Any language/library that embraces both of these notions together has to do something that is not identical to either pure FP or OO approaches. Some languages get a foothold by distinguishing object types from value types. Thus, nullness applies to objects, optionalness applies to values. So, Scala, Lime, etc have variants of: 1. value types: int, double etc 2. Optional: the result of partial functions on value types 3. object types (Object and subclasses) 4. refs: possibly null references to objects
We don't have this foothold. Arguably, because of this, we should not be creating such frameworks. Be we are. So the choices are: A. Pretend we have value types. Introduce Optional for use with any value-like things, along with some set of conventions about how they interact with objects and possibly null refs. B. Don't pretend we have value types unless/until we have them. Use the standard OO conventions, in which boxing classes like Integers are used when you need to elevate a value to objecthood. And when you have one, you have a full-fledged object, not just an invisible pointer. And when you don't have one, you just have null. Choice A is tempting because of its familiarity by programmers with FP background. But doing so forces a never-ending set of bandaids (as we've seen lately) because none of the rules for interoperating with Object conventions make much sense. Sticking with (B) is less tempting to some people not only because they like to think of some of their classes in value-like ways, but also because streams (like java.util.concurrent) would need to relentlessly maintain the "null means nothing there" policy. So, emptyStream.reduce(f) must return null, null elements appearing in streams must be skipped, etc. But not only is this the most defensible policy to use in the absence of true value types, it is best suited to kludgelessly evolve to embrace value types if they are ever supported. There is also a choice C: always throw exceptions for partial functions / nothing-there cases. The logic of this is fine, and completely reasonable is when nothing-there-ness is accidental or exceptional. But the world voted against the painfulness and inefficiency of everyday programming under this encoding of nothing-there decades ago. Summary: get rid of Optional. Use null consistently to mean nothing there (plus exceptions in exceptional cases). Use the standard boxed types for numerics. Until/unless there is are value types, create intStream etc as a separate set of classes with merely analogous APIs. (And while we are at it, add LongKeyHashMap and a few others!) Don't worry about people who used null as "meaningful" elements, map keys or map values. No one is forcing them to use streams. -Doug
- Previous message: Primitive streams and optional
- Next message: Primitive streams and optional
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the lambda-libs-spec-observers mailing list