Dot Product Thoughts (original) (raw)
Richard Warburton richard.warburton at gmail.com
Thu Apr 18 16:09:38 PDT 2013
- Previous message: Map.getOrDefault(Object,Supplier) override
- Next message: Dot Product Thoughts
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
Implementing a dot product between a pair of vectors brought up a few observations about the library:
- There's a Double::sum, but no Double::multiply, etc. I appreciate you have to stop somewhere, but is sum the place to stop? Might be worth adding other basic arithmetic operations.
- There appears to be a zip function now, but with no overloads for primitive streams. I managed to guess at Streams.zip as the location, so one data point, but good news on that front.
- If I have an double[] is there is a cleaner way of making a DoubleStream than adding every element to the builder? Perhaps I'm missing something? Is the expectation that people who care about unboxed primitives are using specialised collections libraries, eg GNU Trove, and these libraries will provide specialised .stream() methods returning DoubleStream, et al. ?
- A performance comparison with a trivial imperative example resulted in a ~16x slowdown moving to lambdas. I'm willing to take some performance hit for the nicer code, but 16x is a lot higher than I would have expected or hoped for. I'll try to have a look at some more 'real world' examples in future, but even then being this much slower on mathematical problems will cause some people trouble.
Imperative code (x and y are arrays)
final int N = x.length; double sum = 0; for (int i = 0; i < N; i++) { sum += x[i] * y[i]; } return sum;
Java 8 Lambdas (x and y are streams):
Streams.zip(x, y, (left, right) -> left * right) .reduce(0.0, Double::sum);
Notes:
- When I profiled the hotspot wasn't in boxing/unboxing but Iterator.forEachRemaining - pretty hot as well, ~ 95% of execution time. I suspect the conclusion is that the computational work of actually performing arithmetic is entirely dominated by the cost of iteration in the framework.
- The 16x figure comes from an average of 90 calls for 10 Million entry vectors. I've left out the timings from my warm up calls, so the timings were steady-state.
- I didn't do thorough GC analysis, but time spent in GC was entirely dominated by execution time in java code.
regards,
Richard Warburton
http://insightfullogic.com @RichardWarburto <http://twitter.com/richardwarburto>
- Previous message: Map.getOrDefault(Object,Supplier) override
- Next message: Dot Product Thoughts
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]