[OpenJDK 2D-Dev] AAShapePipe concurrency & memory waste (original) (raw)
Jim Graham james.graham at oracle.com
Thu Apr 11 21:42:16 UTC 2013
- Previous message: [OpenJDK 2D-Dev] AAShapePipe concurrency & memory waste
- Next message: [OpenJDK 2D-Dev] AAShapePipe concurrency & memory waste
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Laurent,
Yes, these kinds of minor optimizations (i.e. optimizations that don't make a clear 2x type of savings) can be frustrating at times. It looks like there is potential for a decent return there if we can find the right change. Sometimes rearranging a couple of things that don't look like they are saving work can somehow trip the runtime into executing the code more efficiently.
I skimmed through your thoughts at the bottom. It occurred to me after I sent that idea out that sometimes we use int[] because we have to hand the values to native for return values and there is no easy way to return 4 values from a native method. An array is simplest because it can be loaded with answers via a single JNI call. 4 fields in a class would require 4xJNI.SetField calls. It might have better payoff if we can cache renderer state there as well which gets into subclassing. Also, doing this right may have to be done by someone here at Oracle because it may involve modifying the Ductus pipeline to match (it's been a while and I don't remember if we open sourced the code that interfaces Ductus to the RenderingEngine interfaces...?)
...jim
On 4/11/13 6:07 AM, Laurent Bourgès wrote:
Jim and Sergey,
1/ Here are few benchmarks (based on mapBench again) running several modified versions of AAShapePipe: http://jmmc.fr/~bourgesl/share/AAShapePipe/mapBench/ - ref: 1 threads and 20 loops per thread, time: 3742 ms 2 threads and 20 loops per thread, time: 4756 ms 4 threads and 20 loops per thread, time: 8528 ms 1 threads and 20 loops per thread, time: 56264 ms 2 threads and 20 loops per thread, time: 75566 ms 4 threads and 20 loops per thread, time: 141546 ms - int4: 1 threads and 20 loops per thread, time: 3653 ms 2 threads and 20 loops per thread, time: 4684 ms 4 threads and 20 loops per thread, time: 8291 ms 1 threads and 20 loops per thread, time: 55950 ms 2 threads and 20 loops per thread, time: 74796 ms 4 threads and 20 loops per thread, time: 139924 ms - byte[]: 1 threads and 20 loops per thread, time: 3795 ms 2 threads and 20 loops per thread, time: 4605 ms 4 threads and 20 loops per thread, time: 8246 ms 1 threads and 20 loops per thread, time: 54961 ms 2 threads and 20 loops per thread, time: 72768 ms 4 threads and 20 loops per thread, time: 139430 ms - int4 / byte[] / rectangle cached in TileState: 1 threads and 20 loops per thread, time: 3610 ms 2 threads and 20 loops per thread, time: 4481 ms 4 threads and 20 loops per thread, time: 8225 ms 1 threads and 20 loops per thread, time: 54651 ms 2 threads and 20 loops per thread, time: 74516 ms 4 threads and 20 loops per thread, time: 140153 ms So you may be right, results are varying depending on the optimizations (int4, byte or all) ! Maybe I should test different versions on optimized pisces renderer ... Here is an updated patch: http://jmmc.fr/~bourgesl/share/AAShapePipe/webrev-2/
2/ Thanks for your comments: actually a refactoring is possible to use a (shared) TileState instance replacing int[] bbox, rectangle bbox): - RenderingEngine.AATileGenerator getAATileGenerator(... int[] abox) it is very interesting here to propose an extensible tile state: maybe created by the renderer engine to cache other data ? - Rectangle and Rectangle2D are only used as the shape s and device rectangle given to CompositePipe.startSequence(): public Object startSequence(SunGraphics2D sg, Shape s, Rectangle dev, int[] abox); Changing this interface may become difficult: AlphaColorPipe.java: 41: public Object startSequence(SunGraphics2D sg, Shape s, Rectangle dev, OK, [s, dev, abox] unused AlphaPaintPipe.java 81: public Object startSequence(SunGraphics2D sg, Shape s, Rectangle devR, create a paint context: PaintContext paintContext = sg.paint.createContext(sg.getDeviceColorModel(), devR, s.getBounds2D(), sg.cloneTransform(), sg.getRenderingHints()); GeneralCompositePipe.java: 62: public Object startSequence(SunGraphics2D sg, Shape s, Rectangle devR, abox unused and create a paint context: PaintContext paintContext = sg.paint.createContext(model, devR, s.getBounds2D(), sg.cloneTransform(), hints); SpanClipRenderer.java 68: public Object startSequence(SunGraphics2D sg, Shape s, Rectangle devR, Forward to another composite pipe return new SCRcontext(ri, outpipe.startSequence(sg, s, devR, abox)); It could be possible to use TileState into PaintContext interface / fix implementations but it may become a tricky change (API change). What do you think ? Laurent 2013/4/11 Jim Graham <james.graham at oracle.com> I'm pretty familiar with all of this code and there aren't any places that save the tile array that I remember. The embedded code that Pisces was taken from had some caching of alpha arrays, but we didn't use or keep that when we converted it for use in the JDK...
It occurs to me that since you are collecting the various pieces of information into an object to store in the thread local storage, perhaps we should convert to a paradigm where an entire Tile Generation sequence uses that object "TileState?" as its main way to communicate info around the various stages. Thus, you don't really need an int[4] to store the 4 parameters, they could be stored directly in the TileState object. This would require more sweeping changes to the pipeline, but it might make the code a bit more readable (and make the hits to convert over more moot as they would be improving readability and give more focus to the relationships between all of the various bits of data). Then it simply becomes a matter of managing the lifetime and allocation of the TileState objects which is a minor update to the newly refactored code. ...jim On 4/10/13 3:59 PM, Sergey Bylokhov wrote: On 4/10/13 11:46 PM, Laurent Bourgčs wrote:
I see that some methods which take it as argument doesn't use them. And most of the time we pass AATileGenerator and abox[] to the same methods, so it could be merged?
For now I did not want to modify the AAShapePipe signatures: abox[] is filled by AATileGenerator implementations (ductus, pisces, others) in order to have the shape bounds and render only tiles covering this area. You still have to check all the places, where these objects are filled and used, and refactoring is a good start, no? Otherwise, how can you prove that these arrays are used as you would expect, These arrays could be stored like the cache or re-used for other purpose(if someone don't want to create new arrays). Probably it will be good to split all your changes / to a few CR. - cleanup - Some small changes which gave us most speedup - all other things. ??
Also I suggest to use jmh for java micrbenchmarks. http://openjdk.java.net/****projects/code-tools/jmh<http://openjdk.java.net/**projects/code-tools/jmh> <http:/**/openjdk.java.net/projects/**code-tools/jmh<http://openjdk.java.net/projects/code-tools/jmh>
So your test will be: http://cr.openjdk.java.net/~**serb/AAShapePipeBenchmark.java<http://cr.openjdk.java.net/%7Eserb/AAShapePipeBenchmark.java> **<http://cr.openjdk.java.net/%**7Eserb/AAShapePipeBenchmark.**java<http://cr.openjdk.java.net/%7Eserb/AAShapePipeBenchmark.java>
Thanks, I will try it asap Laurent -- Best regards, Sergey.
- Previous message: [OpenJDK 2D-Dev] AAShapePipe concurrency & memory waste
- Next message: [OpenJDK 2D-Dev] AAShapePipe concurrency & memory waste
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]