[OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements (original) (raw)
Laurent Bourgès bourges.laurent at gmail.com
Fri May 10 06:50:17 UTC 2013
- Previous message: [OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements
- Next message: [OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jim,
FYI, I am working on optimizing the 2 hotspot methods annotated by oprofile (see specific emails) :
- ScanLineIterator.next() ~ 35%
- Renderer.endRendering(...) ~ 20%
I think that the ScanLineIterator class is no more useful and could be merged into Renderer directly: I try to optimize these 2 code paths (crossing / crossing -> alpha) but it seems quite difficult as I must understand hotspot optimizations (assembler code)...
For now I want to keep pisces in Java code as hotspot is efficient enough and probably the algorithm can be reworked a bit; few questions:
- should edges be sorted by Ymax ONCE to avoid complete edges traversal to count crossings for each Y value:
156 if ((bucketcount & 0x1) != 0) { 157 int newCount = 0; 158 for (int i = 0, ecur; i < count; i++) { 159 ecur = ptrs[i];* 160 if (_edgesInt[ecur + YMAX] > cury) {* 161 ptrs[newCount++] = ecur; 162 } 163 } 164 count = newCount; 165 }
- why multiply x2 and divide /2 the crossings (+ rounding issues) ?
202 for (int i = 0, ecur, j; i < count; i++) { 203 ecur = ptrs[i]; 204 curx = _edges[ecur /* + CURX */]; 205 _edges[ecur /* + CURX */] = curx + _edges[ecur + SLOPE]; 206 * 207 cross = ((int) curx) << 1;* 208 if (_edgesInt[ecur + OR] != 0 /* > 0 */) { 209 cross |= 1; 210 }
- 674 int lowx = crossings[0] >> 1; 675 int highx = crossings[numCrossings - 1] >> 1;* 689 for (int i = 0; i < numCrossings; i++) { 690 int curxo = crossings[i];* 691 int curx = curxo >> 1;*
- last x pixel processing: could you explain me ? 712 int pix_xmax = x1 >> SUBPIXEL_LG_POSITIONS_X; 713 int tmp = (x0 & SUBPIXEL_MASK_X); 714 alpha[pix_x] += SUBPIXEL_POSITIONS_X - tmp; 715 alpha[pix_x + 1] += tmp; 716 tmp = (x1 & SUBPIXEL_MASK_X); 717 alpha[pix_xmax] -= SUBPIXEL_POSITIONS_X - tmp; 718 alpha[pix_xmax + 1] -= tmp;
Finally, it seems that hotspot settings (CompileThreshold=1000 and -XX:aggressiveopts) are able to compile theses hotspots better ...
2013/5/8 Jim Graham <james.graham at oracle.com>
This is amazing work, Laurent! I'll look over the code changes soon. Note that the "2 edge arrays" issue goes away if we can use native methods and C structs. It may be faster still in that case...
Thanks; probably the edgeBucket / edgeBucketCount arrays could be merged into a single one to improve cache affinity.
Let stay in java ... as hotspot is so efficient (until the contrary is proven).
FYI, I can write C/C++ code but I never practised JNI code. Does somebody could help us to port only these 2 hotspot methods ?
PS: I attend a conference next week (germany) so I will be less available to work on code but I will read my emails.
Laurent -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.openjdk.java.net/pipermail/2d-dev/attachments/20130510/f41be939/attachment.html>
- Previous message: [OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements
- Next message: [OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]