Loading... (original) (raw)
Current implementation of AbstractStringBuilder.append does:
*/
public AbstractStringBuilder append(String str) {
...
int len = str.length();
ensureCapacityInternal(count + len);
...
}
private void ensureCapacityInternal(int minimumCapacity) {
if (minimumCapacity - value.length > 0)
expandCapacity(minimumCapacity);
}
void expandCapacity(int minimumCapacity) {
int newCapacity = value.length * 2 + 2;
if (newCapacity - minimumCapacity < 0)
newCapacity = minimumCapacity;
...
value = Arrays.copyOf(value, newCapacity);
}
It would appear we double the value array every time... But no! If you append, say, a large String right away into a clean StringBuilder, and it blows ((value.length * 2 + 2) - minimumCapacity < 0) check, we expand SB to hold the String exactly. Then, an upcoming append() will have to resize *again*, even if some trivial space is required.
E.g. these two tests differ wildly in performance:
@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
public class ConcatOrderCopy {
@Param({"1", "64", "4096"})
int size;
private String s;
private long l;
@Setup
public void setup() {
s = "";
for (int c = 0; c < size; c++) {
s += "a";
}
l = Long.MAX_VALUE;
}
@Benchmark
public String string_long() {
return new StringBuilder().append(s).append(l).toString();
}
@Benchmark
public String long_string() {
return new StringBuilder().append(l).append(s).toString();
}
}
On 8u40, Linux x86_64, it yields:
Benchmark (size) Mode Cnt Score Error Units
ConcatOrderCopy.long_string 1 avgt 15 75.292 ± 4.153 ns/op
ConcatOrderCopy.long_string 64 avgt 15 117.067 ± 28.454 ns/op
ConcatOrderCopy.long_string 4096 avgt 15 1402.182 ± 19.783 ns/op
ConcatOrderCopy.string_long 1 avgt 15 70.958 ± 3.714 ns/op
ConcatOrderCopy.string_long 64 avgt 15 125.451 ± 51.675 ns/op
ConcatOrderCopy.string_long 4096 avgt 15 2640.751 ± 20.027 ns/op
On large enough data, concatenating String+long is *much* slower than concatenating long+String.
The answer seems to be to always allocate some gap at the end. Say, "minimumCapacity + 20"?