Why one can't submit an alternative thread pool to the streams API? (original) (raw)

Luke Hutchison luke.hutch at gmail.com
Sat Oct 26 00:45:09 UTC 2019


On Sun, Oct 13, 2019 at 9:17 AM Brian Goetz <brian.goetz at oracle.com> wrote:

Don’t try to create and manage your own pools. Don’t try to expose APIs that encourage your users to create and manage their own pools. Use parallel streams for data parallelism, which automatically use the common pool. The common pool is sized to the number of cores; if you create more threads, then they just compete for the cores, and you’re paying for extra context switching and you will get less throughput — plus (much) more memory consumption, more configuration, and more complexity.

The “I know what is happening” that you are seeking is an illusion. Don’t be tempted by it. 99.9% of the time, you’re just going to make it worse.

This advice isn't optimal for every type of parallel workload. For example, for spinning disks with high-latency seek, when reading and/or writing multiple large files in parallel, one file per stream item, performance will completely tank if you set the number of threads in a parallel stream to more than 2 or so.

Most of the time when a job is submitted to a custom pool, it is to control the level of parallelism. If you don't want people to use anything other than the common pool, then you should at least add a provision to the API to set the level of parallelism for a given stream, e.g. .parallel(2).



More information about the jdk-dev mailing list