Avoid sliced locked contention in internal engine by jasontedor · Pull Request #18060 · elastic/elasticsearch (original) (raw)

Today we use a sliced lock strategy for acquiring locks to prevent concurrent updates to the same document. The number of sliced locks is computed as a linear function of the number of logical processors. Unfortunately, the probability of a collision against a sliced lock is prone to the birthday problem and grows faster than expected. In fact, the mathematics works out such that for a fixed target probability of collision, the number of lock slices should grow like the square of the number of logical processors. This is less-than-ideal, and we can do better anyway. This commit introduces a strategy for avoiding lock contention within the internal engine. Ideally, we would only have lock contention if there were concurrent updates to the same document. We can get close to this ideal world by associating a lock with the ID of each document. This association can be held in a concurrent hash map. Now, the JDK ConcurrentHashMap also uses a sliced lock internally, but it has several strategies for avoiding taking the locks and these locks are only held for a very short period of time. This implementation associates a reference count with the lock that is associated with a document ID and automatically removes the document ID from the concurrent hash map when the reference count reaches zero. Lastly, this implementation uses a pool of locks to minimize object allocations during indexing.