LUCENE-8962: merge small segments on commit by msokolov · Pull Request #1552 · apache/lucene-solr (original) (raw)
I'll beast this PR on my 128-core AMD Ryzen box :)
I ran this overnight, consuming a lot of electricity and generating heat in my already too hot basement ;0
It finished all Lucene (core + modules) tests 889 times, and hit a few interesting failures! I have not checked 1) whether they reproduce, nor 2) whether these failures might also occur on clean master:
1st:
2> NOTE: reproduce with: ant test -Dtestcase=TestIndexFileDeleter -Dtests.method=testExcInDecRef -Dtests.seed=DC21EB3B9D4052A4 -Dtests.slow=true -Dtests.badapples\
=true -Dtests.locale=sw -Dtests.timezone=Africa/Cairo -Dtests.asserts=true -Dtests.file.encoding=UTF-8
2> Jun 05, 2020 4:16:02 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
2> WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,5,TGRP-TestIndexFileDeleter]
2> org.apache.lucene.index.MergePolicy$MergeException: java.lang.RuntimeException: segments must include at least one segment
2> at __randomizedtesting.SeedInfo.seed([DC21EB3B9D4052A4]:0)
2> at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:703)
2> at org.apache.lucene.index.TestIndexFileDeleter$2.handleMergeException(TestIndexFileDeleter.java:438)
2> at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:694)
2> Caused by: java.lang.RuntimeException: segments must include at least one segment
2> at org.apache.lucene.index.MergePolicy$OneMerge.<init>(MergePolicy.java:235)
2> at org.apache.lucene.index.IndexWriter$2.mergeFinished(IndexWriter.java:3182)
2> at org.apache.lucene.index.IndexWriter.closeMergeReaders(IndexWriter.java:4418)
2> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4720)
2> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4145)
2> at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5706)
2> at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:624)
2> at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:682)
2>
2> NOTE: leaving temporary files on disk at: /tmp/lucene_tests/lucene.index.TestIndexFileDeleter_DC21EB3B9D4052A4-001
2> NOTE: test params are: codec=Lucene86, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@248b50a7), locale=sw, timezone=Africa/Cairo
2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=165714056,total=268435456
2> NOTE: All tests run in this JVM: [TestTessellator, TestFilterLeafReader, TestNRTReaderCleanup, TestIndexFileDeleter]
2nd:
org.apache.lucene.search.TestPhraseWildcardQuery > testExplain FAILED
org.junit.ComparisonFailure: expected:<...tle:"t?e b* b*") in [1]) [AssertingSimilari...> but was:<...tle:"t?e b* b*") in [4]) [AssertingSimilari...>
at __randomizedtesting.SeedInfo.seed([698AF96B1C28F923:DF7EB48A3083036C]:0)
at org.junit.Assert.assertEquals(Assert.java:115)
at org.junit.Assert.assertEquals(Assert.java:144)
at org.apache.lucene.search.TestPhraseWildcardQuery.testExplain(TestPhraseWildcardQuery.java:233)
...
2> NOTE: reproduce with: ant test -Dtestcase=TestPhraseWildcardQuery -Dtests.method=testExplain -Dtests.seed=698AF96B1C28F923 -Dtests.slow=true -Dtests.badapples=\
true -Dtests.locale=ebu -Dtests.timezone=SystemV/CST6 -Dtests.asserts=true -Dtests.file.encoding=UTF-8
2> NOTE: test params are: codec=Asserting(Lucene86): {other=PostingsFormat(name=Direct), author=BlockTreeOrds(blocksize=128), category=PostingsFormat(name=Assertin\
g), title=PostingsFormat(name=Direct)}, docValues:{}, maxPointsInLeafNode=576, maxMBSortInHeap=6.318571589527405, sim=Asserting(org.apache.lucene.search.similarities\
.AssertingSimilarity@37191dc0), locale=ebu, timezone=SystemV/CST6
2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=176758768,total=268435456
2> NOTE: All tests run in this JVM: [TestPhraseWildcardQuery]
3rd:
org.apache.lucene.index.TestIndexWriter > test suite's output saved to /l/trunk/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexW\
riter.txt, copied below:
2> xun 05, 2020 6:26:03 DE LA MAÑANA com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
2> WARNING: Uncaught exception in thread: Thread[Thread-50,5,TGRP-TestIndexWriter]
2> java.lang.AssertionError: java.nio.file.NoSuchFileException: _1y.fdt
2> at __randomizedtesting.SeedInfo.seed([C384705D23CB7F34]:0)
2> at org.apache.lucene.index.TestIndexWriter.lambda$testRandomOperations$48(TestIndexWriter.java:3886)
2> at java.base/java.lang.Thread.run(Thread.java:834)
2> Caused by: java.nio.file.NoSuchFileException: _1y.fdt
2> at org.apache.lucene.store.ByteBuffersDirectory.deleteFile(ByteBuffersDirectory.java:148)
2> at org.apache.lucene.store.MockDirectoryWrapper.deleteFile(MockDirectoryWrapper.java:607)
2> at org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:38)
2> at org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:696)
2> at org.apache.lucene.index.IndexFileDeleter.deleteFiles(IndexFileDeleter.java:690)
2> at org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:589)
2> at org.apache.lucene.index.IndexFileDeleter.deleteCommits(IndexFileDeleter.java:382)
2> at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:527)
2> at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3601)
2> at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3557)
2> at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3507)
2> at org.apache.lucene.index.TestIndexWriter.lambda$testRandomOperations$48(TestIndexWriter.java:3879)
2> ... 1 more
2> NOTE: reproduce with: ant test -Dtestcase=TestIndexWriter -Dtests.method=testRandomOperations -Dtests.seed=C384705D23CB7F34 -Dtests.slow=true -Dtests.badapples\
=true -Dtests.locale=ast -Dtests.timezone=Libya -Dtests.asserts=true -Dtests.file.encoding=UTF-8
2> NOTE: test params are: codec=Asserting(Lucene86): {=PostingsFormat(name=LuceneVarGapFixedInterval), a=PostingsFormat(name=Direct), c=PostingsFormat(name=LuceneV\
arGapDocFreqInterval), string=PostingsFormat(name=Direct), foo=BlockTreeOrds(blocksize=128), body=BlockTreeOrds(blocksize=128), version=PostingsFormat(name=LuceneVar\
GapFixedInterval), content=PostingsFormat(name=Direct), str=PostingsFormat(name=Direct), tvtest=BlockTreeOrds(blocksize=128), field=BlockTreeOrds(blocksize=128), con\
tent4=PostingsFormat(name=LuceneVarGapDocFreqInterval), str3=BlockTreeOrds(blocksize=128), content1=BlockTreeOrds(blocksize=128), binary=PostingsFormat(name=Direct),\
str2=PostingsFormat(name=Direct), id=PostingsFormat(name=LuceneVarGapDocFreqInterval), myfield=BlockTreeOrds(blocksize=128)}, docValues:{numval=DocValuesFormat(name\
=Lucene80), my_deletes=DocValuesFormat(name=Lucene80), sortedsetdv=DocValuesFormat(name=Lucene80), sortednumericdv=DocValuesFormat(name=Asserting), soft_deletes=DocV\
aluesFormat(name=Lucene80), num=DocValuesFormat(name=Asserting), binarydv=DocValuesFormat(name=Lucene80), dv=DocValuesFormat(name=Asserting), dv_field=DocValuesForma\
t(name=Lucene80), numericdv=DocValuesFormat(name=Lucene80), soft_delete=DocValuesFormat(name=Asserting), sorteddv=DocValuesFormat(name=Lucene80), id=DocValuesFormat(\
name=Lucene80), soft_deletes_1=DocValuesFormat(name=Lucene80)}, maxPointsInLeafNode=95, maxMBSortInHeap=5.8271489113157, sim=Asserting(org.apache.lucene.search.simil\
arities.AssertingSimilarity@481a5d24), locale=ast, timezone=Libya
2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=112682104,total=268435456
2> NOTE: All tests run in this JVM: [TestForUtil, TestIndexWriter]
4th:
2> NOTE: reproduce with: ant test -Dtestcase=TestIndexFileDeleter -Dtests.method=testExcInDecRef -Dtests.seed=DBA89BFE7C61C7AB -Dtests.slow=true -Dtests.badapples\
=true -Dtests.locale=ln-CG -Dtests.timezone=America/Noronha -Dtests.asserts=true -Dtests.file.encoding=UTF-8
2> yun 05, 2020 7:27:19 NTƆ́NGƆ́ com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
2> WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #1,5,TGRP-TestIndexFileDeleter]
2> org.apache.lucene.index.MergePolicy$MergeException: java.lang.RuntimeException: segments must include at least one segment
2> at __randomizedtesting.SeedInfo.seed([DBA89BFE7C61C7AB]:0)
2> at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:703)
2> at org.apache.lucene.index.TestIndexFileDeleter$2.handleMergeException(TestIndexFileDeleter.java:438)
2> at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:694)
2> Caused by: java.lang.RuntimeException: segments must include at least one segment
2> at org.apache.lucene.index.MergePolicy$OneMerge.<init>(MergePolicy.java:235)
2> at org.apache.lucene.index.IndexWriter$2.mergeFinished(IndexWriter.java:3182)
2> at org.apache.lucene.index.IndexWriter.closeMergeReaders(IndexWriter.java:4418)
2> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4720)
2> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4145)
2> at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5706)
2> at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:624)
2> at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:682)
2>
2> NOTE: leaving temporary files on disk at: /tmp/lucene_tests/lucene.index.TestIndexFileDeleter_DBA89BFE7C61C7AB-001
2> NOTE: test params are: codec=Lucene86, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@44ab18a0), locale=ln-CG, timezone=America/Noronha
2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=150481312,total=268435456
2> NOTE: All tests run in this JVM: [TestTessellator, TestFilterLeafReader, TestNRTReaderCleanup, TestIndexFileDeleter]
5th:
> java.lang.RuntimeException: segments must include at least one segment
> at __randomizedtesting.SeedInfo.seed([AC9C0966B9BC03C8:9144A74A81525DB8]:0)
> at org.apache.lucene.index.MergePolicy$OneMerge.<init>(MergePolicy.java:235)
> at org.apache.lucene.index.IndexWriter$2.mergeFinished(IndexWriter.java:3182)
> at org.apache.lucene.index.IndexWriter.closeMergeReaders(IndexWriter.java:4418)
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4720)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4145)
> at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5706)
> at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:40)
> at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3331)
> at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3549)
> at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3507)
> at org.apache.lucene.index.TestIndexWriterExceptions2.testBasics(TestIndexWriterExceptions2.java:211)
> ...
2> NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterExceptions2 -Dtests.method=testBasics -Dtests.seed=AC9C0966B9BC03C8 -Dtests.slow=true -Dtests.badapple\
s=true -Dtests.locale=et-EE -Dtests.timezone=Australia/Lord_Howe -Dtests.asserts=true -Dtests.file.encoding=UTF-8
2> NOTE: test params are: codec=Asserting(Lucene86): {}, docValues:{}, maxPointsInLeafNode=1394, maxMBSortInHeap=6.947614894754386, sim=Asserting(org.apache.lucene\
.search.similarities.AssertingSimilarity@52aafa77), locale=et-EE, timezone=Australia/Lord_Howe
2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=216811792,total=268435456
2> NOTE: All tests run in this JVM: [TestLucene50StoredFieldsFormatHighCompression, Test2BPostingsBytes, TestIndexWriterExceptions2]