LUCENE-8962: merge small segments on commit by msokolov · Pull Request #1552 · apache/lucene-solr (original) (raw)

I'll beast this PR on my 128-core AMD Ryzen box :)

I ran this overnight, consuming a lot of electricity and generating heat in my already too hot basement ;0

It finished all Lucene (core + modules) tests 889 times, and hit a few interesting failures! I have not checked 1) whether they reproduce, nor 2) whether these failures might also occur on clean master:

1st:

  2> NOTE: reproduce with: ant test  -Dtestcase=TestIndexFileDeleter -Dtests.method=testExcInDecRef -Dtests.seed=DC21EB3B9D4052A4 -Dtests.slow=true -Dtests.badapples\
=true -Dtests.locale=sw -Dtests.timezone=Africa/Cairo -Dtests.asserts=true -Dtests.file.encoding=UTF-8
  2> Jun 05, 2020 4:16:02 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
  2> WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,5,TGRP-TestIndexFileDeleter]
  2> org.apache.lucene.index.MergePolicy$MergeException: java.lang.RuntimeException: segments must include at least one segment
  2>    at __randomizedtesting.SeedInfo.seed([DC21EB3B9D4052A4]:0)
  2>    at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:703)
  2>    at org.apache.lucene.index.TestIndexFileDeleter$2.handleMergeException(TestIndexFileDeleter.java:438)
  2>    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:694)
  2> Caused by: java.lang.RuntimeException: segments must include at least one segment
  2>    at org.apache.lucene.index.MergePolicy$OneMerge.<init>(MergePolicy.java:235)
  2>    at org.apache.lucene.index.IndexWriter$2.mergeFinished(IndexWriter.java:3182)
  2>    at org.apache.lucene.index.IndexWriter.closeMergeReaders(IndexWriter.java:4418)
  2>    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4720)
  2>    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4145)
  2>    at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5706)
  2>    at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:624)
  2>    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:682)
  2>
  2> NOTE: leaving temporary files on disk at: /tmp/lucene_tests/lucene.index.TestIndexFileDeleter_DC21EB3B9D4052A4-001
  2> NOTE: test params are: codec=Lucene86, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@248b50a7), locale=sw, timezone=Africa/Cairo
  2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=165714056,total=268435456
  2> NOTE: All tests run in this JVM: [TestTessellator, TestFilterLeafReader, TestNRTReaderCleanup, TestIndexFileDeleter]

2nd:

org.apache.lucene.search.TestPhraseWildcardQuery > testExplain FAILED
    org.junit.ComparisonFailure: expected:<...tle:"t?e b* b*") in [1]) [AssertingSimilari...> but was:<...tle:"t?e b* b*") in [4]) [AssertingSimilari...>
        at __randomizedtesting.SeedInfo.seed([698AF96B1C28F923:DF7EB48A3083036C]:0)
        at org.junit.Assert.assertEquals(Assert.java:115)
        at org.junit.Assert.assertEquals(Assert.java:144)
        at org.apache.lucene.search.TestPhraseWildcardQuery.testExplain(TestPhraseWildcardQuery.java:233)
        ...
  2> NOTE: reproduce with: ant test  -Dtestcase=TestPhraseWildcardQuery -Dtests.method=testExplain -Dtests.seed=698AF96B1C28F923 -Dtests.slow=true -Dtests.badapples=\
true -Dtests.locale=ebu -Dtests.timezone=SystemV/CST6 -Dtests.asserts=true -Dtests.file.encoding=UTF-8
  2> NOTE: test params are: codec=Asserting(Lucene86): {other=PostingsFormat(name=Direct), author=BlockTreeOrds(blocksize=128), category=PostingsFormat(name=Assertin\
g), title=PostingsFormat(name=Direct)}, docValues:{}, maxPointsInLeafNode=576, maxMBSortInHeap=6.318571589527405, sim=Asserting(org.apache.lucene.search.similarities\
.AssertingSimilarity@37191dc0), locale=ebu, timezone=SystemV/CST6
  2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=176758768,total=268435456
  2> NOTE: All tests run in this JVM: [TestPhraseWildcardQuery]

3rd:

org.apache.lucene.index.TestIndexWriter > test suite's output saved to /l/trunk/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexW\
riter.txt, copied below:
  2> xun 05, 2020 6:26:03 DE LA MAÑANA com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
  2> WARNING: Uncaught exception in thread: Thread[Thread-50,5,TGRP-TestIndexWriter]
  2> java.lang.AssertionError: java.nio.file.NoSuchFileException: _1y.fdt
  2>    at __randomizedtesting.SeedInfo.seed([C384705D23CB7F34]:0)
  2>    at org.apache.lucene.index.TestIndexWriter.lambda$testRandomOperations$48(TestIndexWriter.java:3886)
  2>    at java.base/java.lang.Thread.run(Thread.java:834)
  2> Caused by: java.nio.file.NoSuchFileException: _1y.fdt
  2>    at org.apache.lucene.store.ByteBuffersDirectory.deleteFile(ByteBuffersDirectory.java:148)
  2>    at org.apache.lucene.store.MockDirectoryWrapper.deleteFile(MockDirectoryWrapper.java:607)
  2>    at org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:38)
  2>    at org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:696)
  2>    at org.apache.lucene.index.IndexFileDeleter.deleteFiles(IndexFileDeleter.java:690)
  2>    at org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:589)
  2>    at org.apache.lucene.index.IndexFileDeleter.deleteCommits(IndexFileDeleter.java:382)
  2>    at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:527)
  2>    at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3601)
  2>    at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3557)
  2>    at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3507)
  2>    at org.apache.lucene.index.TestIndexWriter.lambda$testRandomOperations$48(TestIndexWriter.java:3879)
  2>    ... 1 more

  2> NOTE: reproduce with: ant test  -Dtestcase=TestIndexWriter -Dtests.method=testRandomOperations -Dtests.seed=C384705D23CB7F34 -Dtests.slow=true -Dtests.badapples\
=true -Dtests.locale=ast -Dtests.timezone=Libya -Dtests.asserts=true -Dtests.file.encoding=UTF-8
  2> NOTE: test params are: codec=Asserting(Lucene86): {=PostingsFormat(name=LuceneVarGapFixedInterval), a=PostingsFormat(name=Direct), c=PostingsFormat(name=LuceneV\
arGapDocFreqInterval), string=PostingsFormat(name=Direct), foo=BlockTreeOrds(blocksize=128), body=BlockTreeOrds(blocksize=128), version=PostingsFormat(name=LuceneVar\
GapFixedInterval), content=PostingsFormat(name=Direct), str=PostingsFormat(name=Direct), tvtest=BlockTreeOrds(blocksize=128), field=BlockTreeOrds(blocksize=128), con\
tent4=PostingsFormat(name=LuceneVarGapDocFreqInterval), str3=BlockTreeOrds(blocksize=128), content1=BlockTreeOrds(blocksize=128), binary=PostingsFormat(name=Direct),\
 str2=PostingsFormat(name=Direct), id=PostingsFormat(name=LuceneVarGapDocFreqInterval), myfield=BlockTreeOrds(blocksize=128)}, docValues:{numval=DocValuesFormat(name\
=Lucene80), my_deletes=DocValuesFormat(name=Lucene80), sortedsetdv=DocValuesFormat(name=Lucene80), sortednumericdv=DocValuesFormat(name=Asserting), soft_deletes=DocV\
aluesFormat(name=Lucene80), num=DocValuesFormat(name=Asserting), binarydv=DocValuesFormat(name=Lucene80), dv=DocValuesFormat(name=Asserting), dv_field=DocValuesForma\
t(name=Lucene80), numericdv=DocValuesFormat(name=Lucene80), soft_delete=DocValuesFormat(name=Asserting), sorteddv=DocValuesFormat(name=Lucene80), id=DocValuesFormat(\
name=Lucene80), soft_deletes_1=DocValuesFormat(name=Lucene80)}, maxPointsInLeafNode=95, maxMBSortInHeap=5.8271489113157, sim=Asserting(org.apache.lucene.search.simil\
arities.AssertingSimilarity@481a5d24), locale=ast, timezone=Libya
  2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=112682104,total=268435456
  2> NOTE: All tests run in this JVM: [TestForUtil, TestIndexWriter]

4th:

  2> NOTE: reproduce with: ant test  -Dtestcase=TestIndexFileDeleter -Dtests.method=testExcInDecRef -Dtests.seed=DBA89BFE7C61C7AB -Dtests.slow=true -Dtests.badapples\
=true -Dtests.locale=ln-CG -Dtests.timezone=America/Noronha -Dtests.asserts=true -Dtests.file.encoding=UTF-8
  2> yun 05, 2020 7:27:19 NTƆ́NGƆ́ com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
  2> WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #1,5,TGRP-TestIndexFileDeleter]
  2> org.apache.lucene.index.MergePolicy$MergeException: java.lang.RuntimeException: segments must include at least one segment
  2>    at __randomizedtesting.SeedInfo.seed([DBA89BFE7C61C7AB]:0)
  2>    at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:703)
  2>    at org.apache.lucene.index.TestIndexFileDeleter$2.handleMergeException(TestIndexFileDeleter.java:438)
  2>    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:694)
  2> Caused by: java.lang.RuntimeException: segments must include at least one segment
  2>    at org.apache.lucene.index.MergePolicy$OneMerge.<init>(MergePolicy.java:235)
  2>    at org.apache.lucene.index.IndexWriter$2.mergeFinished(IndexWriter.java:3182)
  2>    at org.apache.lucene.index.IndexWriter.closeMergeReaders(IndexWriter.java:4418)
  2>    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4720)
  2>    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4145)
  2>    at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5706)
  2>    at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:624)
  2>    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:682)
  2>
  2> NOTE: leaving temporary files on disk at: /tmp/lucene_tests/lucene.index.TestIndexFileDeleter_DBA89BFE7C61C7AB-001
  2> NOTE: test params are: codec=Lucene86, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@44ab18a0), locale=ln-CG, timezone=America/Noronha
  2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=150481312,total=268435456
  2> NOTE: All tests run in this JVM: [TestTessellator, TestFilterLeafReader, TestNRTReaderCleanup, TestIndexFileDeleter]

5th:

   >     java.lang.RuntimeException: segments must include at least one segment
   >         at __randomizedtesting.SeedInfo.seed([AC9C0966B9BC03C8:9144A74A81525DB8]:0)
   >         at org.apache.lucene.index.MergePolicy$OneMerge.<init>(MergePolicy.java:235)
   >         at org.apache.lucene.index.IndexWriter$2.mergeFinished(IndexWriter.java:3182)
   >         at org.apache.lucene.index.IndexWriter.closeMergeReaders(IndexWriter.java:4418)
   >         at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4720)
   >         at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4145)
   >         at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5706)
   >         at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:40)
   >         at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3331)
   >         at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3549)
   >         at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3507)
   >         at org.apache.lucene.index.TestIndexWriterExceptions2.testBasics(TestIndexWriterExceptions2.java:211)
   >         ...
  2> NOTE: reproduce with: ant test  -Dtestcase=TestIndexWriterExceptions2 -Dtests.method=testBasics -Dtests.seed=AC9C0966B9BC03C8 -Dtests.slow=true -Dtests.badapple\
s=true -Dtests.locale=et-EE -Dtests.timezone=Australia/Lord_Howe -Dtests.asserts=true -Dtests.file.encoding=UTF-8
  2> NOTE: test params are: codec=Asserting(Lucene86): {}, docValues:{}, maxPointsInLeafNode=1394, maxMBSortInHeap=6.947614894754386, sim=Asserting(org.apache.lucene\
.search.similarities.AssertingSimilarity@52aafa77), locale=et-EE, timezone=Australia/Lord_Howe
  2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 (64-bit)/cpus=128,threads=1,free=216811792,total=268435456
  2> NOTE: All tests run in this JVM: [TestLucene50StoredFieldsFormatHighCompression, Test2BPostingsBytes, TestIndexWriterExceptions2]