Am Mi., 1. Juli 2020 um 09:33 Uhr schrieb Hal           Finkel <hfinkel@anl.gov>:
                                         
              

I definitely agree that we should not be trying to do                 this kind of checking using textual metadata-node                 matching in FileCheck. The alternative already available                 is to add an analysis pass with some kind of verifier                 output. This output, not the raw metadata itself, can be                 checked by FileCheck. We also need to check the                 verification code, but at least that's something we can                 keep just in one place. For parallel annotations, we                 already have such a thing (we can run opt -loops                 -analyze; e.g., in                 test/Analysis/LoopInfo/annotated-parallel-complex.ll).                 We also do this kind of thing for the cost model (by                 running with -cost-model -analyze). To what extent would                 making more-extensive use of this technique address the                 use cases you're trying to address?
              

            
                     
The CHECK lines in annotated-parallel-complex.ll are:
          

          
          
; CHECK: Parallel Loop at depth 1
          ; CHECK-NOT: Parallel
            ; CHECK:     Loop at depth 2
            ; CHECK:         Parallel Loop
            ; CHECK:             Parallel Loop
                 
                 When adding this test, I had to change           LoopInfo to emit the "Parallel" in front of "Loop". For           readability, I would have preferred the parallel info as a           "tag", such as `Loop (parallel) as depth 1`, but this would           break other tests that check "Loop at depth 1". Later I           noticed that there are regression tests that check           "LoopFullUnrollPass on Loop at depth 3 containing:           %l0.0.0
", but it seems I got lucky in that none           of these loops have parallel annotations.         
                 "CHECK-NOT" is inherently fragile. It           is too easy to make a change in LLVM that changes the text           output and oversee that this test does not check what it was           supposed to test. For a FileCheck-friendlier output, it could           emit "NonParallel" and match this. However, this clutters the           output for humans, will actually break the "LoopFullUnrollPass           on Loop at depth 3 ..." and "CHECK: Parallel" matches this as           well since FileCheck ignores word boundaries.                                 
                 The CHECK lines test more than           necessary. The first and third CHECK lines also check the "at           depth" to make it match the correct loop (and not, e.g. match           the next inner loop), although we are not interested in the           loop depths themselves. Ironically, is the reason why cannot           be tags between "Loop" and "at depth"                 


    

    

We can have different printing modes. There can be a       more-human-friendly mode and a more-FileCheck-friendly mode. Or       modes customized for different kinds of tests. I agree, however,       that this does not solve the fragility problems with CHECK-NOT.
    

    


    

                    
                 Not all of the loop metadata have           passes that print them. For instance, there are loop           properties such as llvm.loop.isvectorized. Reading those is           usually done using utility functions such as           getBooleanLoopAttribute(L, "llvm.loop.isvectorized"). A           solution using FileCheck would be to add another pass that           prints loop metadata. That pass would only be used for           testing, making the release LLVM binaries larger than           necessary and still have the other problems.         
                 Processing the IR through a tool can           make the output more FileCheck-friendly, but it doesn't make           its problems disappear. IMHO it adds to the maintenance burden           since it adds more textual interfaces.                 


    

    

That's the interesting question... it does add to the maintenance       burden. However, having textual outputs are also quite convenient       when debugging things (because I can change the input and see the       output quickly, without needing to create and compile another       program). Obviously, at some point, this becomes ridiculous. How       much is too much? I don't know. But it's also not clear to me that       we're at that point yet. We could add more textual analysis       outputs and still have that be a net benefit in many places.

    

 In cases where the requisite output would just be too specific,       we do have unit tests. Should we just add more? Maybe we're too       happy to add lit tests instead of unit tests for some of these       cases.
    

    


    

                    
                 Michael         
                 
                 
                 
                 
                 
                 
                         --  Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory    ">

(original) (raw)


On 7/1/20 11:13 AM, Michael Kruse wrote:
Am Mi., 1\. Juli 2020 um 09:33 Uhr schrieb Hal Finkel <hfinkel@anl.gov>:

I definitely agree that we should not be trying to do this kind of checking using textual metadata-node matching in FileCheck. The alternative already available is to add an analysis pass with some kind of verifier output. This output, not the raw metadata itself, can be checked by FileCheck. We also need to check the verification code, but at least that's something we can keep just in one place. For parallel annotations, we already have such a thing (we can run opt -loops \-analyze; e.g., in test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do this kind of thing for the cost model (by running with -cost-model -analyze). To what extent would making more-extensive use of this technique address the use cases you're trying to address?

The CHECK lines in annotated-parallel-complex.ll are:

; CHECK: Parallel Loop at depth 1
; CHECK-NOT: Parallel
; CHECK: Loop at depth 2
; CHECK: Parallel Loop
; CHECK: Parallel Loop

When adding this test, I had to change LoopInfo to emit the "Parallel" in front of "Loop". For readability, I would have preferred the parallel info as a "tag", such as \`Loop (parallel) as depth 1\`, but this would break other tests that check "Loop at depth 1". Later I noticed that there are regression tests that check "LoopFullUnrollPass on Loop at depth 3 containing: %l0.0.0
", but it seems I got lucky in that none of these loops have parallel annotations.

"CHECK-NOT" is inherently fragile. It is too easy to make a change in LLVM that changes the text output and oversee that this test does not check what it was supposed to test. For a FileCheck-friendlier output, it could emit "NonParallel" and match this. However, this clutters the output for humans, will actually break the "LoopFullUnrollPass on Loop at depth 3 ..." and "CHECK: Parallel" matches this as well since FileCheck ignores word boundaries.

The CHECK lines test more than necessary. The first and third CHECK lines also check the "at depth" to make it match the correct loop (and not, e.g. match the next inner loop), although we are not interested in the loop depths themselves. Ironically, is the reason why cannot be tags between "Loop" and "at depth"


We can have different printing modes. There can be a more-human-friendly mode and a more-FileCheck-friendly mode. Or modes customized for different kinds of tests. I agree, however, that this does not solve the fragility problems with CHECK-NOT.



Not all of the loop metadata have passes that print them. For instance, there are loop properties such as llvm.loop.isvectorized. Reading those is usually done using utility functions such as getBooleanLoopAttribute(L, "llvm.loop.isvectorized"). A solution using FileCheck would be to add another pass that prints loop metadata. That pass would only be used for testing, making the release LLVM binaries larger than necessary and still have the other problems.

Processing the IR through a tool can make the output more FileCheck-friendly, but it doesn't make its problems disappear. IMHO it adds to the maintenance burden since it adds more textual interfaces.


That's the interesting question... it does add to the maintenance burden. However, having textual outputs are also quite convenient when debugging things (because I can change the input and see the output quickly, without needing to create and compile another program). Obviously, at some point, this becomes ridiculous. How much is too much? I don't know. But it's also not clear to me that we're at that point yet. We could add more textual analysis outputs and still have that be a net benefit in many places.

In cases where the requisite output would just be too specific, we do have unit tests. Should we just add more? Maybe we're too happy to add lit tests instead of unit tests for some of these cases.



Michael







--   
Hal Finkel  
Lead, Compiler Technology and Programming Languages  
Leadership Computing Facility  
Argonne National Laboratory