LLVM: llvm::LoopVectorizationCostModel Class Reference (original) (raw)
LoopVectorizationCostModel - estimates the expected speedups due to vectorization. More...
| Public Types | |
|---|---|
| enum | InstWidening { CM_Unknown, CM_Widen, CM_Widen_Reverse, CM_Interleave, CM_GatherScatter, CM_Scalarize, CM_VectorCall, CM_IntrinsicCall } |
| Decision that was taken during cost calculation for memory instruction. More... |
| Public Member Functions | |
|---|---|
| LoopVectorizationCostModel (ScalarEpilogueLowering SEL, Loop *L, PredicatedScalarEvolution &PSE, LoopInfo *LI, LoopVectorizationLegality *Legal, const TargetTransformInfo &TTI, const TargetLibraryInfo *TLI, DemandedBits *DB, AssumptionCache *AC, OptimizationRemarkEmitter *ORE, std::function< BlockFrequencyInfo &()> GetBFI, const Function *F, const LoopVectorizeHints *Hints, InterleavedAccessInfo &IAI, bool OptForSize) | |
| FixedScalableVFPair | computeMaxVF (ElementCount UserVF, unsigned UserIC) |
| bool | runtimeChecksRequired () |
| bool | selectUserVectorizationFactor (ElementCount UserVF) |
| Setup cost-based decisions for user vectorization factor. | |
| bool | useMaxBandwidth (TargetTransformInfo::RegisterKind RegKind) |
| bool | shouldConsiderRegPressureForVF (ElementCount VF) |
| std::pair< unsigned, unsigned > | getSmallestAndWidestTypes () |
| void | setCostBasedWideningDecision (ElementCount VF) |
| Memory access instruction may be vectorized in more than one way. | |
| void | setVectorizedCallDecision (ElementCount VF) |
| A call may be vectorized in different ways depending on whether we have vectorized variants available and whether the target supports masking. | |
| void | collectValuesToIgnore () |
| Collect values we want to ignore in the cost model. | |
| void | collectElementTypesForWidening () |
| Collect all element types in the loop for which widening is needed. | |
| void | collectInLoopReductions () |
| Split reductions into those that happen in the loop, and those that happen outside. | |
| bool | useOrderedReductions (const RecurrenceDescriptor &RdxDesc) const |
| Returns true if we should use strict in-order reductions for the given RdxDesc. | |
| const MapVector< Instruction *, uint64_t > & | getMinimalBitwidths () const |
| bool | isProfitableToScalarize (Instruction *I, ElementCount VF) const |
| bool | isUniformAfterVectorization (Instruction *I, ElementCount VF) const |
| Returns true if I is known to be uniform after vectorization. | |
| bool | isScalarAfterVectorization (Instruction *I, ElementCount VF) const |
| Returns true if I is known to be scalar after vectorization. | |
| bool | canTruncateToMinimalBitwidth (Instruction *I, ElementCount VF) const |
| void | setWideningDecision (Instruction *I, ElementCount VF, InstWidening W, InstructionCost Cost) |
| Save vectorization decision W and Cost taken by the cost model for instruction I and vector width VF. | |
| void | setWideningDecision (const InterleaveGroup< Instruction > *Grp, ElementCount VF, InstWidening W, InstructionCost Cost) |
| Save vectorization decision W and Cost taken by the cost model for interleaving group Grp and vector width VF. | |
| InstWidening | getWideningDecision (Instruction *I, ElementCount VF) const |
| Return the cost model decision for the given instruction I and vector width VF. | |
| InstructionCost | getWideningCost (Instruction *I, ElementCount VF) |
| Return the vectorization cost for the given instruction I and vector width VF. | |
| void | setCallWideningDecision (CallInst *CI, ElementCount VF, InstWidening Kind, Function *Variant, Intrinsic::ID IID, std::optional< unsigned > MaskPos, InstructionCost Cost) |
| CallWideningDecision | getCallWideningDecision (CallInst *CI, ElementCount VF) const |
| bool | isOptimizableIVTruncate (Instruction *I, ElementCount VF) |
| Return True if instruction I is an optimizable truncate whose operand is an induction variable. | |
| void | collectInstsToScalarize (ElementCount VF) |
| Collects the instructions to scalarize for each predicated instruction in the loop. | |
| void | collectNonVectorizedAndSetWideningDecisions (ElementCount VF) |
| Collect values that will not be widened, including Uniforms, Scalars, and Instructions to Scalarize for the given VF. | |
| bool | isLegalMaskedStore (Type *DataType, Value *Ptr, Align Alignment, unsigned AddressSpace) const |
| Returns true if the target machine supports masked store operation for the given DataType and kind of access to Ptr. | |
| bool | isLegalMaskedLoad (Type *DataType, Value *Ptr, Align Alignment, unsigned AddressSpace) const |
| Returns true if the target machine supports masked load operation for the given DataType and kind of access to Ptr. | |
| bool | isLegalGatherOrScatter (Value *V, ElementCount VF) |
| Returns true if the target machine can represent V as a masked gather or scatter operation. | |
| bool | canVectorizeReductions (ElementCount VF) const |
| Returns true if the target machine supports all of the reduction variables found for the given VF. | |
| bool | isDivRemScalarWithPredication (InstructionCost ScalarCost, InstructionCost SafeDivisorCost) const |
| Given costs for both strategies, return true if the scalar predication lowering should be used for div/rem. | |
| bool | isScalarWithPredication (Instruction *I, ElementCount VF) |
| Returns true if I is an instruction which requires predication and for which our chosen predication strategy is scalarization (i.e. | |
| bool | isPredicatedInst (Instruction *I) const |
| Returns true if I is an instruction that needs to be predicated at runtime. | |
| unsigned | getPredBlockCostDivisor (TargetTransformInfo::TargetCostKind CostKind, const BasicBlock *BB) |
| A helper function that returns how much we should divide the cost of a predicated block by. | |
| std::pair< InstructionCost, InstructionCost > | getDivRemSpeculationCost (Instruction *I, ElementCount VF) |
| Return the costs for our two available strategies for lowering a div/rem operation which requires speculating at least one lane. | |
| bool | memoryInstructionCanBeWidened (Instruction *I, ElementCount VF) |
| Returns true if I is a memory instruction with consecutive memory access that can be widened. | |
| bool | interleavedAccessCanBeWidened (Instruction *I, ElementCount VF) const |
| Returns true if I is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles. | |
| bool | isAccessInterleaved (Instruction *Instr) const |
| Check if Instr belongs to any interleaved access group. | |
| const InterleaveGroup< Instruction > * | getInterleavedAccessGroup (Instruction *Instr) const |
| Get the interleaved access group that Instr belongs to. | |
| bool | requiresScalarEpilogue (bool IsVectorizing) const |
| Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop. | |
| bool | isScalarEpilogueAllowed () const |
| Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation. | |
| bool | preferPredicatedLoop () const |
| Returns true if tail-folding is preferred over a scalar epilogue. | |
| TailFoldingStyle | getTailFoldingStyle (bool IVUpdateMayOverflow=true) const |
| Returns the TailFoldingStyle that is best for the current loop. | |
| void | setTailFoldingStyles (bool IsScalableVF, unsigned UserIC) |
| Selects and saves TailFoldingStyle for 2 options - if IV update may overflow or not. | |
| bool | foldTailByMasking () const |
| Returns true if all loop blocks should be masked to fold tail loop. | |
| bool | useWideActiveLaneMask () const |
| Returns true if the use of wide lane masks is requested and the loop is using tail-folding with a lane mask for control flow. | |
| std::optional< unsigned > | getMaxSafeElements () const |
| Return maximum safe number of elements to be processed per vector iteration, which do not prevent store-load forwarding and are safe with regard to the memory dependencies. | |
| bool | blockNeedsPredicationForAnyReason (BasicBlock *BB) const |
| Returns true if the instructions in this block requires predication for any reason, e.g. | |
| bool | foldTailWithEVL () const |
| Returns true if VP intrinsics with explicit vector length support should be generated in the tail folded loop. | |
| bool | isInLoopReduction (PHINode *Phi) const |
| Returns true if the Phi is part of an inloop reduction. | |
| bool | usePredicatedReductionSelect () const |
| Returns true if the predicated reduction select should be used to set the incoming value for the reduction phi. | |
| InstructionCost | getVectorIntrinsicCost (CallInst *CI, ElementCount VF) const |
| Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF. | |
| InstructionCost | getVectorCallCost (CallInst *CI, ElementCount VF) const |
| Estimate cost of a call instruction CI if it were vectorized with factor VF. | |
| void | invalidateCostModelingDecisions () |
| Invalidates decisions already taken by the cost model. | |
| InstructionCost | expectedCost (ElementCount VF) |
| Returns the expected execution cost. | |
| bool | hasPredStores () const |
| bool | isEpilogueVectorizationProfitable (const ElementCount VF, const unsigned IC) const |
| Returns true if epilogue vectorization is considered profitable, and false otherwise. | |
| InstructionCost | getInstructionCost (Instruction *I, ElementCount VF) |
| Returns the execution time cost of an instruction for a given vector width. | |
| std::optional< InstructionCost > | getReductionPatternCost (Instruction *I, ElementCount VF, Type *VectorTy) const |
| Return the cost of instructions in an inloop reduction pattern, if I is part of that pattern. | |
| bool | shouldConsiderInvariant (Value *Op) |
| Returns true if Op should be considered invariant and if it is trivially hoistable. | |
| std::optional< unsigned > | getVScaleForTuning () const |
| Return the value of vscale used for tuning the cost model. | |
| BlockFrequencyInfo & | getBFI () |
| Returns the BlockFrequencyInfo for the function if cached, otherwise fetches it via GetBFI. |
| Public Attributes | |
|---|---|
| Loop * | TheLoop |
| The loop that we evaluate. | |
| PredicatedScalarEvolution & | PSE |
| Predicated scalar evolution analysis. | |
| LoopInfo * | LI |
| Loop Info analysis. | |
| LoopVectorizationLegality * | Legal |
| Vectorization legality. | |
| const TargetTransformInfo & | TTI |
| Vector target information. | |
| const TargetLibraryInfo * | TLI |
| Target Library Info. | |
| DemandedBits * | DB |
| Demanded bits analysis. | |
| AssumptionCache * | AC |
| Assumption cache. | |
| OptimizationRemarkEmitter * | ORE |
| Interface to emit optimization remarks. | |
| std::function< BlockFrequencyInfo &()> | GetBFI |
| A function to lazily fetch BlockFrequencyInfo. | |
| BlockFrequencyInfo * | BFI = nullptr |
| The BlockFrequencyInfo returned from GetBFI. | |
| const Function * | TheFunction |
| const LoopVectorizeHints * | Hints |
| Loop Vectorize Hint. | |
| InterleavedAccessInfo & | InterleaveInfo |
| The interleave access information contains groups of interleaved accesses with the same stride and close to each other. | |
| SmallPtrSet< const Value *, 16 > | ValuesToIgnore |
| Values to ignore in the cost model. | |
| SmallPtrSet< const Value *, 16 > | VecValuesToIgnore |
| Values to ignore in the cost model when VF > 1. | |
| SmallPtrSet< Type *, 16 > | ElementTypesInLoop |
| All element types found in the loop. | |
| TTI::TargetCostKind | CostKind |
| The kind of cost that we are calculating. | |
| bool | OptForSize |
| Whether this loop should be optimized for size based on function attribute or profile information. | |
| FixedScalableVFPair | MaxPermissibleVFWithoutMaxBW |
| The highest VF possible for this loop, without using MaxBandwidth. |
| Friends | |
|---|---|
| class | LoopVectorizationPlanner |
LoopVectorizationCostModel - estimates the expected speedups due to vectorization.
In many cases vectorization is not profitable. This can happen because of a number of reasons. In this class we mainly attempt to predict the expected speedup/slowdowns due to the supported instruction set. We use the TargetTransformInfo to query the different backends for the cost of different operations.
Definition at line 867 of file LoopVectorize.cpp.
◆ InstWidening
Decision that was taken during cost calculation for memory instruction.
| Enumerator |
|---|
| CM_Unknown |
| CM_Widen |
| CM_Widen_Reverse |
| CM_Interleave |
| CM_GatherScatter |
| CM_Scalarize |
| CM_VectorCall |
| CM_IntrinsicCall |
Definition at line 1021 of file LoopVectorize.cpp.
| llvm::LoopVectorizationCostModel::LoopVectorizationCostModel ( ScalarEpilogueLowering SEL, Loop * L, PredicatedScalarEvolution & PSE, LoopInfo * LI, LoopVectorizationLegality * Legal, const TargetTransformInfo & TTI, const TargetLibraryInfo * TLI, DemandedBits * DB, AssumptionCache * AC, OptimizationRemarkEmitter * ORE, std::function< BlockFrequencyInfo &()> GetBFI, const Function * F, const LoopVectorizeHints * Hints, InterleavedAccessInfo & IAI, bool OptForSize ) | inline |
|---|
Definition at line 871 of file LoopVectorize.cpp.
References AC, CostKind, DB, F, ForceTargetSupportsScalableVectors, GetBFI, Hints, InterleaveInfo, Legal, LI, OptForSize, ORE, PSE, llvm::TargetTransformInfo::TCK_CodeSize, llvm::TargetTransformInfo::TCK_RecipThroughput, TheFunction, TheLoop, TLI, and TTI.
◆ blockNeedsPredicationForAnyReason()
| bool llvm::LoopVectorizationCostModel::blockNeedsPredicationForAnyReason ( BasicBlock * BB) const | inline |
|---|
◆ canTruncateToMinimalBitwidth()
◆ canVectorizeReductions()
| bool llvm::LoopVectorizationCostModel::canVectorizeReductions ( ElementCount VF) const | inline |
|---|
◆ collectElementTypesForWidening()
| void LoopVectorizationCostModel::collectElementTypesForWidening | ( | ) |
|---|
Collect all element types in the loop for which widening is needed.
Definition at line 4551 of file LoopVectorize.cpp.
References assert(), llvm::dyn_cast(), ElementTypesInLoop, llvm::RecurrenceDescriptor::getRecurrenceKind(), llvm::RecurrenceDescriptor::getRecurrenceType(), I, llvm::isa(), Legal, PreferInLoopReductions, T, TheLoop, TTI, useOrderedReductions(), and ValuesToIgnore.
Referenced by processLoopInVPlanNativePath().
◆ collectInLoopReductions()
| void LoopVectorizationCostModel::collectInLoopReductions | ( | ) |
|---|
Split reductions into those that happen in the loop, and those that happen outside.
In loop reductions are collected into InLoopReductions.
Definition at line 6621 of file LoopVectorize.cpp.
References llvm::dbgs(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::RecurrenceDescriptor::getRecurrenceKind(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::RecurrenceDescriptor::getReductionOpChain(), llvm::RecurrenceDescriptor::hasUsesOutsideReductionChain(), I, llvm::RecurrenceDescriptor::isAnyOfRecurrenceKind(), llvm::RecurrenceDescriptor::isFindIVRecurrenceKind(), Legal, LLVM_DEBUG, PreferInLoopReductions, TheLoop, TTI, and useOrderedReductions().
◆ collectInstsToScalarize()
| void LoopVectorizationCostModel::collectInstsToScalarize | ( | ElementCount | VF | ) |
|---|
Collects the instructions to scalarize for each predicated instruction in the loop.
Definition at line 4946 of file LoopVectorize.cpp.
References assert(), blockNeedsPredicationForAnyReason(), CM_Scalarize, llvm::dyn_cast(), I, llvm::MapVector< KeyT, ValueT, MapType, VectorType >::insert(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), isScalarAfterVectorization(), isScalarWithPredication(), llvm::ElementCount::isVector(), llvm::predecessors(), and TheLoop.
Referenced by collectNonVectorizedAndSetWideningDecisions().
◆ collectNonVectorizedAndSetWideningDecisions()
| void llvm::LoopVectorizationCostModel::collectNonVectorizedAndSetWideningDecisions ( ElementCount VF) | inline |
|---|
◆ collectValuesToIgnore()
| void LoopVectorizationCostModel::collectValuesToIgnore | ( | ) |
|---|
Collect values we want to ignore in the cost model.
Definition at line 6483 of file LoopVectorize.cpp.
References AC, llvm::all_of(), llvm::any_of(), llvm::append_range(), llvm::LoopBlocksDFS::beginRPO(), llvm::cast(), llvm::CodeMetrics::collectEphemeralValues(), llvm::dyn_cast(), llvm::dyn_cast_or_null(), llvm::LoopBlocksDFS::endRPO(), llvm::InductionDescriptor::getCastInsts(), llvm::RecurrenceDescriptor::getCastInsts(), getInterleavedAccessGroup(), llvm::getLoadStorePointerOperand(), getParent(), llvm::BasicBlock::getSingleSuccessor(), I, llvm::isa(), isAccessInterleaved(), IsEmptyBlock(), Legal, LI, llvm::make_range(), llvm::LoopBlocksDFS::perform(), llvm::BasicBlock::phis(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), requiresScalarEpilogue(), llvm::reverse(), llvm::SmallVectorTemplateCommon< T, typename >::size(), TheLoop, TLI, ValuesToIgnore, VecValuesToIgnore, and llvm::wouldInstructionBeTriviallyDead().
◆ computeMaxVF()
Returns
An upper bound for the vectorization factors (both fixed and scalable). If the factors are 0, vectorization and interleaving should be avoided up front.
Definition at line 3543 of file LoopVectorize.cpp.
References llvm::ScalarEvolution::applyLoopGuards(), assert(), llvm::CM_ScalarEpilogueAllowed, llvm::CM_ScalarEpilogueNotAllowedLowTripLoop, llvm::CM_ScalarEpilogueNotAllowedOptSize, llvm::CM_ScalarEpilogueNotAllowedUsePredicate, llvm::CM_ScalarEpilogueNotNeededUsePredicate, llvm::DataWithEVL, llvm::dbgs(), llvm::FixedScalableVFPair::FixedVF, foldTailByMasking(), llvm::ScalarEvolution::getAddExpr(), llvm::ScalarEvolution::getBackedgeTakenCount(), llvm::ScalarEvolution::getConstant(), llvm::ElementCount::getFixed(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), getMaxVScale(), llvm::ScalarEvolution::getMinusOne(), llvm::FixedScalableVFPair::getNone(), llvm::ScalarEvolution::getOne(), llvm::ElementCount::getScalable(), llvm::Type::getScalarSizeInBits(), getSmallBestKnownTC(), getSmallConstantTripCount(), getTailFoldingStyle(), llvm::SCEV::getType(), llvm::ScalarEvolution::getURemExpr(), llvm::CmpInst::ICMP_EQ, InterleaveInfo, llvm::isa(), llvm::ScalarEvolution::isKnownPredicate(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isNonZero(), llvm::isPowerOf2_32(), llvm::ElementCount::isScalar(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isZero(), llvm::SCEV::isZero(), Legal, LLVM_DEBUG, ORE, PSE, llvm::reportVectorizationFailure(), runtimeChecksRequired(), llvm::FixedScalableVFPair::ScalableVF, setTailFoldingStyles(), TheFunction, TheLoop, TTI, and useMaskedInterleavedAccesses().
◆ expectedCost()
Returns the expected execution cost.
The unit of the cost does not matter because we use the 'cost' units to compare different vector widths. The cost that is returned is not normalized by the factor width.
Definition at line 5123 of file LoopVectorize.cpp.
References addFullyUnrolledInstructionsToIgnore(), llvm::CallingConv::C, CM_Interleave, CostKind, llvm::SmallPtrSetImpl< PtrType >::count(), llvm::dbgs(), foldTailByMasking(), llvm::ForceTargetInstructionCost, getInstructionCost(), getInterleavedAccessGroup(), getPredBlockCostDivisor(), getSmallConstantTripCount(), getWideningDecision(), I, InstructionCost, llvm::ElementCount::isScalar(), llvm::ElementCount::isVector(), Legal, LLVM_DEBUG, PSE, TheLoop, ValuesToIgnore, and VecValuesToIgnore.
Referenced by selectUserVectorizationFactor().
◆ foldTailByMasking()
| bool llvm::LoopVectorizationCostModel::foldTailByMasking ( ) const | inline |
|---|
◆ foldTailWithEVL()
| bool llvm::LoopVectorizationCostModel::foldTailWithEVL ( ) const | inline |
|---|
◆ getBFI()
◆ getCallWideningDecision()
◆ getDivRemSpeculationCost()
Return the costs for our two available strategies for lowering a div/rem operation which requires speculating at least one lane.
First result is for scalarization (will be invalid for scalable vectors); second is for the safe-divisor strategy.
Definition at line 2914 of file LoopVectorize.cpp.
References assert(), llvm::CmpInst::BAD_ICMP_PREDICATE, CostKind, llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::Type::getInt1Ty(), llvm::InstructionCost::getInvalid(), getPredBlockCostDivisor(), getScalarizationOverhead(), I, llvm::isSafeToSpeculativelyExecute(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::toVectorTy(), and TTI.
Referenced by getInstructionCost(), and isScalarWithPredication().
◆ getInstructionCost()
Returns the execution time cost of an instruction for a given vector width.
Vector width of one means scalar.
Definition at line 6047 of file LoopVectorize.cpp.
References llvm::all_of(), assert(), llvm::CmpInst::BAD_ICMP_PREDICATE, canTruncateToMinimalBitwidth(), llvm::cast(), llvm::cast_if_present(), CM_GatherScatter, CM_Interleave, CM_IntrinsicCall, CM_Scalarize, CM_Unknown, CM_VectorCall, CM_Widen, CM_Widen_Reverse, CostKind, llvm::dyn_cast(), llvm::find_singleton(), foldTailWithEVL(), llvm::TargetTransformInfo::GatherScatter, llvm::IntegerType::get(), llvm::VectorType::get(), llvm::APInt::getAllOnes(), llvm::Type::getContext(), getDivRemSpeculationCost(), llvm::ElementCount::getFixed(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), getInstructionCost(), llvm::Type::getInt1Ty(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::getLoadStoreType(), llvm::TargetTransformInfo::getOperandInfo(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::LoadInst::getPointerOperandType(), getReductionPatternCost(), llvm::Type::getScalarSizeInBits(), llvm::ScalarEvolution::getSCEV(), llvm::BranchInst::getSuccessor(), llvm::Value::getType(), getVectorCallCost(), llvm::Type::getVoidTy(), getWideningCost(), getWideningDecision(), I, llvm::CmpInst::ICMP_EQ, llvm::TargetTransformInfo::Interleave, llvm::isa(), llvm::RecurrenceDescriptor::isAnyOfRecurrenceKind(), llvm::BranchInst::isConditional(), isDivRemScalarWithPredication(), isInLoopReduction(), llvm::ScalarEvolution::isLoopInvariant(), isOptimizableIVTruncate(), isPredicatedInst(), isProfitableToScalarize(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), isScalarAfterVectorization(), isUniformAfterVectorization(), llvm::ElementCount::isVector(), llvm::Type::isVectorTy(), Legal, llvm_unreachable, llvm::llvm_unreachable_internal(), llvm::HistogramInfo::Load, llvm::PatternMatch::m_LogicalAnd(), llvm::PatternMatch::m_LogicalOr(), llvm::PatternMatch::m_Value(), llvm::CmpInst::makeCmpResultType(), llvm::TargetTransformInfo::Masked, llvm::SCEVPatternMatch::match(), llvm::TargetTransformInfo::None, llvm::TargetTransformInfo::Normal, llvm::TargetTransformInfo::OK_AnyValue, llvm::TargetTransformInfo::OK_UniformValue, PSE, llvm::TargetTransformInfo::Reversed, shouldConsiderInvariant(), llvm::TargetTransformInfo::SK_Splice, llvm::TargetTransformInfo::TCC_Free, TheLoop, TLI, llvm::toVectorizedTy(), llvm::toVectorTy(), and TTI.
Referenced by expectedCost(), and getInstructionCost().
◆ getInterleavedAccessGroup()
◆ getMaxSafeElements()
| std::optional< unsigned > llvm::LoopVectorizationCostModel::getMaxSafeElements ( ) const | inline |
|---|
Return maximum safe number of elements to be processed per vector iteration, which do not prevent store-load forwarding and are safe with regard to the memory dependencies.
Required for EVL-based VPlans to correctly calculate AVL (application vector length) as min(remaining AVL, MaxSafeElements). TODO: need to consider adjusting cost model to use this value as a vectorization factor for EVL-based vectorization.
Definition at line 1388 of file LoopVectorize.cpp.
◆ getMinimalBitwidths()
Returns
The smallest bitwidth each instruction can be represented with. The vector equivalents of these instructions should be truncated to this type.
Definition at line 955 of file LoopVectorize.cpp.
◆ getPredBlockCostDivisor()
A helper function that returns how much we should divide the cost of a predicated block by.
Typically this is the reciprocal of the block probability, i.e. if we return X we are assuming the predicated block will execute once for every X iterations of the loop header so the block should only contribute 1/X of its cost to the total cost calculation, but when optimizing for code size it will just be 1 as code size costs don't depend on execution probabilities.
Note that if a block wasn't originally predicated but was predicated due to tail folding, the divisor will still be 1 because it will execute for every iteration of the loop header.
Definition at line 2896 of file LoopVectorize.cpp.
References assert(), BBFreq, CostKind, getBFI(), Legal, llvm::TargetTransformInfo::TCK_CodeSize, and TheLoop.
Referenced by expectedCost(), and getDivRemSpeculationCost().
◆ getReductionPatternCost()
Return the cost of instructions in an inloop reduction pattern, if I is part of that pattern.
Definition at line 5402 of file LoopVectorize.cpp.
References llvm::cast(), CostKind, llvm::dyn_cast(), llvm::FMulAdd, llvm::VectorType::get(), llvm::RecurrenceDescriptor::getFastMathFlags(), llvm::Type::getIntegerBitWidth(), llvm::getMinMaxReductionIntrinsicOp(), llvm::Instruction::getOpcode(), llvm::RecurrenceDescriptor::getOpcode(), llvm::User::getOperand(), llvm::RecurrenceDescriptor::getRecurrenceKind(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::Value::getType(), llvm::Value::hasOneUser(), I, llvm::isa(), llvm::RecurrenceDescriptor::isMinMaxRecurrenceKind(), llvm::ElementCount::isScalar(), llvm::InstructionCost::isValid(), Legal, llvm::PatternMatch::m_Instruction(), llvm::PatternMatch::m_Mul(), llvm::MIPatternMatch::m_OneUse(), llvm::PatternMatch::m_Value(), llvm::PatternMatch::m_ZExtOrSExt(), llvm::PatternMatch::match(), llvm::SCEVPatternMatch::match(), llvm::TargetTransformInfo::None, TheLoop, TTI, useOrderedReductions(), and llvm::Instruction::user_back().
Referenced by getInstructionCost(), getVectorCallCost(), and setVectorizedCallDecision().
◆ getSmallestAndWidestTypes()
| std::pair< unsigned, unsigned > LoopVectorizationCostModel::getSmallestAndWidestTypes | ( | ) |
|---|
◆ getTailFoldingStyle()
| TailFoldingStyle llvm::LoopVectorizationCostModel::getTailFoldingStyle ( bool IVUpdateMayOverflow = true) const | inline |
|---|
◆ getVectorCallCost()
Estimate cost of a call instruction CI if it were vectorized with factor VF.
Return the cost of the instruction, including scalarization overhead if it's needed.
Definition at line 2524 of file LoopVectorize.cpp.
References llvm::CallBase::args(), CostKind, llvm::CallBase::getCalledFunction(), getCallWideningDecision(), getReductionPatternCost(), llvm::Value::getType(), getVectorIntrinsicCost(), llvm::getVectorIntrinsicIDForCall(), IntrinsicCost, llvm::RecurrenceDescriptor::isFMulAddIntrinsic(), llvm::ElementCount::isScalar(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), TLI, and TTI.
Referenced by getInstructionCost().
◆ getVectorIntrinsicCost()
Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF.
Return the cost of the instruction, including scalarization overhead if it's needed.
Definition at line 2558 of file LoopVectorize.cpp.
References llvm::CallBase::args(), Arguments, assert(), CostKind, llvm::dyn_cast(), llvm::CallBase::getCalledFunction(), llvm::Function::getFunctionType(), llvm::InstructionCost::getInvalid(), llvm::Value::getType(), llvm::getVectorIntrinsicIDForCall(), maybeVectorizeType(), llvm::FunctionType::param_begin(), llvm::FunctionType::param_end(), TLI, and TTI.
Referenced by getVectorCallCost(), and setVectorizedCallDecision().
◆ getVScaleForTuning()
| std::optional< unsigned > llvm::LoopVectorizationCostModel::getVScaleForTuning ( ) const | inline |
|---|
◆ getWideningCost()
◆ getWideningDecision()
◆ hasPredStores()
| bool llvm::LoopVectorizationCostModel::hasPredStores ( ) const | inline |
|---|
◆ interleavedAccessCanBeWidened()
Returns true if I is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles.
Definition at line 2971 of file LoopVectorize.cpp.
References assert(), blockNeedsPredicationForAnyReason(), CM_Unknown, DL, getInterleavedAccessGroup(), llvm::getLoadStoreAddressSpace(), llvm::getLoadStoreAlignment(), llvm::getLoadStoreType(), getWideningDecision(), hasIrregularType(), I, llvm::isa(), isAccessInterleaved(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), isScalarEpilogueAllowed(), Legal, TTI, and useMaskedInterleavedAccesses().
Referenced by setCostBasedWideningDecision().
◆ invalidateCostModelingDecisions()
| void llvm::LoopVectorizationCostModel::invalidateCostModelingDecisions ( ) | inline |
|---|
Invalidates decisions already taken by the cost model.
Definition at line 1430 of file LoopVectorize.cpp.
◆ isAccessInterleaved()
| bool llvm::LoopVectorizationCostModel::isAccessInterleaved ( Instruction * Instr) const | inline |
|---|
◆ isDivRemScalarWithPredication()
◆ isEpilogueVectorizationProfitable()
Returns true if epilogue vectorization is considered profitable, and false otherwise.
VF is the vectorization factor chosen for the original loop. Multiplier is an aditional scaling factor applied to VF before comparing to EpilogueVectorizationMinVF.
Definition at line 4362 of file LoopVectorize.cpp.
References EpilogueVectorizationMinVF, estimateElementCount(), and TTI.
◆ isInLoopReduction()
| bool llvm::LoopVectorizationCostModel::isInLoopReduction ( PHINode * Phi) const | inline |
|---|
◆ isLegalGatherOrScatter()
◆ isLegalMaskedLoad()
| bool llvm::LoopVectorizationCostModel::isLegalMaskedLoad ( Type * DataType, Value * Ptr, Align Alignment, unsigned AddressSpace ) const | inline |
|---|
◆ isLegalMaskedStore()
| bool llvm::LoopVectorizationCostModel::isLegalMaskedStore ( Type * DataType, Value * Ptr, Align Alignment, unsigned AddressSpace ) const | inline |
|---|
◆ isOptimizableIVTruncate()
◆ isPredicatedInst()
Returns true if I is an instruction that needs to be predicated at runtime.
The result is independent of the predication mechanism. Superset of instructions that return true for isScalarWithPredication.
Definition at line 2844 of file LoopVectorize.cpp.
References assert(), llvm::cast(), foldTailByMasking(), llvm::getLoadStorePointerOperand(), I, llvm::isa(), llvm::isSafeToSpeculativelyExecute(), Legal, llvm_unreachable, and TheLoop.
Referenced by getInstructionCost(), isScalarWithPredication(), setCostBasedWideningDecision(), and shouldConsiderInvariant().
◆ isProfitableToScalarize()
◆ isScalarAfterVectorization()
◆ isScalarEpilogueAllowed()
| bool llvm::LoopVectorizationCostModel::isScalarEpilogueAllowed ( ) const | inline |
|---|
◆ isScalarWithPredication()
Returns true if I is an instruction which requires predication and for which our chosen predication strategy is scalarization (i.e.
we don't have an alternate strategy such as masking available). VF is the vectorization factor that will be used to vectorize I.
Definition at line 2802 of file LoopVectorize.cpp.
References llvm::cast(), CM_Scalarize, llvm::VectorType::get(), getCallWideningDecision(), getDivRemSpeculationCost(), llvm::getLoadStoreAddressSpace(), llvm::getLoadStoreAlignment(), llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), I, llvm::isa(), isDivRemScalarWithPredication(), isLegalMaskedLoad(), isLegalMaskedStore(), isPredicatedInst(), llvm::ElementCount::isScalar(), llvm::ElementCount::isVector(), and TTI.
Referenced by collectInstsToScalarize(), memoryInstructionCanBeWidened(), and setCostBasedWideningDecision().
◆ isUniformAfterVectorization()
◆ memoryInstructionCanBeWidened()
◆ preferPredicatedLoop()
| bool llvm::LoopVectorizationCostModel::preferPredicatedLoop ( ) const | inline |
|---|
◆ requiresScalarEpilogue()
| bool llvm::LoopVectorizationCostModel::requiresScalarEpilogue ( bool IsVectorizing) const | inline |
|---|
◆ runtimeChecksRequired()
| bool LoopVectorizationCostModel::runtimeChecksRequired | ( | ) |
|---|
◆ selectUserVectorizationFactor()
| bool llvm::LoopVectorizationCostModel::selectUserVectorizationFactor ( ElementCount UserVF) | inline |
|---|
◆ setCallWideningDecision()
◆ setCostBasedWideningDecision()
| void LoopVectorizationCostModel::setCostBasedWideningDecision | ( | ElementCount | VF | ) |
|---|
Memory access instruction may be vectorized in more than one way.
Form of instruction after vectorization depends on cost. This function takes cost-based decisions for Load/Store instructions and collects them in a map. This decisions map is used for building the lists of loop-uniform and loop-scalar instructions. The calculated cost is saved with widening decision in order to avoid redundant calculations.
Definition at line 5650 of file LoopVectorize.cpp.
References llvm::append_range(), assert(), llvm::cast(), CM_GatherScatter, CM_Interleave, CM_Scalarize, CM_Unknown, CM_Widen, CM_Widen_Reverse, llvm::SmallPtrSetImpl< PtrType >::contains(), llvm::dyn_cast(), llvm::dyn_cast_or_null(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), foldTailByMasking(), llvm::ElementCount::getFixed(), getInterleavedAccessGroup(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), getWideningDecision(), I, llvm::SmallPtrSetImpl< PtrType >::insert(), interleavedAccessCanBeWidened(), llvm::isa(), isAccessInterleaved(), isLegalGatherOrScatter(), isPredicatedInst(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), isScalarWithPredication(), Legal, LI, memoryInstructionCanBeWidened(), llvm::SmallVectorImpl< T >::pop_back_val(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), setWideningDecision(), TheLoop, and TTI.
Referenced by collectNonVectorizedAndSetWideningDecisions().
◆ setTailFoldingStyles()
| void llvm::LoopVectorizationCostModel::setTailFoldingStyles ( bool IsScalableVF, unsigned UserIC ) | inline |
|---|
Selects and saves TailFoldingStyle for 2 options - if IV update may overflow or not.
Parameters
| IsScalableVF | true if scalable vector factors enabled. |
|---|---|
| UserIC | User specific interleave count. |
Definition at line 1322 of file LoopVectorize.cpp.
References assert(), llvm::CM_ScalarEpilogueAllowed, llvm::CM_ScalarEpilogueNotNeededUsePredicate, llvm::DataWithEVL, llvm::DataWithoutLaneMask, llvm::dbgs(), llvm::EnableVPlanNativePath, ForceTailFoldingStyle, Legal, LLVM_DEBUG, llvm::None, and TTI.
Referenced by computeMaxVF().
◆ setVectorizedCallDecision()
| void LoopVectorizationCostModel::setVectorizedCallDecision | ( | ElementCount | VF | ) |
|---|
A call may be vectorized in different ways depending on whether we have vectorized variants available and whether the target supports masking.
This function analyzes all calls in the function at the supplied VF, makes a decision based on the costs of available options, and stores that decision in a map for use in planning and plan execution.
Definition at line 5876 of file LoopVectorize.cpp.
References llvm::CallBase::args(), assert(), CM_IntrinsicCall, CM_Scalarize, CM_VectorCall, CostKind, llvm::dyn_cast(), llvm::CallBase::getArgOperand(), llvm::CallBase::getCalledFunction(), llvm::Module::getFunction(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::VFDatabase::getMappings(), llvm::Instruction::getModule(), llvm::VFInfo::getParamIndexForOptionalMask(), getReductionPatternCost(), getScalarizationOverhead(), llvm::ScalarEvolution::getSCEV(), llvm::Value::getType(), getVectorIntrinsicCost(), llvm::getVectorIntrinsicIDForCall(), llvm::GlobalPredicate, I, IntrinsicCost, llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isFixed(), llvm::RecurrenceDescriptor::isFMulAddIntrinsic(), llvm::CallBase::isNoBuiltin(), llvm::ElementCount::isScalar(), isUniformAfterVectorization(), isValid(), llvm::ElementCount::isVector(), Legal, llvm::SCEVPatternMatch::m_SCEV(), llvm::SCEVPatternMatch::m_scev_AffineAddRec(), llvm::SCEVPatternMatch::m_scev_SpecificSInt(), llvm::SCEVPatternMatch::m_SpecificLoop(), llvm::SCEVPatternMatch::match(), llvm::Intrinsic::not_intrinsic, llvm::OMP_Linear, llvm::OMP_Uniform, PSE, llvm::SmallVectorTemplateBase< T, bool >::push_back(), setCallWideningDecision(), TheLoop, TLI, llvm::toVectorizedTy(), TTI, and llvm::Vector.
Referenced by collectNonVectorizedAndSetWideningDecisions().
◆ setWideningDecision() [1/2]
Save vectorization decision W and [Cost](namespacellvm.html#a19921a3ceb99548f498d3df118eda9ed) taken by the cost model for interleaving group Grp and vector width VF.
Broadcast this decicion to all instructions inside the group. When interleaving, the cost will only be assigned one instruction, the insert position. For other cases, add the appropriate fraction of the total cost to each instruction. This ensures accurate costs are used, even if the insert position instruction is not used.
Definition at line 1042 of file LoopVectorize.cpp.
References assert(), CM_Interleave, llvm::InterleaveGroup< InstTy >::getFactor(), llvm::InterleaveGroup< InstTy >::getInsertPos(), llvm::InterleaveGroup< InstTy >::getMember(), llvm::InterleaveGroup< InstTy >::getNumMembers(), I, and llvm::ElementCount::isVector().
◆ setWideningDecision() [2/2]
◆ shouldConsiderInvariant()
| bool LoopVectorizationCostModel::shouldConsiderInvariant | ( | Value * | Op | ) |
|---|
◆ shouldConsiderRegPressureForVF()
| bool LoopVectorizationCostModel::shouldConsiderRegPressureForVF | ( | ElementCount | VF | ) |
|---|
◆ useMaxBandwidth()
◆ useOrderedReductions()
◆ usePredicatedReductionSelect()
| bool llvm::LoopVectorizationCostModel::usePredicatedReductionSelect ( ) const | inline |
|---|
◆ useWideActiveLaneMask()
| bool llvm::LoopVectorizationCostModel::useWideActiveLaneMask ( ) const | inline |
|---|
◆ LoopVectorizationPlanner
◆ AC
◆ BFI
◆ CostKind
The kind of cost that we are calculating.
Definition at line 1758 of file LoopVectorize.cpp.
Referenced by expectedCost(), getDivRemSpeculationCost(), getInstructionCost(), getPredBlockCostDivisor(), getReductionPatternCost(), getVectorCallCost(), getVectorIntrinsicCost(), LoopVectorizationCostModel(), llvm::LoopVectorizePass::processLoop(), processLoopInVPlanNativePath(), and setVectorizedCallDecision().
◆ DB
◆ ElementTypesInLoop
SmallPtrSet<Type *, 16> llvm::LoopVectorizationCostModel::ElementTypesInLoop
◆ GetBFI
◆ Hints
◆ InterleaveInfo
◆ Legal
Vectorization legality.
Definition at line 1708 of file LoopVectorize.cpp.
Referenced by blockNeedsPredicationForAnyReason(), canVectorizeReductions(), collectElementTypesForWidening(), collectInLoopReductions(), collectValuesToIgnore(), computeMaxVF(), expectedCost(), getInstructionCost(), getPredBlockCostDivisor(), getReductionPatternCost(), getSmallestAndWidestTypes(), interleavedAccessCanBeWidened(), isLegalMaskedLoad(), isLegalMaskedStore(), isOptimizableIVTruncate(), isPredicatedInst(), LoopVectorizationCostModel(), memoryInstructionCanBeWidened(), requiresScalarEpilogue(), runtimeChecksRequired(), setCostBasedWideningDecision(), setTailFoldingStyles(), setVectorizedCallDecision(), shouldConsiderInvariant(), and useMaxBandwidth().
◆ LI
LoopInfo* llvm::LoopVectorizationCostModel::LI
◆ MaxPermissibleVFWithoutMaxBW
◆ OptForSize
bool llvm::LoopVectorizationCostModel::OptForSize
◆ ORE
◆ PSE
◆ TheFunction
◆ TheLoop
Loop* llvm::LoopVectorizationCostModel::TheLoop
The loop that we evaluate.
Definition at line 1699 of file LoopVectorize.cpp.
Referenced by collectElementTypesForWidening(), collectInLoopReductions(), collectInstsToScalarize(), collectValuesToIgnore(), computeMaxVF(), expectedCost(), getInstructionCost(), getPredBlockCostDivisor(), getReductionPatternCost(), getWideningDecision(), isPredicatedInst(), isProfitableToScalarize(), isScalarAfterVectorization(), isUniformAfterVectorization(), LoopVectorizationCostModel(), requiresScalarEpilogue(), runtimeChecksRequired(), setCostBasedWideningDecision(), setVectorizedCallDecision(), and shouldConsiderInvariant().
◆ TLI
◆ TTI
Vector target information.
Definition at line 1711 of file LoopVectorize.cpp.
Referenced by collectElementTypesForWidening(), collectInLoopReductions(), computeMaxVF(), getDivRemSpeculationCost(), getInstructionCost(), getReductionPatternCost(), getVectorCallCost(), getVectorIntrinsicCost(), interleavedAccessCanBeWidened(), isEpilogueVectorizationProfitable(), isLegalGatherOrScatter(), isLegalMaskedLoad(), isLegalMaskedStore(), isOptimizableIVTruncate(), isScalarWithPredication(), LoopVectorizationCostModel(), llvm::LoopVectorizePass::processLoop(), setCostBasedWideningDecision(), setTailFoldingStyles(), setVectorizedCallDecision(), shouldConsiderRegPressureForVF(), useMaxBandwidth(), and usePredicatedReductionSelect().
◆ ValuesToIgnore
◆ VecValuesToIgnore
The documentation for this class was generated from the following file:
- lib/Transforms/Vectorize/LoopVectorize.cpp