LLVM: slpvectorizer::BoUpSLP Class Reference (original) (raw)

Bottom Up SLP Vectorizer. More...

Classes
struct EdgeInfo
This structure holds any data we need about the edges being traversed during buildTreeRec(). More...
class LookAheadHeuristics
A helper class used for scoring candidates for two consecutive lanes. More...
class ShuffleCostEstimator
Merges shuffle masks and emits final shuffle instruction, if required. More...
class ShuffleInstructionBuilder
Merges shuffle masks and emits final shuffle instruction, if required. More...
class VLOperands
A helper data structure to hold the operands of a vector of instructions. More...
Public Types
enum class LoadsState { Gather, Vectorize, ScatterVectorize, StridedVectorize, CompressVectorize }
Tracks the state we can represent the loads in the given sequence. More...
using ValueList = SmallVector<Value *, 8>
using InstrList = SmallVector<Instruction *, 16>
using ValueSet = SmallPtrSet<Value *, 16>
using StoreList = SmallVector<StoreInst *, 8>
using ExtraValueToDebugLocsMap = SmallDenseSet<Value *, 4>
using OrdersType = SmallVector<unsigned, 4>
Public Member Functions
BoUpSLP (Function *Func, ScalarEvolution *Se, TargetTransformInfo *Tti, TargetLibraryInfo *TLi, AAResults *Aa, LoopInfo *Li, DominatorTree *Dt, AssumptionCache *AC, DemandedBits *DB, const DataLayout *DL, OptimizationRemarkEmitter *ORE)
Value * vectorizeTree ()
Vectorize the tree that starts with the elements in VL.
Value * vectorizeTree (const ExtraValueToDebugLocsMap &ExternallyUsedValues, Instruction *ReductionRoot=nullptr, ArrayRef< std::tuple< Value *, unsigned, bool > > VectorValuesAndScales={})
Vectorize the tree but with the list of externally used values ExternallyUsedValues.
InstructionCost getSpillCost ()
InstructionCost getTreeCost (ArrayRef< Value * > VectorizedVals={}, InstructionCost ReductionCost=TTI::TCC_Free)
void buildTree (ArrayRef< Value * > Roots, const SmallDenseSet< Value * > &UserIgnoreLst)
Construct a vectorizable tree that starts at Roots, ignoring users for the purpose of scheduling and extraction in the UserIgnoreLst.
void buildTree (ArrayRef< Value * > Roots)
Construct a vectorizable tree that starts at Roots.
ArrayRef< Value * > getRootNodeScalars () const
Return the scalars of the root node.
std::optional< std::pair< Type *, bool > > getRootNodeTypeWithNoCast () const
Returns the type/is-signed info for the root node in the graph without casting.
bool isSignedMinBitwidthRootNode () const
Checks if the root graph node can be emitted with narrower bitwidth at codegen and returns it signedness, if so.
FixedVectorType * getReductionType () const
Returns reduction type after minbitdth analysis.
void buildExternalUses (const ExtraValueToDebugLocsMap &ExternallyUsedValues={})
Builds external uses of the vectorized scalars, i.e.
void transformNodes ()
Transforms graph nodes to target specific representations, if profitable.
void deleteTree ()
Clear the internal data structures that are created by 'buildTree'.
unsigned getTreeSize () const
unsigned getCanonicalGraphSize () const
Returns the base graph size, before any transformations.
void optimizeGatherSequence ()
Perform LICM and CSE on the newly generated gather sequences.
std::optional< OrdersType > findReusedOrderedScalars (const TreeEntry &TE, bool TopToBottom, bool IgnoreReorder)
Checks if the specified gather tree entry TE can be represented as a shuffled vector entry + (possibly) permutation with other gathers.
std::optional< OrdersType > findPartiallyOrderedLoads (const TreeEntry &TE)
Sort loads into increasing pointers offsets to allow greater clustering.
std::optional< OrdersType > getReorderingData (const TreeEntry &TE, bool TopToBottom, bool IgnoreReorder)
Gets reordering data for the given tree entry.
bool isProfitableToReorder () const
Checks if it is profitable to reorder the current tree.
void reorderTopToBottom ()
Reorders the current graph to the most profitable order starting from the root node to the leaf nodes.
void reorderBottomToTop (bool IgnoreReorder=false)
Reorders the current graph to the most profitable order starting from leaves to the root.
unsigned getVectorElementSize (Value *V)
void computeMinimumValueSizes ()
Compute the minimum type sizes required to represent the entries in a vectorizable tree.
unsigned getMaxVecRegSize () const
unsigned getMinVecRegSize () const
unsigned getMinVF (unsigned Sz) const
unsigned getMaximumVF (unsigned ElemWidth, unsigned Opcode) const
unsigned canMapToVector (Type *T) const
Check if homogeneous aggregate is isomorphic to some VectorType.
bool isTreeTinyAndNotFullyVectorizable (bool ForReduction=false) const
bool isTreeNotExtendable () const
Checks if the graph and all its subgraphs cannot be better vectorized.
bool isLoadCombineReductionCandidate (RecurKind RdxKind) const
Assume that a legal-sized 'or'-reduction of shifted/zexted loaded values can be load combined in the backend.
bool isLoadCombineCandidate (ArrayRef< Value * > Stores) const
Assume that a vector of stores of bitwise-or/shifted/zexted loaded values can be load combined in the backend.
bool isStridedLoad (ArrayRef< Value * > PointerOps, Type *ScalarTy, Align Alignment, const int64_t Diff, const size_t Sz) const
Checks if strided loads can be generated out of VL loads with pointers PointerOps:
bool analyzeConstantStrideCandidate (const ArrayRef< Value * > PointerOps, Type *ElemTy, Align Alignment, const SmallVectorImpl< unsigned > &SortedIndices, const int64_t Diff, Value *Ptr0, Value *PtrN, StridedPtrInfo &SPtrInfo) const
Return true if an array of scalar loads can be replaced with a strided load (with constant stride).
bool analyzeRtStrideCandidate (ArrayRef< Value * > PointerOps, Type *ScalarTy, Align CommonAlignment, SmallVectorImpl< unsigned > &SortedIndices, StridedPtrInfo &SPtrInfo) const
Return true if an array of scalar loads can be replaced with a strided load (with run-time stride).
LoadsState canVectorizeLoads (ArrayRef< Value * > VL, const Value *VL0, SmallVectorImpl< unsigned > &Order, SmallVectorImpl< Value * > &PointerOps, StridedPtrInfo &SPtrInfo, unsigned *BestVF=nullptr, bool TryRecursiveCheck=true) const
Checks if the given array of loads can be represented as a vectorized, scatter or just simple gather.
template<typename T>
void registerNonVectorizableLoads (ArrayRef< T * > VL)
Registers non-vectorizable sequence of loads.
template<typename T>
bool areKnownNonVectorizableLoads (ArrayRef< T * > VL) const
Checks if the given loads sequence is known as not vectorizable.
OptimizationRemarkEmitter * getORE ()
std::optional< int > findBestRootPair (ArrayRef< std::pair< Value *, Value * > > Candidates, int Limit=LookAheadHeuristics::ScoreFail) const
Evaluate each pair in Candidates and return index into Candidates for a pair which have highest score deemed to have best chance to form root of profitable tree to vectorize.
bool isDeleted (Instruction *I) const
Checks if the instruction is marked for deletion.
void eraseInstruction (Instruction *I)
Removes an instruction from its block and eventually deletes it.
template<typename T>
void removeInstructionsAndOperands (ArrayRef< T * > DeadVals, ArrayRef< std::tuple< Value *, unsigned, bool > > VectorValuesAndScales)
Remove instructions from the parent function and clear the operands of DeadVals instructions, marking for deletion trivially dead operands.
bool isAnalyzedReductionRoot (Instruction *I) const
Checks if the instruction was already analyzed for being possible reduction root.
void analyzedReductionRoot (Instruction *I)
Register given instruction as already analyzed for being possible reduction root.
bool areAnalyzedReductionVals (ArrayRef< Value * > VL) const
Checks if the provided list of reduced values was checked already for vectorization.
void analyzedReductionVals (ArrayRef< Value * > VL)
Adds the list of reduced values to list of already checked values for the vectorization.
void clearReductionData ()
Clear the list of the analyzed reduction root instructions.
bool isAnyGathered (const SmallDenseSet< Value * > &Vals) const
Checks if the given value is gathered in one of the nodes.
bool isGathered (const Value *V) const
Checks if the given value is gathered in one of the nodes.
bool isNotScheduled (const Value *V) const
Checks if the specified value was not schedule.
bool isVectorized (const Value *V) const
Check if the value is vectorized in the tree.
~BoUpSLP ()
template<typename BVTy, typename ResTy, typename... Args>
ResTy processBuildVector (const TreeEntry *E, Type *ScalarTy, Args &...Params)
Static Public Member Functions
static bool isIdentityOrder (ArrayRef< unsigned > Order)
Does this non-empty order represent an identity order?
Friends
struct DenseMapInfo< EdgeInfo >
struct GraphTraits< BoUpSLP * >
struct DOTGraphTraits< BoUpSLP * >
raw_ostream & operator<< (raw_ostream &OS, const BoUpSLP::ScheduleEntity &SE)
raw_ostream & operator<< (raw_ostream &OS, const BoUpSLP::ScheduleData &SD)
raw_ostream & operator<< (raw_ostream &OS, const BoUpSLP::ScheduleBundle &Bundle)
raw_ostream & operator<< (raw_ostream &OS, const BoUpSLP::ScheduleCopyableData &SD)

Bottom Up SLP Vectorizer.

Definition at line 1924 of file SLPVectorizer.cpp.

ExtraValueToDebugLocsMap

InstrList

OrdersType

StoreList

ValueList

ValueSet

LoadsState

Tracks the state we can represent the loads in the given sequence.

Enumerator
Gather
Vectorize
ScatterVectorize
StridedVectorize
CompressVectorize

Definition at line 1948 of file SLPVectorizer.cpp.

slpvectorizer::BoUpSLP::BoUpSLP ( Function * Func, ScalarEvolution * Se, TargetTransformInfo * Tti, TargetLibraryInfo * TLi, AAResults * Aa, LoopInfo * Li, DominatorTree * Dt, AssumptionCache * AC, DemandedBits * DB, const DataLayout * DL, OptimizationRemarkEmitter * ORE ) inline

~BoUpSLP()

analyzeConstantStrideCandidate()

Return true if an array of scalar loads can be replaced with a strided load (with constant stride).

It is possible that the load gets "widened". Suppose that originally each load loads k bytes and PointerOps can be arranged as follows (s is constant): b + 0 * s + 0 b + 0 * s + 1 b + 0 * s + 2 ... b + 0 * s + (w - 1)

b + 1 * s + 0 b + 1 * s + 1 b + 1 * s + 2 ... b + 1 * s + (w - 1) ...

b + (n - 1) * s + 0 b + (n - 1) * s + 1 b + (n - 1) * s + 2 ... b + (n - 1) * s + (w - 1)

In this case we will generate a strided load of type <n x (k * w)>.

Parameters

PointerOps list of pointer arguments of loads.
ElemTy original scalar type of loads.
Alignment alignment of the first load.
SortedIndices is the order of PointerOps as returned by sortPtrAccesses
Diff Pointer difference between the lowest and the highes pointer in PointerOps as returned by getPointersDiff.
Ptr0 first pointer in PointersOps.
PtrN last pointer in PointersOps.
SPtrInfo If the function return true, it also sets all the fields of SPtrInfo necessary to generate the strided load later.

Definition at line 7012 of file SLPVectorizer.cpp.

References llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::find_if(), llvm::Type::getIntNTy(), llvm::getPointersDiff(), llvm::Value::getType(), getWidenedType(), I, isStridedLoad(), llvm::seq(), and llvm::ArrayRef< T >::size().

Referenced by canVectorizeLoads().

analyzedReductionRoot()

void slpvectorizer::BoUpSLP::analyzedReductionRoot ( Instruction * I) inline

analyzedReductionVals()

void slpvectorizer::BoUpSLP::analyzedReductionVals ( ArrayRef< Value * > VL) inline

analyzeRtStrideCandidate()

Return true if an array of scalar loads can be replaced with a strided load (with run-time stride).

Parameters

PointerOps list of pointer arguments of loads.
ScalarTy type of loads.
CommonAlignment common alignement of loads as computed by computeCommonAlignment<LoadInst>.
SortedIndicies is a list of indicies computed by this function such that the sequence PointerOps[SortedIndices[0]], / PointerOps[SortedIndicies[1]], ..., PointerOps[SortedIndices[n]] is ordered by the coefficient of the stride. For example, if PointerOps is base + stride, base, base + 2 * stride the SortedIndices will be [1, 0, 2]. We follow the convention that if SortedIndices has to be 0, 1, 2, 3, ... we return empty vector for SortedIndicies.
SPtrInfo If the function return true, it also sets all the fields of SPtrInfo necessary to generate the strided load later.

Definition at line 7106 of file SLPVectorizer.cpp.

References calculateRtStride(), getWidenedType(), and llvm::ArrayRef< T >::size().

Referenced by canVectorizeLoads().

areAnalyzedReductionVals()

bool slpvectorizer::BoUpSLP::areAnalyzedReductionVals ( ArrayRef< Value * > VL) const inline

areKnownNonVectorizableLoads()

template<typename T>

bool slpvectorizer::BoUpSLP::areKnownNonVectorizableLoads ( ArrayRef< T * > VL) const inline

buildExternalUses()

void BoUpSLP::buildExternalUses ( const ExtraValueToDebugLocsMap & ExternallyUsedValues = {} )

Builds external uses of the vectorized scalars, i.e.

the list of vectorized scalars to be extracted, their lanes and their scalar users. ExternallyUsedValues contains additional list of external uses to handle vectorization of reductions.

Definition at line 8937 of file SLPVectorizer.cpp.

References llvm::all_of(), assert(), llvm::dbgs(), doesInTreeUserNeedToExtract(), llvm::dyn_cast(), llvm::ArrayRef< T >::empty(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::end(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::find(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::find(), llvm::isa(), isDeleted(), LLVM_DEBUG, llvm::none_of(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::try_emplace(), llvm::Value::users(), and UsesLimit.

buildTree() [1/2]

buildTree() [2/2]

canMapToVector()

Check if homogeneous aggregate is isomorphic to some VectorType.

Accepts homogeneous multidimensional aggregate of scalars/vectors like {[4 x i16], [4 x i16]}, { <2 x float>, <2 x float> }, {{{i16, i16}, {i16, i16}}, {{i16, i16}, {i16, i16}}} and so on.

Returns

number of elements in vector if isomorphism exists, 0 otherwise.

Definition at line 12130 of file SLPVectorizer.cpp.

References llvm::cast(), llvm::dyn_cast(), getWidenedType(), llvm::isa(), llvm::Type::isEmptyTy(), isValidElementType(), and N.

canVectorizeLoads()

Checks if the given array of loads can be represented as a vectorized, scatter or just simple gather.

Parameters

VL list of loads.
VL0 main load value.
Order returned order of load instructions.
PointerOps returned list of pointer operands.
BestVF return best vector factor, if recursive check found better vectorization sequences rather than masked gather.
TryRecursiveCheck used to check if long masked gather can be represented as a serie of loads/insert subvector, if profitable.

Definition at line 7124 of file SLPVectorizer.cpp.

References llvm::all_of(), analyzeConstantStrideCandidate(), analyzeRtStrideCandidate(), llvm::any_of(), areKnownNonVectorizableLoads(), arePointersCompatible(), llvm::ArrayRef(), llvm::SmallVectorTemplateCommon< T, typename >::back(), llvm::ArrayRef< T >::begin(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::CallingConv::C, canVectorizeLoads(), llvm::cast(), llvm::SmallVectorImpl< T >::clear(), llvm::APInt::clearAllBits(), CompressVectorize, computeCommonAlignment(), CostKind, llvm::count_if(), doesNotNeedToBeScheduled(), llvm::dyn_cast(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::ArrayRef< T >::end(), llvm::enumerate(), llvm::ArrayRef< T >::front(), llvm::SmallVectorTemplateCommon< T, typename >::front(), Gather, GEP, llvm::APInt::getAllOnes(), getFloorFullVectorNumberOfElements(), getGEPCosts(), getMinVF(), llvm::APInt::getOneBitSet(), getParent(), llvm::getPointerOperand(), llvm::getPointersDiff(), getScalarizationOverhead(), getShuffleCost(), llvm::Value::getType(), llvm::getUnderlyingObject(), getWidenedType(), hasFullVectorsOrPowerOf2(), I, llvm::isa(), llvm::APInt::isAllOnes(), llvm::IsaPred, isMaskedLoadCompress(), isReverseOrder(), llvm::APInt::isZero(), P, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::SmallVectorImpl< T >::resize(), ScatterVectorize, llvm::seq(), llvm::APInt::setAllBits(), llvm::APInt::setBits(), llvm::ArrayRef< T >::size(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::TargetTransformInfo::SK_Broadcast, llvm::TargetTransformInfo::SK_InsertSubvector, llvm::TargetTransformInfo::SK_PermuteSingleSrc, llvm::ArrayRef< T >::slice(), SLPCostThreshold, llvm::sortPtrAccesses(), StridedVectorize, llvm::TargetTransformInfo::TCK_RecipThroughput, and Vectorize.

Referenced by canVectorizeLoads(), and getReorderingData().

clearReductionData()

void slpvectorizer::BoUpSLP::clearReductionData ( ) inline

Clear the list of the analyzed reduction root instructions.

Definition at line 3641 of file SLPVectorizer.cpp.

computeMinimumValueSizes()

void BoUpSLP::computeMinimumValueSizes ( )

Compute the minimum type sizes required to represent the entries in a vectorizable tree.

Definition at line 22634 of file SLPVectorizer.cpp.

References llvm::all_of(), llvm::any_of(), assert(), llvm::bit_ceil(), llvm::SmallVectorImpl< T >::clear(), llvm::ComputeNumSignBits(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::contains(), llvm::dyn_cast(), llvm::IntegerType::get(), getNumberOfParts(), getNumElements(), getOpcode(), llvm::Type::getScalarType(), llvm::Value::getType(), getWidenedType(), I, llvm::isa(), llvm::isGather(), and llvm::SmallVectorTemplateBase< T, bool >::push_back().

deleteTree()

void slpvectorizer::BoUpSLP::deleteTree ( ) inline

eraseInstruction()

void slpvectorizer::BoUpSLP::eraseInstruction ( Instruction * I) inline

findBestRootPair()

findPartiallyOrderedLoads()

std::optional< BoUpSLP::OrdersType > BoUpSLP::findPartiallyOrderedLoads ( const TreeEntry & TE )

findReusedOrderedScalars()

std::optional< BoUpSLP::OrdersType > BoUpSLP::findReusedOrderedScalars ( const TreeEntry & TE,
bool TopToBottom,
bool IgnoreReorder )

Checks if the specified gather tree entry TE can be represented as a shuffled vector entry + (possibly) permutation with other gathers.

It implements the checks only for possibly ordered scalars (Loads, ExtractElement, ExtractValue), which can be part of the graph.

Parameters

TopToBottom If true, used for the whole tree rotation, false - for sub-tree rotations.
IgnoreReorder true, if the order of the root node might be ignored.

Definition at line 6307 of file SLPVectorizer.cpp.

References llvm::SmallBitVector::all(), llvm::all_of(), llvm::SmallBitVector::any(), llvm::any_of(), assert(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::cast(), llvm::count(), llvm::dyn_cast(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::SmallVectorTemplateCommon< T, typename >::end(), llvm::enumerate(), llvm::fill(), llvm::find(), llvm::SmallVectorTemplateCommon< T, typename >::front(), getNumberOfParts(), getNumElems(), getPartNumElems(), getWidenedType(), I, llvm::isa(), isConstant(), isValidElementType(), P, llvm::PoisonMaskElem, llvm::seq(), llvm::SmallBitVector::set(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::TargetTransformInfo::SK_PermuteSingleSrc, and llvm::SmallBitVector::test().

Referenced by getReorderingData().

getCanonicalGraphSize()

unsigned slpvectorizer::BoUpSLP::getCanonicalGraphSize ( ) const inline

getMaximumVF()

getMaxVecRegSize()

unsigned slpvectorizer::BoUpSLP::getMaxVecRegSize ( ) const inline

getMinVecRegSize()

unsigned slpvectorizer::BoUpSLP::getMinVecRegSize ( ) const inline

getMinVF()

getORE()

getReductionType()

FixedVectorType * slpvectorizer::BoUpSLP::getReductionType ( ) const inline

getReorderingData()

std::optional< BoUpSLP::OrdersType > BoUpSLP::getReorderingData ( const TreeEntry & TE,
bool TopToBottom,
bool IgnoreReorder )

Gets reordering data for the given tree entry.

If the entry is vectorized

Definition at line 7627 of file SLPVectorizer.cpp.

References addMask(), llvm::all_of(), allConstant(), allSameType(), llvm::any_of(), llvm::ArrayRef(), assert(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), canVectorizeLoads(), llvm::cast(), llvm::Instruction::comesBefore(), CompressVectorize, llvm::count_if(), llvm::Data, llvm::divideCeil(), llvm::dyn_cast(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::SmallVectorTemplateCommon< T, typename >::end(), llvm::enumerate(), llvm::find(), llvm::find_if_not(), findPartiallyOrderedLoads(), findReusedOrderedScalars(), fixupOrderingIndices(), llvm::SmallVectorTemplateCommon< T, typename >::front(), llvm::PoisonValue::get(), getElementIndex(), getExtractIndex(), getNumberOfParts(), llvm::Value::getNumUses(), getParent(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), getShuffleCost(), getWidenedType(), I, II, inversePermutation(), llvm::isa(), isAlternateInstruction(), llvm::IsaPred, llvm::Instruction::isBinaryOp(), isConstant(), isIdentityOrder(), llvm::ShuffleVectorInst::isOneUseSingleSourceMask(), isReverseOrder(), isSplat(), llvm::PoisonMaskElem, reorderOrder(), llvm::seq(), llvm::SmallBitVector::set(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::TargetTransformInfo::SK_PermuteSingleSrc, llvm::stable_sort(), StridedVectorize, llvm::TargetTransformInfo::TCK_RecipThroughput, llvm::SmallBitVector::test(), llvm::transform(), llvm::Value::use_empty(), llvm::Value::user_begin(), Vectorize, VectorizeNonPowerOf2, and llvm::zip().

Referenced by reorderBottomToTop(), and reorderTopToBottom().

getRootNodeScalars()

ArrayRef< Value * > slpvectorizer::BoUpSLP::getRootNodeScalars ( ) const inline

getRootNodeTypeWithNoCast()

std::optional< std::pair< Type *, bool > > slpvectorizer::BoUpSLP::getRootNodeTypeWithNoCast ( ) const inline

getSpillCost()

Returns

the cost incurred by unwanted spills and fills, caused by holding live values over call sites.

Definition at line 15805 of file SLPVectorizer.cpp.

References allConstant(), llvm::SmallVectorImpl< T >::append(), assert(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::at(), BlockSize, llvm::cast(), Cleanup, llvm::Instruction::comesBefore(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::contains(), llvm::SmallPtrSetImpl< PtrType >::contains(), llvm::dyn_cast(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::find(), llvm::First, llvm::IntegerType::get(), llvm::IntrinsicCostAttributes::getArgTypes(), llvm::Type::getContext(), llvm::BasicBlock::getFirstNonPHIOrDbgOrAlloca(), llvm::BasicBlock::getParent(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::BasicBlock::getTerminator(), getWidenedType(), I, II, llvm::InnerAnalysisManagerProxy< AnalysisManagerT, IRUnitT, ExtraArgTs... >::Key, llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::insert(), llvm::SmallPtrSetImpl< PtrType >::insert(), llvm::isa(), isVectorized(), llvm::Type::isVectorTy(), llvm::Last, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::lookup(), llvm::make_scope_exit(), llvm::SmallVectorImpl< T >::pop_back_val(), llvm::pred_begin(), llvm::pred_end(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), ScheduleRegionSizeBudget, llvm::BasicBlock::size(), llvm::TargetTransformInfo::TCK_RecipThroughput, and llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::try_emplace().

Referenced by getTreeCost().

getTreeCost()

Returns

the vectorization cost of the subtree that starts at VL. A negative number means that this is profitable.

Definition at line 16198 of file SLPVectorizer.cpp.

References llvm::all_of(), llvm::any_of(), areTwoInsertFromSameBuildVector(), assert(), llvm::sampleprof::Base, llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::CallingConv::C, llvm::cast(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::contains(), CostKind, llvm::count_if(), llvm::Data, llvm::dbgs(), llvm::dump(), llvm::dyn_cast(), llvm::dyn_cast_if_present(), llvm::dyn_cast_or_null(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::ArrayRef< T >::empty(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), llvm::SmallVectorTemplateCommon< T, typename >::end(), llvm::enumerate(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::find(), llvm::find_if(), llvm::SmallVectorTemplateCommon< T, typename >::front(), llvm::IntegerType::get(), llvm::Type::getContext(), getElementIndex(), getExtractWithExtendCost(), llvm::User::getOperand(), llvm::Type::getScalarType(), getShuffleCost(), getSpillCost(), llvm::InsertElementInst::getType(), getVectorInstrCost(), getWidenedType(), llvm::APInt::getZero(), llvm::has_single_bit(), I, II, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::insert(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::insert(), llvm::SmallPtrSetImpl< PtrType >::insert(), llvm::is_contained(), llvm::isa(), llvm::isa_and_nonnull(), llvm::isa_and_present(), isFirstInsertElement(), llvm::ShuffleVectorInst::isIdentityMask(), llvm::isKnownNonNegative(), isVectorized(), LLVM_DEBUG, llvm::MutableArrayRef(), llvm::TargetTransformInfo::None, llvm::none_of(), P, performExtractsShuffleAction(), llvm::PoisonMaskElem, llvm::SmallVectorTemplateBase< T, bool >::push_back(), shortBundleName(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::TargetTransformInfo::SK_PermuteSingleSrc, llvm::TargetTransformInfo::SK_PermuteTwoSrc, SLPCostThreshold, SLPReVec, llvm::TargetTransformInfo::TCC_Basic, llvm::TargetTransformInfo::TCC_Free, llvm::TargetTransformInfo::TCK_RecipThroughput, UsesLimit, Vector, llvm::ViewGraph(), and ViewSLPTree.

getTreeSize()

unsigned slpvectorizer::BoUpSLP::getTreeSize ( ) const inline

getVectorElementSize()

Returns

The vector element size in bits to use when vectorizing the expression tree ending at V. If V is a store, the size is the width of the stored value. Otherwise, the size is the width of the largest loaded value reaching V. This method is used by the vectorizer to calculate vectorization factors.

Definition at line 22108 of file SLPVectorizer.cpp.

References llvm::dyn_cast(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::Type::getInt1Ty(), llvm::Value::getType(), getVectorElementSize(), I, llvm::SmallPtrSetImpl< PtrType >::insert(), llvm::isa(), llvm::SmallVectorImpl< T >::pop_back_val(), and RecursionMaxDepth.

Referenced by getVectorElementSize().

isAnalyzedReductionRoot()

bool slpvectorizer::BoUpSLP::isAnalyzedReductionRoot ( Instruction * I) const inline

Checks if the instruction was already analyzed for being possible reduction root.

Definition at line 3622 of file SLPVectorizer.cpp.

References I.

isAnyGathered()

isDeleted()

isGathered()

bool slpvectorizer::BoUpSLP::isGathered ( const Value * V) const inline

Checks if the given value is gathered in one of the nodes.

Definition at line 3651 of file SLPVectorizer.cpp.

isIdentityOrder()

isLoadCombineCandidate()

isLoadCombineReductionCandidate()

bool BoUpSLP::isLoadCombineReductionCandidate ( RecurKind RdxKind ) const

Assume that a legal-sized 'or'-reduction of shifted/zexted loaded values can be load combined in the backend.

Load combining may not be allowed in the IR optimizer, so we do not want to alter the pattern. For example, partially transforming a scalar bswap() pattern into vector code is effectively impossible for the backend to undo. TODO: If load combining is allowed in the IR optimizer, this analysis may not be necessary.

Definition at line 15572 of file SLPVectorizer.cpp.

References isLoadCombineCandidateImpl(), and llvm::Or.

isNotScheduled()

bool slpvectorizer::BoUpSLP::isNotScheduled ( const Value * V) const inline

isProfitableToReorder()

bool BoUpSLP::isProfitableToReorder ( ) const

isSignedMinBitwidthRootNode()

bool slpvectorizer::BoUpSLP::isSignedMinBitwidthRootNode ( ) const inline

Checks if the root graph node can be emitted with narrower bitwidth at codegen and returns it signedness, if so.

Definition at line 2046 of file SLPVectorizer.cpp.

isStridedLoad()

Checks if strided loads can be generated out of VL loads with pointers PointerOps:

  1. Target with strided load support is detected.
  2. The number of loads is greater than MinProfitableStridedLoads, or the potential stride <= MaxProfitableLoadStride and the potential stride is power-of-2 (to avoid perf regressions for the very small number of loads) and max distance > number of loads, or potential stride is -1.
  3. The loads are ordered, or number of unordered loads <= MaxProfitableUnorderedLoads, or loads are in reversed order. (this check is to avoid extra costs for very expensive shuffles).
  4. Any pointer operand is an instruction with the users outside of the current graph (for masked gathers extra extractelement instructions might be required).

Definition at line 6981 of file SLPVectorizer.cpp.

References llvm::any_of(), getWidenedType(), llvm::has_single_bit(), llvm::isa(), MaxProfitableLoadStride, and MinProfitableStridedLoads.

Referenced by analyzeConstantStrideCandidate().

isTreeNotExtendable()

bool BoUpSLP::isTreeNotExtendable ( ) const

Checks if the graph and all its subgraphs cannot be better vectorized.

It may happen, if all gather nodes are loads and they cannot be "clusterized". In this case even subgraphs cannot be vectorized more effectively than the base graph.

Definition at line 15771 of file SLPVectorizer.cpp.

References llvm::all_of(), allConstant(), allSameBlock(), llvm::ArrayRef(), llvm::count_if(), getCanonicalGraphSize(), getSameOpcode(), getTreeSize(), llvm::isa(), llvm::IsaPred, isSplat(), and llvm::seq().

isTreeTinyAndNotFullyVectorizable()

bool BoUpSLP::isTreeTinyAndNotFullyVectorizable ( bool ForReduction = false ) const

Returns

True if the VectorizableTree is both tiny and not fully vectorizable. We do not vectorize such trees.

Definition at line 15595 of file SLPVectorizer.cpp.

References llvm::all_of(), allConstant(), allSameBlock(), llvm::any_of(), llvm::ArrayRef(), assert(), llvm::count_if(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::APInt::getAllOnes(), getWidenedType(), llvm::isa(), llvm::IsaPred, isSplat(), MinTreeSize, llvm::none_of(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::DebugCounter::shouldExecute(), llvm::SmallVectorTemplateCommon< T, typename >::size(), SLPCostThreshold, and llvm::TargetTransformInfo::TCK_RecipThroughput.

isVectorized()

bool slpvectorizer::BoUpSLP::isVectorized ( const Value * V) const inline

optimizeGatherSequence()

void BoUpSLP::optimizeGatherSequence ( )

Perform LICM and CSE on the newly generated gather sequences.

Definition at line 20972 of file SLPVectorizer.cpp.

References A(), llvm::any_of(), assert(), llvm::SmallVectorImpl< T >::assign(), B(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::SmallPtrSetImplBase::clear(), llvm::dbgs(), llvm::dyn_cast(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::SmallVectorTemplateCommon< T, typename >::end(), eraseInstruction(), llvm::ilist_node_impl< OptionsT >::getIterator(), getNumberOfParts(), llvm::BasicBlock::getTerminator(), getWidenedType(), I, llvm::is_contained(), llvm::isa(), isDeleted(), LLVM_DEBUG, llvm::make_early_inc_range(), N, llvm::PoisonMaskElem, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::SmallVectorImpl< T >::reserve(), llvm::ArrayRef< T >::size(), llvm::SmallVectorTemplateCommon< T, typename >::size(), and llvm::sort().

processBuildVector()

template<typename BVTy, typename ResTy, typename... Args>

ResTy slpvectorizer::BoUpSLP::processBuildVector ( const TreeEntry * E,
Type * ScalarTy,
Args &... Params )

Definition at line 18652 of file SLPVectorizer.cpp.

References llvm::all_of(), llvm::any_of(), llvm::SmallVectorImpl< T >::append(), llvm::ArrayRef(), assert(), llvm::SmallVectorImpl< T >::assign(), llvm::SmallVectorTemplateCommon< T, typename >::back(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::cast(), llvm::SmallVectorImpl< T >::clear(), llvm::copy(), llvm::count_if(), llvm::dbgs(), llvm::dyn_cast(), llvm::ArrayRef< T >::empty(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::SmallVectorTemplateCommon< T, typename >::end(), llvm::enumerate(), llvm::find_if(), llvm::find_if_not(), llvm::for_each(), llvm::SmallVectorTemplateCommon< T, typename >::front(), llvm::PoisonValue::get(), llvm::UndefValue::get(), getNumberOfParts(), getNumElems(), getPartNumElems(), llvm::Value::getType(), getWidenedType(), I, if(), inversePermutation(), llvm::is_contained(), llvm::isa(), llvm::IsaPred, isConstant(), llvm::ShuffleVectorInst::isExtractSubvectorMask(), llvm::isGuaranteedNotToBePoison(), llvm::ShuffleVectorInst::isIdentityMask(), isSplat(), isVectorized(), LLVM_DEBUG, llvm::MutableArrayRef(), llvm::none_of(), P, llvm::PoisonMaskElem, llvm::SmallVectorTemplateBase< T, bool >::push_back(), reorderScalars(), llvm::seq(), shortBundleName(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::TargetTransformInfo::SK_PermuteSingleSrc, llvm::TargetTransformInfo::SK_PermuteTwoSrc, llvm::ArrayRef< T >::slice(), std::swap(), llvm::transform(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::try_emplace(), and llvm::zip().

registerNonVectorizableLoads()

template<typename T>

void slpvectorizer::BoUpSLP::registerNonVectorizableLoads ( ArrayRef< T * > VL) inline

removeInstructionsAndOperands()

Remove instructions from the parent function and clear the operands of DeadVals instructions, marking for deletion trivially dead operands.

Definition at line 3537 of file SLPVectorizer.cpp.

References llvm::all_of(), assert(), llvm::cast(), llvm::cast_or_null(), llvm::dyn_cast(), llvm::dyn_cast_if_present(), llvm::ArrayRef< T >::empty(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), eraseInstruction(), I, llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::insert(), llvm::isInstructionTriviallyDead(), llvm::none_of(), llvm::SmallVectorImpl< T >::pop_back_val(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::salvageDebugInfo(), llvm::Value::use_empty(), and llvm::wouldInstructionBeTriviallyDead().

reorderBottomToTop()

void BoUpSLP::reorderBottomToTop ( bool IgnoreReorder = false )

Reorders the current graph to the most profitable order starting from leaves to the root.

It allows to rotate small subgraphs and reduce the number of reshuffles if the leaf nodes use the same order. In this case we can merge the orders and just shuffle user node instead of shuffling its operands. Plus, even the leaf nodes have different orders, it allows to sink reordering in the graph closer to the root node and merge it later during analysis.

Definition at line 8519 of file SLPVectorizer.cpp.

References AbstractManglingParser< Derived, Alloc >::NumOps, AbstractManglingParser< Derived, Alloc >::Ops, llvm::all_of(), llvm::any_of(), assert(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::SmallPtrSetImplBase::clear(), combineOrders(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::contains(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::count(), llvm::count_if(), llvm::Data, llvm::dyn_cast(), llvm::ArrayRef< T >::empty(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::SmallPtrSetImpl< PtrType >::erase(), llvm::find_if_not(), fixupOrderingIndices(), Gather, getReorderingData(), llvm::getVectorIntrinsicIDForCall(), I, llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::insert(), llvm::SetVector< T, Vector, Set, N >::insert(), llvm::SmallPtrSetImpl< PtrType >::insert(), llvm::SmallPtrSetImpl< PtrType >::insert_range(), inversePermutation(), llvm::isa(), isConstant(), isIdentityOrder(), llvm::isVectorIntrinsicWithScalarOpAtArg(), llvm::make_second_range(), llvm::Intrinsic::not_intrinsic, P, llvm::PoisonMaskElem, llvm::SmallVectorTemplateBase< T, bool >::push_back(), reorderOrder(), reorderReuses(), reorderScalars(), llvm::seq(), llvm::ArrayRef< T >::size(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::SetVector< T, Vector, Set, N >::takeVector(), llvm::transform(), and Users.

reorderTopToBottom()

void BoUpSLP::reorderTopToBottom ( )

Reorders the current graph to the most profitable order starting from the root node to the leaf nodes.

The best order is chosen only from the nodes of the same size (vectorization factor). Smaller nodes are considered parts of subgraph with smaller VF and they are reordered independently. We can make it because we still need to extend smaller nodes to the wider VF and we can merge reordering shuffles with the widening shuffles.

Definition at line 8177 of file SLPVectorizer.cpp.

References addMask(), assert(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), Cleanup, combineOrders(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::count(), llvm::ArrayRef< T >::empty(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::empty(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::erase(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::find(), fixupOrderingIndices(), llvm::for_each(), getAltInstrMask(), getOpcode(), getReorderingData(), getWidenedType(), I, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::insert(), inversePermutation(), llvm::isa(), isIdentityOrder(), llvm::make_scope_exit(), llvm::PoisonMaskElem, RecursionMaxDepth, reorderOrder(), reorderScalars(), llvm::ArrayRef< T >::size(), SLPReVec, llvm::transform(), and llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::try_emplace().

transformNodes()

void BoUpSLP::transformNodes ( )

Transforms graph nodes to target specific representations, if profitable.

Definition at line 13046 of file SLPVectorizer.cpp.

References llvm::all_of(), llvm::any_of(), CostKind, llvm::count_if(), llvm::dyn_cast(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::ArrayRef< T >::empty(), findBestRootPair(), I, llvm::is_contained(), llvm::isa(), P, slpvectorizer::BoUpSLP::LookAheadHeuristics::ScoreSplatLoads, llvm::seq(), and llvm::TargetTransformInfo::TCK_RecipThroughput.

vectorizeTree() [1/2]

Value * BoUpSLP::vectorizeTree ( )

vectorizeTree() [2/2]

Vectorize the tree but with the list of externally used values ExternallyUsedValues.

Values in this MapVector can be replaced but the generated extractvalue instructions.

Definition at line 20302 of file SLPVectorizer.cpp.

References llvm::all_of(), llvm::any_of(), areTwoInsertFromSameBuildVector(), assert(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::cast(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::contains(), llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::count(), createExtractVector(), llvm::Data, llvm::dbgs(), llvm::dyn_cast(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::ArrayRef< T >::empty(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), llvm::SmallVectorTemplateCommon< T, typename >::end(), eraseInstruction(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::find(), llvm::find_if(), llvm::get(), getElementIndex(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::User::getOperand(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::InsertElementInst::getType(), llvm::Value::getType(), getWidenedType(), I, II, llvm::InnerAnalysisManagerProxy< AnalysisManagerT, IRUnitT, ExtraArgTs... >::Key, llvm::detail::DenseSetImpl< ValueT, MapTy, ValueInfoT >::insert(), llvm::is_contained(), llvm::isa(), llvm::Type::isIntOrIntVectorTy(), llvm::isKnownNonNegative(), isVectorized(), IV, LLVM_DEBUG, llvm::mayHaveNonDefUseDependency(), llvm::Instruction::moveAfter(), PHI, llvm::PoisonMaskElem, llvm::User::replaceUsesOfWith(), llvm::seq(), llvm::SmallVectorTemplateCommon< T, typename >::size(), SLPReVec, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::try_emplace(), llvm::User::User(), llvm::Value::users(), and vectorizeTree().

DenseMapInfo< EdgeInfo >

DOTGraphTraits< BoUpSLP * >

GraphTraits< BoUpSLP * >

operator<< [1/4]

operator<< [2/4]

operator<< [3/4]

operator<< [4/4]


The documentation for this class was generated from the following file: