Fennel: LcsClusterNodeWriter Class Reference (original) (raw)

Constructs a cluster page, managing the amount of space currently in use on the page and determining the offsets where different elements are to be stored. More...

#include <[LcsClusterNodeWriter.h](LcsClusterNodeWriter%5F8h-source.html)>

Inheritance diagram for LcsClusterNodeWriter:

List of all members.

Public Member Functions
LcsClusterNodeWriter (BTreeDescriptor const &treeDescriptorInit, SegmentAccessor const &accessorInit, TupleDescriptor const &colTupleDescInit, SharedTraceTarget pTraceTargetInit, std::string nameInit)
~LcsClusterNodeWriter ()
bool getLastClusterPageForWrite (PLcsClusterNode &pBlock, LcsRid &firstRid)
Gets the last cluster page.
PLcsClusterNode allocateClusterPage (LcsRid firstRid)
Allocates a new cluster page.
void init (uint nColumns, PBuffer indexBlock, PBuffer *pBlock, uint szBlock)
Initializes object with parameters relevant to the cluster page that will be written.
void close ()
void openNew (LcsRid startRID)
Prepares a cluster page as a new one.
bool openAppend (uint *nValOffsets, uint16_t *lastValOffsets, RecordNum &nrows)
Prepares an existing cluster page for appending new data, and determines whether the page is already full and cannot accomodate any more data.
void describeLastBatch (uint column, uint &dRow, uint &recSize)
Returns parameters describing the last batch for a given column.
uint16_t getNextVal (uint column, uint16_t thisVal)
Returns the offset of the next value in a batch.
void rollBackLastBatch (uint column, PBuffer pVal)
Rolls back the last 8 value (or less) from a batch.
bool noCompressMode (uint column) const
Returns true if the batch is not being forced to compress mode.
PBuffer getOffsetPtr (uint column, uint16_t offset)
Translates an offset for a column to the pointer to the actual value.
bool addValue (uint column, bool bFirstTimeInBatch)
Adds a value to the page, in the case where the value already exists in the column.
bool addValue (uint column, PBuffer pVal, uint16_t *oVal)
Adds a new value to the page.
void undoValue (uint column, PBuffer pVal, bool bFirstInBatch)
Undoes the last value added to the current batch for a column.
void putCompressedBatch (uint column, PBuffer pRows, PBuffer pBuf)
Writes a compressed mode batch into the temporary cluster page for a column.
void putFixedVarBatch (uint column, uint16_t *pRows, PBuffer pBuf)
Writes a fixed or variable mode batch into a temporary cluster page for a column.
void pickCompressionMode (uint column, uint fixedSize, uint nRows, uint16_t **pValOffset, LcsBatchMode &compressionMode)
Determines which compression mode to use for a batch.
bool isEndOfBlock ()
Returns true if there is no space left in the cluster page.
void endBlock ()
Done with the current cluster page.
uint getNumClusterCols ()
Returns number of columns in cluster.
void setNumClusterCols (uint nCols)
Sets number of columns in cluster.
void unlockClusterPage ()
Unlocks cluster page.
virtual void initTraceSource (SharedTraceTarget pTraceTarget, std::string name)
For use when initialization has to be deferred until after construction.
void trace (TraceLevel level, std::string message) const
Records a trace message.
bool isTracing () const
**Returns:**true iff tracing is enabled for this source
bool isTracingLevel (TraceLevel level) const
Determines whether a particular level is being traced.
TraceTarget & getTraceTarget () const
**Returns:**the TraceTarget for this source
SharedTraceTarget getSharedTraceTarget () const
**Returns:**the SharedTraceTarget for this source
std::string getTraceSourceName () const
Gets the name of this source.
void setTraceSourceName (std::string const &n)
Sets the name of this source.
TraceLevel getMinimumTraceLevel () const
void disableTracing ()
Protected Member Functions
LcsRid readRid ()
Returns RID from btree tuple.
PageId readClusterPageId ()
Returns cluster pageid from btree tuple.
void setHdrOffsets (PConstLcsClusterNode pHdr)
Sets pointers to offset arrays in cluster page header.
Protected Attributes
TupleData bTreeTupleData
Tuple data representing the btree key corresponding to the cluster page.
SegmentAccessor segmentAccessor
Accessor for segment storing both btree and cluster pages.
ClusterPageLock clusterLock
Buffer lock for the actual cluster node pages.
PageId clusterPageId
Current cluster pageid.
LcsRid bTreeRid
Current rid in btree used to access current cluster page.
uint nClusterCols
Number of columns in cluster.
uint16_t * lastVal
Offsets to the last value stored on the page for each column in cluster.
uint16_t * firstVal
Offsets to the first value stored on the page for each column in cluster.
uint * nVal
Number of distinct values in the page for each column in cluster.
uint16_t * delta
For each column in the cluster, offset used to get the real offset within the page.
Private Member Functions
PBuffer valueSource (uint16_t lastValOffset, PBuffer pValBank, uint16_t oValBank, PBuffer pBlock, uint16_t f)
Associates an offset with an address, determining whether a value is stored in the temporary block or the temporary value bank.
RecordNum moveFromIndexToTemp ()
Moves all cluster data from cluster page to temporary storage.
void moveFromTempToIndex ()
Moves all cluster data from temporary storage to the actual cluster page.
void allocArrays ()
Allocates temporary arrays used during cluster writes.
uint32_t round8Boundary (uint32_t val)
Rounds a 32-bit value to a boundary of 8.
uint32_t roundIf8Boundary (uint32_t val)
Rounds a 32-bit value to a boundary of 8 if it is > 8.
Private Attributes
SharedBTreeWriter bTreeWriter
Writes btree corresponding to cluster.
SegmentAccessor scratchAccessor
Accessor for scratch segments.
ClusterPageLock bufferLock
Lock on scratch page.
PLcsClusterNode pHdr
Cluster page header.
uint hdrSize
Size of the cluster page header.
PBuffer pIndexBlock
Cluster page to be written.
PBuffer * pBlock
Array of pointers to temporary blocks, 1 block for each column cluster.
uint szBlock
Size of the cluster page.
int minSzLeft
Minimum size left on the page.
boost::scoped_array< LcsBatchDir > batchDirs
Batch directories for the batches currently being constructed, one per cluster column.
boost::scoped_array< PBuffer > pValBank
Temporary storage for values, used for fixed mode batches; one per cluster column.
boost::scoped_array< uint16_t > oValBank
First offset in the bank for each column in the cluster value bank.
boost::scoped_array< uint16_t > valBankStart
Start of each cluster column in the value bank.
boost::scoped_array< uint16_t > batchOffset
Offsets to the batch directories on the temporary pages, one per cluster column.
boost::scoped_array< uint > batchCount
Count of the number of batches in the temporary pages, one per cluster column.
int szLeft
Number of bytes left on the page.
boost::scoped_array< uint > nBits
Number of bits required to store the value codes for each column in the cluster, for the batches currently being constructed.
boost::scoped_array< uint > nextWidthChange
Number of values that will cause the next nBit change for the column in the cluster.
bool arraysAllocated
Indicates whether temporary arrays have already been allocated.
boost::scoped_array< ForceMode > bForceMode
Set when the mode of a batch should be forced to a particular value.
boost::scoped_array< uint > forceModeCount
Number of times force mode has been used for each cluster column.
boost::scoped_array< uint > maxValueSize
Max value size encountered thus far for each cluster column.
boost::scoped_array< UnalignedAttributeAccessor > attrAccessors
Accessors for reading unaligned values.
SharedLcsClusterDump clusterDump
Cluster dump.
TupleDescriptor colTupleDesc
Tuple descriptor of the columns being loaded.

Detailed Description

Constructs a cluster page, managing the amount of space currently in use on the page and determining the offsets where different elements are to be stored.

Definition at line 46 of file LcsClusterNodeWriter.h.


Constructor & Destructor Documentation

Definition at line 29 of file LcsClusterNodeWriter.cpp.

References SegPageLock::accessSegment(), arraysAllocated, batchCount, batchDirs, batchOffset, bForceMode, bTreeWriter, bufferLock, clusterDump, colTupleDesc, forceModeCount, hdrSize, maxValueSize, minSzLeft, nBits, LcsClusterAccessBase::nClusterCols, nextWidthChange, oValBank, pBlock, pHdr, pIndexBlock, pValBank, scratchAccessor, szBlock, szLeft, TRACE_FINE, and valBankStart.

| LcsClusterNodeWriter::~LcsClusterNodeWriter | ( | | ) | | -------------------------------------------- | - | | - |


Member Function Documentation

Associates an offset with an address, determining whether a value is stored in the temporary block or the temporary value bank.

Parameters:

lastValOffset offset of the last value for this particular column
pValBank buffer storing values in the value bank
oValBank offset of first value for column in the value bank
pBlock temporary block for column
f desired offset

Returns:

address corresponding to offset

Definition at line 198 of file LcsClusterNodeWriter.h.

Referenced by putFixedVarBatch().

00201 { 00202
00203 if (f < lastValOffset) { 00204 return pValBank + f - oValBank; 00205 } else { 00206 return pBlock + f; 00207 } 00208 }

| RecordNum LcsClusterNodeWriter::moveFromIndexToTemp | ( | | ) | [private] | | ------------------------------------------------------------------------------------------------------- | - | | - | ----------- |

Moves all cluster data from cluster page to temporary storage.

Returns:

number of rows currently on page

Definition at line 961 of file LcsClusterNodeWriter.cpp.

References batchCount, batchOffset, bitVecPtr(), bitVecWidth(), calcWidth(), LcsClusterAccessBase::firstVal, hdrSize, LcsClusterAccessBase::lastVal, LCS_COMPRESSED, LCS_VARIABLE, myCopy(), LcsClusterNode::nBatch, LcsClusterAccessBase::nClusterCols, LcsBatchDir::nRow, LcsClusterAccessBase::nVal, LcsBatchDir::nVal, LcsClusterNode::oBatch, LcsBatchDir::oVal, pBlock, pHdr, pIndexBlock, LcsBatchDir::recSize, and szBlock.

Referenced by openAppend().

00962 { 00963 PLcsBatchDir pBatch; 00964 boost::scoped_array batchDirOffset; 00965 uint16_t loc; 00966 uint column; 00967 uint batchCount = pHdr->nBatch / nClusterCols; 00968 uint b; 00969 00970 batchDirOffset.reset(new uint16_t[pHdr->nBatch]); 00971 00972
00973
00974
00975
00976 for (column = 0; column < nClusterCols; column++) { 00977 uint sz = firstVal[column] - lastVal[column]; 00978 loc = (uint16_t) (szBlock - sz); 00979 myCopy(pBlock[column] + loc, pIndexBlock + lastVal[column], sz); 00980 00981
00982 lastVal[column] = loc; 00983 firstVal[column] = (uint16_t) szBlock; 00984 } 00985 00986
00987 00988 pBatch = (PLcsBatchDir)(pIndexBlock + pHdr->oBatch); 00989 for (column = 0; column < nClusterCols; column++) { 00990 uint i; 00991 loc = hdrSize; 00992 00993
00994 for (b = column, i = 0; i < batchCount; i++, b = b + nClusterCols) { 00995 uint16_t batchStart = loc; 00996 00997 if (pBatch[b].mode == LCS_COMPRESSED) { 00998 uint8_t *pBit; 00999 WidthVec w;
01000 PtrVec p;
01001 uint iV;
01002 uint sizeOffsets, nBytes; 01003 01004
01005 sizeOffsets = pBatch[b].nVal * sizeof(uint16_t); 01006 myCopy( 01007 pBlock[column] + loc, pIndexBlock + pBatch[b].oVal, 01008 sizeOffsets); 01009 01010
01011 loc = (uint16_t) (loc + sizeOffsets); 01012 01013
01014 iV = bitVecWidth(calcWidth(pBatch[b].nVal), w); 01015 01016
01017 pBit = pIndexBlock + pBatch[b].oVal + sizeOffsets; 01018 01019
01020 nBytes = bitVecPtr(pBatch[b].nRow, iV, w, p, pBit); 01021 01022 myCopy(pBlock[column] + loc, pBit, nBytes); 01023 01024
01025 loc = (uint16_t) (loc + nBytes); 01026 } else if (pBatch[b].mode == LCS_VARIABLE) { 01027 uint sizeOffsets; 01028 01029 sizeOffsets = pBatch[b].nRow * sizeof(uint16_t); 01030 01031
01032 myCopy( 01033 pBlock[column] + loc, pIndexBlock + pBatch[b].oVal, 01034 sizeOffsets); 01035 01036
01037 loc = (uint16_t) (loc + sizeOffsets); 01038 } else { 01039
01040 uint sizeFixed; 01041 01042 sizeFixed = pBatch[b].nRow * pBatch[b].recSize; 01043
01044 myCopy( 01045 pBlock[column] + loc, pIndexBlock + pBatch[b].oVal, 01046 sizeFixed); 01047 01048
01049 loc = (uint16_t) (loc + sizeFixed); 01050 } 01051 01052
01053 batchDirOffset[b] = batchStart; 01054 } 01055 01056
01057 01058 uint16_t dirLoc; 01059 b = column; 01060 dirLoc = loc; 01061 batchOffset[column] = dirLoc; 01062 01063
01064 for (i = 0; i < batchCount; i++) { 01065 PLcsBatchDir pTempBatch = (PLcsBatchDir)(pBlock[column] + dirLoc); 01066 myCopy(pTempBatch, &pBatch[b], sizeof(LcsBatchDir)); 01067 01068 pTempBatch->oVal = batchDirOffset[b]; 01069
01070 b = b + nClusterCols; 01071 dirLoc += sizeof(LcsBatchDir); 01072 } 01073 } 01074 01075
01076 pBatch = (PLcsBatchDir)(pIndexBlock + pHdr->oBatch); 01077 RecordNum nrows = 0; 01078 for (b = 0; b < pHdr->nBatch; b = b + nClusterCols) { 01079 nrows += pBatch[b].nRow; 01080 } 01081 01082 batchDirOffset.reset(); 01083 return nrows; 01084 }

| void LcsClusterNodeWriter::moveFromTempToIndex | ( | | ) | [private] | | ---------------------------------------------- | - | | - | ----------- |

Moves all cluster data from temporary storage to the actual cluster page.

Definition at line 1086 of file LcsClusterNodeWriter.cpp.

References batchCount, batchOffset, bitVecPtr(), bitVecWidth(), calcWidth(), clusterDump, LcsClusterAccessBase::clusterPageId, LcsClusterAccessBase::delta, LcsClusterAccessBase::firstVal, hdrSize, TraceSource::isTracingLevel(), LcsClusterAccessBase::lastVal, LCS_COMPRESSED, LCS_VARIABLE, myCopy(), LcsClusterNode::nBatch, LcsClusterAccessBase::nClusterCols, LcsBatchDir::nRow, LcsClusterAccessBase::nVal, LcsBatchDir::nVal, LcsClusterNode::oBatch, opaqueToInt(), LcsBatchDir::oVal, pBlock, pHdr, pIndexBlock, LcsBatchDir::recSize, szBlock, and TRACE_FINE.

01087 { 01088 PLcsBatchDir pBatch; 01089 uint sz, numBatches = batchCount[0]; 01090 uint16_t offset, loc; 01091 uint column, b; 01092 01093
01094
01095 01096 for (offset = (uint16_t) szBlock, column = 0; column < nClusterCols; 01097 column++) 01098 { 01099 sz = szBlock - lastVal[column]; 01100 myCopy( 01101 pIndexBlock + (offset - sz), pBlock[column] + lastVal[column], sz); 01102 01103
01104 delta[column] = (uint16_t)(szBlock - offset); 01105 01106
01107
01108 firstVal[column] = offset; 01109 offset = (uint16_t) (offset - sz); 01110 lastVal[column] = offset; 01111 } 01112 01113
01114 01115 for (loc = hdrSize, b = 0; b < numBatches; b++) { 01116 for (column = 0; column < nClusterCols; column++) { 01117 uint16_t batchStart = loc; 01118 01119 pBatch = (PLcsBatchDir)(pBlock[column] + batchOffset[column]); 01120 01121 if (pBatch[b].mode == LCS_COMPRESSED) { 01122 uint8_t *pBit; 01123 WidthVec w;
01124 PtrVec p;
01125 uint iV;
01126 uint sizeOffsets, nBytes; 01127 01128 sizeOffsets = pBatch[b].nVal * sizeof(uint16_t); 01129 01130
01131 myCopy( 01132 pIndexBlock + loc, pBlock[column] + pBatch[b].oVal, 01133 sizeOffsets); 01134 01135
01136 loc = (uint16_t) (loc + sizeOffsets); 01137 01138
01139 iV = bitVecWidth(calcWidth(pBatch[b].nVal), w); 01140 01141
01142 pBit = pBlock[column] + pBatch[b].oVal + sizeOffsets; 01143 01144
01145 nBytes = bitVecPtr(pBatch[b].nRow, iV, w, p, pBit); 01146 01147 myCopy(pIndexBlock + loc, pBit, nBytes); 01148 01149
01150 loc = (uint16_t)(loc + nBytes); 01151 01152 } else if (pBatch[b].mode == LCS_VARIABLE) { 01153 uint sizeOffsets; 01154 01155 sizeOffsets = pBatch[b].nRow * sizeof(uint16_t); 01156 01157
01158 myCopy( 01159 pIndexBlock + loc, pBlock[column] + pBatch[b].oVal, 01160 sizeOffsets); 01161 01162
01163 loc = (uint16_t) (loc + sizeOffsets); 01164 } else { 01165
01166 uint sizeFixed; 01167 01168 sizeFixed = pBatch[b].nRow * pBatch[b].recSize; 01169
01170 myCopy( 01171 pIndexBlock + loc, pBlock[column] + pBatch[b].oVal, 01172 sizeFixed); 01173 01174
01175 loc = (uint16_t) (loc + sizeFixed); 01176 } 01177 01178
01179 pBatch[b].oVal = batchStart; 01180 } 01181 } 01182 01183
01184 pHdr->nBatch = nClusterCols * numBatches; 01185 01186
01187 pHdr->oBatch = loc; 01188 01189
01190 for (b = 0; b < numBatches; b++) { 01191 for (column = 0; column < nClusterCols; column++) { 01192 pBatch = (PLcsBatchDir)(pBlock[column] + batchOffset[column]); 01193 myCopy(pIndexBlock + loc, &pBatch[b], sizeof(LcsBatchDir)); 01194 loc += sizeof(LcsBatchDir); 01195 } 01196 } 01197 01198 if (isTracingLevel(TRACE_FINE)) { 01199 FENNEL_TRACE( 01200 TRACE_FINE, "Calling ClusterDump from moveFromTempToIndex"); 01201 clusterDump->dump(opaqueToInt(clusterPageId), pHdr, szBlock); 01202 } 01203 }

| void LcsClusterNodeWriter::allocArrays | ( | | ) | [private] | | -------------------------------------- | - | | - | ----------- |

Allocates temporary arrays used during cluster writes.

Definition at line 1205 of file LcsClusterNodeWriter.cpp.

References SegNodeLock< Node >::allocatePage(), arraysAllocated, attrAccessors, batchCount, batchDirs, batchOffset, bForceMode, bufferLock, colTupleDesc, forceModeCount, SegPageLock::getPage(), CachePage::getWritableData(), maxValueSize, nBits, LcsClusterAccessBase::nClusterCols, nextWidthChange, oValBank, pValBank, SegPageLock::unlock(), and valBankStart.

Referenced by init().

01206 { 01207
01208 if (arraysAllocated) { 01209 arraysAllocated = true; 01210 01211 batchDirs.reset(new LcsBatchDir[nClusterCols]); 01212 01213 pValBank.reset(new PBuffer[nClusterCols]); 01214 01215
01216 01217 attrAccessors.reset(new UnalignedAttributeAccessor[nClusterCols]); 01218 01219 for (uint col = 0; col < nClusterCols; col++) { 01220 bufferLock.allocatePage(); 01221 pValBank[col] = bufferLock.getPage().getWritableData(); 01222
01223
01224
01225
01226 bufferLock.unlock(); 01227 01228 attrAccessors[col].compute(colTupleDesc[col]); 01229 } 01230 01231 valBankStart.reset(new uint16_t[nClusterCols]); 01232 01233 forceModeCount.reset(new uint[nClusterCols]); 01234 01235 bForceMode.reset(new ForceMode[nClusterCols]); 01236 01237 oValBank.reset(new uint16_t[nClusterCols]); 01238 01239 batchOffset.reset(new uint16_t[nClusterCols]); 01240 01241 batchCount.reset(new uint[nClusterCols]); 01242 01243 nBits.reset(new uint[nClusterCols]); 01244 01245 nextWidthChange.reset(new uint[nClusterCols]); 01246 01247 maxValueSize.reset(new uint[nClusterCols]); 01248 } 01249 01250 memset(valBankStart.get(), 0, nClusterCols * sizeof(uint16_t)); 01251 memset(forceModeCount.get(), 0, nClusterCols * sizeof(uint)); 01252 memset(bForceMode.get(), 0, nClusterCols * sizeof(ForceMode)); 01253 memset(oValBank.get(), 0, nClusterCols * sizeof(uint16_t)); 01254 memset(batchOffset.get(), 0, nClusterCols * sizeof(uint16_t)); 01255 memset(batchCount.get(), 0, nClusterCols * sizeof(uint)); 01256 memset(nBits.get(), 0, nClusterCols * sizeof(uint)); 01257 memset(nextWidthChange.get(), 0, nClusterCols * sizeof(uint)); 01258 memset(maxValueSize.get(), 0, nClusterCols * sizeof(uint)); 01259 }

uint32_t LcsClusterNodeWriter::round8Boundary ( uint32_t val ) [inline, private]
uint32_t LcsClusterNodeWriter::roundIf8Boundary ( uint32_t val ) [inline, private]

Rounds a 32-bit value to a boundary of 8 if it is > 8.

Parameters:

Definition at line 243 of file LcsClusterNodeWriter.h.

00244 { 00245 if (val > 8) { 00246 return round8Boundary(val); 00247 } 00248 }

bool LcsClusterNodeWriter::getLastClusterPageForWrite ( PLcsClusterNode & pBlock,
LcsRid & firstRid
)

Gets the last cluster page.

Parameters:

pBlock output param returning the cluster page
firstRid output param returning first rid stored on cluster page

Returns:

true if cluster is non-empty

Definition at line 101 of file LcsClusterNodeWriter.cpp.

References LcsClusterAccessBase::bTreeTupleData, bTreeWriter, clusterDump, LcsClusterAccessBase::clusterLock, LcsClusterAccessBase::clusterPageId, SegNodeLock< Node >::getNodeForWrite(), TraceSource::isTracingLevel(), SegPageLock::lockExclusive(), opaqueToInt(), pBlock, LcsClusterAccessBase::readClusterPageId(), szBlock, and TRACE_FINE.

PLcsClusterNode LcsClusterNodeWriter::allocateClusterPage ( LcsRid firstRid )

Allocates a new cluster page.

Parameters:

firstRid first rid to be stored on cluster page

Returns:

page allocated

Definition at line 132 of file LcsClusterNodeWriter.cpp.

References SegNodeLock< Node >::allocatePage(), LcsClusterAccessBase::bTreeRid, LcsClusterAccessBase::bTreeTupleData, bTreeWriter, LcsClusterAccessBase::clusterLock, LcsClusterAccessBase::clusterPageId, DUP_FAIL, SegPageLock::flushPage(), SegNodeLock< Node >::getNodeForWrite(), SegPageLock::getPageId(), SegPageLock::isLocked(), NULL_PAGE_ID, SegmentAccessor::pSegment, and LcsClusterAccessBase::segmentAccessor.

Initializes object with parameters relevant to the cluster page that will be written.

Parameters:

nColumns number of columns in the cluster
indexBlock pointer to the cluster page to be written
pBlock array of pointers to temporary pages to be used while writing this cluster page
szBlock size of cluster page, reflecting max amount of space available to write cluster data

Definition at line 160 of file LcsClusterNodeWriter.cpp.

References allocArrays(), getClusterSubHeaderSize(), hdrSize, LcsMaxLeftOver, minSzLeft, LcsClusterAccessBase::nClusterCols, pBlock, pHdr, pIndexBlock, LcsClusterAccessBase::setHdrOffsets(), and szBlock.

| void LcsClusterNodeWriter::close | ( | | ) | | -------------------------------- | - | | - |

Definition at line 78 of file LcsClusterNodeWriter.cpp.

References attrAccessors, batchCount, batchDirs, batchOffset, bForceMode, bTreeWriter, LcsClusterAccessBase::clusterLock, SegPageLock::flushPage(), forceModeCount, SegPageLock::isLocked(), maxValueSize, nBits, nextWidthChange, oValBank, pValBank, SegPageLock::unlock(), and valBankStart.

Referenced by ~LcsClusterNodeWriter().

void LcsClusterNodeWriter::openNew ( LcsRid startRID )

Prepares a cluster page as a new one.

Parameters:

startRID first RID on the page

Definition at line 182 of file LcsClusterNodeWriter.cpp.

References batchCount, batchDirs, batchOffset, LcsClusterAccessBase::delta, LcsClusterNode::firstRID, LcsClusterAccessBase::firstVal, hdrSize, LcsClusterAccessBase::lastVal, LCS_COMPRESSED, max(), LcsClusterNode::nBatch, nBits, LcsClusterAccessBase::nClusterCols, LcsClusterNode::nColumn, nextWidthChange, LcsClusterAccessBase::nVal, LcsClusterNode::oBatch, pHdr, szBlock, and szLeft.

bool LcsClusterNodeWriter::openAppend ( uint * nValOffsets,
uint16_t * lastValOffsets,
RecordNum & nrows
)

Prepares an existing cluster page for appending new data, and determines whether the page is already full and cannot accomodate any more data.

Parameters:

nValOffsets pointer to output array reflecting the number of values currently in each column on this page
lastValOffsets pointer to output array reflecting the offset of the last value currently on the page for each cluster column
nrows returns number of rows currently on page

Returns:

true if the page is already full

Definition at line 219 of file LcsClusterNodeWriter.cpp.

References batchCount, batchDirs, LcsClusterAccessBase::lastVal, LCS_COMPRESSED, max(), moveFromIndexToTemp(), LcsClusterNode::nBatch, nBits, LcsClusterAccessBase::nClusterCols, nextWidthChange, LcsClusterAccessBase::nVal, LcsClusterNode::oBatch, oValBank, pHdr, and szLeft.

void LcsClusterNodeWriter::describeLastBatch ( uint column,
uint & dRow,
uint & recSize
)

Returns the offset of the next value in a batch.

Parameters:

column column we want the value for
thisVal offset of the value currently positioned at

Definition at line 264 of file LcsClusterNodeWriter.cpp.

References attrAccessors, pBlock, and szBlock.

00265 { 00266 if (thisVal && thisVal != szBlock) { 00267 return 00268 (uint16_t) (thisVal + 00269 attrAccessors[column].getStoredByteCount( 00270 pBlock[column] + thisVal)); 00271 } else { 00272 return 0; 00273 } 00274 }

void LcsClusterNodeWriter::rollBackLastBatch ( uint column,
PBuffer pVal
)

Rolls back the last 8 value (or less) from a batch.

Parameters:

column column to be rolled back
pVal buffer where the rolled back values will be copied; the buffer is assumed to be fixedRec * (nRows % 8) in size, as determined by the last call to describeLastBatch

Definition at line 276 of file LcsClusterNodeWriter.cpp.

References attrAccessors, batchCount, batchDirs, batchOffset, bitVecPtr(), bitVecWidth(), calcWidth(), LcsClusterAccessBase::lastVal, LCS_COMPRESSED, LCS_FIXED, LcsMaxRollBack, max(), nBits, nextWidthChange, LcsClusterAccessBase::nVal, pBlock, readBitVecs(), and szLeft.

00277 { 00278 uint i; 00279 PLcsBatchDir pBatch; 00280 uint16_t *pValOffsets; 00281 00282 uint8_t *pBit;
00283 WidthVec w;
00284 PtrVec p;
00285 uint iV;
00286 00287 uint16_t rows[LcsMaxRollBack];
00288 int origSzLeft; 00289 uint len; 00290 00291
00292 pBatch = (PLcsBatchDir)(pBlock[column] + batchOffset[column]); 00293 batchDirs[column] = pBatch[batchCount[column] -1]; 00294 00295
00296 origSzLeft = lastVal[column] - batchOffset[column] - 00297 (batchCount[column]+2)*sizeof(LcsBatchDir); 00298 00299 if ((batchDirs[column].nRow > 8) || (batchDirs[column].nRow % 8) == 0) { 00300 return; 00301 } 00302 00303 if (batchDirs[column].mode == LCS_COMPRESSED) { 00304
00305 iV = bitVecWidth(calcWidth(batchDirs[column].nVal), w); 00306 00307
00308 pBit = pBlock[column] + batchDirs[column].oVal + 00309 batchDirs[column].nVal * sizeof(uint16_t); 00310 00311
00312 bitVecPtr(batchDirs[column].nRow, iV, w, p, pBit); 00313 00314
00315 readBitVecs(rows, iV, w, p, 0, batchDirs[column].nRow); 00316 00317
00318 pValOffsets = (uint16_t *)(pBlock[column] + batchDirs[column].oVal); 00319 00320
00321 for (i = 0; i < batchDirs[column].nRow; 00322 i++, pBuf += batchDirs[column].recSize) 00323 { 00324 len = 00325 attrAccessors[column].getStoredByteCount( 00326 pBlock[column] + pValOffsets[rows[i]]); 00327 memcpy(pBuf, pBlock[column] + pValOffsets[rows[i]], len); 00328 } 00329 00330 } else if (batchDirs[column].mode == LCS_FIXED) { 00331
00332
00333 memcpy( 00334 pBuf, 00335 pBlock[column] + batchDirs[column].oVal, 00336 batchDirs[column].nRow * batchDirs[column].recSize); 00337 } else { 00338
00339
00340 pValOffsets = (uint16_t *)(pBlock[column] + batchDirs[column].oVal); 00341 00342
00343 for (i = 0; i < batchDirs[column].nRow; 00344 i++, pBuf += batchDirs[column].recSize) 00345 { 00346 len = 00347 attrAccessors[column].getStoredByteCount( 00348 pBlock[column] + pValOffsets[i]); 00349 memcpy(pBuf, pBlock[column] + pValOffsets[i], len); 00350 } 00351 } 00352 00353
00354 batchCount[column]--; 00355
00356 batchOffset[column] = batchDirs[column].oVal; 00357 00358
00359 memmove( 00360 pBlock[column] + batchOffset[column], 00361 pBatch, 00362 batchCount[column] * sizeof(LcsBatchDir)); 00363 00364
00365
00366
00367
00368 int newSz; 00369 newSz = lastVal[column] - batchOffset[column] - 00370 (batchCount[column] + 2) * sizeof(LcsBatchDir); 00371 szLeft += (newSz - origSzLeft); 00372 szLeft = std::max(szLeft, 0); 00373 assert(szLeft >= 0); 00374 00375
00376 nBits[column] = 0; 00377 nextWidthChange[column] = 1; 00378 00379
00380 batchDirs[column].mode = LCS_COMPRESSED; 00381 batchDirs[column].nVal = 0; 00382 batchDirs[column].nRow = 0; 00383 batchDirs[column].oVal = 0; 00384 batchDirs[column].recSize = 0; 00385 }

bool LcsClusterNodeWriter::noCompressMode ( uint column ) const [inline]

Returns true if the batch is not being forced to compress mode.

Parameters:

column column being described

Definition at line 361 of file LcsClusterNodeWriter.h.

References fixed, and variable.

Translates an offset for a column to the pointer to the actual value.

Parameters:

column offset corresponds to this column
offset offset to be translated

Returns:

pointer to value

Definition at line 376 of file LcsClusterNodeWriter.h.

00377 { 00378 return pBlock[column] + offset; 00379 };

bool LcsClusterNodeWriter::addValue ( uint column,
bool bFirstTimeInBatch
)

Adds a new value to the page.

In the case of compressed or variable mode, adds the value to the bottom of the page. In the case of fixed mode, adds the value to the "value bank".

Parameters:

column column corresponding to the value being added
pVal value to be added
oVal returns the offset where the value has been added

Returns:

true if there is enough room in the page for the value

Definition at line 422 of file LcsClusterNodeWriter.cpp.

References attrAccessors, batchDirs, bForceMode, calcWidth(), fixed, LcsClusterAccessBase::lastVal, LcsMaxSzLeftError, maxValueSize, nBits, LcsClusterAccessBase::nClusterCols, nextWidthChange, LcsClusterAccessBase::nVal, pBlock, pValBank, and szLeft.

void LcsClusterNodeWriter::undoValue ( uint column,
PBuffer pVal,
bool bFirstInBatch
)
void LcsClusterNodeWriter::putCompressedBatch ( uint column,
PBuffer pRows,
PBuffer pBuf
)

Writes a compressed mode batch into the temporary cluster page for a column.

Only a multiple of 8 rows is written, if this is not the last batch in the cluster.

Excess rows are written into a temporary buffer. If this is the last batch in the load, then it is ok to have < 8 rows, as the next load will roll it back to fill it up with more rows.

Note that it is assumed that the caller has already copied the key offsets for this batch into the cluster page. This call will only copy the bit vectors and batch directory corresponding to this batch

Parameters:

column column corresponding to the batch
pRows array mapping rows to key offsets
pBuf temporary buffer where excess row values will be copied; assumed to be (nRow % 8)*fixedRec big

Definition at line 535 of file LcsClusterNodeWriter.cpp.

References attrAccessors, batchCount, batchDirs, batchOffset, bitVecPtr(), bitVecWidth(), LcsClusterAccessBase::lastVal, LCS_COMPRESSED, nBits, nByte, nextWidthChange, LcsClusterAccessBase::nVal, pBlock, round8Boundary(), and setBits().

00537 { 00538 uint i, j, b; 00539 uint iRow; 00540 uint nByte; 00541 uint8_t *pBit; 00542 uint16_t *pOffs; 00543 PLcsBatchDir pBatch; 00544 00545 WidthVec w;
00546 PtrVec p;
00547 uint iV;
00548 00549
00550
00551
00552
00553
00554
00555
00556 00557
00558
00559 00560 if (batchDirs[column].nRow > 8) { 00561 uint len; 00562 pOffs = (uint16_t *)(pBlock[column] + batchDirs[column].oVal); 00563 for (i = round8Boundary((uint32_t) batchDirs[column].nRow); 00564 i < batchDirs[column].nRow; i++, pBuf += batchDirs[column].recSize) 00565 { 00566 iRow = ((uint16_t *) pRows)[i]; 00567 len = 00568 attrAccessors[column].getStoredByteCount( 00569 pBlock[column] + pOffs[iRow]); 00570 memcpy(pBuf, pBlock[column] + pOffs[iRow], len); 00571 } 00572 batchDirs[column].nRow = 00573 round8Boundary((uint32_t) batchDirs[column].nRow); 00574 } 00575 00576
00577 iV = bitVecWidth(nBits[column], w); 00578 00579
00580 pBit = pBlock[column] + batchDirs[column].oVal + 00581 batchDirs[column].nVal*sizeof(uint16_t); 00582 00583
00584 nByte = bitVecPtr(batchDirs[column].nRow, iV, w, p, pBit); 00585 memset(pBit, 0, nByte); 00586 00587 for (j = 0, b = 0; j < iV ; j++) { 00588 switch (w[j]) { 00589 case 16: 00590 memcpy(p[j], pRows, batchDirs[column].nRow * sizeof(uint16_t)); 00591 break; 00592 00593 case 8: 00594 for (i = 0; i < batchDirs[column].nRow ; i++) { 00595 (p[j])[i] = (uint8_t)((uint16_t *) pRows)[i]; 00596 } 00597 break; 00598 00599 case 4: 00600 for (i = 0; i < batchDirs[column].nRow ; i++) { 00601 setBits( 00602 p[j] + i / 2 , 00603 4, 00604 (i % 2) * 4, 00605 (uint16_t)(((uint16_t *) pRows)[i] >> b)); 00606 } 00607 break; 00608 00609 case 2: 00610 for (i = 0; i < batchDirs[column].nRow ; i++) { 00611 setBits( 00612 p[j] + i / 4 , 00613 2, 00614 (i % 4) * 2, 00615 (uint16_t)(((uint16_t *) pRows)[i] >> b)); 00616 } 00617 break; 00618 00619 case 1: 00620 for (i = 0; i < batchDirs[column].nRow ; i++) { 00621 setBits( 00622 p[j] + i / 8 , 00623 1, 00624 (i % 8), 00625 (uint16_t)(((uint16_t *)pRows)[i] >> b)); 00626 } 00627 break; 00628 00629 default: 00630 ; 00631 } 00632 b += w[j]; 00633 } 00634 00635
00636 pBatch = (PLcsBatchDir)(pBlock[column] + batchOffset[column]); 00637 pBatch[batchCount[column]] = batchDirs[column]; 00638 batchCount[column]++; 00639 00640
00641 batchDirs[column].mode = LCS_COMPRESSED; 00642 batchDirs[column].oLastValHighMark = lastVal[column]; 00643 batchDirs[column].nValHighMark = nVal[column]; 00644 batchDirs[column].nVal = 0; 00645 batchDirs[column].oVal = batchOffset[column]; 00646 batchDirs[column].nRow = 0; 00647 00648
00649 nBits[column] = 0; 00650 nextWidthChange[column] = 1 ; 00651 }

Writes a fixed or variable mode batch into a temporary cluster page for a column.

Only a multiple of 8 rows is written, if this is not the last batch in the cluster.

Excess rows are written into a temporary buffer. If this is the last batch in the load, then it is ok to have < 8 rows, as the next load will roll it back to fill it up with more rows.

In the variable mode case, the key offsets are written to the batch area on the page. In the fixed mode case, the values themselves are written to the batch area. In both cases, the batch directory is also written out.

Parameters:

column column corresponding to the batch
pRows array of offsets to values
pBuf temporary buffer where excess row values will be copied; assumed to be (nRow % 8)*fixedRec big

Definition at line 653 of file LcsClusterNodeWriter.cpp.

References attrAccessors, batchCount, batchDirs, batchOffset, bForceMode, fixed, forceModeCount, LcsClusterAccessBase::lastVal, LCS_COMPRESSED, LCS_FIXED, LCS_VARIABLE, maxValueSize, nBits, nextWidthChange, none, LcsClusterAccessBase::nVal, oValBank, pBlock, pValBank, valBankStart, valueSource(), and variable.

00655 { 00656 uint i; 00657 uint batchRows; 00658 PBuffer pVal; 00659 PLcsBatchDir pBatch; 00660 PBuffer src; 00661 uint batchRecSize; 00662 uint16_t localLastVal; 00663 uint16_t localoValBank; 00664 PBuffer localpValBank, localpBlock; 00665 00666 00667
00668
00669
00670
00671
00672
00673 batchRows = (batchDirs[column].nRow > 8) 00674 ? batchDirs[column].nRow & 0xfffffff8 : batchDirs[column].nRow; 00675 00676
00677 pVal = pBlock[column] + batchDirs[column].oVal; 00678 if (batchDirs[column].mode == LCS_VARIABLE) { 00679
00680
00681
00682 memcpy(pVal, pRows, batchRows * sizeof(uint16_t)); 00683 } else { 00684
00685 assert(batchDirs[column].mode == LCS_FIXED); 00686 00687 batchRecSize = batchDirs[column].recSize; 00688 localLastVal = lastVal[column]; 00689 localpValBank = pValBank[column] + valBankStart[column]; 00690 localoValBank = oValBank[column]; 00691 localpBlock = pBlock[column]; 00692 00693
00694
00695 for (i = 0; i < batchRows; i++) { 00696
00697
00698 src = valueSource( 00699 localLastVal, localpValBank, localoValBank, 00700 localpBlock, pRows[i]); 00701 uint len = attrAccessors[column].getStoredByteCount(src); 00702 memcpy(pVal, src, len); 00703 pVal += batchRecSize; 00704 } 00705 } 00706 00707
00708
00709 if (bForceMode[column] != none) { 00710 if (forceModeCount[column] > 20) { 00711 bForceMode[column] = none; 00712 forceModeCount[column] = 0; 00713 } 00714 } 00715 00716 batchRecSize = batchDirs[column].recSize; 00717 localLastVal = lastVal[column]; 00718 localpValBank = pValBank[column] + valBankStart[column]; 00719 localoValBank = oValBank[column]; 00720 localpBlock = pBlock[column]; 00721 00722
00723 pVal = pBuf; 00724 for (i = batchRows; i < batchDirs[column].nRow; i++) { 00725
00726
00727
00728 src = valueSource( 00729 localLastVal, localpValBank, localoValBank, 00730 localpBlock, pRows[i]); 00731 uint len = attrAccessors[column].getStoredByteCount(src); 00732 memcpy(pVal, src, len); 00733 pVal += batchRecSize; 00734 } 00735 00736 if (pValBank[column]) { 00737 oValBank[column] = 0; 00738 } 00739 00740
00741 batchDirs[column].nRow = batchRows; 00742 pBatch = (PLcsBatchDir)(pBlock[column] + batchOffset[column]); 00743 pBatch[batchCount[column]] = batchDirs[column]; 00744 00745
00746 batchCount[column]++; 00747 00748
00749
00750 switch (bForceMode[column]) { 00751 case none: 00752 batchDirs[column].mode = LCS_COMPRESSED; 00753 break; 00754 case fixed: 00755 batchDirs[column].mode = LCS_FIXED; 00756 break; 00757 case variable: 00758 batchDirs[column].mode = LCS_VARIABLE; 00759 break; 00760 default: 00761 assert(false); 00762 } 00763 batchDirs[column].oLastValHighMark = lastVal[column]; 00764 batchDirs[column].nValHighMark = nVal[column]; 00765 batchDirs[column].nVal = 0; 00766 batchDirs[column].oVal = batchOffset[column]; 00767 batchDirs[column].nRow = 0; 00768 00769
00770 nBits[column] = 0; 00771 nextWidthChange[column] = 1 ; 00772 00773 maxValueSize[column] = 0; 00774 }

Determines which compression mode to use for a batch.

Parameters:

column column for which compression mode is being determined
fixedSize size of record in the case of fixed size compression
nRows number of rows in the batch
pValOffset returns a pointer to the offset of the start of the batch
compressionMode returns the chosen compression mode

Definition at line 776 of file LcsClusterNodeWriter.cpp.

References batchCount, batchDirs, batchOffset, bForceMode, bitVecWidth(), fixed, forceModeCount, LcsClusterAccessBase::lastVal, LCS_FIXED, LCS_VARIABLE, LcsMaxLeftOver, LcsMaxSzLeftError, max(), min(), nBits, nByte, LcsClusterAccessBase::nVal, oValBank, pBlock, pValBank, sizeofBitVec(), szLeft, valBankStart, and variable.

| bool LcsClusterNodeWriter::isEndOfBlock | ( | | ) | [inline] | | --------------------------------------- | - | | - | ---------- |

| void LcsClusterNodeWriter::endBlock | ( | | ) | [inline] | | ----------------------------------- | - | | - | ---------- |

Done with the current cluster page.

Moves all data from temporary pages into the real cluster page

Definition at line 505 of file LcsClusterNodeWriter.h.

| LcsRid LcsClusterAccessBase::readRid | ( | | ) | [protected, inherited] | | ------------------------------------ | - | | - | ------------------------ |

| PageId LcsClusterAccessBase::readClusterPageId | ( | | ) | [protected, inherited] | | ---------------------------------------------- | - | | - | ------------------------ |

| uint LcsClusterAccessBase::getNumClusterCols | ( | | ) | [inline, inherited] | | --------------------------------------------------------------------------------------------------------- | - | | - | --------------------- |

void LcsClusterAccessBase::setNumClusterCols ( uint nCols ) [inline, inherited]

| void LcsClusterAccessBase::unlockClusterPage | ( | | ) | [inherited] | | -------------------------------------------- | - | | - | ------------- |

void TraceSource::initTraceSource ( SharedTraceTarget pTraceTarget,
std::string name
) [virtual, inherited]
void TraceSource::trace ( TraceLevel level,
std::string message
) const [inherited]

| bool TraceSource::isTracing | ( | | ) | const [inline, inherited] | | --------------------------- | - | | - | --------------------------- |

bool TraceSource::isTracingLevel ( TraceLevel level ) const [inline, inherited]

| TraceTarget& TraceSource::getTraceTarget | ( | | ) | const [inline, inherited] | | ----------------------------------------------------------------- | - | | - | --------------------------- |

| std::string TraceSource::getTraceSourceName | ( | | ) | const [inline, inherited] | | ------------------------------------------- | - | | - | --------------------------- |

void TraceSource::setTraceSourceName ( std::string const & n ) [inline, inherited]

Sets the name of this source.

Useful to construct dynamic names for fine-grained filtering.

Definition at line 136 of file TraceSource.h.

00137 { 00138 name = n; 00139 }

| TraceLevel TraceSource::getMinimumTraceLevel | ( | | ) | const [inline, inherited] | | ------------------------------------------------------------------------------------------------------ | - | | - | --------------------------- |

| void TraceSource::disableTracing | ( | | ) | [inherited] | | -------------------------------- | - | | - | ------------- |


Member Data Documentation

Array of pointers to temporary blocks, 1 block for each column cluster.

Definition at line 83 of file LcsClusterNodeWriter.h.

Referenced by addValue(), describeLastBatch(), getLastClusterPageForWrite(), getNextVal(), init(), LcsClusterNodeWriter(), moveFromIndexToTemp(), moveFromTempToIndex(), pickCompressionMode(), putCompressedBatch(), putFixedVarBatch(), and rollBackLastBatch().

Batch directories for the batches currently being constructed, one per cluster column.

Definition at line 99 of file LcsClusterNodeWriter.h.

Referenced by addValue(), allocArrays(), close(), LcsClusterNodeWriter(), openAppend(), openNew(), pickCompressionMode(), putCompressedBatch(), putFixedVarBatch(), rollBackLastBatch(), and undoValue().

Offsets to the batch directories on the temporary pages, one per cluster column.

Definition at line 122 of file LcsClusterNodeWriter.h.

Referenced by allocArrays(), close(), describeLastBatch(), LcsClusterNodeWriter(), moveFromIndexToTemp(), moveFromTempToIndex(), openNew(), pickCompressionMode(), putCompressedBatch(), putFixedVarBatch(), and rollBackLastBatch().

Count of the number of batches in the temporary pages, one per cluster column.

Definition at line 128 of file LcsClusterNodeWriter.h.

Referenced by allocArrays(), close(), describeLastBatch(), LcsClusterNodeWriter(), moveFromIndexToTemp(), moveFromTempToIndex(), openAppend(), openNew(), pickCompressionMode(), putCompressedBatch(), putFixedVarBatch(), and rollBackLastBatch().

Number of bits required to store the value codes for each column in the cluster, for the batches currently being constructed.

Definition at line 139 of file LcsClusterNodeWriter.h.

Referenced by addValue(), allocArrays(), close(), LcsClusterNodeWriter(), openAppend(), openNew(), pickCompressionMode(), putCompressedBatch(), putFixedVarBatch(), rollBackLastBatch(), and undoValue().

Number of columns in cluster.

Definition at line 68 of file LcsClusterAccessBase.h.

Referenced by addValue(), allocArrays(), LcsClusterDump::dump(), LcsClusterReader::getNumRows(), init(), LcsClusterReader::initColumnReaders(), LcsClusterNodeWriter(), moveFromIndexToTemp(), moveFromTempToIndex(), openAppend(), openNew(), LcsClusterReader::positionInBlock(), and LcsClusterAccessBase::setHdrOffsets().

Offsets to the last value stored on the page for each column in cluster.

Definition at line 74 of file LcsClusterAccessBase.h.

Referenced by addValue(), LcsClusterDump::dump(), moveFromIndexToTemp(), moveFromTempToIndex(), openAppend(), openNew(), pickCompressionMode(), putCompressedBatch(), putFixedVarBatch(), rollBackLastBatch(), LcsClusterAccessBase::setHdrOffsets(), and undoValue().

Number of distinct values in the page for each column in cluster.

Definition at line 87 of file LcsClusterAccessBase.h.

Referenced by addValue(), LcsClusterDump::dump(), moveFromIndexToTemp(), moveFromTempToIndex(), openAppend(), openNew(), pickCompressionMode(), putCompressedBatch(), putFixedVarBatch(), rollBackLastBatch(), LcsClusterAccessBase::setHdrOffsets(), and undoValue().


The documentation for this class was generated from the following files:


Generated on Mon Jun 22 04:00:37 2009 for Fennel by doxygen 1.5.1