Fennel: UnalignedAttributeAccessor Class Reference (original) (raw)
UnalignedAttributeAccessor is similar to AttributeAccessor, except that it provides a by-value access model intended for storing individual data values with maximum compression (hence unaligned), as opposed to the tuple-valued by-reference model of AttributeAccessor. More...
#include <[UnalignedAttributeAccessor.h](UnalignedAttributeAccessor%5F8h-source.html)>
| Public Member Functions | |
|---|---|
| UnalignedAttributeAccessor () | |
| UnalignedAttributeAccessor (TupleAttributeDescriptor const &attrDescriptor) | |
| Creates an accessor for the given attribute descriptor. | |
| void | compute (TupleAttributeDescriptor const &attrDescriptor) |
| Precomputes access for a descriptor. | |
| void | storeValue (TupleDatum const &datum, PBuffer pDataWithLen) const |
| Stores a value by itself, including length information, encoding it into the buffer passed in. | |
| void | loadValue (TupleDatum &datum, PConstBuffer pDataWithLen) const |
| Loads a value from a buffer containing data encoded via storeValue. | |
| TupleStorageByteLength | getStoredByteCount (PConstBuffer pDataWithLen) const |
| Gets the length information corresponding to the data stored in a buffer. | |
| TupleStorageByteLength | getMaxByteCount () const |
| Get the maximum number of bytes required to store any value of the given attribute. | |
| Private Member Functions | |
| void | compressInt64 (TupleDatum const &datum, PBuffer pDest) const |
| Compresses and stores an 8-byte integer by stripping off leading zeros. | |
| void | uncompressInt64 (TupleDatum &datum, PConstBuffer pDataWithLen) const |
| Uncompresses and loads an 8-byte integer, expanding it back to its original 8-byte value. | |
| bool | isInitialized () const |
| Private Attributes | |
| uint | cbStorage |
| bool | omitLengthIndicator |
| bool | isCompressedInt64 |
| Static Private Attributes | |
| static const TupleStorageByteLength | ONE_BYTE_MAX_LENGTH = 127 |
| static const TupleStorageByteLength | TWO_BYTE_MAX_LENGTH = 32767 |
| static const uint8_t | ONE_BYTE_LENGTH_MASK = 0x7f |
| static const uint16_t | TWO_BYTE_LENGTH_MASK1 = 0x7f00 |
| static const uint16_t | TWO_BYTE_LENGTH_MASK2 = 0x00ff |
| static const uint8_t | TWO_BYTE_LENGTH_BIT = 0x80 |
Detailed Description
UnalignedAttributeAccessor is similar to AttributeAccessor, except that it provides a by-value access model intended for storing individual data values with maximum compression (hence unaligned), as opposed to the tuple-valued by-reference model of AttributeAccessor.
Note:
Two methods, storeValue and loadValue, store and load TupleDatum to and from a preallocated buffer. The storage format is different from the marshalled format for a tuple (see TupleAccessor), since there's only one TupleDatum involved and there is no need to store the offset needed for "constant seek time". The storage format depends on the type of the data stored and for variable-width values is prefixed with leading bytes containing the length of the data.
If the data is an 8-byte integer (other than null), the leading zeroes in the data are stripped, and the length of the remaining bytes is stored in the first byte, followed by the data.
If the data is fixed-width and non-nullable, only the data itself is stored. We do not need to store the length of the data in this case because it is fixed and can be determined from the type descriptor corresponding to the data.
In all other cases, a length is encoded in the leading bytes of the buffer, based on the number of bytes in the data. The byte format of the buffer after storeDatum is:
One length byte encodes value length from 0(0x0000) to 127(0x007f)
0xxxxxxx
-------- -------- -------- -------- -------- ...
|length | data value bytes
Two length bytes encode value length from 128(0x0080) to 32767(0x7fff)
1xxxxxxx xxxxxxxx
-------- -------- -------- -------- -------- ...
| length | data value bytes
where length (1 or 2 bytes) comes from TupleDatum.cbData (a 4 byte
type) and data value bytes are copied from TupleDatum.pData. When storing NULL values, the one-byte length value of 0x00 is used; empty strings are special-cased as the two-byte length value of 0x8000 (because NULL values are much more common than empty strings)
TODO jvs 22-Oct-2006: unify this up at the TupleAccessor level as a new TUPLE_FORMAT_UNALIGNED.
Definition at line 81 of file UnalignedAttributeAccessor.h.
Constructor & Destructor Documentation
| UnalignedAttributeAccessor::UnalignedAttributeAccessor | ( | | ) | [explicit] | | ------------------------------------------------------ | - | | - | ------------ |
Creates an accessor for the given attribute descriptor.
Parameters:
| attrDescriptor | descriptor for values which will be accessed |
|---|
Definition at line 34 of file UnalignedAttributeAccessor.cpp.
References compute().
00036 { 00037 compute(attrDescriptor); 00038 }
Member Function Documentation
| void UnalignedAttributeAccessor::compressInt64 | ( | TupleDatum const & | datum, |
|---|---|---|---|
| PBuffer | pDest | ||
| ) | const [inline, private] |
Compresses and stores an 8-byte integer by stripping off leading zeros.
The stored value includes a leading byte indicating the length of the data.
Parameters:
| [in] | datum | datum to compress |
|---|---|---|
| [in,out] | pDest | pointer to the buffer where the data will be stored |
Definition at line 62 of file UnalignedAttributeAccessor.cpp.
References TupleDatum::cbData, FixedBuffer, and TupleDatum::pData.
Referenced by storeValue().
00065 {
00066
00067
00068
00069
00070
00071
00072
00073 assert(datum.cbData == 8);
00074 int64_t intVal = *reinterpret_cast<int64_t const *> (datum.pData);
00075 uint len;
00076
00077 if (intVal >= 0) {
00078 FixedBuffer tmpBuf[8];
00079 PBuffer pTmpBuf = tmpBuf + 8;
00080 len = 0;
00081 do {
00082 *(--pTmpBuf) = intVal & 0xff;
00083 len++;
00084 intVal >>= 8;
00085 } while (intVal);
00086
00087
00088
00089 if (*pTmpBuf & 0x80) {
00090 *(--pTmpBuf) = 0;
00091 len++;
00092 }
00093 *pDest = static_cast(len);
00094 memcpy(pDest + 1, pTmpBuf, len);
00095 } else {
00096
00097 if (intVal >= -(0x80)) {
00098 len = 1;
00099 } else if (intVal >= -(0x8000)) {
00100 len = 2;
00101 } else if (intVal >= -(0x800000)) {
00102 len = 3;
00103 } else if (intVal >= -(0x80000000LL)) {
00104 len = 4;
00105 } else if (intVal >= -(0x8000000000LL)) {
00106 len = 5;
00107 } else if (intVal >= -(0x800000000000LL)) {
00108 len = 6;
00109 } else if (intVal >= -(0x80000000000000LL)) {
00110 len = 7;
00111 } else {
00112 len = 8;
00113 }
00114 *pDest = static_cast(len);
00115 PBuffer pTmpBuf = pDest + 1 + len;
00116 while (len--) {
00117 *(--pTmpBuf) = intVal & 0xff;
00118 intVal >>= 8;
00119 }
00120 }
00121 }
| void UnalignedAttributeAccessor::uncompressInt64 | ( | TupleDatum & | datum, |
|---|---|---|---|
| PConstBuffer | pDataWithLen | ||
| ) | const [inline, private] |
Uncompresses and loads an 8-byte integer, expanding it back to its original 8-byte value.
Parameters:
| [in] | datum | datum to receive decompression result |
|---|---|---|
| [in] | pDataWithLen | data buffer to load from |
Definition at line 123 of file UnalignedAttributeAccessor.cpp.
References TupleDatum::cbData, and TupleDatum::pData.
Referenced by loadValue().
00126 {
00127 uint len = *pDataWithLen;
00128 assert(len != 0);
00129 PConstBuffer pSrcBuf = pDataWithLen + 1;
00130 uint signByte = *(pSrcBuf++);
00131
00132 int64_t intVal =
00133 int64_t(signByte) | ((signByte & 0x80) ? 0xffffffffffffff00LL : 0);
00134 while (--len > 0) {
00135 intVal <<= 8;
00136 intVal |= *(pSrcBuf++);
00137 }
00138 datum.cbData = 8;
00139
00140
00141
00142 memcpy(const_cast(datum.pData), &intVal, 8);
00143 }
| bool UnalignedAttributeAccessor::isInitialized | ( | | ) | const [private] | | ---------------------------------------------- | - | | - | ----------------- |
Precomputes access for a descriptor.
Must be called before any other method (or invoked explicitly by non-default constructor).
Parameters:
| attrDescriptor | descriptor for values which will be accessed |
|---|
Definition at line 40 of file UnalignedAttributeAccessor.cpp.
References TupleAttributeDescriptor::cbStorage, cbStorage, StoredTypeDescriptor::getOrdinal(), isCompressedInt64, TupleAttributeDescriptor::isNullable, omitLengthIndicator, TupleAttributeDescriptor::pTypeDescriptor, STANDARD_TYPE_INT_64, STANDARD_TYPE_UINT_64, STANDARD_TYPE_UNICODE_VARCHAR, STANDARD_TYPE_VARBINARY, and STANDARD_TYPE_VARCHAR.
Referenced by LcsHash::init(), LcsRowScanExecStream::prepareResidualFilters(), and UnalignedAttributeAccessor().
| void UnalignedAttributeAccessor::storeValue | ( | TupleDatum const & | datum, |
|---|---|---|---|
| PBuffer | pDataWithLen | ||
| ) | const |
Stores a value by itself, including length information, encoding it into the buffer passed in.
The caller needs to allocate a buffer of sufficient size. To do this, use the getMaxByteCount() method.
Parameters:
| [in] | datum | value to be stored |
|---|---|---|
| [in,out] | pDataWithLen | data buffer to store to |
Definition at line 145 of file UnalignedAttributeAccessor.cpp.
References TupleDatum::cbData, compressInt64(), isCompressedInt64, isInitialized(), omitLengthIndicator, ONE_BYTE_MAX_LENGTH, TupleDatum::pData, TWO_BYTE_LENGTH_BIT, TWO_BYTE_LENGTH_MASK1, TWO_BYTE_LENGTH_MASK2, and TWO_BYTE_MAX_LENGTH.
Referenced by LcsHash::insert(), and LcsHash::undoInsert().
00148 {
00149 assert(isInitialized());
00150
00151 PBuffer tmpDataPtr = pDataWithLen;
00152
00153 if (!datum.pData) {
00154
00155
00156
00157 *tmpDataPtr = 0x00;
00158 } else {
00159
00160
00161
00162
00163 assert(datum.cbData <= TWO_BYTE_MAX_LENGTH);
00164
00165 if (isCompressedInt64) {
00166
00167 compressInt64(datum, tmpDataPtr);
00168 } else {
00169
00170
00171 if () {
00172 if (datum.cbData && (datum.cbData <= ONE_BYTE_MAX_LENGTH)) {
00173 *tmpDataPtr = static_cast(datum.cbData);
00174 tmpDataPtr++;
00175 } else {
00176 uint8_t higherByte =
00177 (datum.cbData & TWO_BYTE_LENGTH_MASK1) >> 8 |
00178 TWO_BYTE_LENGTH_BIT;
00179 uint8_t lowerByte = datum.cbData & TWO_BYTE_LENGTH_MASK2;
00180 *tmpDataPtr = higherByte;
00181 tmpDataPtr++;
00182 *tmpDataPtr = lowerByte;
00183 tmpDataPtr++;
00184 }
00185 }
00186
00187
00188 memcpy(tmpDataPtr, datum.pData, datum.cbData);
00189 }
00190 }
00191 }
Loads a value from a buffer containing data encoded via storeValue.
Note:
See note on memCopyFrom method regarding why and how to preallocate the buffer.
Parameters:
| [in] | datum | datum to receive loaded value |
|---|---|---|
| [in] | pDataWithLen | data buffer to load from |
Definition at line 193 of file UnalignedAttributeAccessor.cpp.
References TupleDatum::cbData, cbStorage, isCompressedInt64, isInitialized(), omitLengthIndicator, ONE_BYTE_LENGTH_MASK, TupleDatum::pData, TWO_BYTE_LENGTH_BIT, and uncompressInt64().
Referenced by LcsColumnReader::findVal(), LcsCompareColKeyUsingOffsetIndex::lessThan(), and LcsHash::search().
Member Data Documentation
The documentation for this class was generated from the following files:
- /home/pub/open/dev/fennel/tuple/UnalignedAttributeAccessor.h
- /home/pub/open/dev/fennel/tuple/UnalignedAttributeAccessor.cpp
