packed_data - Documentation for Ruby 3.5 (original) (raw)

Packed Data¶ ↑

Quick Reference¶ ↑

These tables summarize the directives for packing and unpacking.

For Integers¶ ↑

Directive	Meaning
C	8-bit unsigned (unsigned char)
S	16-bit unsigned, native endian (uint16_t)
L	32-bit unsigned, native endian (uint32_t)
Q	64-bit unsigned, native endian (uint64_t)
J	pointer width unsigned, native endian (uintptr_t)

n | 16-bit unsigned, network (big-endian) byte order N | 32-bit unsigned, network (big-endian) byte order v | 16-bit unsigned, VAX (little-endian) byte order V | 32-bit unsigned, VAX (little-endian) byte order

U | UTF-8 character w | BER-compressed integer

For Floats¶ ↑

Directive	Meaning
D d	double-precision, native format
F f	single-precision, native format
E	double-precision, little-endian byte order
e	single-precision, little-endian byte order
G	double-precision, network (big-endian) byte order
g	single-precision, network (big-endian) byte order

For Strings¶ ↑

Directive	Meaning
A	arbitrary binary string (remove trailing nulls and ASCII spaces)
a	arbitrary binary string
Z	null-terminated string
B	bit string (MSB first)
b	bit string (LSB first)
H	hex string (high nibble first)
h	hex string (low nibble first)
u	UU-encoded string
M	quoted-printable, MIME encoding (see RFC2045)
m	base64 encoded string (RFC 2045) (default)

      |   (base64 encoded string (RFC 4648) if followed by 0)

P | pointer to a structure (fixed-length string) p | pointer to a null-terminated string

Additional Directives for Packing¶ ↑

Directive	Meaning
@	moves to absolute position
X	back up a byte
x	null byte

Additional Directives for Unpacking¶ ↑

Directive	Meaning
@	skip to the offset given by the length argument
X	skip backward one byte
x	skip forward one byte

Packing and Unpacking¶ ↑

Certain Ruby core methods deal with packing and unpacking data:

Method Array#pack: Formats each element in array self into a binary string; returns that string.
Method String#unpack: Extracts data from string self, forming objects that become the elements of a new array; returns that array.
Method String#unpack1: Does the same, but unpacks and returns only the first extracted object.

Each of these methods accepts a string template, consisting of zero or more directive characters, each followed by zero or more modifier characters.

Examples (directive 'C' specifies ‘unsigned character’):

[65].pack('C')
[65, 66].pack('CC') [65, 66].pack('C')
[65].pack('')
[65].pack('CC')

'A'.unpack('C')
'AB'.unpack('CC') 'AB'.unpack('C')
'A'.unpack('CC')
'AB'.unpack('')

The string template may contain any mixture of valid directives (directive 'c' specifies ‘signed character’):

[65, -1].pack('cC')
"A\xFF".unpack('cC')

The string template may contain whitespace (which is ignored) and comments, each of which begins with character '#' and continues up to and including the next following newline:

[0,1].pack(" C #foo \n C ")
"\0\1".unpack(" C #foo \n C ")

Any directive may be followed by either of these modifiers:

'*' - The directive is to be applied as many times as needed:
[65, 66].pack('C*')
'AB'.unpack('C*')
Integer count - The directive is to be applied count times:
[65, 66].pack('C2')
[65, 66].pack('C3')
'AB'.unpack('C2')
'AB'.unpack('C3')
Note: Directives in %w[A a Z m] use count differently; see String Directives.

If elements don’t fit the provided directive, only least significant bits are encoded:

[257].pack("C").unpack("C")

Packing Method¶ ↑

Method Array#pack accepts optional keyword argument buffer that specifies the target string (instead of a new string):

[65, 66].pack('C*', buffer: 'foo')

The method can accept a block:

[65, 66].pack('C*') {|s| p s }

Unpacking Methods¶ ↑

Methods String#unpack and String#unpack1 each accept an optional keyword argument offset that specifies an offset into the string:

'ABC'.unpack('C*', offset: 1)
'ABC'.unpack1('C*', offset: 1)

Both methods can accept a block:

ret = [] "ABCD".unpack("C*") {|c| ret << c } ret

'AB'.unpack1('C*') {|ele| p ele }

Integer Directives¶ ↑

Each integer directive specifies the packing or unpacking for one element in the input or output array.

8-Bit Integer Directives¶ ↑

'c' - 8-bit signed integer (like C signed char):
[0, 1, 255].pack('c*')
s = [0, 1, -1].pack('c*')
s.unpack('c*')
'C' - 8-bit unsigned integer (like C unsigned char):
[0, 1, 255].pack('C*')
s = [0, 1, -1].pack('C*')
s.unpack('C*')

16-Bit Integer Directives¶ ↑

's' - 16-bit signed integer, native-endian (like C int16_t):
[513, -514].pack('s*')
s = [513, 65022].pack('s*')
s.unpack('s*')
'S' - 16-bit unsigned integer, native-endian (like C uint16_t):
[513, -514].pack('S*')
s = [513, 65022].pack('S*')
s.unpack('S*')
'n' - 16-bit network integer, big-endian:
s = [0, 1, -1, 32767, -32768, 65535].pack('n*')
s.unpack('n*')
'v' - 16-bit VAX integer, little-endian:
s = [0, 1, -1, 32767, -32768, 65535].pack('v*')
s.unpack('v*')

32-Bit Integer Directives¶ ↑

'l' - 32-bit signed integer, native-endian (like C int32_t):
s = [67305985, -50462977].pack('l*')
s.unpack('l*')
'L' - 32-bit unsigned integer, native-endian (like C uint32_t):
s = [67305985, 4244504319].pack('L*')
s.unpack('L*')
'N' - 32-bit network integer, big-endian:
s = [0,1,-1].pack('N*')
s.unpack('N*')
'V' - 32-bit VAX integer, little-endian:
s = [0,1,-1].pack('V*')
s.unpack('v*')

64-Bit Integer Directives¶ ↑

'q' - 64-bit signed integer, native-endian (like C int64_t):
s = [578437695752307201, -506097522914230529].pack('q*')
s.unpack('q*')
'Q' - 64-bit unsigned integer, native-endian (like C uint64_t):
s = [578437695752307201, 17940646550795321087].pack('Q*')
s.unpack('Q*')

Platform-Dependent Integer Directives¶ ↑

'i' - Platform-dependent width signed integer, native-endian (like C int):
s = [67305985, -50462977].pack('i*')
s.unpack('i*')
'I' - Platform-dependent width unsigned integer, native-endian (like C unsigned int):
s = [67305985, -50462977].pack('I*')
s.unpack('I*')
'j' - Pointer-width signed integer, native-endian (like C intptr_t):
s = [67305985, -50462977].pack('j*')
s.unpack('j*')
'J' - Pointer-width unsigned integer, native-endian (like C uintptr_t):
s = [67305985, 4244504319].pack('J*')
s.unpack('J*')

Other Integer Directives¶ ↑

'U' - UTF-8 character:
s = [4194304].pack('U*')
s.unpack('U*')
'w' - BER-encoded integer (see BER encoding):
s = [1073741823].pack('w*')
s.unpack('w*')

Modifiers for Integer Directives¶ ↑

For the following directives, '!' or '_' modifiers may be suffixed as underlying platform’s native size.

'i', 'I' - C int, always native size.
's', 'S' - C short.
'l', 'L' - C long.
'q', 'Q' - C long long, if available.
'j', 'J' - C intptr_t, always native size.

Native size modifiers are silently ignored for always native size directives.

The endian modifiers also may be suffixed in the directives above:

'>' - Big-endian.
'<' - Little-endian.

Float Directives¶ ↑

Each float directive specifies the packing or unpacking for one element in the input or output array.

Single-Precision Float Directives¶ ↑

'F' or 'f' - Native format:
s = [3.0].pack('F')
s.unpack('F')
'e' - Little-endian:
s = [3.0].pack('e')
s.unpack('e')
'g' - Big-endian:
s = [3.0].pack('g')
s.unpack('g')

Double-Precision Float Directives¶ ↑

'D' or 'd' - Native format:
s = [3.0].pack('D')
s.unpack('D')
'E' - Little-endian:
s = [3.0].pack('E')
s.unpack('E')
'G' - Big-endian:
s = [3.0].pack('G')
s.unpack('G')

A float directive may be infinity or not-a-number:

inf = 1.0/0.0
[inf].pack('f')
"\x00\x00\x80\x7F".unpack('f')

nan = inf/inf
[nan].pack('f')
"\x00\x00\xC0\x7F".unpack('f')

String Directives¶ ↑

Each string directive specifies the packing or unpacking for one byte in the input or output string.

Binary String Directives¶ ↑

'A' - Arbitrary binary string (space padded; count is width); nil is treated as the empty string:
['foo'].pack('A')
['foo'].pack('A*')
['foo'].pack('A2')
['foo'].pack('A4')
[nil].pack('A')
[nil].pack('A*')
[nil].pack('A2')
[nil].pack('A4')
"foo\0".unpack('A')
"foo\0".unpack('A4')
"foo\0bar".unpack('A10')
"foo ".unpack('A')
"foo ".unpack('A4')
"foo".unpack('A4')
russian = "\u{442 435 441 442}"
russian.size
russian.bytesize
[russian].pack('A')
[russian].pack('A*')
russian.unpack('A')
russian.unpack('A2')
russian.unpack('A4')
russian.unpack('A*')
'a' - Arbitrary binary string (null padded; count is width):
["foo"].pack('a')
["foo"].pack('a*')
["foo"].pack('a2')
["foo\0"].pack('a4')
[nil].pack('a')
[nil].pack('a*')
[nil].pack('a2')
[nil].pack('a4')
"foo\0".unpack('a')
"foo\0".unpack('a4')
"foo ".unpack('a4')
"foo".unpack('a4')
"foo\0bar".unpack('a4')
'Z' - Same as 'a', except that null is added or ignored with '*':
["foo"].pack('Z*')
[nil].pack('Z*')
"foo\0".unpack('Z*')
"foo".unpack('Z*')
"foo\0bar".unpack('Z*')

Bit String Directives¶ ↑

'B' - Bit string (high byte first):
['11111111' + '00000000'].pack('B*')
['10000000' + '01000000'].pack('B*')
['1'].pack('B0')
['1'].pack('B1')
['1'].pack('B2')
['1'].pack('B3')
['1'].pack('B4')
['1'].pack('B5')
['1'].pack('B6')
"\xff\x00".unpack("B*")
"\x01\x02".unpack("B*")
"".unpack("B0")
"\x80".unpack("B1")
"\x80".unpack("B2")
"\x80".unpack("B3")
'b' - Bit string (low byte first):
['11111111' + '00000000'].pack('b*')
['10000000' + '01000000'].pack('b*')
['1'].pack('b0')
['1'].pack('b1')
['1'].pack('b2')
['1'].pack('b3')
['1'].pack('b4')
['1'].pack('b5')
['1'].pack('b6')
"\xff\x00".unpack("b*")
"\x01\x02".unpack("b*")
"".unpack("b0")
"\x01".unpack("b1")
"\x01".unpack("b2")
"\x01".unpack("b3")

Hex String Directives¶ ↑

'H' - Hex string (high nibble first):
['10ef'].pack('H*')
['10ef'].pack('H0')
['10ef'].pack('H3')
['10ef'].pack('H5')
['fff'].pack('H3')
['fff'].pack('H4')
['fff'].pack('H5')
['fff'].pack('H6')
['fff'].pack('H7')
['fff'].pack('H8')
"\x10\xef".unpack('H*')
"\x10\xef".unpack('H0')
"\x10\xef".unpack('H1')
"\x10\xef".unpack('H2')
"\x10\xef".unpack('H3')
"\x10\xef".unpack('H4')
"\x10\xef".unpack('H5')
'h' - Hex string (low nibble first):
['10ef'].pack('h*')
['10ef'].pack('h0')
['10ef'].pack('h3')
['10ef'].pack('h5')
['fff'].pack('h3')
['fff'].pack('h4')
['fff'].pack('h5')
['fff'].pack('h6')
['fff'].pack('h7')
['fff'].pack('h8')
"\x01\xfe".unpack('h*')
"\x01\xfe".unpack('h0')
"\x01\xfe".unpack('h1')
"\x01\xfe".unpack('h2')
"\x01\xfe".unpack('h3')
"\x01\xfe".unpack('h4')
"\x01\xfe".unpack('h5')

Pointer String Directives¶ ↑

'P' - Pointer to a structure (fixed-length string):
s = ['abc'].pack('P')
s.unpack('P*')
".".unpack("P")
("\0" * 8).unpack("P")
[nil].pack("P")
'p' - Pointer to a null-terminated string:
s = ['abc'].pack('p')
s.unpack('p*')
".".unpack("p")
("\0" * 8).unpack("p")
[nil].pack("p")

Other String Directives¶ ↑

'M' - Quoted printable, MIME encoding; text mode, but input must use LF and output LF; (see RFC 2045):
["a b c\td \ne"].pack('M')
["\0"].pack('M')
["a"*1023].pack('M') == ("a"*73+"=\n")*14+"a=\n"
("a"*73+"=\na=\n").unpack('M') == ["a"*74]
(("a"*73+"=\n")*14+"a=\n").unpack('M') == ["a"*1023]
"a b c\td =\n\ne=\n".unpack('M')
"=00=\n".unpack('M')
"pre=31=32=33after".unpack('M')
"pre=\nafter".unpack('M')
"pre=\r\nafter".unpack('M')
"pre=".unpack('M')
"pre=\r".unpack('M')
"pre=hoge".unpack('M')
"pre==31after".unpack('M')
"pre===31after".unpack('M')
'm' - Base64 encoded string; count specifies input bytes between each newline, rounded down to nearest multiple of 3; if count is zero, no newlines are added; (see RFC 4648):
[""].pack('m')
["\0"].pack('m')
["\0\0"].pack('m')
["\0\0\0"].pack('m')
["\377"].pack('m')
["\377\377"].pack('m')
["\377\377\377"].pack('m')
"".unpack('m')
"AA==\n".unpack('m')
"AAA=\n".unpack('m')
"AAAA\n".unpack('m')
"/w==\n".unpack('m')
"//8=\n".unpack('m')
"////\n".unpack('m')
"A\n".unpack('m')
"AA\n".unpack('m')
"AA=\n".unpack('m')
"AAA\n".unpack('m')
[""].pack('m0')
["\0"].pack('m0')
["\0\0"].pack('m0')
["\0\0\0"].pack('m0')
["\377"].pack('m0')
["\377\377"].pack('m0')
["\377\377\377"].pack('m0')
"".unpack('m0')
"AA==".unpack('m0')
"AAA=".unpack('m0')
"AAAA".unpack('m0')
"/w==".unpack('m0')
"//8=".unpack('m0')
"////".unpack('m0')
'u' - UU-encoded string:
[""].pack("u")
["a"].pack("u")
["aaa"].pack("u")
"".unpack("u")
"#86)C\n".unpack("u")

Offset Directives¶ ↑

'@' - Begin packing at the given byte offset; for packing, null fill or shrink if necessary:
[1, 2].pack("C@0C")
[1, 2].pack("C@1C")
[1, 2].pack("C@5C")
[*1..5].pack("CCCC@2C")
For unpacking, cannot to move to outside the string:
"\x01\x00\x00\x02".unpack("C@3C")
"\x00".unpack("@1C")
"\x00".unpack("@2C")
'X' - For packing, shrink for the given byte offset:
[0, 1, 2].pack("CCXC")
[0, 1, 2].pack("CCX2C")
For unpacking; rewind unpacking position for the given byte offset:
"\x00\x02".unpack("CCXC")
Cannot to move to outside the string:
[0, 1, 2].pack("CCX3C")
"\x00\x02".unpack("CX3C")
'x' - Begin packing at after the given byte offset; for packing, null fill if necessary:
[].pack("x0")
[].pack("x")
[].pack("x8")
For unpacking, cannot to move to outside the string:
"\x00\x00\x02".unpack("CxC")
"\x00\x00\x02".unpack("x3C")
"\x00\x00\x02".unpack("x4C")