Break This 8-Bit Block Mixing Cipher (original) (raw)

ACiphers By Ritter Page

Terry Ritter

An 8-bit-wide model block cipher enciphers toy "messages" of two hex characters each. The intent is not to protect information, but rather to support analysis of the design. The model presents cryptographic "strength" at a reduced level where it hopefully can be confronted and understood. This appears to be a fairly novel approach to the study of cryptographic strength, both in a particular cipher design, and in general.

The 8-bit model is one of the smallest incarnations of a _scalable_design which constructs both tiny models and real-size ciphers from components which differ only in size. Presumably, any possible weakness in the larger realizations also must be present in the tiny version, where it should be easier to find. The goal is an understanding of strength or weakness which scales up to real cipher sizes.

A previous article proposed a similar 4-bit block version which is vulnerable to many attacks simply because of its tiny size. While these attacks are real enough, they do not scale up, do not weaken real-size versions, and so do not provide insight into the general design. Size-based attacks are almost eliminated in this 8-bit version, and this greatly simplifies the discussion.


Contents


From: ritter@io.com (Terry Ritter) Newsgroups: sci.crypt Subject: Break This 8-Bit Block Cipher (long) Date: Sun, 19 Apr 1998 20:49:09 GMT Lines: 549 Message-ID: 353a6292.2214286@news.io.com

BREAK THIS 8-BIT BLOCK CIPHER!

Terry Ritter ritter@io.com Ritter Software Engineering http://www.io.com/~ritter/

1998-04-19

ABSTRACT

An 8-bit-wide model block cipher enciphers toy "messages" of two hex characters each. The intent is not to protect information, but rather to support analysis of the design. The model presents cryptographic "strength" at a reduced level where it hopefully can be confronted and understood. This appears to be a fairly novel approach to the study of cryptographic strength, both in a particular cipher design, and in general.

The 8-bit model is one of the smallest incarnations of a scalable design which constructs both tiny models and real-size ciphers from components which differ only in size. Presumably, any possible weakness in the larger realizations also must be present in the tiny version, where it should be easier to find. The goal is an understanding of strength or weakness which scales up to real cipher sizes.

A previous article proposed a similar 4-bit block version which is vulnerable to many attacks simply because of its tiny size. While these attacks are real enough, they do not scale up, do not weaken real-size versions, and so do not provide insight into the general design. Size-based attacks are almost eliminated in this 8-bit version, and this greatly simplifies the discussion.

THE PROBLEM

We have a block cipher composed of four shuffled substitution tables and a linear Balanced Block Mixer. (Converting a binary key value into shuffled tables is well-known technology; see, for example:

http://www.io.com/~ritter/KEYSHUF.HTM

.) We assume that we know every possible plaintext and ciphertext pair, plus the everything about the mixer. The goal is to develop the table permutations -- the internal key -- from the known information, using some sort of scalable algorithm. Since we know all possible messages and corresponding ciphertexts, we always have substantially more known information than the internal state we are trying to resolve.

Note that this reduced model cannot be a great cipher in any sense. It is known from previous work that a single mixing does not produce the nonlinearity distribution of the larger block (see:

http://www.io.com/~ritter/ARTS/MIXNONLI.HTM

). But it is also known that multiple mixings and wider tables do produce the right distribution. Here we investigate whether a single mixing layer, with associated table layers, will protect whatever other layers happen to be used.

The Balanced Block Mixer (BBM) basically consists of two Latin squares. Each square mixes two half-size blocks into a single half-size result; two of those thus give a full mixed block. The BBM can be computed or modeled as a table, and has been discussed extensively (see, for example:

http://www.io.com/~ritter/JAVASCRP/ACTIVBBM.HTM

). In the 8-bit model, the substitution tables are 4-bit invertible tables, each having exactly 16 entries of 4 bits each. In marked contrast to 2-bit tables, most 4-bit tables have at least some nonlinearity. (Nonlinearity is discussed in detail at:

http://www.io.com/~ritter/JAVASCRP/NONLMEAS.HTM

.) Since the only nonlinearity in the system is in the tables, when the selected tables have no nonlinearity, neither does the cipher. But at the real cipher size, finding a linear table is almost impossible. Which brings us to the issue of strength:

STRENGTH

The meaning of "strength" in this model is not the protection of information, since we assume that we know the ciphertext for every possible plaintext "message." The model emulates a Simple Substitution on one 8-bit character, and so is much like the puzzles we find in newspapers. But the same design scales up, stacks in added layers, and extends to wide blocks of dynamically selected size where we cannot even store all possible transformations. Only in those enhanced structures will the cipher protect data.

Nor is the meaning of "strength" a huge keyspace. The 8-bit version does have an apparent 177-bit keyspace, which should prevent "brute force" attacks on the table permutations. But this value probably does not represent the true strength of the structure, and the true strength is what we are trying to find.

In the strength analysis of the larger versions of these ciphers, I assume that if attackers could provably resolve even one table entry, they could similarly resolve the rest of the table. So I assign a strength value of just the bit-width for each table. Further, I assume that in any structure like this, one table layer does not provide strength. This means that I would assign a total strength of just 8 bits to the 8-bit model. In real-size designs, where we have 3 or 5 layers of tables, with a minimum width of 64 or 128 bits, even these minimalist assumptions imply a strength of at least 128 bits, which is "large enough."

The meaning of "strength" is also not the nonlinearity of the overall block transformation. While it is true that some 4-bit tables are linear (which makes the resulting cipher linear), finding a linear 8-bit table at random is harder than guessing a cipher key. So overall linearity is also not going to solve the problem, unless some form of it somehow scales up with the cipher.

At this point, one might reasonably ask "What is left?", and the answer is: "Anything that will reveal the key and also scale to real size implementations."

In particular, the mixing transformation is both linear and known, and is an obvious focal point for an attack. But eventually the "attack" should be some process which exposes one table entry then another -- or all at once -- independent of table size and block size. This is not because the problem is "rigged," but instead because the whole point is to understand any weakness inherent in the scalable design, instead of weaknesses of size which will not exist in a real version.

SCALABILITY

Scalability is a relatively new concept in cipher design, so it is easy to miss what "attack" means in this context. The usual way to discuss cipher strength would be to present a real-size version and say "Attack it!" But a serious attack is usually a complex, time-consuming process, so it is desirable to find a better way of revealing cipher weakness.

The point of a truly scalable design is to have a fixed definition which is scalable in the same way that we expect arithmetic to work on numbers both large and small. Obviously large values are not the same as small values, but the concepts should be the same.

This particular design consists of exactly two components: tables and a mixer. There is no particular problem in constructing or keying tables of any reasonable size (although one might think that 8-bit tables would be of sufficient size in practice).

The mixer is defined algebraically (albeit in mod-2 polynomials), and so also scales as desired. The only specific difference in mixer size is the use of some particular irreducible polynomial. But there always is some such polynomial, and we assume that the attacker will know what one we use. It does seem unlikely that a particular irreducible would be "weak" just as it seems unlikely that a particular prime would be "weak" in number theory ciphers. But if so, that would be good to know, and then we would use an irreducible which does not have that problem.

If there is a weakness in the larger versions, the very same weakness should be present in the tiny model, where it should be easier to find.

THE DESIGN

Here we have a system which has 4-bit tables and data paths, and an 8-bit block size. This is a scaled-down model which is not useful for hiding data, but is useful for analysis.

| INPUT BLOCK | InA | | InB | v v | -------- -------- | | TA | | TB | | -------- -------- | | | | +-------------------+ | | | | | | | +-------------------+ | A | | B A | | B | v v v v | -------- -------- | | L0 | BBM | L1 | | -------- -------- | X | | Y | v v | -------- -------- | | TX | | TY | | -------- -------- | | | | OutX v v OutY | OUTPUT BLOCK

In the figure, the tables are TA, TB, TX and TY, and the Balanced Block Mixer is shown as two Latin square combiners L0 and L1.

A MIXING OVERVIEW

The orthogonal Latin squares in the BBM are developed algorithmically as follows:

X = 3A + 2B (mod 2)(mod p), p = 10011 Y = 2A + 3B (mod 2)(mod p)

These equations can make it awkward to think about the system, so we can use a table instead. Here A selects a row, B selects a column, and the selected entry is XY:

4-Bit Els, Poly 10011 = 0x13

   0   1   2   3   4   5   6   7   8   9   a   b   c   d   e   f

0 00 23 46 65 8c af ca e9 3b 18 7d 5e b7 94 f1 d2 1 32 11 74 57 be 9d f8 db 09 2a 4f 6c 85 a6 c3 e0 2 64 47 22 01 e8 cb ae 8d 5f 7c 19 3a d3 f0 95 b6 3 56 75 10 33 da f9 9c bf 6d 4e 2b 08 e1 c2 a7 84 4 c8 eb 8e ad 44 67 02 21 f3 d0 b5 96 7f 5c 39 1a 5 fa d9 bc 9f 76 55 30 13 c1 e2 87 a4 4d 6e 0b 28 6 ac 8f ea c9 20 03 66 45 97 b4 d1 f2 1b 38 5d 7e 7 9e bd d8 fb 12 31 54 77 a5 86 e3 c0 29 0a 6f 4c 8 b3 90 f5 d6 3f 1c 79 5a 88 ab ce ed 04 27 42 61 9 81 a2 c7 e4 0d 2e 4b 68 ba 99 fc df 36 15 70 53 a d7 f4 91 b2 5b 78 1d 3e ec cf aa 89 60 43 26 05 b e5 c6 a3 80 69 4a 2f 0c de fd 98 bb 52 71 14 37 c 7b 58 3d 1e f7 d4 b1 92 40 63 06 25 cc ef 8a a9 d 49 6a 0f 2c c5 e6 83 a0 72 51 34 17 fe dd b8 9b e 1f 3c 59 7a 93 b0 d5 f6 24 07 62 41 a8 8b ee cd f 2d 0e 6b 48 a1 82 e7 c4 16 35 50 73 9a b9 dc ff

This mixing structure is clearly weak: it is in fact linear. But it is intended to be used with very nonlinear tables, and the resulting cipher is not linear. We can implement the mixing structure as two 2-dimension indexable arrays L0 and L1, which we fill by computation.

This means that the whole cipher can be expressed as just 2 input substitutions, 2 mixings, and 2 output substitutions:

// input substitutions A = TA[ InA ]; B = TB[ InB ];

// BBM mixing X = L0[A][B]; Y = L1[A][B];

// output substitutions OutX = TX[ X ]; OutY = TY[ Y ];

Or we could just compute the mixing values each time, instead of using tables:

int L0( int a, int b ) { // return 3a + 2b (mod 2)(mod p) int t; t = a^b; // a + b (mod 2) t <<= 1; // 2a + 2b (mod 2) if (t & 0x10) // if msb is 1, going out of range t ^= 0x13; // so subtract irreducible p (mod 2) return a^t; // 3a + 2b (mod 2)(mod p) }

And L1 would be the same except for "return b^t", thus calculating "2a + 3b (mod 2)(mod p)". This is probably how we would want to do the real-size designs, since a single "8-bit" Latin square has 65,536 byte entries, and we need two. (But if we did that we could make the tables nonlinear, and might use multiple different mixings.)

ATTACKS

Normally, we think of attacking a cipher to gain some of the information it protects. Perhaps we have some known plaintext (with associated ciphertext), and wish to somehow expose the key, which will of course expose the rest of the ciphertext. With the 8-bit model proposed here, there is no "rest of the ciphertext" to expose, but there is a key, which consists of the state of the 4 shuffled tables. Since we have all possible plaintexts and ciphertexts, we can mount any possible known-plaintext or defined-plaintext attack.

More elaborate attacks, such as Linear Cryptanalysis or Differential Cryptanalysis usually depend upon a knowledge of fixed nonlinear tables in a cipher, and there are no such tables here. While it would be much too casual to say that such techniques could not be applied, it is just not clear how they could.

COMMENTS

The model is a single linear mixing with unknown tables on both the input and output. This seems to be the ultimate simplification of both the Mixing and VSBC ciphers, and in fact may be a step too far. The real ciphers have at least three layers of tables and two mixings, so that "known plaintext" is not available across any single mixing.

This tiny model demonstrates the analytic advantage of true cipher scalability. One would think that a cipher with a 256-element codebook would be small enough to attack on even the smallest personal computer. And if an attack is not possible, the model may be small enough to know why not. This last possibility would be An Important Result. Presumably, such an outcome would for the first time make it possible to prove strength in a practical full-size cipher, something I never expected to see.

It seems to me that the guaranteed balance of the BBM protects the tables from being separated and exposed. If so, the simple linear BBM contributes to strength in an essential way, despite having absolutely no strength of its own.

Many articles on the larger systems and their general design principles can be found on my web pages.

ACKNOWLEDGMENT

My thanks to Randall Williams for his efforts on the 4-bit cipher, which identified a whole range of problems with the earlier presentation.

APPENDIX A: The 8-Bit Block Cipher in C

// for research analysis; not a useful cipher

#include <time.h> // randomize #include <stdlib.h> // random #include <stdio.h> // printf

// BBM by lookup int L0[16][16]; int L1[16][16];

// BBM by computation

int FL0( int a, int b ) { int t; t = (a^b) << 1; if (t & 0x10) t ^= 0x13; return a^t; }

int FL1( int a, int b ) { int t; t = (a^b) << 1; if (t & 0x10) t ^= 0x13; return b^t; }

// use computed BBM to fill global lookup

void InitLS( int lastel ) { int a, b; for( a=0; a<=lastel; a++ ) { for( b=0; b<=lastel; b++ ) { L0[a][b] = FL0( a, b ); L1[a][b] = FL1( a, b ); } } }

// keyed tables int TA[16]; int TB[16]; int TX[16]; int TY[16];

void InitTab( int tab[], int lastel ) { int i; for (i=0; i<=lastel; i++) { tab[i] = i; } }

int Rand0thruN( int mask, int N ) { // return integer 0..N, N < 256 // mask MUST be 2**n - 1 for some n int b; do { b = rand(); // 32-bit value b ^= b >> 16; b ^= b >> 8; b &= mask; // b is 0..mask } while (b > N); return b; }

void ShufTab( int tab[], int mask ) { // shuffle the table // mask MUST be 2**n - 1 for some n int nxtmsk, N, rand, by;

do { nxtmsk = mask >> 1;

  for (N = mask; N > nxtmsk; N--) {
     rand = Rand0thruN( mask, N );
     by = tab[N];
     tab[N] = tab[rand];
     tab[rand] = by;
     }

  mask = nxtmsk;
  } while (mask);

}

// global state ciphering

int InA, InB, A, B, X, Y, OutX, OutY;

void Mix8( void ) { A = TA[ InA ]; B = TB[ InB ]; X = L0[ A ][ B ]; Y = L1[ A ][ B ]; OutX = TX[ X ]; OutY = TY[ Y ]; }

// functional ciphering

int Mix8X( int ina, int inb ) { return TX[ L0[ TA[ina] ][ TB[inb] ] ]; }

int Mix8Y( int ina, int inb ) { return TY[ L1[ TA[ina] ][ TB[inb] ] ]; }

void main( void ) { int ina, inb, outx, outy;

randomize(); // start out new

// fill the 2 global Ls tables as the BBM InitLS( 15 );

// shuffle the 4 global keying tables InitTab( TA, 15 ); ShufTab( TA, 15 ); InitTab( TB, 15 ); ShufTab( TB, 15 ); InitTab( TX, 15 ); ShufTab( TX, 15 ); InitTab( TY, 15 ); ShufTab( TY, 15 );

// show every possible plaintext and ciphertext

// first the inb column labels printf( "\n " ); for( inb = 0; inb<=15; inb++ ) printf( " %x ", inb );

// then each ina row for( ina = 0; ina<=15; ina++ ) { printf( "\n %x ", ina ); // start row with ina label for( inb=0; inb<=15; inb++ ) { outx = Mix8X( ina, inb ); outy = Mix8Y( ina, inb ); printf( " %x%x ", outx, outy ); } } printf( "\n" ); // end last line

}

APPENDIX B: Sample Output

 0   1   2   3   4   5   6   7   8   9   a   b   c   d   e   f

0 06 67 da 54 45 e0 88 2d ae 3f c1 1c b3 9b 72 f9 1 a3 c5 28 79 37 be 1a dc 00 41 6f 8d e6 f2 5b 94 2 c2 ad 71 2e ec 49 9f 57 64 b8 0a f5 3b 83 d6 10 3 fc 8b 40 bf d2 7a 6e 33 98 24 19 c6 5d a7 e5 01 4 15 93 e9 38 76 d1 a4 bb 8f 5e f0 02 27 6d 4c ca 5 2a 74 a6 c7 99 8c e3 0e dd fb 52 b0 18 4f 61 35 6 9d 12 3e e1 2b 58 c0 46 fa d9 84 63 7c 05 b7 af 7 b9 48 85 f3 6a a2 d7 1f eb cd 3c 21 04 7e 90 56 8 4e b1 fd 82 0f c3 5c 9a 36 a5 e7 78 60 29 14 db 9 d8 59 03 65 f4 1d b6 a0 2c 92 7b ee 8a 31 cf 47 a e4 3a 17 96 c8 0b 25 81 b2 6c 4d df a9 50 fe 73 b 30 ef 9c 1b a1 66 7d f8 43 07 b5 5a ce d4 89 22 c 7f 20 cb ac 1e f7 32 69 55 86 d3 44 91 ba 08 ed d 51 de 62 0d 80 95 4b c4 77 13 26 39 ff e8 aa bc e 6b 0c 5f d0 bd 34 f1 75 c9 ea a8 97 42 16 23 8e f 87 f6 b4 4a 53 2f 09 e2 11 70 9e ab d5 cc 3d 68

 0   1   2   3   4   5   6   7   8   9   a   b   c   d   e   f

0 df 41 10 83 38 02 9c 67 59 bb e4 7d c6 2e fa a5 1 53 1d 46 9f f5 6e 89 0b dc 27 aa c1 70 b2 34 e8 2 b1 af e9 6d 77 9a 06 88 20 d5 1e 33 fc 54 c2 4b 3 47 d8 5a 7b 61 f9 ce 3f 12 a3 26 85 94 ec 00 bd 4 3e 64 05 a2 d6 13 e7 4c fb 79 91 ba 28 cf 5d 80 5 a8 b7 22 35 8f c0 f4 71 ea 4d 5c 6b 0e 16 99 d3 6 09 f0 31 2c 1a d7 b3 52 6f 9e 75 e6 ad 8b 48 c4 7 74 8e 9b 4a bc ed 18 a6 c5 30 0f d2 57 f1 23 69 8 6c 36 fd b9 44 5b 2f de 03 82 c8 a0 e1 97 15 7a 9 1b 55 d4 c7 0d 3c 72 f3 4e ef b0 98 8a a9 66 21 a e5 2b be f8 93 76 3a cd a4 11 d9 07 62 40 8c 5f b 86 7c c3 d0 ae 25 51 b4 9d 6a f7 49 1f 08 eb 32 c ca 92 87 14 29 a1 45 e0 78 f6 63 5e db 3d bf 0c d 2d e3 ac 01 cb 84 60 95 b6 58 42 ff 39 da 7e 17 e 90 c9 7f 56 e2 b8 dd 2a 81 04 3b 1c 43 65 a7 fe f f2 0a 68 ee 50 4f ab 19 37 cc 8d 24 b5 73 d1 96


Terry Ritter ritter@io.com http://www.io.com/~ritter/ The Crypto Glossary: http://www.io.com/~ritter/GLOSSARY.HTM

From: David Hopwood hopwood@zetnet.co.uk Newsgroups: sci.crypt Subject: Re: Break This 8-Bit Block Cipher (long) Date: Sat, 25 Apr 1998 10:59:13 +0100 Message-ID: 1998042510591374952@zetnet.co.uk References: <353a6292.2214286@news.io.com> Lines: 450

-----BEGIN PGP SIGNED MESSAGE-----

In message <353a6292.2214286@news.io.com> ritter@io.com (Terry Ritter) wrote:

THE PROBLEM

We have a block cipher composed of four shuffled substitution tables and a linear Balanced Block Mixer. [...] We assume that we know every possible plaintext and ciphertext pair, plus the everything about the mixer. The goal is to develop the table permutations -- the internal key -- from the known information, using some sort of scalable algorithm.

That is a goal, but not necessarily the goal. It would also be a weakness in the cipher if it were possible, given some subset of the plaintext/ciphertext pairs, to find other plaintext/ciphertext pairs that were not given (using a scalable algorithm).

This might not require finding the internal key, and in fact this 8-bit cipher construction is a good example of that; see below.

A MIXING OVERVIEW

The orthogonal Latin squares in the BBM are developed algorithmically as follows:

X = 3A + 2B (mod 2)(mod p), p = 10011 Y = 2A + 3B (mod 2)(mod p)

[table snipped]

This mixing structure is clearly weak: it is in fact linear. But it is intended to be used with very nonlinear tables, and the resulting cipher is not linear. We can implement the mixing structure as two 2-dimension indexable arrays L0 and L1, which we fill by computation.

This means that the whole cipher can be expressed as just 2 input substitutions, 2 mixings, and 2 output substitutions:

// input substitutions A = TA[ InA ]; B = TB[ InB ];

// BBM mixing X = L0[A][B]; Y = L1[A][B];

// output substitutions OutX = TX[ X ]; OutY = TY[ Y ];

I.e. OutX = TX[ 3 TA[InA] + 2 TB[InB] (mod 2)(mod p) ]

Suppose we vary InA while keeping InB fixed. Then

OutX = TX[ 3 TA[InA] + constant (mod 2)(mod p) ]

Since TA, multiplication by 3 (in the relevant ring), addition of a constant, and TX are all permutations, their composition is also a permutation. This means that when InB is held constant and InA is varied from 0 to 15 (i.e. a column in the output table), OutX will take on each value from 0 to 15 exactly once. The same applies to OutY in each column, and to OutX and OutY in each row.

Here is a sample output table; the fact that X and Y are permutations within each row and column stands out clearly once you know what to look for:

 0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F

0 E2 48 3C 70 D9 95 2B 1E 5D A4 BF 01 F6 67 8A C3 1 47 E1 B4 25 FA 50 7D AF 9B 1C 3E 68 D3 02 C9 86 2 84 CE 57 13 6D B6 AA 78 39 22 91 FF 05 DC EB 40 3 BD 35 4A D1 74 88 F7 66 C2 09 E3 A0 2E 1B 9C 5F 4 71 27 FE ED B3 6B 45 8C 00 CF D4 52 3A 98 16 A9 5 28 72 DF 4B 36 0D E0 C4 65 8E FC 97 B9 51 A3 1A 6 5A 93 8D 0E A7 4F 64 F0 EC DB C5 76 11 29 32 B8 7 F5 DD 73 37 4E A2 B1 59 18 96 2A 8B E4 C0 0F 6C 8 3B B0 E9 F8 2C C1 D2 03 87 6A 46 15 7F AD 54 9E 9 D0 FB 26 B2 EF 17 38 9A A1 53 79 CD 4C 85 6E 04 A 99 56 CB 6F 12 EE 0C D5 44 FD 80 23 A8 7A B7 31 B CC 8F 92 A6 0B 33 19 21 BA 77 58 DE 60 F4 4D E5 C 63 0A A5 94 81 7C 5E BB 2F 30 1D 49 C7 E6 D8 F2 D 06 69 10 5C C8 24 9F 3D 7E B5 AB EA 82 43 F1 D7 E AE 14 61 CA 55 F9 83 42 D6 E8 07 BC 9D 3F 20 7B F 1F AC 08 89 90 DA C6 E7 F3 41 62 34 5B BE 75 2D

This is a significant weakness because it leaks information about plaintext/ciphertext pairs that would not otherwise be known. For example, suppose that an attacker knows the ciphertext for input blocks 00 to 0E inclusive in the above table. He/she can infer that the ciphertext for 0F is C3, and can decrypt this ciphertext if it is encountered.

Alternatively, probabilistic information can be obtained. E.g. if the attacker only knows blocks 00 to 0D inclusive, and encounters one of {83, 8A, C3, CA} as a ciphertext, there is a 50% chance that the corresponding plaintext is 0E or 0F; much higher than for a random function.

Actually, there is also another attack that gives key information, but I thought I'd point out this pattern first, as a demonstration that not all attacks necessarily involve obtaining the key.

COMMENTS

The model is a single linear mixing with unknown tables on both the input and output. This seems to be the ultimate simplification of both the Mixing and VSBC ciphers, and in fact may be a step too far.

Yes, it is; the linear BBM is not sufficiently well hidden by the tables. Consider

OutX = TX[ 3 TA[InA] + 2 TB[InB] (mod 2)(mod p) ]

Suppose we have two plaintext blocks, (InA1 InB1) and (InA2 InB2), that map to the same value for OutX (e.g. 0B => 01 and 1D => 02 in the sample table given above).

Then we can write (taking all arithmetic to be implicitly (mod 2)(mod p)):

TX[ 3 TA[InA1] + 2 TB[InB1] ] = TX[ 3 TA[InA2] + 2 TB[InB2] ]

Since TX is a permutation, this implies

3 TA[InA1] + 2 TB[InB1] = 3 TA[InA2] + 2 TB[InB2] or 3 TA[InA1] + 2 TB[InB1] - 3 TA[InA2] - 2 TB[InB2] = 0

which is a linear equation with the table entries as variables. For the example of 0B => 01 and 1D => 02, we would get

3 TA[0x0] + 2 TB[0xB] - 3 TA[0x1] - 2 TB[0xD] = 0

Blocks that map to the same value for OutY can also be used; they will yield equations of the form

2 TA[InA1'] + 3 TB[InB1'] - 2 TA[InA1'] - 3 TB[InB2'] = 0

If we can find 32 suitable linearly independent equations (which simply requires having enough known or chosen text), we can solve for the 32 unknowns TA[0]..TA[15] and TB[0]..TB[15]. Once the TA and TB tables are known, it is trivial to find TX and TY.

I haven't implemented code to solve the system of equations (my linear algebra is a little rusty), but here is the rest as a Java program:

import java.util.Random; import java.util.Vector; import java.util.Enumeration;

/**

}

/**

}


David Hopwood hopwood@zetnet.co.uk PGP public key: http://www.users.zetnet.co.uk/hopwood/public.asc Key fingerprint = 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01 Key type/length = RSA 2048-bit (always check this as well as the fingerprint)

-----BEGIN PGP SIGNATURE----- Version: 2.6.3i Charset: noconv

iQEVAwUBNUGy/zkCAxeYt5gVAQFPcQgAhMQ71kDKK0RqC5a7JJDD1wf6klWSIagZ GN8AmXs64Ir7pDZ3sBX9+4T88zV4RcRbxOy7QbZRk8QzfLLTFirMO63iU2FqDAxV xHVyvOH9nYQl65QHVhF/bzeuyWWBfTwMYl7fJeWNwXliaXK0cxLQeeJVffj7CNJ3 hsrq6g79Gq400vy50Tc9IRy2Aa2wC7v6HSDPxvdmk7Q5vLuGnCyXhEXxSjiTsKLl FMgPXD73uZYzRaHDY5FVjEJOtDU51iVKGv0VNI98ByroqfDMjRZuoAwhSyjF7B61 5NFhsfe5qddb/Oyv8I5ta1fNAsgBrpaD7Mw43l7xA2Tg/NsNeM05Ww== =VFWB -----END PGP SIGNATURE-----

From: ritter@io.com (Terry Ritter) Newsgroups: sci.crypt Subject: Re: Break This 8-Bit Block Cipher (long) Date: Tue, 28 Apr 1998 04:12:10 GMT Lines: 105 Message-ID: 354555fd.3948563@news.io.com References: <353a6292.2214286@news.io.com> <1998042510591374952@zetnet.co.uk>

Note: This is a quick response since I am being otherwise occupied. I also do want to actually work out the attack before speculating on the consequences.

On Sat, 25 Apr 1998 10:59:13 +0100, in <1998042510591374952@zetnet.co.uk> in sci.crypt David Hopwood hopwood@zetnet.co.uk wrote:

-----BEGIN PGP SIGNED MESSAGE-----

In message <353a6292.2214286@news.io.com> ritter@io.com (Terry Ritter) wrote:

THE PROBLEM

We have a block cipher composed of four shuffled substitution tables and a linear Balanced Block Mixer. [...] We assume that we know every possible plaintext and ciphertext pair, plus the everything about the mixer. The goal is to develop the table permutations -- the internal key -- from the known information, using some sort of scalable algorithm.

That is a goal, but not necessarily the goal. It would also be a weakness in the cipher if it were possible, given some subset of the plaintext/ciphertext pairs, to find other plaintext/ciphertext pairs that were not given (using a scalable algorithm).

I find it hard to take this complaint too literally, since this "weakness" is of course inherent in every possible block cipher under known-plaintext conditions: Simply by knowing one known-plaintext pair we have reduced the uncertainty for every other ciphertext (whose associated plaintext now cannot be the one we already know).

[...] when InB is held constant and InA is varied from 0 to 15 (i.e. a column in the output table), OutX will take on each value from 0 to 15 exactly once. The same applies to OutY in each column, and to OutX and OutY in each row.

Yes, of course. This is inherent in the Latin square nature of the mixing, and is neither secret nor surprising. The effect is demonstrated in the Active BBM page:

http://www.io.com/~ritter/JAVASCRP/ACTIVBBM.HTM

This is a significant weakness because it leaks information about plaintext/ciphertext pairs that would not otherwise be known.

Again, at least in the abstract, this "weakness" is inherent in any block cipher under known-plaintext conditions. Presumably the issue here is the small sub-block size, but even in this case, the attack seems strange.

For example, suppose that an attacker knows the ciphertext for input blocks 00 to 0E inclusive in the above table. He/she can infer that the ciphertext for 0F is C3, and can decrypt this ciphertext if it is encountered.

But The Opponents have just collected allbutone of the ciphertexts associated with changing just one plaintext input (with the others fixed). And all they get out of this is the one remaining ciphertext. So exactly what prevents them from getting the remaining ciphertext the same way they got all the others?

[...] Suppose we have two plaintext blocks, (InA1 InB1) and (InA2 InB2), that map to the same value for OutX (e.g. 0B => 01 and 1D => 02 in the sample table given above).

Then we can write (taking all arithmetic to be implicitly (mod 2)(mod p)):

TX[ 3 TA[InA1] + 2 TB[InB1] ] = TX[ 3 TA[InA2] + 2 TB[InB2] ]

Since TX is a permutation, this implies

3 TA[InA1] + 2 TB[InB1] = 3 TA[InA2] + 2 TB[InB2] or 3 TA[InA1] + 2 TB[InB1] - 3 TA[InA2] - 2 TB[InB2] = 0

which is a linear equation with the table entries as variables.

[...] If we can find 32 suitable linearly independent equations (which simply requires having enough known or chosen text), we can solve for the 32 unknowns TA[0]..TA[15] and TB[0]..TB[15]. Once the TA and TB tables are known, it is trivial to find TX and TY.

OK. This is the good stuff! This is what I asked for, and presumably the desired solution. But I expect to have to work it out in detail before I gain any deep insight into the problem.

My thanks to David for his help and interest.


Terry Ritter ritter@io.com http://www.io.com/~ritter/ The Crypto Glossary: http://www.io.com/~ritter/GLOSSARY.HTM

From: ritter@io.com (Terry Ritter) Newsgroups: sci.crypt Subject: Re: Break This 8-Bit Block Cipher (long) Date: Wed, 29 Apr 1998 15:51:54 GMT Lines: 148 Message-ID: 35474c6f.949708@news.io.com References: <353a6292.2214286@news.io.com> <1998042510591374952@zetnet.co.uk>

OK, I've looked at the attack, and I'm having trouble getting it to work.

On Sat, 25 Apr 1998 10:59:13 +0100, in <1998042510591374952@zetnet.co.uk> in sci.crypt David Hopwood hopwood@zetnet.co.uk wrote:

[...] the linear BBM is not sufficiently well hidden by the tables. Consider

OutX = TX[ 3 TA[InA] + 2 TB[InB] (mod 2)(mod p) ]

Suppose we have two plaintext blocks, (InA1 InB1) and (InA2 InB2), that map to the same value for OutX (e.g. 0B => 01 and 1D => 02 in the sample table given above).

Then we can write (taking all arithmetic to be implicitly (mod 2)(mod p)):

TX[ 3 TA[InA1] + 2 TB[InB1] ] = TX[ 3 TA[InA2] + 2 TB[InB2] ]

Yes.

Since TX is a permutation, this implies

3 TA[InA1] + 2 TB[InB1] = 3 TA[InA2] + 2 TB[InB2] or 3 TA[InA1] + 2 TB[InB1] - 3 TA[InA2] - 2 TB[InB2] = 0

Actually, subtraction and addition are the same thing mod 2, so we really have no choice but to "add" all terms.

which is a linear equation with the table entries as variables. For the example of 0B => 01 and 1D => 02, we would get

3 TA[0x0] + 2 TB[0xB] - 3 TA[0x1] - 2 TB[0xD] = 0

Yes.

Blocks that map to the same value for OutY can also be used; they will yield equations of the form

2 TA[InA1'] + 3 TB[InB1'] - 2 TA[InA1'] - 3 TB[InB2'] = 0

Sure.

If we can find 32 suitable linearly independent equations (which simply requires having enough known or chosen text), we can solve for the 32 unknowns TA[0]..TA[15] and TB[0]..TB[15].

OK, this seems to be the problem: The apparently reasonable idea that 32 such equations exist (or that even 2 equations exist with the same 2 unknowns) appears false. To see why in a manageable size, let's go back to the 4-bit mixing table from the earlier 4-bit block cipher article:

X = 3A + 2B (mod 2)(mod p), p = 111 Y = 2A + 3B (mod 2)(mod p)

Here A selects a row, B selects a column, and the selected entry is XY.

     0   1   2   3

0 00 23 31 12 1 32 11 03 20 2 13 30 22 01 3 21 02 10 33

Let's assume that we know TA[] and TB[] so we can see inside the computation and group things appropriately. Without loss of generality and for analytic convenience only, we assume that TA[] and TB[] are the identity substitution. Let's start with X, which is 3 TA[] + 2 TB[], and group the equations which produce the same X value:

3 TA[0] + 2 TB[0] = 0 3 TA[1] + 2 TB[2] = 0 3 TA[2] + 2 TB[3] = 0 3 TA[3] + 2 TB[1] = 0

3 TA[0] + 2 TB[3] = 1 3 TA[1] + 2 TB[1] = 1 3 TA[2] + 2 TB[0] = 1 3 TA[3] + 2 TB[2] = 1

3 TA[0] + 2 TB[1] = 2 3 TA[1] + 2 TB[3] = 2 3 TA[2] + 2 TB[2] = 2 3 TA[3] + 2 TB[0] = 2

3 TA[0] + 2 TB[2] = 3 3 TA[1] + 2 TB[0] = 3 3 TA[2] + 2 TB[1] = 3 3 TA[3] + 2 TB[3] = 3

This is every possible combination for the 2-bit input values InA and InB, grouped by the internal 2-bit X results for all 16 possible 4-bit "messages." So we have 16 equations, and can similarly do Y to get 16 more. But note that this is not 16 different views of 2 unknown variables similar to what we saw in Linear Algebra problems. Since each table element is a different unknown, we must collect all 4 of the equations in each group just to cover the unknowns once. And since this is mod 2, when we collect an even number of expressions which produce the same value, they sum to zero:

3TA[0]+2TB[0]+3TA[1]+2TB[2]+3TA[2]+2TB[3]+3TA[3]+2TB[1] = 0 3TA[0]+2TB[3]+3TA[1]+2TB[1]+3TA[2]+2TB[0]+3TA[3]+2TB[2] = 0 3TA[0]+2TB[1]+3TA[1]+2TB[3]+3TA[2]+2TB[2]+3TA[3]+2TB[0] = 0 3TA[0]+2TB[2]+3TA[1]+2TB[0]+3TA[2]+2TB[1]+3TA[3]+2TB[3] = 0

So now we apparently have 4 equations and 8 unknowns.

But if we look at these equations and re-arrange the terms, we find that we actually have just one equation, expressed 4 times. This equation is just the sum of all input possibilities in the X mixing function. And we can do the same thing with Y.

Now, in the 8-bit size, we have 16 unknowns in each 4-bit table, and 256 possible 8-bit "messages." So we start out with 16 groups of 16 equations, and then get 16 equations which each cover all 32 input variables exactly once. But all of these equations are the same, so in the end we have just 2 different equations (X and Y), and this is not going to solve the system.

Once the TA and TB tables are known, it is trivial to find TX and TY.

Yup.

I haven't implemented code to solve the system of equations (my linear algebra is a little rusty), [...]

So far, I don't see that it works, but perhaps someone has a better idea.


Terry Ritter ritter@io.com http://www.io.com/~ritter/ The Crypto Glossary: http://www.io.com/~ritter/GLOSSARY.HTM

From: David Hopwood hopwood@zetnet.co.uk Newsgroups: sci.crypt Subject: Re: Break This 8-Bit Block Cipher (long) Date: Fri, 15 May 1998 17:47:47 +0100 Message-ID: 1998051517474774952@zetnet.co.uk References: <353a6292.2214286@news.io.com> <1998042510591374952@zetnet.co.uk> <35474c6f.949708@news.io.com> Lines: 256

-----BEGIN PGP SIGNED MESSAGE-----

In message <35474c6f.949708@news.io.com> ritter@io.com (Terry Ritter) wrote:

OK, I've looked at the attack, and I'm having trouble getting it to work.

Yes, so did I when I looked at it more closely, but I've now fixed it.

There is not enough information to uniquely determine the table entries; however the key space is divided into classes of equivalent keys, and it is possible to find which equivalence class was used (which is all that a cryptanalyst needs). I've written a Java program to implement this, at

http://www.users.zetnet.co.uk/hopwood/crypto/ritter/kBit.zip

Try "java kBitDemo -known -showtables 8 19", for example (you will need to install the JDK from www.javasoft.com, if you don't already have it).

Instances with a block size of up to 20 bits can be solved in less than a minute on a PC (providing the -lowmemory option is used). The attack requires more known plaintext than I originally thought would be needed, though (the number of plaintexts required is a roughly constant fraction of the number of possible input blocks, although it may be possible to improve that).

On Sat, 25 Apr 1998 10:59:13 +0100, in <1998042510591374952@zetnet.co.uk> in sci.crypt David Hopwood hopwood@zetnet.co.uk wrote:

[...] the linear BBM is not sufficiently well hidden by the tables. Consider

OutX = TX[ 3 TA[InA] + 2 TB[InB] (mod 2)(mod p) ]

Suppose we have two plaintext blocks, (InA1 InB1) and (InA2 InB2), that map to the same value for OutX (e.g. 0B => 01 and 1D => 02 in the sample table given above).

Then we can write (taking all arithmetic to be implicitly (mod 2)(mod p)):

TX[ 3 TA[InA1] + 2 TB[InB1] ] = TX[ 3 TA[InA2] + 2 TB[InB2] ]

Yes.

It turns out to be easier to handle a single equation for each plaintext/ciphertext pair, with entries in the inverse tables for TX and TY (call them ITX and ITY) as additional unknowns:

3 TA[InA] + 2 TB[InB] + ITX[OutX] = 0 2 TA[InA] + 3 TB[InB] + ITY[OutY] = 0

The advantage of doing this is that the full set of these equations completely characterise the overall permutation, so a key is equivalent to the original key if-and-only-if it is a solution to the equations.

[...]

If we can find 32 suitable linearly independent equations (which simply requires having enough known or chosen text), we can solve for the 32 unknowns TA[0]..TA[15] and TB[0]..TB[15].

OK, this seems to be the problem: The apparently reasonable idea that 32 such equations exist (or that even 2 equations exist with the same 2 unknowns) appears false.

Yes, you're right. That's a consequence of the balanced mixing structure; if no variables are known initially, any linear combination of the equations will have at least three unknown terms.

However, it turns out that the equations have mm(m-1) solutions (where m is 2^(k/2), i.e. the size of a table, or equivalently the number of elements in the field). Since we only need one of those solutions, it's possible to give fixed values to some of the table entries (in practice, 3 of them), and then solve for the rest.

First, here's an argument for why there are at least (and I think always exactly) mm(m-1) solutions:

Let s, t and u be constant elements in the field, with s != 0 f(x) = sx + t g(x) = sx + u u(x) = (x + 3t + 2u)/s v(x) = (x + 2t + 3u)/s

'.' denotes functional composition.

For all a and b, TX.u[L0(f.TA[a], g.TB[b])] = TX.u[3s TA[a] + 3t + 2s TB[b] + 2u] = TX[((3s TA[a] + 2s TB[b] + 3t + 2u) + 3t + 2u)/s] = TX[3 TA[a] + 2 TB[b]] = TX[L0(TA[a], TB[b])] and TY.v[L1(f.TA[a], g.TB[b])] = TY.v[2s TA[a] + 2t + 3s TB[b] + 3u] = TY[((2s TA[a] + 3s TB[b] + 2t + 3u) + 2t + 3u)/s] = TY[2 TA[a] + 3 TB[b]] = TY[L1(TA[a], TB[b])]

I.e. if (TA, TB, TX, TY) is a solution, then so is (f.TA, g.TB, TX.u, TY.v) for each s != 0, t, and u. These represent equivalent keys (in the usual sense of representing the same mapping from plaintext to ciphertext).

There are m possibilities for each of t and u, and m-1 possibilities for s, so this proves that the number of solutions is either zero, or at least mm(m-1). We know that there is at least one solution (the original key), so there must be at least mm(m-1). [I can't immediately see how to show that there are no more than that, but it's not really critical.]

This means that without loss of generality, we can set

f.TA[i] = 0 f.TA[j] = 1 g.TB[k] = 0

for any indices i, j and k where i != j.

[There will always be an equivalent key that satisfies the above, since the equations

s TA[i] + u = 0 s TA[j] + u = 1 s TB[k] + t = 0

always have a solution.]

====== Let's try an example by hand with 4-bit blocks:

  0   1   2   3

0 00 23 31 12 1 32 11 03 20 2 13 30 22 01 3 21 02 10 33

The original table entries are:

TA[0] = 0, TB[0] = 0, TX[0] = 0, TY[0] = 0 TA[1] = 1, TB[1] = 1, TX[1] = 1, TY[1] = 1 TA[2] = 2, TB[2] = 2, TX[2] = 2, TY[2] = 2 TA[3] = 3, TB[3] = 3, TX[3] = 3, TY[3] = 3

We'll use the names TA', TB', ITX' and ITY' for the tables that will be found by cryptanalysis (i.e. f.TA, g.TB, (TX.u)^-1 and (TY.v)^-1)).

Arbitrarily pick i = 1, j = 3, k = 2. (If doing a known plaintext attack, we could choose these values to correspond to the terms of the first available equation, so that additional variables can be found as quickly as possible.)

Set TA'[1] = 0, TA'[3] = 1, TB'[2] = 0.

I can't work out GF(2^k) multipliction in my head, so here is a table for multiplication:

Now repeatedly pick equations for which all but one term is known:

3 TA'[1] + 2 TB'[2] + 1 ITX'[0] = 0 => ITX'[0] = 0

3 TA'[3] + 2 TB'[1] + 1 ITX'[0] = 0 => 2 TB'[1] = 3 => TB'[1] = 2

3 TA'[1] + 2 TB'[1] + 1 ITX'[1] = 0 => ITX'[1] = 3

2 TA'[3] + 3 TB'[2] + 1 ITY'[0] = 0 => ITY'[0] = 2

2 TA'[2] + 3 TB'[1] + 1 ITY'[0] = 0 => 2 TA'[2] = 3 TA'[2] = 2

3 TA'[2] + 2 TB'[3] + 1 ITX'[0] = 0 => 2 TB'[3] = 1 TB'[3] = 3

3 TA'[0] + 2 TB'[3] + 1 ITX'[1] = 0 => 3 TA'[0] = 2 => TA'[0] = 3 (check: TA entries sum to 0)

3 TA'[0] + 2 TB'[0] + 1 ITX'[0] = 0 => 2 TB'[0] = 2 TB'[0] = 1 (check: TB entries sum to 0)

3 TA'[0] + 2 TB'[1] + 1 ITX'[2] = 0 => ITX'[2] = 1

3 TA'[0] + 2 TB'[2] + 1 ITX'[3] = 0 => ITX'[3] = 2 (check: ITX entries sum to 0)

2 TA'[0] + 3 TB'[2] + 1 ITY'[1] = 0 => ITY'[1] = 1

2 TA'[0] + 3 TB'[3] + 1 ITY'[2] = 0 => ITY'[2] = 3

2 TA'[0] + 3 TB'[1] + 1 ITY'[3] = 0 => ITY'[3] = 0 (check: ITY entries sum to 0)

In summary:

TA'[0] = 3, TB'[0] = 1, ITX'[0] = 0, ITY'[0] = 2 TA'[1] = 0, TB'[1] = 2, ITX'[1] = 3, ITY'[1] = 1 TA'[2] = 2, TB'[2] = 0, ITX'[2] = 1, ITY'[2] = 3 TA'[3] = 1, TB'[3] = 3, ITX'[3] = 2, ITY'[3] = 0

Invert ITX' and ITY':

TX'[0] = 0, TY'[0] = 3 TY'[1] = 2, TY'[1] = 1 TY'[2] = 3, TY'[2] = 0 TY'[3] = 1, TY'[3] = 2

which gives the same output table as for the original key:

  0   1   2   3

0 00 23 31 12 1 32 11 03 20 2 13 30 22 01 3 21 02 10 33

Voila.


David Hopwood hopwood@zetnet.co.uk PGP public key: http://www.users.zetnet.co.uk/hopwood/public.asc Key fingerprint = 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01 Key type/length = RSA 2048-bit (always check this as well as the fingerprint)

-----BEGIN PGP SIGNATURE----- Version: 2.6.3i Charset: noconv

iQEVAwUBNVs3mTkCAxeYt5gVAQEj7wgA0ExYH2VRhBVruCoVX7tJkIauTWzTkgiH /OyPrLdf3D2V8/m+gm5sGZI95jVe9aRDFuQ1/kfNJC7s8YcjDRIw68NE5VGUTRBc z7YAvJa290WnquiBlIcD92KINWD0f3Em3Z1x88ohMd9CJY5xXC3HZnTi+spQIVJ2 SUyrpj+a+KKsWb+/hKEEcEFq/sUul2BbuD0mq7h+SHJ3IiqAVZ8cSRTnYcdljQXt AAmeOHLkQQ/iXARaJucJAuqoTViIrJRVzR1JnSmhBMcr8Ax6BF0Z4QGTJ0qTvwtZ 8pSFQFBSaGZapHPNLDocrLFEhbSfkgLVIPZcnH+gxzetZojvAv1dhA== =5Ucy -----END PGP SIGNATURE-----

/hopwood@zetnet.co.uk

Terry Ritter, hiscurrent address, and histop page.

Last updated: 1998-05-29