Pierre Moulin - Academia.edu (original) (raw)

Papers by Pierre Moulin

Research paper thumbnail of Universal Fingerprinting: Capacity and Random-Coding Exponents

arXiv (Cornell University), Jan 24, 2008

This paper studies fingerprinting (traitor tracing) games in which the number of colluders and th... more This paper studies fingerprinting (traitor tracing) games in which the number of colluders and the collusion channel are unknown. The fingerprints are embedded into host sequences representing signals to be protected and provide the receiver with the capability to trace back pirated copies to the colluders. The colluders and the fingerprint embedder are subject to signal fidelity constraints. Our problem setup unifies the signal-distortion and Boneh-Shaw formulations of fingerprinting. The fundamental tradeoffs between fingerprint codelength, number of users, number of colluders, fidelity constraints, and decoding reliability are then determined. Several bounds on fingerprinting capacity have been presented in recent literature. This paper derives exact capacity formulas and presents a new randomized fingerprinting scheme with the following properties: (1) the encoder and receiver assume a nominal coalition size but do not need to know the actual coalition size and the collusion channel; (2) a tunable parameter ∆ trades off false-positive and false-negative error exponents; (3) the receiver provides a reliability metric for its decision; and (4) the scheme is capacity-achieving when the false-positive exponent ∆ tends to zero and the nominal coalition size coincides with the actual coalition size. A fundamental component of the new scheme is the use of a "time-sharing" randomized sequence. The decoder is a maximum penalized mutual information decoder, where the significance of each candidate coalition is assessed relative to a threshold, and the penalty is proportional to the coalition size. A much simpler threshold decoder that satisfies properties (1)-(3) above but not (4) is also given.

Research paper thumbnail of Strong converse for Gel'fand-Pinsker channel

A strong converse for the Gel'fand-Pinsker channel is established in this paper. The method is th... more A strong converse for the Gel'fand-Pinsker channel is established in this paper. The method is then extended to a multiuser scenario. A strong converse is established for the multiple-access Gel'fand-Pinsker channel under the maximum error criterion, and the capacity region is determined.

Research paper thumbnail of The Method Of Types And Its Application To Information Hiding

Publication in the conference proceedings of EUSIPCO, Antalya, Turkey, 2005

Research paper thumbnail of Information-theoretic analysis of spherical fingerprinting

2009 Information Theory and Applications Workshop, 2009

Information-theoretic performance limits of digital fingerprinting systems subject to almost-sure... more Information-theoretic performance limits of digital fingerprinting systems subject to almost-sure squared-error distortion constraints on the fingerprint embedder and the colluders are derived in this paper. The rate of the fingerprinting code is R = 1 N log M where N is codelength and M is the number of users. No assumption is made on the host signal statistics, but the collusion channel is also subject to a location-invariant condition. The receiver knows neither the collusion channel nor even the number of colluders. Capacity is the supremum of achievable rates and is shown to be equal to 1 2K log(1 + D f KDc) where K is the number of colluders, and D f and Dc are the L 2-distortion tolerance levels for the fingerprint embedder and the colluders, respectively. The worst collusion is shown to consist of uniform linear averaging of the coalition's marked copies followed by addition of independent spherical noise. Positive error exponents are achieved at all rates below capacity using random spherical fingerprinting codes and a new universal decoding criterion based on empirical Gaussian mutual information. It is also shown that minimum-distance decoding fails for this problem, and that a simple single-user decoder is almost as good as the universal decoder for large K. Geometric interpretations for all the results are given.

Research paper thumbnail of Capacity and optimal collusion attack channels for Gaussian fingerprinting games

SPIE Proceedings, 2007

paper, we focus on the case of two colluders for the simplicity of notation. Although cumbersome,... more paper, we focus on the case of two colluders for the simplicity of notation. Although cumbersome, it is conceptually straightforward to extend to the case where there are more than two colluders.

Research paper thumbnail of Maximin optimality of the arcsine fingerprinting distribution and the interleaving attack for large coalitions

2010 IEEE International Workshop on Information Forensics and Security, 2010

Research paper thumbnail of An information-theoretic model for image watermarking and data hiding

Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)

Research paper thumbnail of On fingerprinting capacity games for arbitrary alphabets and their asymptotics

2012 IEEE International Symposium on Information Theory Proceedings, 2012

The fingerprinting capacity has recently been derived as the value of a two-person zero-sum game.... more The fingerprinting capacity has recently been derived as the value of a two-person zero-sum game. In this work, we study the fingerprinting capacity games with k pirates in a new collusion model called the mixed digit model, which is inspired by the combined digit model ofŠkorić et al. For small k, the capacities along with optimal strategies for both players of the game are obtained explicitly. For large k, we extend our earlier asymptotic analysis for the binary alphabet with the marking assumption to q-ary alphabets with this general model and show that the capacity is asymptotic to A/(2k 2 ln q) where the constant A is specified as the maximin value of a functional game. Saddle-point solutions to the game are obtained using methods of variational calculus. For the special case of qary fingerprinting in the restricted digit model, we show that the interleaving attack is asymptotically optimal, a property that has motivated the design of optimized practical codes.

Research paper thumbnail of Capacity-achieving fingerprint decoding

2009 First IEEE International Workshop on Information Forensics and Security (WIFS), 2009

Research paper thumbnail of Capacity and Random-Coding Error Exponent for Public Fingerprinting Game

2006 IEEE International Symposium on Information Theory, 2006

Capacity and random-coding error exponent formulas are derived for a public fingerprinting (trait... more Capacity and random-coding error exponent formulas are derived for a public fingerprinting (traitor tracing) game. The original media copy is available to the encoder, but not to the decoder. We derive the random-coding error exponent for a stacked binning scheme. The exponent is strictly positive at all rates below capacity. The converse part of the capacity proof is based on the Gel'fand-Pinsker technique.

Research paper thumbnail of Universal fingerprinting: Capacity and random-coding exponents

2008 IEEE International Symposium on Information Theory, 2008

This paper studies fingerprinting games in which the number of colluders and the collusion channe... more This paper studies fingerprinting games in which the number of colluders and the collusion channel are unknown. The fingerprints are embedded into host sequences representing signals to be protected and provide the receiver with the capability to trace back pirated copies to the colluders. The colluders and the fingerprint embedder are subject to signal fidelity constraints. Our problem setup unifies the signal-distortion and Boneh-Shaw formulations of fingerprinting. The fundamental tradeoffs between fingerprint codelength, number of users, and fidelity constraints are then determined. Several bounds on fingerprinting capacity have been presented in recent literature. This paper derives exact capacity formulas and presents a new randomized fingerprinting scheme with the following properties: (1) the encoder and receiver do not need to know the coalition size and collusion channel; (2) a tunable parameter ∆ trades off false-positive and false-negative error exponents; (3) the receiver provides a reliability metric for its decision; and (4) the scheme is capacity-achieving when the false-positive exponent ∆ tends to zero and the coalition size is known to the encoder. A fundamental component of the new scheme is the use of a "time-sharing" randomized sequence. The decoder is a maximum penalized mutual information decoder, where the significance of each candidate coalition is assessed relative to a threshold, and the penalty is proportional to the coalition size. A much simpler threshold decoder that satisfies properties (1)-(3) above but not (4) is also given.

Research paper thumbnail of Optimal Gaussian fingeprint decoders

2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009

This paper proposes codes that achieve the fundamental capacity limits of digital fingerprinting ... more This paper proposes codes that achieve the fundamental capacity limits of digital fingerprinting subject to mean-squared distortion constraints on the fingerprint embedder and the colluders. We first show that the traditional method of fingerprint decoding by thresholding correlation statistics falls short of this goal: reliable performance is impossible at code rates greater than some value C1 that is strictly less than capacity. To bridge the gap to capacity, a more powerful decoding method is needed. The Maximum Penalized Gaussian Mutual Information decoder presented here meets this requirement.

Research paper thumbnail of <title>On achievable error exponents for watermarking</title>

Security, Steganography, and Watermarking of Multimedia Contents VII, 2005

Probability-of-error exponents have recently been derived for watermarking systems based on sprea... more Probability-of-error exponents have recently been derived for watermarking systems based on spread-spectrum and quantization-index modulation methods. This paper takes this work one step further and presents minmax error exponents for any embedding scheme and any attack (subject to distortion constraints) at all rates below capacity. The decoders used are universal: they do not know the attack used. Randomized codes outperform deterministic codes, except in the case of memoryless attacks where the same performance is obtained using either kind of code.

Research paper thumbnail of <title>Steganalysis of block-structured stegotext</title>

Security, Steganography, and Watermarking of Multimedia Contents VI, 2004

We study a detection-theoretic approach to steganalysis. The relative entropy between covertext a... more We study a detection-theoretic approach to steganalysis. The relative entropy between covertext and stegotext determines the steganalyzer's difficulty in discriminating them, which in turn defines the detectability of the stegosystem. We consider the case of Gaussian random covertexts and mean-squared embedding constraint. We derive a lower bound on the relative entropy between covertext and stegotext for block-based embedding functions. This lower bound can be approached arbitrarily closely using a spread-spectrum method and secret keys with large entropy. The lower bound can also be attained using a stochastic quantization index modulation (QIM) encoder, without need for secret keys. In general, perfect undetectability can be achieved for blockwise memoryless Gaussian covertexts. For general Gaussian covertexts with memory, the relative entropy increases approximately linearly with the number of blocks observed by the steganalyzer. The error probabilities of the best steganalysis methods decrease exponentially with the number of blocks.

Research paper thumbnail of Saddle-point solution of the fingerprinting capacity game under the marking assumption

2009 IEEE International Symposium on Information Theory, 2009

We study a fingerprinting game in which the collusion channel is unknown. The encoder embeds fing... more We study a fingerprinting game in which the collusion channel is unknown. The encoder embeds fingerprints into a host sequence and provides the decoder with the capability to trace back pirated copies to the colluders. Fingerprinting capacity has recently been derived as the limit value of a sequence of maxmin games with mutual information as the payoff function. However, these games generally do not admit saddle-point solutions and are very hard to solve numerically. Here under the so-called Boneh-Shaw marking assumption, we reformulate the capacity as the value of a single two-person zerosum game, and show that it is achieved by a saddle-point solution. If the maximal coalition size is k and the fingerprint alphabet is binary, we derive equations that can numerically solve the capacity game for arbitrary k. We also provide tight upper and lower bounds on the capacity. Finally, we discuss the asymptotic behavior of the fingerprinting game for large k and practical implementation issues.

Research paper thumbnail of Achievable rates for queue-based timing stegocodes

2009 IEEE Information Theory Workshop, 2009

This work studies stegocodes (data-hiding codes) for covert communication over timing channels. I... more This work studies stegocodes (data-hiding codes) for covert communication over timing channels. In a timing channel, the information resides in the packet interdeparture times as opposed to the packets themselves. The encoding procedure should preserve the statistics of the packet interarrival process. Our steganographic codes are based on two novel ideas, namely, queuebased codes and Shannon's encoding functions for channels with causal side information at the transmitter. The embedding of information bits is done by a Geo/Geo/1 queue servicing the interarrival times. Shannon's theory of coding with causal side information at the transmitter is then used to study the queue-based stegocodes and their information-theoretic limits. Evaluating these limits for the Geo/Geo/1 queue is a formidable computational problem. An efficient computational approach is proposed to compute the maximum achievable rate of the queue-based stegocode and the optimal encoding functions. We also design simple practical codes, test them on a communication system while taking network noise into account, and compare their performance to the mappings under study.

Research paper thumbnail of The Parallel-Gaussian Watermarking Game

IEEE Transactions on Information Theory, 2004

Research paper thumbnail of Perfectly Secure Steganography: Capacity, Error Exponents, and Code Constructions

IEEE Transactions on Information Theory, 2008

An analysis of steganographic systems subject to the following perfect undetectability condition ... more An analysis of steganographic systems subject to the following perfect undetectability condition is presented in this paper. Following embedding of the message into the covertext, the resulting stegotext is required to have exactly the same probability distribution as the covertext. Then no statistical test can reliably detect the presence of the hidden message. We refer to such steganographic schemes as perfectly secure. A few such schemes have been proposed in recent literature, but they have vanishing rate. We prove that communication performance can potentially be vastly improved; specifically, our basic setup assumes independently and identically distributed (i.i.d.) covertext, and we construct perfectly secure steganographic codes from public watermarking codes using binning methods and randomized permutations of the code. The permutation is a secret key shared between encoder and decoder. We derive (positive) capacity and random-coding exponents for perfectly-secure steganographic systems. The error exponents provide estimates of the code length required to achieve a target low error probability. In some applications, steganographic communication may be disrupted by an active warden, modelled here by a compound discrete memoryless channel. The transmitter and warden are subject to distortion constraints. We address the potential loss in communication performance due to the perfect-security requirement. This loss is the same as the loss obtained under a weaker order-1 steganographic requirement that would just require matching of first-order marginals of the covertext and stegotext distributions. Furthermore, no loss occurs if the covertext distribution is uniform and the distortion metric is cyclically symmetric; steganographic capacity is then achieved by randomized linear codes. Our framework may also be useful for developing computationally secure steganographic systems that have near-optimal communication performance.

Research paper thumbnail of Information-theoretic analysis of information hiding

IEEE Transactions on Information Theory, 2003

An information-theoretic analysis of information hiding is presented in this paper, forming the t... more An information-theoretic analysis of information hiding is presented in this paper, forming the theoretical basis for design of information-hiding systems. Information hiding is an emerging research area which encompasses applications such as copyright protection for digital media, watermarking, fingerprinting, steganography, and data embedding. In these applications, information is hidden within a host data set and is to be reliably communicated to a receiver. The host data set is intentionally corrupted, but in a covert way, designed to be imperceptible to a casual analysis. Next, an attacker may seek to destroy this hidden information, and for this purpose, introduce additional distortion to the data set. Side information (in the form of cryptographic keys and/or information about the host signal) may be available to the information hider and to the decoder. We formalize these notions and evaluate the hiding capacity, which upper-bounds the rates of reliable transmission and quantifies the fundamental tradeoff between three quantities: the achievable information-hiding rates and the allowed distortion levels for the information hider and the attacker. The hiding capacity is the value of a game between the information hider and the attacker. The optimal attack strategy is the solution of a particular rate-distortion problem, and the optimal hiding strategy is the solution to a channel-coding problem. The hiding capacity is derived by extending the Gel'fand-Pinsker theory of communication with side information at the encoder. The extensions include the presence of distortion constraints, side information at the decoder, and unknown communication channel. Explicit formulas for capacity are given in several cases, including Bernoulli and Gaussian problems, as well as the important special case of small distortions. In some cases, including the last two above, the hiding capacity is the same whether or not the decoder knows the host data set. It is shown that many existing information-hiding systems in the literature operate far below capacity.

Research paper thumbnail of A Neyman–Pearson Approach to Universal Erasure and List Decoding

IEEE Transactions on Information Theory, 2009

When information is to be transmitted over an unknown, possibly unreliable channel, an erasure op... more When information is to be transmitted over an unknown, possibly unreliable channel, an erasure option at the decoder is desirable. Using constant-composition random codes, we propose a generalization of Csiszár and Körner's maximum mutual information (MMI) decoder with an erasure option for discrete memoryless channels. The new decoder is parameterized by a weighting function that is designed to optimize the fundamental tradeoff between undetected-error and erasure exponents for a compound class of channels. The class of weighting functions may be further enlarged to optimize a similar tradeoff for list decoders-in that case, undetected-error probability is replaced with average number of incorrect messages in the list. Explicit solutions are identified. The optimal exponents admit simple expressions in terms of the sphere-packing exponent, at all rates below capacity. For small erasure exponents, these expressions coincide with those derived by Forney (1968) for symmetric channels, using maximum a posteriori decoding. Thus, for those channels at least, ignorance of the channel law is inconsequential. Conditions for optimality of the Csiszár-Körner rule and of the simpler empirical-mutual-information thresholding rule are identified. The error exponents are evaluated numerically for the binary symmetric channel.

Research paper thumbnail of Universal Fingerprinting: Capacity and Random-Coding Exponents

arXiv (Cornell University), Jan 24, 2008

This paper studies fingerprinting (traitor tracing) games in which the number of colluders and th... more This paper studies fingerprinting (traitor tracing) games in which the number of colluders and the collusion channel are unknown. The fingerprints are embedded into host sequences representing signals to be protected and provide the receiver with the capability to trace back pirated copies to the colluders. The colluders and the fingerprint embedder are subject to signal fidelity constraints. Our problem setup unifies the signal-distortion and Boneh-Shaw formulations of fingerprinting. The fundamental tradeoffs between fingerprint codelength, number of users, number of colluders, fidelity constraints, and decoding reliability are then determined. Several bounds on fingerprinting capacity have been presented in recent literature. This paper derives exact capacity formulas and presents a new randomized fingerprinting scheme with the following properties: (1) the encoder and receiver assume a nominal coalition size but do not need to know the actual coalition size and the collusion channel; (2) a tunable parameter ∆ trades off false-positive and false-negative error exponents; (3) the receiver provides a reliability metric for its decision; and (4) the scheme is capacity-achieving when the false-positive exponent ∆ tends to zero and the nominal coalition size coincides with the actual coalition size. A fundamental component of the new scheme is the use of a "time-sharing" randomized sequence. The decoder is a maximum penalized mutual information decoder, where the significance of each candidate coalition is assessed relative to a threshold, and the penalty is proportional to the coalition size. A much simpler threshold decoder that satisfies properties (1)-(3) above but not (4) is also given.

Research paper thumbnail of Strong converse for Gel'fand-Pinsker channel

A strong converse for the Gel'fand-Pinsker channel is established in this paper. The method is th... more A strong converse for the Gel'fand-Pinsker channel is established in this paper. The method is then extended to a multiuser scenario. A strong converse is established for the multiple-access Gel'fand-Pinsker channel under the maximum error criterion, and the capacity region is determined.

Research paper thumbnail of The Method Of Types And Its Application To Information Hiding

Publication in the conference proceedings of EUSIPCO, Antalya, Turkey, 2005

Research paper thumbnail of Information-theoretic analysis of spherical fingerprinting

2009 Information Theory and Applications Workshop, 2009

Information-theoretic performance limits of digital fingerprinting systems subject to almost-sure... more Information-theoretic performance limits of digital fingerprinting systems subject to almost-sure squared-error distortion constraints on the fingerprint embedder and the colluders are derived in this paper. The rate of the fingerprinting code is R = 1 N log M where N is codelength and M is the number of users. No assumption is made on the host signal statistics, but the collusion channel is also subject to a location-invariant condition. The receiver knows neither the collusion channel nor even the number of colluders. Capacity is the supremum of achievable rates and is shown to be equal to 1 2K log(1 + D f KDc) where K is the number of colluders, and D f and Dc are the L 2-distortion tolerance levels for the fingerprint embedder and the colluders, respectively. The worst collusion is shown to consist of uniform linear averaging of the coalition's marked copies followed by addition of independent spherical noise. Positive error exponents are achieved at all rates below capacity using random spherical fingerprinting codes and a new universal decoding criterion based on empirical Gaussian mutual information. It is also shown that minimum-distance decoding fails for this problem, and that a simple single-user decoder is almost as good as the universal decoder for large K. Geometric interpretations for all the results are given.

Research paper thumbnail of Capacity and optimal collusion attack channels for Gaussian fingerprinting games

SPIE Proceedings, 2007

paper, we focus on the case of two colluders for the simplicity of notation. Although cumbersome,... more paper, we focus on the case of two colluders for the simplicity of notation. Although cumbersome, it is conceptually straightforward to extend to the case where there are more than two colluders.

Research paper thumbnail of Maximin optimality of the arcsine fingerprinting distribution and the interleaving attack for large coalitions

2010 IEEE International Workshop on Information Forensics and Security, 2010

Research paper thumbnail of An information-theoretic model for image watermarking and data hiding

Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)

Research paper thumbnail of On fingerprinting capacity games for arbitrary alphabets and their asymptotics

2012 IEEE International Symposium on Information Theory Proceedings, 2012

The fingerprinting capacity has recently been derived as the value of a two-person zero-sum game.... more The fingerprinting capacity has recently been derived as the value of a two-person zero-sum game. In this work, we study the fingerprinting capacity games with k pirates in a new collusion model called the mixed digit model, which is inspired by the combined digit model ofŠkorić et al. For small k, the capacities along with optimal strategies for both players of the game are obtained explicitly. For large k, we extend our earlier asymptotic analysis for the binary alphabet with the marking assumption to q-ary alphabets with this general model and show that the capacity is asymptotic to A/(2k 2 ln q) where the constant A is specified as the maximin value of a functional game. Saddle-point solutions to the game are obtained using methods of variational calculus. For the special case of qary fingerprinting in the restricted digit model, we show that the interleaving attack is asymptotically optimal, a property that has motivated the design of optimized practical codes.

Research paper thumbnail of Capacity-achieving fingerprint decoding

2009 First IEEE International Workshop on Information Forensics and Security (WIFS), 2009

Research paper thumbnail of Capacity and Random-Coding Error Exponent for Public Fingerprinting Game

2006 IEEE International Symposium on Information Theory, 2006

Capacity and random-coding error exponent formulas are derived for a public fingerprinting (trait... more Capacity and random-coding error exponent formulas are derived for a public fingerprinting (traitor tracing) game. The original media copy is available to the encoder, but not to the decoder. We derive the random-coding error exponent for a stacked binning scheme. The exponent is strictly positive at all rates below capacity. The converse part of the capacity proof is based on the Gel'fand-Pinsker technique.

Research paper thumbnail of Universal fingerprinting: Capacity and random-coding exponents

2008 IEEE International Symposium on Information Theory, 2008

This paper studies fingerprinting games in which the number of colluders and the collusion channe... more This paper studies fingerprinting games in which the number of colluders and the collusion channel are unknown. The fingerprints are embedded into host sequences representing signals to be protected and provide the receiver with the capability to trace back pirated copies to the colluders. The colluders and the fingerprint embedder are subject to signal fidelity constraints. Our problem setup unifies the signal-distortion and Boneh-Shaw formulations of fingerprinting. The fundamental tradeoffs between fingerprint codelength, number of users, and fidelity constraints are then determined. Several bounds on fingerprinting capacity have been presented in recent literature. This paper derives exact capacity formulas and presents a new randomized fingerprinting scheme with the following properties: (1) the encoder and receiver do not need to know the coalition size and collusion channel; (2) a tunable parameter ∆ trades off false-positive and false-negative error exponents; (3) the receiver provides a reliability metric for its decision; and (4) the scheme is capacity-achieving when the false-positive exponent ∆ tends to zero and the coalition size is known to the encoder. A fundamental component of the new scheme is the use of a "time-sharing" randomized sequence. The decoder is a maximum penalized mutual information decoder, where the significance of each candidate coalition is assessed relative to a threshold, and the penalty is proportional to the coalition size. A much simpler threshold decoder that satisfies properties (1)-(3) above but not (4) is also given.

Research paper thumbnail of Optimal Gaussian fingeprint decoders

2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009

This paper proposes codes that achieve the fundamental capacity limits of digital fingerprinting ... more This paper proposes codes that achieve the fundamental capacity limits of digital fingerprinting subject to mean-squared distortion constraints on the fingerprint embedder and the colluders. We first show that the traditional method of fingerprint decoding by thresholding correlation statistics falls short of this goal: reliable performance is impossible at code rates greater than some value C1 that is strictly less than capacity. To bridge the gap to capacity, a more powerful decoding method is needed. The Maximum Penalized Gaussian Mutual Information decoder presented here meets this requirement.

Research paper thumbnail of <title>On achievable error exponents for watermarking</title>

Security, Steganography, and Watermarking of Multimedia Contents VII, 2005

Probability-of-error exponents have recently been derived for watermarking systems based on sprea... more Probability-of-error exponents have recently been derived for watermarking systems based on spread-spectrum and quantization-index modulation methods. This paper takes this work one step further and presents minmax error exponents for any embedding scheme and any attack (subject to distortion constraints) at all rates below capacity. The decoders used are universal: they do not know the attack used. Randomized codes outperform deterministic codes, except in the case of memoryless attacks where the same performance is obtained using either kind of code.

Research paper thumbnail of <title>Steganalysis of block-structured stegotext</title>

Security, Steganography, and Watermarking of Multimedia Contents VI, 2004

We study a detection-theoretic approach to steganalysis. The relative entropy between covertext a... more We study a detection-theoretic approach to steganalysis. The relative entropy between covertext and stegotext determines the steganalyzer's difficulty in discriminating them, which in turn defines the detectability of the stegosystem. We consider the case of Gaussian random covertexts and mean-squared embedding constraint. We derive a lower bound on the relative entropy between covertext and stegotext for block-based embedding functions. This lower bound can be approached arbitrarily closely using a spread-spectrum method and secret keys with large entropy. The lower bound can also be attained using a stochastic quantization index modulation (QIM) encoder, without need for secret keys. In general, perfect undetectability can be achieved for blockwise memoryless Gaussian covertexts. For general Gaussian covertexts with memory, the relative entropy increases approximately linearly with the number of blocks observed by the steganalyzer. The error probabilities of the best steganalysis methods decrease exponentially with the number of blocks.

Research paper thumbnail of Saddle-point solution of the fingerprinting capacity game under the marking assumption

2009 IEEE International Symposium on Information Theory, 2009

We study a fingerprinting game in which the collusion channel is unknown. The encoder embeds fing... more We study a fingerprinting game in which the collusion channel is unknown. The encoder embeds fingerprints into a host sequence and provides the decoder with the capability to trace back pirated copies to the colluders. Fingerprinting capacity has recently been derived as the limit value of a sequence of maxmin games with mutual information as the payoff function. However, these games generally do not admit saddle-point solutions and are very hard to solve numerically. Here under the so-called Boneh-Shaw marking assumption, we reformulate the capacity as the value of a single two-person zerosum game, and show that it is achieved by a saddle-point solution. If the maximal coalition size is k and the fingerprint alphabet is binary, we derive equations that can numerically solve the capacity game for arbitrary k. We also provide tight upper and lower bounds on the capacity. Finally, we discuss the asymptotic behavior of the fingerprinting game for large k and practical implementation issues.

Research paper thumbnail of Achievable rates for queue-based timing stegocodes

2009 IEEE Information Theory Workshop, 2009

This work studies stegocodes (data-hiding codes) for covert communication over timing channels. I... more This work studies stegocodes (data-hiding codes) for covert communication over timing channels. In a timing channel, the information resides in the packet interdeparture times as opposed to the packets themselves. The encoding procedure should preserve the statistics of the packet interarrival process. Our steganographic codes are based on two novel ideas, namely, queuebased codes and Shannon's encoding functions for channels with causal side information at the transmitter. The embedding of information bits is done by a Geo/Geo/1 queue servicing the interarrival times. Shannon's theory of coding with causal side information at the transmitter is then used to study the queue-based stegocodes and their information-theoretic limits. Evaluating these limits for the Geo/Geo/1 queue is a formidable computational problem. An efficient computational approach is proposed to compute the maximum achievable rate of the queue-based stegocode and the optimal encoding functions. We also design simple practical codes, test them on a communication system while taking network noise into account, and compare their performance to the mappings under study.

Research paper thumbnail of The Parallel-Gaussian Watermarking Game

IEEE Transactions on Information Theory, 2004

Research paper thumbnail of Perfectly Secure Steganography: Capacity, Error Exponents, and Code Constructions

IEEE Transactions on Information Theory, 2008

An analysis of steganographic systems subject to the following perfect undetectability condition ... more An analysis of steganographic systems subject to the following perfect undetectability condition is presented in this paper. Following embedding of the message into the covertext, the resulting stegotext is required to have exactly the same probability distribution as the covertext. Then no statistical test can reliably detect the presence of the hidden message. We refer to such steganographic schemes as perfectly secure. A few such schemes have been proposed in recent literature, but they have vanishing rate. We prove that communication performance can potentially be vastly improved; specifically, our basic setup assumes independently and identically distributed (i.i.d.) covertext, and we construct perfectly secure steganographic codes from public watermarking codes using binning methods and randomized permutations of the code. The permutation is a secret key shared between encoder and decoder. We derive (positive) capacity and random-coding exponents for perfectly-secure steganographic systems. The error exponents provide estimates of the code length required to achieve a target low error probability. In some applications, steganographic communication may be disrupted by an active warden, modelled here by a compound discrete memoryless channel. The transmitter and warden are subject to distortion constraints. We address the potential loss in communication performance due to the perfect-security requirement. This loss is the same as the loss obtained under a weaker order-1 steganographic requirement that would just require matching of first-order marginals of the covertext and stegotext distributions. Furthermore, no loss occurs if the covertext distribution is uniform and the distortion metric is cyclically symmetric; steganographic capacity is then achieved by randomized linear codes. Our framework may also be useful for developing computationally secure steganographic systems that have near-optimal communication performance.

Research paper thumbnail of Information-theoretic analysis of information hiding

IEEE Transactions on Information Theory, 2003

An information-theoretic analysis of information hiding is presented in this paper, forming the t... more An information-theoretic analysis of information hiding is presented in this paper, forming the theoretical basis for design of information-hiding systems. Information hiding is an emerging research area which encompasses applications such as copyright protection for digital media, watermarking, fingerprinting, steganography, and data embedding. In these applications, information is hidden within a host data set and is to be reliably communicated to a receiver. The host data set is intentionally corrupted, but in a covert way, designed to be imperceptible to a casual analysis. Next, an attacker may seek to destroy this hidden information, and for this purpose, introduce additional distortion to the data set. Side information (in the form of cryptographic keys and/or information about the host signal) may be available to the information hider and to the decoder. We formalize these notions and evaluate the hiding capacity, which upper-bounds the rates of reliable transmission and quantifies the fundamental tradeoff between three quantities: the achievable information-hiding rates and the allowed distortion levels for the information hider and the attacker. The hiding capacity is the value of a game between the information hider and the attacker. The optimal attack strategy is the solution of a particular rate-distortion problem, and the optimal hiding strategy is the solution to a channel-coding problem. The hiding capacity is derived by extending the Gel'fand-Pinsker theory of communication with side information at the encoder. The extensions include the presence of distortion constraints, side information at the decoder, and unknown communication channel. Explicit formulas for capacity are given in several cases, including Bernoulli and Gaussian problems, as well as the important special case of small distortions. In some cases, including the last two above, the hiding capacity is the same whether or not the decoder knows the host data set. It is shown that many existing information-hiding systems in the literature operate far below capacity.

Research paper thumbnail of A Neyman–Pearson Approach to Universal Erasure and List Decoding

IEEE Transactions on Information Theory, 2009

When information is to be transmitted over an unknown, possibly unreliable channel, an erasure op... more When information is to be transmitted over an unknown, possibly unreliable channel, an erasure option at the decoder is desirable. Using constant-composition random codes, we propose a generalization of Csiszár and Körner's maximum mutual information (MMI) decoder with an erasure option for discrete memoryless channels. The new decoder is parameterized by a weighting function that is designed to optimize the fundamental tradeoff between undetected-error and erasure exponents for a compound class of channels. The class of weighting functions may be further enlarged to optimize a similar tradeoff for list decoders-in that case, undetected-error probability is replaced with average number of incorrect messages in the list. Explicit solutions are identified. The optimal exponents admit simple expressions in terms of the sphere-packing exponent, at all rates below capacity. For small erasure exponents, these expressions coincide with those derived by Forney (1968) for symmetric channels, using maximum a posteriori decoding. Thus, for those channels at least, ignorance of the channel law is inconsequential. Conditions for optimality of the Csiszár-Körner rule and of the simpler empirical-mutual-information thresholding rule are identified. The error exponents are evaluated numerically for the binary symmetric channel.