David Soukal - Profile on Academia.edu (original) (raw)

Papers by David Soukal

Proceedings of SPIE, Mar 21, 2005

In this paper, we show that the communication channel known as writing in memory with defective c... more In this paper, we show that the communication channel known as writing in memory with defective cells [1][2] is a relevant information-theoretical model for a specific case of passive warden steganography when the sender embeds a secret message into a subset C of the cover object X without sharing the selection channel C with the recipient. The set C could be arbitrary, determined by the sender from the cover image using a deterministic, pseudo-random, or a truly random process. We call this steganography "writing on wet paper" and realize it using a simple variable-rate random linear code that gives the sender a convenient flexibility and control over the embedding process and is thus suitable for practical implementation. The importance of the wet paper scenario for covert communication is discussed within the context of adaptive steganography and perturbed quantization steganography [3]. Heuristic arguments supported by tests using blind steganalysis [4] indicate that the wet paper steganography provides improved steganographic security and is less vulnerable to attacks compared to existing methods with shared selection channels.

This paper is an extension of our work 1 on stego key search for JPEG images published at EI SPIE... more This paper is an extension of our work 1 on stego key search for JPEG images published at EI SPIE in 2004. We provide a more general theoretical description of the methodology, apply our approach to the spatial domain, and add a method that determines the stego key from multiple images. We show that in the spatial domain the stego key search can be made significantly more efficient by working with the noise component of the image obtained using a denoising filter. The technique is tested on the LSB embedding paradigm and on a special case of embedding by noise adding (the ±1 embedding). The stego key search can be performed for a wide class of steganographic techniques even for sizes of secret message well below those detectable using known methods. The proposed strategy may prove useful to forensic analysts and law enforcement. 1.

Steganalysis in the wide sense consists of first identifying suspicious objects and then further ... more Steganalysis in the wide sense consists of first identifying suspicious objects and then further analysis during which we try to identify the steganographic scheme used for embedding, recover the stego key, and finally extract the hidden message. In this paper, we present a methodology for identifying the stego key in key-dependent steganographic schemes. Previous approaches for stego key search were exhaustive searches looking for some recognizable structure (e.g., header) in the extracted bit-stream. However, if the message is encrypted, the search will become much more expensive because for each stego key, all possible encryption keys would have to be tested. In this paper, we show that for a very wide range of steganographic schemes, the complexity of the stego key search is determined only by the size of the stego key space and is independent of the encryption algorithm. The correct stego key can be determined through an exhaustive stego key search by quantifying statistical pr...

In this paper, we show that the communication channel known as writing in memory with defective c... more In this paper, we show that the communication channel known as writing in memory with defective cells [1][2] is a relevant information-theoretical model for a specific case of passive warden steganography when the sender embeds a secret message into a subset C of the cover object X without sharing C with the recipient. The set C, also called the selection channel, could be arbitrary, determined by the sender from the cover object using a deterministic, pseudo-random, or a truly random process. We call this steganography “writing on wet paper ” and realize it using a simple variable-rate random linear code that gives the sender a convenient flexibility and control over the embedding process and is thus suitable for practical implementation. The importance of the wet paper scenario for covert communication is discussed within the context of adaptive steganography and perturbed quantization steganography [3]. Heuristic arguments supported by tests using blind steganalysis [4] indicate ...

In this paper, we use the recently proposed wet paper codes and introduce a new approach to passi... more In this paper, we use the recently proposed wet paper codes and introduce a new approach to passive-warden steganography called Perturbed Quantization. In Perturbed Quantization, the sender hides data while processing the cover object with an informationreducing operation that involves quantization, such as lossy compression, downsampling, or A/D conversion. The unquantized values of the processed cover object are considered as side information to confine the embedding changes to those unquantized elements whose values are close to the middle of quantization intervals. This choice of the selection channel calls for wet paper codes as they enable communication with nonshared selection channel. Heuristic is presented that indicates that the proposed method provides better steganographic security than current JPEG steganographic methods. This claim is further supported by blind steganalysis of a specific case of Perturbed Quantization for recompressed JPEG images.

Modern attackers increasingly exploit search engines as a vehicle to identify vulnerabilities and... more Modern attackers increasingly exploit search engines as a vehicle to identify vulnerabilities and to gather information for launching new attacks. In this paper, we perform a large-scale quantitative analysis on bot queries received by the Bing search engine over month-long periods. Our analysis is based on an automated system, called SBotScope, that we develop to dissect large-scale bot queries. Specifically we answer questions of “what are the bot queries searching for?” and “who are submitting these queries?”. Our study shows that 33% of bot queries are searching for vulnerabilities, followed by 11% harvesting user account information. In one of our 16-day datasets, we uncover 8.2 million hosts from botnets and 13,364 hosts from data centers submitting bot queries. To the best of our knowledge, our work is the first large-scale effort toward systematically understanding bot query intentions and the scales of the malicious attacks associated with them.

On Steganographic Embedding Efficiency

Information Hiding, 2006

Page 1. On Steganographic Embedding Efficiency Jessica Fridrich1, Petr Lisonek2, and David Soukal... more Page 1. On Steganographic Embedding Efficiency Jessica Fridrich1, Petr Lisonek2, and David Soukal1 ... q , Ext(Emb(x, m)) = m. In other words, (3) means that we can embed any message from M in any x and (4) states that we can do it by imposing at most R changes. ...

Advanced steganographic and steganalytic methods in the spatial domain

Writing on Wet Paper with LT Codes

Proceedings of the 2004 multimedia and security workshop on Multimedia and security - MM&Sec '04, 2004

In this paper, we introduce a new approach to passive-warden steganography in which the sender em... more In this paper, we introduce a new approach to passive-warden steganography in which the sender embeds the secret message into a certain subset of the cover object without having to share the selection channel with the recipient. An appropriate informationtheoretical model for this communication is writing in memory with (a large number of) defective cells [1]. We describe a simple variable-rate random linear code for this channel (the "wet paper" code) and use it to develop a new steganographic methodology for digital media files-Perturbed Quantization. In Perturbed Quantization, the sender hides data while processing the cover object with an information-reducing operation, such as lossy compression, downsampling, A/D conversion, etc. The sender uses the cover object before processing as side information to confine the embedding changes to those elements of the processed cover object whose values are the most "uncertain". This informed-sender embedding and uninformed-recipient message extraction improves steganographic security because an attacker cannot easily determine from the processed stego object the location of embedding changes. Heuristic is presented and supported by blind steganalysis [2] that a specific case of Perturbed Quantization for JPEG images is significantly less detectable than current JPEG steganographic methods.

Security, Steganography, and Watermarking of Multimedia Contents VII, 2005

In this paper, we propose a new method for estimation of the number of embedding changes for non-... more In this paper, we propose a new method for estimation of the number of embedding changes for non-adaptive ±k embedding in images. By modeling the cover image and the stego noise as additive mixture of random processes, the stego message is estimated from the stego image using a denoising filter in the wavelet domain. The stego message estimate is further analyzed using ML/MAP estimators to identify the pixels that were modified during embedding. For non-adaptive ±k embedding, the density of embedding changes is estimated from selected segments of the stego image. It is shown that for images with a low level of noise (e.g., for decompressed JPEG images) this approach can detect and estimate the number of embedding changes even for small values of k, such as k=2, and in some cases even for k=1.

Security, Steganography, and Watermarking of Multimedia Contents VII, 2005

In this paper, we propose a new method for estimating the number of embedding changes for non-ada... more In this paper, we propose a new method for estimating the number of embedding changes for non-adaptive ±K embedding in images. The method uses a high-pass FIR filter and then recovers an approximate message length using a Maximum Likelihood Estimator on those stego image segments where the filtered samples can be modeled using a stationary Generalized Gaussian random process. It is shown that for images with a low noise level, such as decompressed JPEG images, this method can accurately estimate the number of embedding changes even for K = 1 and for embedding rates as low as 0.2 bits per pixel. Although for raw, never compressed images the message length estimate is less accurate, when used as a scalar parameter for a classifier detecting the presence of ±K steganography, the proposed method gave us relatively reliable results for embedding rates as low as 0.5 bits per pixel.

Security, Steganography, and Watermarking of Multimedia Contents VIII, 2006

Wet paper codes were previously proposed as a tool for construction of steganographic schemes wit... more Wet paper codes were previously proposed as a tool for construction of steganographic schemes with arbitrary (non-shared) selection channels. In this paper, we propose a new approach to wet paper codes using random linear codes of small codimension that at the same time improves the embedding efficiency (number of random message bits embedded per embedding change). Practical algorithms are given and their performance is evaluated experimentally and compared to theoretically achievable bounds. An approximate formula for the embedding efficiency of the proposed scheme is derived. The proposed coding method can be modularly combined with most steganographic schemes to improve their security.

Detection of Copy-Move Forgery in Digital Images Using SIFT Algorithm

2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application, 2008

... (3)Divide 1 K ∅ and 2 K ∅ respectively like (1) and repeat (1) (2) until every keypoint in K ... more ... (3)Divide 1 K ∅ and 2 K ∅ respectively like (1) and repeat (1) (2) until every keypoint in K ∅ is matched with each other. 3. Experimental Results ... 3, pp.371~377. [4] DG Lowe (2004). Distinctive Image Features from Scale-Invariant Keypoints. ...

Steganography is the art of covert (undetectable) communication in which secret data is embedded ... more Steganography is the art of covert (undetectable) communication in which secret data is embedded in innocuous looking messages by slightly modifying them. The detectability of secret data is influenced, besides other factors, by the placement of embedding changes within the message and by the embedding efficiency-the number of bits embedded per one embedding change. In this paper, we first show that codes for memory with defective cells enable steganographic communication without sharing the placement of embedding changes. This allows construction of a new class of steganographic schemes with improved security. We then describe an efficient coding method for memory with defective cells based on the LT process that is suitable for steganographic applications. In the second part of the paper, we explore a different approach based on random linear codes on small blocks with the goal to decrease the number of embedding changes. The embedding efficiency of this approach is compared to theoretically achievable bounds.

Proceedings of the 19th international conference on World wide web - WWW '10, 2010

In this paper, we propose a semi-supervised learning approach for classifying program (bot) gener... more In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by the challenge that the enormous amount of search data pose to traditional approaches that rely on fully annotated training samples. We propose a semi-supervised framework that addresses the problem in multiple fronts. First, we use the CAPTCHA technique and simple heuristics to extract from the data logs a large set of training samples with initial labels, though directly using these training data is problematic because the data thus sampled are biased. To tackle this problem, we further develop a semi-supervised learning algorithm to take advantage of the unlabeled data to improve the classification performance. These two proposed algorithms can be seamlessly combined and very cost efficient to scale the training process. In our experiment, the proposed approach showed significant (i.e. 2 : 1) improvement compared to the traditional supervised approach.

Writing on wet paper

Security, Steganography, and Watermarking of Multimedia Contents VII, 2005

Security and Watermarking of Multimedia Contents V, 2003

In this paper, we describe a new higher-order steganalytic method called Pairs Analysis for detec... more In this paper, we describe a new higher-order steganalytic method called Pairs Analysis for detection of secret messages embedded in digital images. Although the approach is in principle applicable to many different steganographic methods as well as image formats, it is ideally suited to 8-bit images, such as GIF images, where message bits are embedded in LSBs of indices to an ordered palette. The EzStego algorithm 4 with random message spread and optimized palette order is used as an embedding archetype on which we demonstrate Pairs Analysis and compare its performance with the chisquare attacks 5,7 and our previously proposed RS steganalysis 2. Pairs Analysis enables more reliable and accurate message detection than previous methods. The method was tested on databases of GIF images of natural scenes, cartoons, and computer-generated images. The experiments indicate that the relative steganographic capacity of the EzStego algorithm with random message spread is less than 10% of the total image capacity (0.1 bits per pixel).

Security, Steganography, and Watermarking of Multimedia Contents VI, 2004

Proceedings of SPIE, Mar 21, 2005

In this paper, we show that the communication channel known as writing in memory with defective c... more In this paper, we show that the communication channel known as writing in memory with defective cells [1][2] is a relevant information-theoretical model for a specific case of passive warden steganography when the sender embeds a secret message into a subset C of the cover object X without sharing C with the recipient. The set C, also called the selection channel, could be arbitrary, determined by the sender from the cover object using a deterministic, pseudo-random, or a truly random process. We call this steganography “writing on wet paper ” and realize it using a simple variable-rate random linear code that gives the sender a convenient flexibility and control over the embedding process and is thus suitable for practical implementation. The importance of the wet paper scenario for covert communication is discussed within the context of adaptive steganography and perturbed quantization steganography [3]. Heuristic arguments supported by tests using blind steganalysis [4] indicate ...

On Steganographic Embedding Efficiency

Information Hiding, 2006

Advanced steganographic and steganalytic methods in the spatial domain

Writing on Wet Paper with LT Codes

Proceedings of the 2004 multimedia and security workshop on Multimedia and security - MM&Sec '04, 2004

Security, Steganography, and Watermarking of Multimedia Contents VII, 2005

Security, Steganography, and Watermarking of Multimedia Contents VIII, 2006

Detection of Copy-Move Forgery in Digital Images Using SIFT Algorithm

2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application, 2008

Proceedings of the 19th international conference on World wide web - WWW '10, 2010

Writing on wet paper

Security, Steganography, and Watermarking of Multimedia Contents VII, 2005

Security and Watermarking of Multimedia Contents V, 2003

Security, Steganography, and Watermarking of Multimedia Contents VI, 2004