Koichi Takagi - Academia.edu (original) (raw)

Papers by Koichi Takagi

Visual Communications and Image Processing 2003, 2003

FGS is a video coding scheme that can control picture quality with high scalability, but it has l... more FGS is a video coding scheme that can control picture quality with high scalability, but it has lower efficiency than the single layer video coding scheme because of its hierarchical structure. This paper proposes application scheme of three methods for efficiency improvement to the FGS encoder and decoder. The first method is the bit plane coding method using two bit group separation based on the distribution of a significant bit. The second method is the introduction of motion compensation estimation to the enhancement layer. The third method is a complement method to make a high quality reference picture from both the base layer and the enhancement layer. In conclusion, the proposed methods can improve the coding efficiency in the enhancement layer of conventional FGS by 2.2% and the PSNR performance of conventional FGS by 1.1dB, as well as reduce propagation of the drift error caused by low rate transmission and channel congestion. The results show that the proposed method is more efficient and more suitable for video streaming on a heterogeneous network with channel deviation.

Lecture Notes in Computer Science, 2011

This paper presents a novel approach to tracking articulated human motion with monocular video. I... more This paper presents a novel approach to tracking articulated human motion with monocular video. In a conventional tracking system based on particle filters, it is very challenging to track a complex human pose with many degrees of freedom. A typical solution to this problem is to track the pose in a low dimensional latent space by manifold learning techniques, e.g., the Gaussian process dynamical model (GPDM model). In this paper, we extend the GPDM model into a graph structure (called GPDM graph) to better express the diverse dynamics of human motion, where multiple latent spaces are constructed and dynamically connected to each other appropriately by an unsupervised learning method. Basically, the proposed model has both intra-transitions (in each latent space) and inter-transitions (among latent spaces). Moreover, the probability of inter-transition is dynamic, depending on the current latent state. Using the proposed GPDM graph model, we can track human motion with monocular video, where the average tracking errors are improved from the state-of-the-art methods in our experiments.

2008 IEEE International Conference on Multimedia and Expo, 2008

This paper proposes an efficient data representation scheme to improve the performance of a data ... more This paper proposes an efficient data representation scheme to improve the performance of a data hiding method [1] in MPEG compressed domain. Even though [1] completely preserves the quality of the modified video to that of the original (compressed) video and [1] is reversible, [1] suffers from consistent filesize increase caused by data embedding. To suppress filesize increase, reverse zerorun length (RZL) is proposed to efficiently encode the message. RZL utilizes the statistics of the macroblocks with respect to [1], and the distance between two excited macroblocks is considered to encode a message segment. RZL simultaneously achieves high payload and high embedding efficiency, thus RZL is able to suppress the filesize increase caused by data embedding. We theoretically analyzed that RZL outperforms matrix encoding for both payload and embedding efficiency for this particular data hiding method. Experiments are also carried out to verify the theoretically deduced results, and the observed results agree with the expected outcomes.

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012

In this paper, we propose a fast, accurate content-based video copy detection scheme based on bag... more In this paper, we propose a fast, accurate content-based video copy detection scheme based on bag-of-global visual features, which is characterized by (1) utilizing an efficient DCT-sign-based feature for fast detection; (2) performing multiple assignment in the temporal domain, in addition to the feature and spatial domain to ensure repeatability in segment-level matching; and (3) adopting an inverse document frequency weighting and temporal burstiness-aware scoring to emphasize distinctive visual words. Despite detection 95 times faster than real-time, the proposed system achieves a false negative rate of 0.2% against queries that are altered by non-geometric transformations without any false positives.

2008 3rd International Symposium on Communications, Control and Signal Processing, 2008

This paper proposes a scalable video scrambling method in MPEG domain. Scrambling strength could ... more This paper proposes a scalable video scrambling method in MPEG domain. Scrambling strength could be controlled to produce video with distortion level ranging from insignificant to significant. AC components in each coefficient block are first scalably transformed according to desired level of distortions. Then, all pairs of the length of zero-run and the level of non-zero component are locally shuffled in each coefficient block. Furthermore, all coefficient blocks are globally shuffled within a region to be scrambled. The scrambled video stream could be descrambled to restore the original video with a right key. The basic performance of this method is verified through computer simulations.

IEICE Transactions on Information and Systems, 2012

This paper presents a system for automatic generation of dancing animation that is synchronized w... more This paper presents a system for automatic generation of dancing animation that is synchronized with a piece of music by re-using motion capture data. Basically, the dancing motion is synthesized according to the rhythm and intensity features of music. For this purpose, we propose a novel meta motion graph structure to embed the necessary features including both rhythm and intensity, which is constructed on the motion capture database beforehand. In this paper, we consider two scenarios for non-streaming music and streaming music, where global search and local search are required respectively. In the case of the former, once a piece of music is input, the efficient dynamic programming algorithm can be employed to globally search a best path in the meta motion graph, where an objective function is properly designed by measuring the quality of beat synchronization, intensity matching, and motion smoothness. In the case of the latter, the input music is stored in a buffer in a streaming mode, then an efficient search method is presented for a certain amount of music data (called a segment) in the buffer with the same objective function, resulting in a segment-based search approach. For streaming applications, we define an additional property in the above meta motion graph to deal with the unpredictable future music, which guarantees that there is some motion to match the unknown remaining music. A user study with totally 60 subjects demonstrates that our system outperforms the stat-of-the-art techniques in both scenarios. Furthermore, our system improves the synthesis speed greatly (maximal speedup is more than 500 times), which is essential for mobile applications. We have implemented our system on commercially available smart phones and confirmed that it works well on these mobile phones.

The Journal of The Institute of Image Information and Television Engineers, 2009

2008 IEEE International Conference on Multimedia and Expo, 2008

Visual Communications and Image Processing 2003, 2003

Lecture Notes in Computer Science, 2011

2008 IEEE International Conference on Multimedia and Expo, 2008

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012

2008 3rd International Symposium on Communications, Control and Signal Processing, 2008

IEICE Transactions on Information and Systems, 2012

The Journal of The Institute of Image Information and Television Engineers, 2009

2008 IEEE International Conference on Multimedia and Expo, 2008