Hiroyuki Kasai | The University of Elec-Communications (original) (raw)
Papers by Hiroyuki Kasai
arXiv preprint
In this paper, we propose novel gossip algorithms for the low-rank decentralized matrix completio... more In this paper, we propose novel gossip algorithms for the low-rank decentralized matrix completion problem. The proposed approach is on the Riemannian Grassmann manifold that allows local matrix completion by different agents while achieving asymptotic consensus on the global low-rank factors. The resulting approach is scalable and parallelizable. Our numerical experiments show the good performance of the proposed algorithms on various benchmarks.
arXiv preprint
Stochastic variance reduction algorithms have recently become popular for minimizing the average ... more Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite, number of loss functions. In this paper, we propose a novel Riemannian extension of the Euclidean stochastic variance reduced gradient algorithm (R-SVRG) to a compact manifold search space. To this end, we show the developments on the Grassmann manifold. The key challenges of averaging, addition, and subtraction of multiple gradients are addressed with notions like logarithm mapping and parallel translation of vectors on the Grassmann manifold. We present a global convergence analysis of the proposed algorithm with decaying step-sizes and a local convergence rate analysis under fixed step-size with some natural assumptions. The proposed algorithm is applied on a number of problems on the Grassmann manifold like principal components analysis, low-rank matrix completion, and the Karcher mean computation. In all these cases, the proposed algorithm outperforms the standard Riemannian stochastic gradient descent algorithm.
2013 IEEE International Conference on Consumer Electronics (ICCE), 2013
ABSTRACT Multi-vision systems and panoramic video systems are expected to produce a new multimedi... more ABSTRACT Multi-vision systems and panoramic video systems are expected to produce a new multimedia paradigm. Our previously presented multi-vision system allows many users to view multiple videos by virtue of fast stream joining algorithms. However image degradation occurs because these algorithms need restricted constant coding parameters. As described herein, we propose a QP control scheme to decrease this degradation without sacrificing high speed in the joiner.
2011 Proceedings of 20th International Conference on Computer Communications and Networks (ICCCN), 2011
It can be essential in the new-generation content services to predict the potential demands of pe... more It can be essential in the new-generation content services to predict the potential demands of people, which they themselves have not recognized or cannot express precisely. Social graphs representing the relationships between people are used for predicting demand in current Internet-based services. However, these graphs cannot represent the relationships of two users residing in common communities or common places. We
This paper proposes music classification considering melody transition for browsing system to rec... more This paper proposes music classification considering melody transition for browsing system to recommend tunes depending on users' preference. Especially, we apply the Prime form and the Interval-Class Vector to Melodies Markov Model proposed in [1].
This paper presents a proposal of a scheme to collect and manage content and service data belongi... more This paper presents a proposal of a scheme to collect and manage content and service data belonging to separate multiple WLANs so that Mobile CE devices can use them. Content and service list sharing for seamless multiple WLANs access by network switching is proposed. This paper shows the effectiveness of the proposed algorithm using simulations, and describes future work.
Our new distributed mobile cache system, a sustainable distributed Geocast technology, enables da... more Our new distributed mobile cache system, a sustainable distributed Geocast technology, enables data caching temporarily in a designated local area. We released the open software development kit (SDK) for embedded systems. This paper explains details of the developed open SDK, an implementation guide, and three applications using this SDK1.
2008 IEEE 19th International Symposium on Personal, Indoor and Mobile Radio Communications, 2008
A light-weight, smooth, and quick-responsible accessibility to contents are truly needed for a pr... more A light-weight, smooth, and quick-responsible accessibility to contents are truly needed for a practical usage of the mobile video services. We research on and develop a new innovative mobile video technology, which utilizes pre-download, pre-fetching and asynchronous network streaming schemes. This paper describes its innovative mechanisms, especially an asynchronous streaming technique and an asynchronous thumbnail data pre-fetching technique. Two prototype implementations of mobile video client are described in detail, and performance evaluations are given at the end of this paper.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
ABSTRACT We have proposed so far a new multivision video provisioning scheme that enables numerou... more ABSTRACT We have proposed so far a new multivision video provisioning scheme that enables numerous multiple users to access any view area with any desired resolution interactively and simultaneously. This paper proposes an adaptive view-area and quality scaling while preserving quick response for view-area movement. This can adaptively avoid redundant coding bits in conjunction with the movement of the viewing area. Simulation experiments and our implementation show the feasibility and effectiveness of our proposal.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
ABSTRACT As video technology advances rapidly, the need for simultaneous presentation of multiple... more ABSTRACT As video technology advances rapidly, the need for simultaneous presentation of multiple videos is expected to increase in future. This paper utilized the orthogonal experiment to analyze visual effects of video physical features on multiple video sequences. The result shows that video position, human face, and motion attract attention effectively.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
ABSTRACT Fast scan and detection of the end of each CAVLC code is needed for bitstream operations... more ABSTRACT Fast scan and detection of the end of each CAVLC code is needed for bitstream operations like transcoding and manipulating. However, they need a larger computational load due to sequential and context-adaptive nature of CAVLC. That is why that it is dominant process in en entire decoding process. This paper proposes a fast skip scheme of CAVLC level code. Simulation results at the end of this paper show that the proposal about 70% reduction of CAVLC level code skip, and outperforms a conventional method.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
Recently, many researchers focus on high-definition and multivision video provisioning. [1] and [... more Recently, many researchers focus on high-definition and multivision video provisioning. [1] and [2] proposed a free-view access of the high-definition video and panoramic using multiple partitioned video streams. While they did not consider high complexity on the client to decode and play the video, [3] presented an innovative stream joining method for lower-complexity client as well as lower server load. It is, however, not practically feasible to serve videos simultaneously to a larger amount of users. This paper presents encoding techniques for a fast and simple stream joining that can achieve simultaneous multivision video provisioning.
2011 IEEE International Conference on Consumer Electronics (ICCE), 2011
ABSTRACT This paper proposes an on-demand soundscape generation and provisioning for a user to ex... more ABSTRACT This paper proposes an on-demand soundscape generation and provisioning for a user to experience a real world in a requested remote place. This generation is achieved by spatial audio mixing considering a real world condition like geographical features or townscapes as well as dynamic situation such as town events or weather. The proposed velocity vector-based clustering can reduce the cost of composing/decomposing clusters of many sound sources. Index Terms — soundscape; virtual reality; spatial audio; velocity vector-based clustering; HRTF I. INTRODUCTION An innovative services like Google Street View enables us to search and view any picture of landscape all over the world where we want to know via the Internet. In addition to visual information of the landscape, its surrounding sound makes our understanding and experience of that place much richer. The term "Soundscape" coined by composer R. Murray Schafe describes atmosphere or environment created by or with sound. Recently, some attempts to virtually generate a soundscape have been studied (1). However, it only serves a soundscape of specific or limited locations. In this paper, we target an implementation of platform that dynamically generates soundscape of any location, and services with acoustic engineering and virtual reality technique at the core. This is pushed by rich information of a real world like Geographic Information System (GIS), event, weather, and human behavior from public Cloud and Web services. This platform dynamically retrieves sound sources for representing the requested place, and finally constructs its soundscape by spatial audio mixing of the sounds considering those locations and listener position. Especially, we propose a new clustering method in a server to support many clients simultaneously to reduce the cost of spatializing of many sound sources.
2011 IEEE International Conference on Consumer Electronics (ICCE), 2011
ABSTRACT This paper presents a proposal of a new video provisioning scheme that enables numerous ... more ABSTRACT This paper presents a proposal of a new video provisioning scheme that enables numerous multiple users to access any view area with any desired resolution interactively. The basis is that the tile-based encoded multiple streams are joined dynamically and delivered based on the user's view-area position. This paper details a tile-based video coding scheme and a fast tile stream joiner scheme. Simulation experiments and our practical implementation illustrate the feasibility and effectiveness of our proposed scheme. Keywords- Tile video stream, Tile stream joiner, H.264/AVC, Tile- based video coding I. INTRODUCTION Recent technological advances in high-definition (HD) video such as video cameras or display devices, broadband networks and video compression schemes have been making HD video services widespread much faster in the world. Widely various video sharing services are also spreading in the Web-world at an unexpectedly high rate. Therefore, a more efficient and interactive way to enjoy such a huge amount of video contents is anticipated (1)-(3). We propose an innovative video access that enables users to browse and view a large amount of high definition videos efficiently. This proposal simultaneously provides many users a new means to access video contents with their desired view areas and resolutions. This enables an efficient video browser I/F for a web-based video service, or a new EPG I/F for a multi-channel cable television. From the free and deep accessible capability to one video, sports program, or camera surveillance can be considered.
2012 International Symposium on Computer, Consumer and Control, 2012
ABSTRACT In this paper the techniques for developing an efficient method of eye gesture tracking ... more ABSTRACT In this paper the techniques for developing an efficient method of eye gesture tracking are explored. The goal is to verify the possibility of building an effective eye gesture input method relying on built-in hardware of the smart phone platform. The achieved results will be used as a basis for researching the technologies for gaze tracking on a mobile device. Effectiveness is measured according to detection accuracy, robustness, and processing performance. The results show limited success in achieving the stated goals, especially with respect to processing performance and provide a solid foundation for future work.
ACM SIGMOBILE Mobile Computing and Communications Review, 2013
ABSTRACT In this paper, we propose a system of sound-enhanced landscape album for smart-phone. Th... more ABSTRACT In this paper, we propose a system of sound-enhanced landscape album for smart-phone. This system is able to provide landscape image and environmental sound for user designated place. This system is consists of a server and a mobile client terminal. Functions that need high computational costs, such as generating environmental sound, are processed in server side. Because result generated by the server is sent to client side through a network, a mobile client can provide landscape image and environmental sound in real-time manner.
ACOFT/AOS 2006 - 31st Australian Conference on Optical Fibre Technology and Meeting of the Australian Optical Society, 2007
A key network-element for the gigabit Ethernet-optical switched access network that achieves high... more A key network-element for the gigabit Ethernet-optical switched access network that achieves higher security and longer span than PON is introduced. Its feasibility is confirmed when implemented by optical packet switches.
2013 IEEE International Conference on Consumer Electronics (ICCE), 2013
ABSTRACT High functionality and interactivity of video services are strongly desired. The authors... more ABSTRACT High functionality and interactivity of video services are strongly desired. The authors have presented a proposal of an encoding scheme and a fast joining scheme for interactive multivision systems, where joining is the combination of multiple streams into a single video stream. This paper describes implementation of an actual HTTP-based streaming server by embedding the stream joiner module into a widely available HTTP server. The contribution of this paper is that the joining speed of the server surpasses the necessary speed attained with practical network equipment connecting to the server.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
ABSTRACT This paper proposes a method to construct the scene composition of an event, a place and... more ABSTRACT This paper proposes a method to construct the scene composition of an event, a place and a situation by learning appearance pattern of objects in them. We recognize objects inside an image, and extract representative patterns based by analyzing the appearance combination of object in a large amount of images. Preliminary experiment shows that the proposal can create knowledge of scene composition with no contradiction and gives unfamiliar insights for the scene1.
2011 IEEE International Conference on Communications Workshops (ICC), 2011
The recent enhancement of mobile devices and wireless networks has enabled content services in mo... more The recent enhancement of mobile devices and wireless networks has enabled content services in mobile environ- ments. Demand prediction is a traditional but powerful technique used for content services. However, it is hard to predict local demand in mobile environments because it depends not only on just user preference and the popularity of common content but also on other factors; users request content related to their locations; moreover, they are interested in the content uploaded by their friends who visited the location before them. Thus, we need to consider the context with multiple factors affecting the demand. In addition, we also need to consider the sequence of contexts. We call a sequence of contexts social context. This makes it difficult to predict local demand from users. In this paper, we propose a novel demand prediction engine that extracts local demand depending on social context and estimates what content will be requested there. To extract the demand, we use a log database and a pattern-matching technique in our prediction engine. To validate our prediction engine, we apply a prefetching service using the engine to a mobile content service.
arXiv preprint
In this paper, we propose novel gossip algorithms for the low-rank decentralized matrix completio... more In this paper, we propose novel gossip algorithms for the low-rank decentralized matrix completion problem. The proposed approach is on the Riemannian Grassmann manifold that allows local matrix completion by different agents while achieving asymptotic consensus on the global low-rank factors. The resulting approach is scalable and parallelizable. Our numerical experiments show the good performance of the proposed algorithms on various benchmarks.
arXiv preprint
Stochastic variance reduction algorithms have recently become popular for minimizing the average ... more Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite, number of loss functions. In this paper, we propose a novel Riemannian extension of the Euclidean stochastic variance reduced gradient algorithm (R-SVRG) to a compact manifold search space. To this end, we show the developments on the Grassmann manifold. The key challenges of averaging, addition, and subtraction of multiple gradients are addressed with notions like logarithm mapping and parallel translation of vectors on the Grassmann manifold. We present a global convergence analysis of the proposed algorithm with decaying step-sizes and a local convergence rate analysis under fixed step-size with some natural assumptions. The proposed algorithm is applied on a number of problems on the Grassmann manifold like principal components analysis, low-rank matrix completion, and the Karcher mean computation. In all these cases, the proposed algorithm outperforms the standard Riemannian stochastic gradient descent algorithm.
2013 IEEE International Conference on Consumer Electronics (ICCE), 2013
ABSTRACT Multi-vision systems and panoramic video systems are expected to produce a new multimedi... more ABSTRACT Multi-vision systems and panoramic video systems are expected to produce a new multimedia paradigm. Our previously presented multi-vision system allows many users to view multiple videos by virtue of fast stream joining algorithms. However image degradation occurs because these algorithms need restricted constant coding parameters. As described herein, we propose a QP control scheme to decrease this degradation without sacrificing high speed in the joiner.
2011 Proceedings of 20th International Conference on Computer Communications and Networks (ICCCN), 2011
It can be essential in the new-generation content services to predict the potential demands of pe... more It can be essential in the new-generation content services to predict the potential demands of people, which they themselves have not recognized or cannot express precisely. Social graphs representing the relationships between people are used for predicting demand in current Internet-based services. However, these graphs cannot represent the relationships of two users residing in common communities or common places. We
This paper proposes music classification considering melody transition for browsing system to rec... more This paper proposes music classification considering melody transition for browsing system to recommend tunes depending on users' preference. Especially, we apply the Prime form and the Interval-Class Vector to Melodies Markov Model proposed in [1].
This paper presents a proposal of a scheme to collect and manage content and service data belongi... more This paper presents a proposal of a scheme to collect and manage content and service data belonging to separate multiple WLANs so that Mobile CE devices can use them. Content and service list sharing for seamless multiple WLANs access by network switching is proposed. This paper shows the effectiveness of the proposed algorithm using simulations, and describes future work.
Our new distributed mobile cache system, a sustainable distributed Geocast technology, enables da... more Our new distributed mobile cache system, a sustainable distributed Geocast technology, enables data caching temporarily in a designated local area. We released the open software development kit (SDK) for embedded systems. This paper explains details of the developed open SDK, an implementation guide, and three applications using this SDK1.
2008 IEEE 19th International Symposium on Personal, Indoor and Mobile Radio Communications, 2008
A light-weight, smooth, and quick-responsible accessibility to contents are truly needed for a pr... more A light-weight, smooth, and quick-responsible accessibility to contents are truly needed for a practical usage of the mobile video services. We research on and develop a new innovative mobile video technology, which utilizes pre-download, pre-fetching and asynchronous network streaming schemes. This paper describes its innovative mechanisms, especially an asynchronous streaming technique and an asynchronous thumbnail data pre-fetching technique. Two prototype implementations of mobile video client are described in detail, and performance evaluations are given at the end of this paper.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
ABSTRACT We have proposed so far a new multivision video provisioning scheme that enables numerou... more ABSTRACT We have proposed so far a new multivision video provisioning scheme that enables numerous multiple users to access any view area with any desired resolution interactively and simultaneously. This paper proposes an adaptive view-area and quality scaling while preserving quick response for view-area movement. This can adaptively avoid redundant coding bits in conjunction with the movement of the viewing area. Simulation experiments and our implementation show the feasibility and effectiveness of our proposal.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
ABSTRACT As video technology advances rapidly, the need for simultaneous presentation of multiple... more ABSTRACT As video technology advances rapidly, the need for simultaneous presentation of multiple videos is expected to increase in future. This paper utilized the orthogonal experiment to analyze visual effects of video physical features on multiple video sequences. The result shows that video position, human face, and motion attract attention effectively.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
ABSTRACT Fast scan and detection of the end of each CAVLC code is needed for bitstream operations... more ABSTRACT Fast scan and detection of the end of each CAVLC code is needed for bitstream operations like transcoding and manipulating. However, they need a larger computational load due to sequential and context-adaptive nature of CAVLC. That is why that it is dominant process in en entire decoding process. This paper proposes a fast skip scheme of CAVLC level code. Simulation results at the end of this paper show that the proposal about 70% reduction of CAVLC level code skip, and outperforms a conventional method.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
Recently, many researchers focus on high-definition and multivision video provisioning. [1] and [... more Recently, many researchers focus on high-definition and multivision video provisioning. [1] and [2] proposed a free-view access of the high-definition video and panoramic using multiple partitioned video streams. While they did not consider high complexity on the client to decode and play the video, [3] presented an innovative stream joining method for lower-complexity client as well as lower server load. It is, however, not practically feasible to serve videos simultaneously to a larger amount of users. This paper presents encoding techniques for a fast and simple stream joining that can achieve simultaneous multivision video provisioning.
2011 IEEE International Conference on Consumer Electronics (ICCE), 2011
ABSTRACT This paper proposes an on-demand soundscape generation and provisioning for a user to ex... more ABSTRACT This paper proposes an on-demand soundscape generation and provisioning for a user to experience a real world in a requested remote place. This generation is achieved by spatial audio mixing considering a real world condition like geographical features or townscapes as well as dynamic situation such as town events or weather. The proposed velocity vector-based clustering can reduce the cost of composing/decomposing clusters of many sound sources. Index Terms — soundscape; virtual reality; spatial audio; velocity vector-based clustering; HRTF I. INTRODUCTION An innovative services like Google Street View enables us to search and view any picture of landscape all over the world where we want to know via the Internet. In addition to visual information of the landscape, its surrounding sound makes our understanding and experience of that place much richer. The term "Soundscape" coined by composer R. Murray Schafe describes atmosphere or environment created by or with sound. Recently, some attempts to virtually generate a soundscape have been studied (1). However, it only serves a soundscape of specific or limited locations. In this paper, we target an implementation of platform that dynamically generates soundscape of any location, and services with acoustic engineering and virtual reality technique at the core. This is pushed by rich information of a real world like Geographic Information System (GIS), event, weather, and human behavior from public Cloud and Web services. This platform dynamically retrieves sound sources for representing the requested place, and finally constructs its soundscape by spatial audio mixing of the sounds considering those locations and listener position. Especially, we propose a new clustering method in a server to support many clients simultaneously to reduce the cost of spatializing of many sound sources.
2011 IEEE International Conference on Consumer Electronics (ICCE), 2011
ABSTRACT This paper presents a proposal of a new video provisioning scheme that enables numerous ... more ABSTRACT This paper presents a proposal of a new video provisioning scheme that enables numerous multiple users to access any view area with any desired resolution interactively. The basis is that the tile-based encoded multiple streams are joined dynamically and delivered based on the user's view-area position. This paper details a tile-based video coding scheme and a fast tile stream joiner scheme. Simulation experiments and our practical implementation illustrate the feasibility and effectiveness of our proposed scheme. Keywords- Tile video stream, Tile stream joiner, H.264/AVC, Tile- based video coding I. INTRODUCTION Recent technological advances in high-definition (HD) video such as video cameras or display devices, broadband networks and video compression schemes have been making HD video services widespread much faster in the world. Widely various video sharing services are also spreading in the Web-world at an unexpectedly high rate. Therefore, a more efficient and interactive way to enjoy such a huge amount of video contents is anticipated (1)-(3). We propose an innovative video access that enables users to browse and view a large amount of high definition videos efficiently. This proposal simultaneously provides many users a new means to access video contents with their desired view areas and resolutions. This enables an efficient video browser I/F for a web-based video service, or a new EPG I/F for a multi-channel cable television. From the free and deep accessible capability to one video, sports program, or camera surveillance can be considered.
2012 International Symposium on Computer, Consumer and Control, 2012
ABSTRACT In this paper the techniques for developing an efficient method of eye gesture tracking ... more ABSTRACT In this paper the techniques for developing an efficient method of eye gesture tracking are explored. The goal is to verify the possibility of building an effective eye gesture input method relying on built-in hardware of the smart phone platform. The achieved results will be used as a basis for researching the technologies for gaze tracking on a mobile device. Effectiveness is measured according to detection accuracy, robustness, and processing performance. The results show limited success in achieving the stated goals, especially with respect to processing performance and provide a solid foundation for future work.
ACM SIGMOBILE Mobile Computing and Communications Review, 2013
ABSTRACT In this paper, we propose a system of sound-enhanced landscape album for smart-phone. Th... more ABSTRACT In this paper, we propose a system of sound-enhanced landscape album for smart-phone. This system is able to provide landscape image and environmental sound for user designated place. This system is consists of a server and a mobile client terminal. Functions that need high computational costs, such as generating environmental sound, are processed in server side. Because result generated by the server is sent to client side through a network, a mobile client can provide landscape image and environmental sound in real-time manner.
ACOFT/AOS 2006 - 31st Australian Conference on Optical Fibre Technology and Meeting of the Australian Optical Society, 2007
A key network-element for the gigabit Ethernet-optical switched access network that achieves high... more A key network-element for the gigabit Ethernet-optical switched access network that achieves higher security and longer span than PON is introduced. Its feasibility is confirmed when implemented by optical packet switches.
2013 IEEE International Conference on Consumer Electronics (ICCE), 2013
ABSTRACT High functionality and interactivity of video services are strongly desired. The authors... more ABSTRACT High functionality and interactivity of video services are strongly desired. The authors have presented a proposal of an encoding scheme and a fast joining scheme for interactive multivision systems, where joining is the combination of multiple streams into a single video stream. This paper describes implementation of an actual HTTP-based streaming server by embedding the stream joiner module into a widely available HTTP server. The contribution of this paper is that the joining speed of the server surpasses the necessary speed attained with practical network equipment connecting to the server.
2012 IEEE International Conference on Consumer Electronics (ICCE), 2012
ABSTRACT This paper proposes a method to construct the scene composition of an event, a place and... more ABSTRACT This paper proposes a method to construct the scene composition of an event, a place and a situation by learning appearance pattern of objects in them. We recognize objects inside an image, and extract representative patterns based by analyzing the appearance combination of object in a large amount of images. Preliminary experiment shows that the proposal can create knowledge of scene composition with no contradiction and gives unfamiliar insights for the scene1.
2011 IEEE International Conference on Communications Workshops (ICC), 2011
The recent enhancement of mobile devices and wireless networks has enabled content services in mo... more The recent enhancement of mobile devices and wireless networks has enabled content services in mobile environ- ments. Demand prediction is a traditional but powerful technique used for content services. However, it is hard to predict local demand in mobile environments because it depends not only on just user preference and the popularity of common content but also on other factors; users request content related to their locations; moreover, they are interested in the content uploaded by their friends who visited the location before them. Thus, we need to consider the context with multiple factors affecting the demand. In addition, we also need to consider the sequence of contexts. We call a sequence of contexts social context. This makes it difficult to predict local demand from users. In this paper, we propose a novel demand prediction engine that extracts local demand depending on social context and estimates what content will be requested there. To extract the demand, we use a log database and a pattern-matching technique in our prediction engine. To validate our prediction engine, we apply a prefetching service using the engine to a mobile content service.