Antonio Pinheiro | Universidade da Beira Interior (original) (raw)
Papers by Antonio Pinheiro
2019 Data Compression Conference (DCC), 2019
Pseudo-sequence based light field compression methods are a highly efficient solution to compress... more Pseudo-sequence based light field compression methods are a highly efficient solution to compress light field images. They use state-of-the-art video encoders like HEVC to encode the image views. HEVC exploits Coding Tree Unit (CTU) structure which is flexible and highly efficient but it is computationally demanding. Each CTU is examined in various depths, prediction and transformation modes to find an optimal coding structure. Efficiently predicting depth of the coding units can reduce complexity significantly. In this paper, a new depth decision method is introduced which exploits the minimum and maximum of previously encoded co-located coding units in spatially closer reference images. Minimum and maximum depths of these co-located CTUs are computed for each coding unit and are used to limit the depth of the current coding unit. Experimental results show up to 55% and 85% encoding time reduction with serial and parallel processing respectively, at negligible degradations.
ArXiv, 2019
Light field imaging is characterized by capturing brightness, color, and directional information ... more Light field imaging is characterized by capturing brightness, color, and directional information of light rays in a scene. This leads to image representations with huge amount of data that require efficient coding schemes. In this paper, lenslet images are rendered into sub-aperture images. These images are organized as a pseudo-sequence input for the HEVC video codec. To better exploit redundancy among the neighboring sub-aperture images and consequently decrease the distances between a sub-aperture image and its references used for prediction, sub-aperture images are divided into four smaller groups that are scanned in a serpentine order. The most central sub-aperture image, which has the highest similarity to all the other images, is used as the initial reference image for each of the four regions. Furthermore, a structure is defined that selects spatially adjacent sub-aperture images as prediction references with the highest similarity to the current image. In this way, encoding...
2019 Data Compression Conference (DCC), 2019
In light field compression, besides coding efficiency, providing random access to individual view... more In light field compression, besides coding efficiency, providing random access to individual views is also a very significant factor. Highly efficient compression methods usually lack random access. Similarly, random access methods usually reduce the compression efficiency. To address this trade-off, a light field image encoding method is proposed in this paper which favors random access. In the proposed scheme 15x15 view images are divided into 25 independent 3x3 view images which are called Macro View Image (MVI). To encode MVIs, the central view image is used to compress its immediate neighboring view images using a hierarchical reference structure. To encode the central view of each MVI, the most central view image, along with the center of at most three MVIs, are used as the reference images for the disparity estimation. In addition, the proposed method enables the use of parallel computation to improve encoding/decoding time complexity. To reduce memory footprint in case a Region of Interest (ROI) is required, HEVC tile partitioning is used.
ArXiv, 2020
With the coming of age of virtual/augmented reality and interactive media, numerous definitions, ... more With the coming of age of virtual/augmented reality and interactive media, numerous definitions, frameworks, and models of immersion have emerged across different fields ranging from computer graphics to literary works. Immersion is oftentimes used interchangeably with presence as both concepts are closely related. However, there are noticeable interdisciplinary differences regarding definitions, scope, and constituents that are required to be addressed so that a coherent understanding of the concepts can be achieved. Such consensus is vital for paving the directionality of the future of immersive media experiences (IMEx) and all related matters. The aim of this white paper is to provide a survey of definitions of immersion and presence which leads to a definition of immersive media experience (IMEx). The Quality of Experience (QoE) for immersive media is described by establishing a relationship between the concepts of QoE and IMEx followed by application areas of immersive media ex...
2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX), 2018
Point clouds are one of the most promising technologies for 3D content representation. In this pa... more Point clouds are one of the most promising technologies for 3D content representation. In this paper, we describe a study on quality assessment of point clouds, degraded by octreebased compression on different levels. The test contents were displayed using Screened Poisson surface reconstruction, without including any textural information, and they were rated by subjects in a passive way, using a 2D image sequence. Subjective evaluations were performed in five independent laboratories in different countries, with the inter-laboratory correlation analysis showing no statistical differences, despite the different equipment employed. Benchmarking results reveal that the state-of-the-art point cloud objective metrics are not able to accurately predict the expected visual quality of such test contents. Moreover, the subjective scores collected from this experiment were found to be poorly correlated with subjective scores obtained from another test involving visualization of raw point clouds. These results suggest the need for further investigations on adequate point cloud representations and objective quality assessment tools.
2020 IEEE International Conference on Image Processing (ICIP), 2020
This paper presents a quality evaluation of the point cloud codecs recently standardised by the M... more This paper presents a quality evaluation of the point cloud codecs recently standardised by the MPEG committee. A subjective experiment was designed to evaluate these codecs performance in terms of bit rate versus perceived quality. Four laboratories with experience with such studies carried out the subjective evaluation. Although the exact setups of the different laboratories were not the same, the obtained MOS results exhibit a high correlation between them, confirming reliability and repeatability of the proposed assessment protocol. The study also confirmed MPEG V-PCC as a superior compression solution for static point clouds when compared to MPEG G-PCC. Finally, a benchmark of the most popular point cloud metrics was performed based on the subjective results. The point2plane metric using the mean square error as a distance measure was revealed to have the best correlation with subjective scores, closely followed by the point2point, also using the mean square error. As both metrics produce high correlation results, it can be concluded that they can be used for quality assessment of MPEG codecs.
IEEE Transactions on Multimedia, 2021
Recently, more interest in the different plenoptic formats, including digital holograms, has emer... more Recently, more interest in the different plenoptic formats, including digital holograms, has emerged. Aside from other challenges that several steps of the holographic pipeline, from digital acquisition to display, have to face, visual quality assessment of compressed holograms is particularly demanding due to the distinct nature of this 3D image modality when compared to regular 2D imaging. There are few studies on holographic data quality assessment, particularly with respect to the perceptual effects of lossy compression. This work aims to study the quality evaluation of digital hologram reconstructions presented on regular 2D displays in the presence of compression distortions. As there is no established or generally agreed on compression methodology for digital hologram compression on the hologram plane with available implementations, a set of state-of-the-art compression codecs, namely, HEVC, AV1, and JPEG2000, were used for compression of the digital holograms on the object plane. Both computergenerated and optically generated holograms were considered. Two subjective tests were conducted to evaluate distortions caused by compression. The first subjective test was conducted on the reconstructed amplitude images of central views, while the second test was conducted on pseudovideos generated from the reconstructed amplitudes of different views. The subjective quality assessment was based on mean opinion scores. A selection of objective quality metrics was evaluated, and their correlations with mean opinion scores were computed. The VIFp metrics appeared to have the highest correlation. Index Terms-Digital holography, perceived quality, MOS, codecs I. INTRODUCTION D IGITAL holography (DH) is a three-dimensional imaging technique where the coherent superposition between the
Publication in the conference proceedings of EUSIPCO, Barcelona, Spain, 2011
2014 22nd European Signal Processing Conference (EUSIPCO), 2014
In this paper a study on the perceived quality that results of chromatic variations in 3D video i... more In this paper a study on the perceived quality that results of chromatic variations in 3D video is reported. The testing videos were represented in the CIE 1976 (L*a*b*) color space, and their colors were initially subdivided into clusters based on their similarity. Predefined chromatic errors were applied to these color clusters. These videos were shown to subjects that were asked to rank their quality based on the colors naturalness. The Mean Opinion Scores were computed and the sensibility to chromatic changes on 3D video was quantified. Moreover, attention maps were obtained and a short study on the changes of the visual saliency in the presence of these chromatic variations is also reported.
Multimedia Tools and Applications, 2020
The MPEG-DASH protocol has been rapidly adopted by most major network content providers and enabl... more The MPEG-DASH protocol has been rapidly adopted by most major network content providers and enables clients to make informed decisions in the context of HTTP streaming, based on network and device conditions using the available media representations. A review of the literature on adaptive streaming over mobile shows that most emphasis has been on adapting the video quality whereas this work examines the trade-off between video and audio quality. In particular, subjective tests were undertaken for live music streaming over emulated mobile networks with MPEG-DASH. A group of audio/video sequences was designed to emulate varying bandwidth arising from network congestion, with varying trade-off between audio and video bit rates. Absolute Category Rating was used to evaluate the relative impact of both audio and video quality in the overall Quality of Experience (QoE). One key finding from the statistical analysis of Mean Opinion Scores (MOS) results using Analysis of Variance indicates that providing reduced audio quality has a much lower impact on QoE than reducing video quality at similar total bandwidth situations. This paper also describes an objective model for audiovisual quality estimation that combines the outcomes from audio and video metrics into a joint parametric model. The correlation between predicted and subjective MOS was computed using several outcomes (Pearson and Spearman correlation coefficients, Root Mean Square This research was co-funded by FEDER-PT2020, Portugal partnership agreement, under the project PTDC/EEI-PRO/2849/ 2014-POCI-01-0145-FEDER-016693, Fundação para a Ciência e a Tecnologia (FCT/MCTES) under the project UIDB/EEA/50008/2020 and the Instituto de Telecomunicações-Fundação para a Ciência e a Tecnologia (project UID/EEA/50008/2013) under internal project QoEVIS.
ACM SIGMultimedia Records, 2017
ISO/IEC JTC 1/SC 29 area of work comprises the standardization of coded representation of audio, ... more ISO/IEC JTC 1/SC 29 area of work comprises the standardization of coded representation of audio, picture, multimedia and hypermedia information and sets of compression and control functions for use with such information. SC29 basically hosts two working groups responsible for the development of international standards for the compression, decompression, processing, and coded representation of media content, in order to satisfy a wide variety of applications", specifically WG1 targeting "digital still pictures" - also known as JPEG - and WG11 targeting "moving pictures, audio, and their combination" - also known as MPEG. The earlier SC29 standards, namely JPEG, MPEG-1 and MPEG-2, received the technology & engineering Emmy award in 1995-96. The standards columns within ACM SIGMM Records provide timely updates about the most recent developments within JPEG and MPEG respectively. The JPEG column is edited by Antonio Pinheiro and the MPEG column is edited by Christian Timmerer. The editors and an overview of recent JPEG and MPEG achievements as well as future plans are highlighted in this article.
Multimedia Tools and Applications, 2019
Due to high correlations among the adjacent blocks, several algorithms utilize movement informati... more Due to high correlations among the adjacent blocks, several algorithms utilize movement information of spatially and temporally correlated neighbouring blocks to adapt their search patterns to that information. In this paper, this information is used to define a dynamic search pattern. Each frame is divided into two sets, black and white blocks, like a chessboard pattern and a di↵erent search pattern is defined for each set. The advantage of this definition is that the number of spatially neighbouring blocks is increased for each current block and it leads to a better prediction for each block. Simulation results show that the proposed algorithm is closer to the Full-Search algorithm in terms of quality metrics such as PSNR than the other state-ofthe-art algorithms while at the same time the average number of search points is less.
2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX), 2016
The rapid adoption of MPEG-DASH is testament to its core design principles that enable the client... more The rapid adoption of MPEG-DASH is testament to its core design principles that enable the client to make the informed decision relating to media encoding representations, based on network conditions, device type and preferences. Typically, the focus has mostly been on the different video quality representations rather than audio. However, for device types with small screens, the relative bandwidth budget difference allocated to the two streams may not be that large. This is especially the case if high quality audio is used, and in this scenario, we argue that increased focus should be given to the bit rate representations for audio. Arising from this, we have designed and implemented a subjective experiment to evaluate and analyse the possible effect of using different audio quality levels. In particular, we investigate the possibility of providing reduced audio quality so as to free up bandwidth for video under certain conditions. Thus, the experiment was implemented for live music concert scenarios transmitted over mobile networks, and we suggest that the results will be of significant interest to DASH content creators when considering bandwidth tradeoff between audio and video.
2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP), 2014
The upcoming JPEG XT is under development for High Dynamic Range (HDR) image compression. This st... more The upcoming JPEG XT is under development for High Dynamic Range (HDR) image compression. This standard encodes a Low Dynamic Range (LDR) version of the HDR image generated by a Tone-Mapping Operator (TMO) using the conventional JPEG coding as a base layer and encodes the extra HDR information in a residual layer. This paper studies the performance of the three profiles of JPEG XT (referred to as profiles A, B and C) using a test set of six HDR images. Four TMO techniques were used for the base layer image generation to assess the influence of the TMOs on the performance of JPEG XT profiles. Then, the HDR images were coded with different quality levels for the base layer and for the residual layer. The performance of each profile was evaluated using Signal to Noise Ratio (SNR), Feature SIMilarity Index (FSIM), Root Mean Square Error (RMSE), and CIEDE2000 color difference objective metrics. The evaluation results demonstrate that profiles A and B lead to similar saturation of quality at the higher bit rates, while profile C exhibits no saturation. Profiles B and C appear to be more dependent on TMOs used for the base layer compared to profile A.
EPJ Nonlinear Biomedical Physics, 2015
Several previous clinical or preclinical studies using computerized texture analysis of MR Images... more Several previous clinical or preclinical studies using computerized texture analysis of MR Images have demonstrated much more clinical discrimination than visual image analysis by the radiologist. In muscular dystrophy, a discriminating power has been already demonstrated with various methods of texture analysis of magnetic resonance images (MRI-TA). Unfortunately, a scale gap exists between the spatial resolutions of histological and MR images making a direct correlation impossible. Furthermore, the effect of the various histological modifications on the grey level of each pixel is complex and cannot be easily analyzed. Consequently, clinicians will not accept the use of MRI-TA in routine practice if TA remains a "black box" without clinical correspondence at a tissue level. A goal therefore of the multicenter European COST action MYO-MRI is to optimize MRI-TA methods in muscular dystrophy and to elucidate the histological meaning of MRI textures.
2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX), 2014
High Dynamic Range (HDR) imaging is able to capture a wide range of luminance values, closer to w... more High Dynamic Range (HDR) imaging is able to capture a wide range of luminance values, closer to what the human visual system can perceive. It is believed by many that HDR is a technology that will revolutionize TV and cinema industry similar to how color television did. However, the complexity of HDR requires reinvention of the whole chain from capture to display. In this paper, HDR images compressed with the upcoming JPEG XT HDR image coding standard are used to investigate the correlation between thirteen well known fullreference metrics and perceived quality of HDR content. The metrics are benchmarked using ground truth subjective scores collected during quality evaluations performed on a Dolby Pulsar HDR monitor. Results demonstrate that objective quality assessment of HDR image compression is challenging. Most of the tested metrics, with exceptions of HDR-VDP-2 and FSIM computed for luma component, poorly predict human perception of visual quality.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2010
Edges are one of the most important image visual features. They are highly related with shapes an... more Edges are one of the most important image visual features. They are highly related with shapes and can also be representative of the image textures. Edge orientations histograms are usually very reliable descriptors suitable for image analysis, search and retrieval. In this work edges detected with Canny algorithm are described by their angular orientations. The resulting descriptor is resilient to image rotation and image translation. It is also resilient to noise. An example of automatic image semantic annotation using this description method is reported using a database with 738 images. The K Nearest Neighbor is used as classifier and the Manhattan distance is used for image similarity computation. The annotation that results with this description method is compared with the provided with other well known descriptors. These examples show that a reliable high level automatic description based in the semantic content can be extracted.
MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia, 2012
ABSTRACT A study on the perceived quality of images displayed with color changes is presented. Un... more ABSTRACT A study on the perceived quality of images displayed with color changes is presented. Under the D65 standard illuminant colors are changed in the CIE 1976 (L*a*b*) color space, with the application of a predefined chromatic error ΔE*ab. The colors were initially divided into clusters with the K-Means algorithm. Each color cluster is shifted by the predefined chromatic error with a random direction in a*b* chromatic coordinates. Applying the ΔE*ab errors of 3, 6, 9, 12 and 15 units to the five hyperspectral images a set of modified images was collected. Those images were shown to individuals, that were asked to rank those images quality based on their naturalness. The Medium Opinion Scores was computed and allowed to test and quantify the sensibility to color changes.
2013 5th International Workshop on Quality of Multimedia Experience, QoMEX 2013 - Proceedings, 2013
ABSTRACT A study on the perceived quality of images displayed with color changes is presented. In... more ABSTRACT A study on the perceived quality of images displayed with color changes is presented. Initially, two hyperspectral images colors were represented in the CIE 1976 (L*a*b*) color space under the D65 standard illuminant. The colors of each image were divided into four clusters with the application of the K-Means algorithm. A new set of images was created changing only one of the four color clusters. The color change results from the application of a predefined chromatic error to the chromatic coordinates (a*, b*) of all pixels in the cluster. Errors of 6, 9, 12 and 15 ΔE*ab units were applied, using two different directions for each color cluster. These images were displayed for individuals visualization, that were asked to rank their quality based on their naturalness. The Mean Opinion Scores was computed and allowed to test and quantify the sensitivity to specific color changes.
2019 Data Compression Conference (DCC), 2019
Pseudo-sequence based light field compression methods are a highly efficient solution to compress... more Pseudo-sequence based light field compression methods are a highly efficient solution to compress light field images. They use state-of-the-art video encoders like HEVC to encode the image views. HEVC exploits Coding Tree Unit (CTU) structure which is flexible and highly efficient but it is computationally demanding. Each CTU is examined in various depths, prediction and transformation modes to find an optimal coding structure. Efficiently predicting depth of the coding units can reduce complexity significantly. In this paper, a new depth decision method is introduced which exploits the minimum and maximum of previously encoded co-located coding units in spatially closer reference images. Minimum and maximum depths of these co-located CTUs are computed for each coding unit and are used to limit the depth of the current coding unit. Experimental results show up to 55% and 85% encoding time reduction with serial and parallel processing respectively, at negligible degradations.
ArXiv, 2019
Light field imaging is characterized by capturing brightness, color, and directional information ... more Light field imaging is characterized by capturing brightness, color, and directional information of light rays in a scene. This leads to image representations with huge amount of data that require efficient coding schemes. In this paper, lenslet images are rendered into sub-aperture images. These images are organized as a pseudo-sequence input for the HEVC video codec. To better exploit redundancy among the neighboring sub-aperture images and consequently decrease the distances between a sub-aperture image and its references used for prediction, sub-aperture images are divided into four smaller groups that are scanned in a serpentine order. The most central sub-aperture image, which has the highest similarity to all the other images, is used as the initial reference image for each of the four regions. Furthermore, a structure is defined that selects spatially adjacent sub-aperture images as prediction references with the highest similarity to the current image. In this way, encoding...
2019 Data Compression Conference (DCC), 2019
In light field compression, besides coding efficiency, providing random access to individual view... more In light field compression, besides coding efficiency, providing random access to individual views is also a very significant factor. Highly efficient compression methods usually lack random access. Similarly, random access methods usually reduce the compression efficiency. To address this trade-off, a light field image encoding method is proposed in this paper which favors random access. In the proposed scheme 15x15 view images are divided into 25 independent 3x3 view images which are called Macro View Image (MVI). To encode MVIs, the central view image is used to compress its immediate neighboring view images using a hierarchical reference structure. To encode the central view of each MVI, the most central view image, along with the center of at most three MVIs, are used as the reference images for the disparity estimation. In addition, the proposed method enables the use of parallel computation to improve encoding/decoding time complexity. To reduce memory footprint in case a Region of Interest (ROI) is required, HEVC tile partitioning is used.
ArXiv, 2020
With the coming of age of virtual/augmented reality and interactive media, numerous definitions, ... more With the coming of age of virtual/augmented reality and interactive media, numerous definitions, frameworks, and models of immersion have emerged across different fields ranging from computer graphics to literary works. Immersion is oftentimes used interchangeably with presence as both concepts are closely related. However, there are noticeable interdisciplinary differences regarding definitions, scope, and constituents that are required to be addressed so that a coherent understanding of the concepts can be achieved. Such consensus is vital for paving the directionality of the future of immersive media experiences (IMEx) and all related matters. The aim of this white paper is to provide a survey of definitions of immersion and presence which leads to a definition of immersive media experience (IMEx). The Quality of Experience (QoE) for immersive media is described by establishing a relationship between the concepts of QoE and IMEx followed by application areas of immersive media ex...
2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX), 2018
Point clouds are one of the most promising technologies for 3D content representation. In this pa... more Point clouds are one of the most promising technologies for 3D content representation. In this paper, we describe a study on quality assessment of point clouds, degraded by octreebased compression on different levels. The test contents were displayed using Screened Poisson surface reconstruction, without including any textural information, and they were rated by subjects in a passive way, using a 2D image sequence. Subjective evaluations were performed in five independent laboratories in different countries, with the inter-laboratory correlation analysis showing no statistical differences, despite the different equipment employed. Benchmarking results reveal that the state-of-the-art point cloud objective metrics are not able to accurately predict the expected visual quality of such test contents. Moreover, the subjective scores collected from this experiment were found to be poorly correlated with subjective scores obtained from another test involving visualization of raw point clouds. These results suggest the need for further investigations on adequate point cloud representations and objective quality assessment tools.
2020 IEEE International Conference on Image Processing (ICIP), 2020
This paper presents a quality evaluation of the point cloud codecs recently standardised by the M... more This paper presents a quality evaluation of the point cloud codecs recently standardised by the MPEG committee. A subjective experiment was designed to evaluate these codecs performance in terms of bit rate versus perceived quality. Four laboratories with experience with such studies carried out the subjective evaluation. Although the exact setups of the different laboratories were not the same, the obtained MOS results exhibit a high correlation between them, confirming reliability and repeatability of the proposed assessment protocol. The study also confirmed MPEG V-PCC as a superior compression solution for static point clouds when compared to MPEG G-PCC. Finally, a benchmark of the most popular point cloud metrics was performed based on the subjective results. The point2plane metric using the mean square error as a distance measure was revealed to have the best correlation with subjective scores, closely followed by the point2point, also using the mean square error. As both metrics produce high correlation results, it can be concluded that they can be used for quality assessment of MPEG codecs.
IEEE Transactions on Multimedia, 2021
Recently, more interest in the different plenoptic formats, including digital holograms, has emer... more Recently, more interest in the different plenoptic formats, including digital holograms, has emerged. Aside from other challenges that several steps of the holographic pipeline, from digital acquisition to display, have to face, visual quality assessment of compressed holograms is particularly demanding due to the distinct nature of this 3D image modality when compared to regular 2D imaging. There are few studies on holographic data quality assessment, particularly with respect to the perceptual effects of lossy compression. This work aims to study the quality evaluation of digital hologram reconstructions presented on regular 2D displays in the presence of compression distortions. As there is no established or generally agreed on compression methodology for digital hologram compression on the hologram plane with available implementations, a set of state-of-the-art compression codecs, namely, HEVC, AV1, and JPEG2000, were used for compression of the digital holograms on the object plane. Both computergenerated and optically generated holograms were considered. Two subjective tests were conducted to evaluate distortions caused by compression. The first subjective test was conducted on the reconstructed amplitude images of central views, while the second test was conducted on pseudovideos generated from the reconstructed amplitudes of different views. The subjective quality assessment was based on mean opinion scores. A selection of objective quality metrics was evaluated, and their correlations with mean opinion scores were computed. The VIFp metrics appeared to have the highest correlation. Index Terms-Digital holography, perceived quality, MOS, codecs I. INTRODUCTION D IGITAL holography (DH) is a three-dimensional imaging technique where the coherent superposition between the
Publication in the conference proceedings of EUSIPCO, Barcelona, Spain, 2011
2014 22nd European Signal Processing Conference (EUSIPCO), 2014
In this paper a study on the perceived quality that results of chromatic variations in 3D video i... more In this paper a study on the perceived quality that results of chromatic variations in 3D video is reported. The testing videos were represented in the CIE 1976 (L*a*b*) color space, and their colors were initially subdivided into clusters based on their similarity. Predefined chromatic errors were applied to these color clusters. These videos were shown to subjects that were asked to rank their quality based on the colors naturalness. The Mean Opinion Scores were computed and the sensibility to chromatic changes on 3D video was quantified. Moreover, attention maps were obtained and a short study on the changes of the visual saliency in the presence of these chromatic variations is also reported.
Multimedia Tools and Applications, 2020
The MPEG-DASH protocol has been rapidly adopted by most major network content providers and enabl... more The MPEG-DASH protocol has been rapidly adopted by most major network content providers and enables clients to make informed decisions in the context of HTTP streaming, based on network and device conditions using the available media representations. A review of the literature on adaptive streaming over mobile shows that most emphasis has been on adapting the video quality whereas this work examines the trade-off between video and audio quality. In particular, subjective tests were undertaken for live music streaming over emulated mobile networks with MPEG-DASH. A group of audio/video sequences was designed to emulate varying bandwidth arising from network congestion, with varying trade-off between audio and video bit rates. Absolute Category Rating was used to evaluate the relative impact of both audio and video quality in the overall Quality of Experience (QoE). One key finding from the statistical analysis of Mean Opinion Scores (MOS) results using Analysis of Variance indicates that providing reduced audio quality has a much lower impact on QoE than reducing video quality at similar total bandwidth situations. This paper also describes an objective model for audiovisual quality estimation that combines the outcomes from audio and video metrics into a joint parametric model. The correlation between predicted and subjective MOS was computed using several outcomes (Pearson and Spearman correlation coefficients, Root Mean Square This research was co-funded by FEDER-PT2020, Portugal partnership agreement, under the project PTDC/EEI-PRO/2849/ 2014-POCI-01-0145-FEDER-016693, Fundação para a Ciência e a Tecnologia (FCT/MCTES) under the project UIDB/EEA/50008/2020 and the Instituto de Telecomunicações-Fundação para a Ciência e a Tecnologia (project UID/EEA/50008/2013) under internal project QoEVIS.
ACM SIGMultimedia Records, 2017
ISO/IEC JTC 1/SC 29 area of work comprises the standardization of coded representation of audio, ... more ISO/IEC JTC 1/SC 29 area of work comprises the standardization of coded representation of audio, picture, multimedia and hypermedia information and sets of compression and control functions for use with such information. SC29 basically hosts two working groups responsible for the development of international standards for the compression, decompression, processing, and coded representation of media content, in order to satisfy a wide variety of applications", specifically WG1 targeting "digital still pictures" - also known as JPEG - and WG11 targeting "moving pictures, audio, and their combination" - also known as MPEG. The earlier SC29 standards, namely JPEG, MPEG-1 and MPEG-2, received the technology & engineering Emmy award in 1995-96. The standards columns within ACM SIGMM Records provide timely updates about the most recent developments within JPEG and MPEG respectively. The JPEG column is edited by Antonio Pinheiro and the MPEG column is edited by Christian Timmerer. The editors and an overview of recent JPEG and MPEG achievements as well as future plans are highlighted in this article.
Multimedia Tools and Applications, 2019
Due to high correlations among the adjacent blocks, several algorithms utilize movement informati... more Due to high correlations among the adjacent blocks, several algorithms utilize movement information of spatially and temporally correlated neighbouring blocks to adapt their search patterns to that information. In this paper, this information is used to define a dynamic search pattern. Each frame is divided into two sets, black and white blocks, like a chessboard pattern and a di↵erent search pattern is defined for each set. The advantage of this definition is that the number of spatially neighbouring blocks is increased for each current block and it leads to a better prediction for each block. Simulation results show that the proposed algorithm is closer to the Full-Search algorithm in terms of quality metrics such as PSNR than the other state-ofthe-art algorithms while at the same time the average number of search points is less.
2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX), 2016
The rapid adoption of MPEG-DASH is testament to its core design principles that enable the client... more The rapid adoption of MPEG-DASH is testament to its core design principles that enable the client to make the informed decision relating to media encoding representations, based on network conditions, device type and preferences. Typically, the focus has mostly been on the different video quality representations rather than audio. However, for device types with small screens, the relative bandwidth budget difference allocated to the two streams may not be that large. This is especially the case if high quality audio is used, and in this scenario, we argue that increased focus should be given to the bit rate representations for audio. Arising from this, we have designed and implemented a subjective experiment to evaluate and analyse the possible effect of using different audio quality levels. In particular, we investigate the possibility of providing reduced audio quality so as to free up bandwidth for video under certain conditions. Thus, the experiment was implemented for live music concert scenarios transmitted over mobile networks, and we suggest that the results will be of significant interest to DASH content creators when considering bandwidth tradeoff between audio and video.
2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP), 2014
The upcoming JPEG XT is under development for High Dynamic Range (HDR) image compression. This st... more The upcoming JPEG XT is under development for High Dynamic Range (HDR) image compression. This standard encodes a Low Dynamic Range (LDR) version of the HDR image generated by a Tone-Mapping Operator (TMO) using the conventional JPEG coding as a base layer and encodes the extra HDR information in a residual layer. This paper studies the performance of the three profiles of JPEG XT (referred to as profiles A, B and C) using a test set of six HDR images. Four TMO techniques were used for the base layer image generation to assess the influence of the TMOs on the performance of JPEG XT profiles. Then, the HDR images were coded with different quality levels for the base layer and for the residual layer. The performance of each profile was evaluated using Signal to Noise Ratio (SNR), Feature SIMilarity Index (FSIM), Root Mean Square Error (RMSE), and CIEDE2000 color difference objective metrics. The evaluation results demonstrate that profiles A and B lead to similar saturation of quality at the higher bit rates, while profile C exhibits no saturation. Profiles B and C appear to be more dependent on TMOs used for the base layer compared to profile A.
EPJ Nonlinear Biomedical Physics, 2015
Several previous clinical or preclinical studies using computerized texture analysis of MR Images... more Several previous clinical or preclinical studies using computerized texture analysis of MR Images have demonstrated much more clinical discrimination than visual image analysis by the radiologist. In muscular dystrophy, a discriminating power has been already demonstrated with various methods of texture analysis of magnetic resonance images (MRI-TA). Unfortunately, a scale gap exists between the spatial resolutions of histological and MR images making a direct correlation impossible. Furthermore, the effect of the various histological modifications on the grey level of each pixel is complex and cannot be easily analyzed. Consequently, clinicians will not accept the use of MRI-TA in routine practice if TA remains a "black box" without clinical correspondence at a tissue level. A goal therefore of the multicenter European COST action MYO-MRI is to optimize MRI-TA methods in muscular dystrophy and to elucidate the histological meaning of MRI textures.
2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX), 2014
High Dynamic Range (HDR) imaging is able to capture a wide range of luminance values, closer to w... more High Dynamic Range (HDR) imaging is able to capture a wide range of luminance values, closer to what the human visual system can perceive. It is believed by many that HDR is a technology that will revolutionize TV and cinema industry similar to how color television did. However, the complexity of HDR requires reinvention of the whole chain from capture to display. In this paper, HDR images compressed with the upcoming JPEG XT HDR image coding standard are used to investigate the correlation between thirteen well known fullreference metrics and perceived quality of HDR content. The metrics are benchmarked using ground truth subjective scores collected during quality evaluations performed on a Dolby Pulsar HDR monitor. Results demonstrate that objective quality assessment of HDR image compression is challenging. Most of the tested metrics, with exceptions of HDR-VDP-2 and FSIM computed for luma component, poorly predict human perception of visual quality.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2010
Edges are one of the most important image visual features. They are highly related with shapes an... more Edges are one of the most important image visual features. They are highly related with shapes and can also be representative of the image textures. Edge orientations histograms are usually very reliable descriptors suitable for image analysis, search and retrieval. In this work edges detected with Canny algorithm are described by their angular orientations. The resulting descriptor is resilient to image rotation and image translation. It is also resilient to noise. An example of automatic image semantic annotation using this description method is reported using a database with 738 images. The K Nearest Neighbor is used as classifier and the Manhattan distance is used for image similarity computation. The annotation that results with this description method is compared with the provided with other well known descriptors. These examples show that a reliable high level automatic description based in the semantic content can be extracted.
MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia, 2012
ABSTRACT A study on the perceived quality of images displayed with color changes is presented. Un... more ABSTRACT A study on the perceived quality of images displayed with color changes is presented. Under the D65 standard illuminant colors are changed in the CIE 1976 (L*a*b*) color space, with the application of a predefined chromatic error ΔE*ab. The colors were initially divided into clusters with the K-Means algorithm. Each color cluster is shifted by the predefined chromatic error with a random direction in a*b* chromatic coordinates. Applying the ΔE*ab errors of 3, 6, 9, 12 and 15 units to the five hyperspectral images a set of modified images was collected. Those images were shown to individuals, that were asked to rank those images quality based on their naturalness. The Medium Opinion Scores was computed and allowed to test and quantify the sensibility to color changes.
2013 5th International Workshop on Quality of Multimedia Experience, QoMEX 2013 - Proceedings, 2013
ABSTRACT A study on the perceived quality of images displayed with color changes is presented. In... more ABSTRACT A study on the perceived quality of images displayed with color changes is presented. Initially, two hyperspectral images colors were represented in the CIE 1976 (L*a*b*) color space under the D65 standard illuminant. The colors of each image were divided into four clusters with the application of the K-Means algorithm. A new set of images was created changing only one of the four color clusters. The color change results from the application of a predefined chromatic error to the chromatic coordinates (a*, b*) of all pixels in the cluster. Errors of 6, 9, 12 and 15 ΔE*ab units were applied, using two different directions for each color cluster. These images were displayed for individuals visualization, that were asked to rank their quality based on their naturalness. The Mean Opinion Scores was computed and allowed to test and quantify the sensitivity to specific color changes.