Dimitrios Makris | Kingston University, London (original) (raw)
Papers by Dimitrios Makris
arXiv (Cornell University), Feb 13, 2023
Taking advantage of an event-based camera, the issues of motion blur, low dynamic range and low t... more Taking advantage of an event-based camera, the issues of motion blur, low dynamic range and low time sampling of standard cameras can all be addressed. However, there is a lack of event-based datasets dedicated to the benchmarking of segmentation algorithms, especially those that provide depth information which is critical for segmentation in occluded scenes. This paper proposes a new Event-based Segmentation Dataset (ESD), a high-quality 3D spatial and temporal dataset for object segmentation in an indoor cluttered environment. Our proposed dataset ESD comprises 145 sequences with 14,166 RGB frames that are manually annotated with instance masks. Overall 21.88 million and 20.80 million events from two event-based cameras in a stereo-graphic configuration are collected, respectively. To the best of our knowledge, this densely annotated and 3D spatial-temporal event-based segmentation benchmark of tabletop objects is the first of its kind. By releasing ESD, we expect to provide the community with a challenging segmentation benchmark with high quality. Please note: Abbreviations should be introduced at the first mention in the main text-no abbreviations lists or tables should be included. The structure of the main text is provided below.
Image fusion methods have gained a lot of attraction over the past few years in the field of sens... more Image fusion methods have gained a lot of attraction over the past few years in the field of sensor fusion. An efficient image fusion approach can obtain complementary information from various multi-modality images. In addition, the fused image is more robust to imperfect conditions such as mis-registration and noise. The aim of this paper is to explore the performance of existing deep learning-based and traditional image fusion techniques for our real marine images. The performance of these techniques is evaluated with six common quality metrics. Image data was collected using a sensor system onboard a vessel in the Finnish archipelago. This system is used for developing autonomous vessels, and records data in a range of operation and climatic conditions. To the best of our knowledge, there is not a comparative study of RGB and infrared image fusion algorithms evaluated in a marine environment. Experimental results indicate that deep learning-based fusion methods can significantly improve the image fusion performance considering both the visual quality and objective assessment comparison against with other methods.
Safety and security are critical issues in maritime environment. Automatic and reliable object de... more Safety and security are critical issues in maritime environment. Automatic and reliable object detection based on multi-sensor data fusion is one of the efficient way for improving these issues in intelligent systems. In this paper, we propose an early fusion framework to achieve a robust object detection. The framework firstly utilizes a fusion strategy to combine both visible and infrared images and generates fused images. The resulting fused images are then processed by a simple dense convolutional neural network based detector, RetinaNet, to predict multiple 2D box hypotheses and the infrared confidences. To evaluate the proposed framework, we collected a real marine dataset using a sensor system onboard a vessel in the Finnish archipelago. This system is used for developing autonomous vessels, and records data in a range of operation and climatic and light conditions. The experimental results show that the proposed fusion framework able to identify the interest of objects surrounding the vessel substantially better compared with the baseline approaches.
arXiv (Cornell University), May 5, 2023
In the context of robotic grasping, object segmentation encounters several difficulties when face... more In the context of robotic grasping, object segmentation encounters several difficulties when faced with dynamic conditions such as real-time operation, occlusion, low lighting, motion blur, and object size variability. In response to these challenges, we propose the Graph Mixer Neural Network that includes a novel collaborative contextual mixing layer, applied to 3D event graphs formed on asynchronous events. The proposed layer is designed to spread spatiotemporal correlation within an event graph at four nearest neighbor levels parallelly. We evaluate the effectiveness of our proposed method on the Event-based Segmentation (ESD) Dataset, which includes five unique image degradation challenges, including occlusion, blur, brightness, trajectory, scale variance, and segmentation of known and unknown objects. The results show that our proposed approach outperforms state-of-the-art methods in terms of mean intersection over the union and pixel accuracy. Code available at: https://github.com/sanket0707/GNN-Mixer.git
Iet Computer Vision, Jan 3, 2018
arXiv (Cornell University), Mar 20, 2023
Object segmentation enhances robotic grasping by aiding object identification. Complex environmen... more Object segmentation enhances robotic grasping by aiding object identification. Complex environments and dynamic conditions pose challenges such as occlusion, low light conditions, motion blur and object size variance. To address these challenges, we propose a Bimodal SegNet that fuses two types of visual signals, event-based data and RGB frame data. The proposed Bimodal SegNet network has two distinct encoders-one for RGB signal input and another for Event signal input, in addition to an Atrous Pyramidal Feature Amplification module. Encoders capture and fuse the rich contextual information from different resolutions via a Cross-Domain Contexual Attention layer while the decoder obtains sharp object boundaries. The evaluation of the proposed method undertakes five unique image degradation challenges including occlusion, blur, brightness, trajectory and scale variance on the Event-based Segmentation (ESD) Dataset. The results show a 4-6% segmentation accuracy improvement over state-of-the-art methods in terms of mean intersection over the union and pixel accuracy.
2018 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), 2018
The daltonization process refers to the color adaptation of images in order to improve the percep... more The daltonization process refers to the color adaptation of images in order to improve the perception of color-blind viewers. This paper proposes a modified clustering approach, which is applied to color adaptation of digitized art paintings and concerns a specific color vision deficiency called protanopia. To accomplish this task, the objective function of the fuzzy c-means is reformulated as to include only the cluster centers, and then it is minimized by the differential evolution. By using a standard technique, the original image is transformed to simulate the effect of the protanopia deficiency. Then, the above-mentioned clustering approach is separately applied to the original and the protanopia simulated images. By comparing the color clusters between these two cases, the colors in the original image are classified into two classes: (a) colors that must be corrected so that a protanope can easily distinguish them, and (b) colors that must remain intact. To this end, the colors belonging to the former class are adapted subject to the constraint that they must not be similar to the colors belonging to the latter class. Finally, the effectiveness of the proposed methodology is demonstrated through a number of experiments on color art paintings.
11th International Conference of Pattern Recognition Systems (ICPRS 2021), 2021
Challenges in anomaly detection include the implicit definition of anomaly, benchmarking against ... more Challenges in anomaly detection include the implicit definition of anomaly, benchmarking against human intuition and scarcity of anomalous examples. We introduce a novel approach designed to enforce separation of normal and abnormal samples in an embedded space using a refined Triple Loss Function, within the paradigm of Deep Networks. Training is based on randomly sampled triplets to manage datasets with small proportion of anomalous data. Results for a range of proportions between normal and anomalous data are presented on the MNIST, CIFAR10 and Concrete Cracks datasets and compared against the current state of the art.
Sensors (Basel, Switzerland), Jan 24, 2018
In this paper, a novel approach to detect incipient slip based on the contact area between a tran... more In this paper, a novel approach to detect incipient slip based on the contact area between a transparent silicone medium and different objects using a neuromorphic event-based vision sensor (DAVIS) is proposed. Event-based algorithms are developed to detect incipient slip, slip, stress distribution and object vibration. Thirty-seven experiments were performed on five objects with different sizes, shapes, materials and weights to compare precision and response time of the proposed approach. The proposed approach is validated by using a high speed constitutional camera (1000 FPS). The results indicate that the sensor can detect incipient slippage with an average of 44.1 ms latency in unstructured environment for various objects. It is worth mentioning that the experiments were conducted in an uncontrolled experimental environment, therefore adding high noise levels that affected results significantly. However, eleven of the experiments had a detection latency below 10 ms which shows t...
IEEE Access
Neuromorphic vision sensor is an attractive technology that offers high dynamic range, and low la... more Neuromorphic vision sensor is an attractive technology that offers high dynamic range, and low latency which are crucial in robotic applications. However, the lack of event-based data in this field, limits the sensors' performance in a real-world environments. In this paper, we propose a novel augmentation technique for neuromorphic vision sensors to improve contact force measurements from events. The proposed method shifts a proportion of events across the time domain, 'Temporal Event Shifting', to augment the dataset. A new set of grasping experiments is performed to validate and analyze the effectiveness of the proposed augmentation method for contact force measurements. The results indicate that temporal event shifting is highly effective augmentation method which improves the models' accuracy for the contact force estimation by thirty percent without performing new experiments. INDEX TERMS Event-based augmentation, neuromorphic augmentation, vision-based tactile sensor.
Pattern Recognition Letters, Sep 1, 2010
We propose an advanced framework for the automatic configuration of spectral dimensionality reduc... more We propose an advanced framework for the automatic configuration of spectral dimensionality reduction methods. This is achieved by introducing, first, the mutual information measure to assess the quality of discovered embedded spaces. Secondly, unsupervised Radial Basis Function network is designated for mapping between spaces where the learning process is derived from graph theory and based on Markov cluster algorithm. Experiments on synthetic and real datasets demonstrate the effectiveness of the proposed methodology.
In this paper, our main contribution is a framework for the automatic configuration of any spectr... more In this paper, our main contribution is a framework for the automatic configuration of any spectral dimensionality reduction methods. This is achieved, first, by introducing the mutual information measure to assess the quality of discovered embedded spaces. Secondly, we overcome the deficiency of mapping function in spectral dimensionality reduction approaches by proposing data projection between spaces based on fully automatic and dynamically adjustable Radial Basis Function network. Finally, this automatic framework is evaluated in the context of 3D human pose estimation. We demonstrate mutual information measure outperforms all current space assessment metrics. Moreover, experiments show the mapping associated to the induced embedded space displays good generalization properties. In particular, it allows improvement of accuracy by around 30% when refining 3D pose estimates of a walking sequence produced by an activity independent method.
Springer eBooks, Dec 22, 2005
An important capability of an ambient intelligent environment is the capacity to detect, locate a... more An important capability of an ambient intelligent environment is the capacity to detect, locate and identify objects of interest. In many cases interesting can move, and in order to provide meaningful interaction, capturing and tracking the motion creates a perceptively-enabled interface, capable of understanding and reacting to a wide range of actions and activities. CCTV systems fulfill an increasingly important role in the modern world, providing live video access to remote environments. Whilst the role of CCTV has been primarily focused on ...
This paper proposes a two-level 3D human pose tracking method for a specific action captured by s... more This paper proposes a two-level 3D human pose tracking method for a specific action captured by several cameras. The generation of pose estimates relies on fitting a 3D articulated model on a Visual Hull generated from the input images. First, an initial pose estimate is constrained by a low dimensional manifold learnt by Temporal Laplacian Eigenmaps. Then, an improved global pose is calculated by refining individual limb poses. The validation of our method uses a public standard dataset and demonstrates its accurate and computational efficiency.
Lecture Notes in Computer Science, 2008
A novel probabilistic formulation for 2-D human pose recovery from monocular images is proposed. ... more A novel probabilistic formulation for 2-D human pose recovery from monocular images is proposed. It relies on a bottom-up approach based on an iterative process between clustering and body model fitting. Body parts are segmented from the foreground by clustering a set of images cues. Clustering is driven by 2D human body model fitting to obtain optimal segmentation while the model is resized and its articulated configuration is updated according to the clustering result. This method neither requires a training stage, nor any prior knowledge of poses and appearance as characteristics of body parts are already embedded in the integrated cues. Furthermore, a probabilistic confidence measure is proposed to evaluate the expected accuracy of recovered poses. Experimental results demonstrate the accuracy and robustness of this new algorithm by estimating 2-D human poses from walking sequences.
Ultrasound in Medicine and Biology, Jun 1, 2019
This study investigates the application and evaluation of existing indirect methods, namely point... more This study investigates the application and evaluation of existing indirect methods, namely point-based registration techniques, for the estimation and compensation of observed motion included in the 2D image plane of Contrast-Enhanced Ultrasound (CEUS) cine-loops recorded for the characterization and diagnosis of focal liver lesions (FLL). The value of applying motion compensation in the challenging modality of CEUS is to assist the quantification of the perfusion dynamics of an FLL in relation to its parenchyma, allowing for a potentially accurate diagnostic suggestion. Towards this end, this study also proposes a novel quantitative multi-level framework for evaluating the quantification of FLLs, which to the best of our knowledge remains undefined, notwithstanding many relevant studies. Our results suggest the 'compact and real-time descriptor' as the optimal indirect motion compensation method in CEUS, following a quantitative evaluation of nineteen other indirect algorithms and configurations, while also considering the requirement for computational efficiency.
This paper presents a novel auto-calibration method from unconstrained human body motion. It reli... more This paper presents a novel auto-calibration method from unconstrained human body motion. It relies on the underlying biomechanical constraints associated with human bipedal locomotion. By analysing positions of key points during a sequence, our technique is able to detect frames where the human body adopts a particular posture which ensures the coplanarity of those key points and therefore allows a successful camera calibration. Our technique includes a 3D model adaptation phase which removes the requirement for a precise geometrical 3D description of those points. Our method is validated using a variety of human bipedal motions and camera configurations.
Pattern Analysis and Applications, Dec 1, 2004
This paper presents a framework for event detection and video content analysis for visual surveil... more This paper presents a framework for event detection and video content analysis for visual surveillance applications. The system is able to coordinate the tracking of objects between multiple camera views, which may be overlapping or non-overlapping. The key novelty of our approach is that we can automatically learn a semantic scene model for a surveillance region, and have defined data models to support the storage of tracking data with different layers of abstraction into a surveillance database. The surveillance database provides a mechanism to generate video content summaries of objects detected by the system across the entire surveillance region in terms of the semantic scene model. In addition, the surveillance database supports spatiotemporal queries, which can be applied for event detection and notification applications.
arXiv (Cornell University), Feb 13, 2023
Taking advantage of an event-based camera, the issues of motion blur, low dynamic range and low t... more Taking advantage of an event-based camera, the issues of motion blur, low dynamic range and low time sampling of standard cameras can all be addressed. However, there is a lack of event-based datasets dedicated to the benchmarking of segmentation algorithms, especially those that provide depth information which is critical for segmentation in occluded scenes. This paper proposes a new Event-based Segmentation Dataset (ESD), a high-quality 3D spatial and temporal dataset for object segmentation in an indoor cluttered environment. Our proposed dataset ESD comprises 145 sequences with 14,166 RGB frames that are manually annotated with instance masks. Overall 21.88 million and 20.80 million events from two event-based cameras in a stereo-graphic configuration are collected, respectively. To the best of our knowledge, this densely annotated and 3D spatial-temporal event-based segmentation benchmark of tabletop objects is the first of its kind. By releasing ESD, we expect to provide the community with a challenging segmentation benchmark with high quality. Please note: Abbreviations should be introduced at the first mention in the main text-no abbreviations lists or tables should be included. The structure of the main text is provided below.
Image fusion methods have gained a lot of attraction over the past few years in the field of sens... more Image fusion methods have gained a lot of attraction over the past few years in the field of sensor fusion. An efficient image fusion approach can obtain complementary information from various multi-modality images. In addition, the fused image is more robust to imperfect conditions such as mis-registration and noise. The aim of this paper is to explore the performance of existing deep learning-based and traditional image fusion techniques for our real marine images. The performance of these techniques is evaluated with six common quality metrics. Image data was collected using a sensor system onboard a vessel in the Finnish archipelago. This system is used for developing autonomous vessels, and records data in a range of operation and climatic conditions. To the best of our knowledge, there is not a comparative study of RGB and infrared image fusion algorithms evaluated in a marine environment. Experimental results indicate that deep learning-based fusion methods can significantly improve the image fusion performance considering both the visual quality and objective assessment comparison against with other methods.
Safety and security are critical issues in maritime environment. Automatic and reliable object de... more Safety and security are critical issues in maritime environment. Automatic and reliable object detection based on multi-sensor data fusion is one of the efficient way for improving these issues in intelligent systems. In this paper, we propose an early fusion framework to achieve a robust object detection. The framework firstly utilizes a fusion strategy to combine both visible and infrared images and generates fused images. The resulting fused images are then processed by a simple dense convolutional neural network based detector, RetinaNet, to predict multiple 2D box hypotheses and the infrared confidences. To evaluate the proposed framework, we collected a real marine dataset using a sensor system onboard a vessel in the Finnish archipelago. This system is used for developing autonomous vessels, and records data in a range of operation and climatic and light conditions. The experimental results show that the proposed fusion framework able to identify the interest of objects surrounding the vessel substantially better compared with the baseline approaches.
arXiv (Cornell University), May 5, 2023
In the context of robotic grasping, object segmentation encounters several difficulties when face... more In the context of robotic grasping, object segmentation encounters several difficulties when faced with dynamic conditions such as real-time operation, occlusion, low lighting, motion blur, and object size variability. In response to these challenges, we propose the Graph Mixer Neural Network that includes a novel collaborative contextual mixing layer, applied to 3D event graphs formed on asynchronous events. The proposed layer is designed to spread spatiotemporal correlation within an event graph at four nearest neighbor levels parallelly. We evaluate the effectiveness of our proposed method on the Event-based Segmentation (ESD) Dataset, which includes five unique image degradation challenges, including occlusion, blur, brightness, trajectory, scale variance, and segmentation of known and unknown objects. The results show that our proposed approach outperforms state-of-the-art methods in terms of mean intersection over the union and pixel accuracy. Code available at: https://github.com/sanket0707/GNN-Mixer.git
Iet Computer Vision, Jan 3, 2018
arXiv (Cornell University), Mar 20, 2023
Object segmentation enhances robotic grasping by aiding object identification. Complex environmen... more Object segmentation enhances robotic grasping by aiding object identification. Complex environments and dynamic conditions pose challenges such as occlusion, low light conditions, motion blur and object size variance. To address these challenges, we propose a Bimodal SegNet that fuses two types of visual signals, event-based data and RGB frame data. The proposed Bimodal SegNet network has two distinct encoders-one for RGB signal input and another for Event signal input, in addition to an Atrous Pyramidal Feature Amplification module. Encoders capture and fuse the rich contextual information from different resolutions via a Cross-Domain Contexual Attention layer while the decoder obtains sharp object boundaries. The evaluation of the proposed method undertakes five unique image degradation challenges including occlusion, blur, brightness, trajectory and scale variance on the Event-based Segmentation (ESD) Dataset. The results show a 4-6% segmentation accuracy improvement over state-of-the-art methods in terms of mean intersection over the union and pixel accuracy.
2018 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), 2018
The daltonization process refers to the color adaptation of images in order to improve the percep... more The daltonization process refers to the color adaptation of images in order to improve the perception of color-blind viewers. This paper proposes a modified clustering approach, which is applied to color adaptation of digitized art paintings and concerns a specific color vision deficiency called protanopia. To accomplish this task, the objective function of the fuzzy c-means is reformulated as to include only the cluster centers, and then it is minimized by the differential evolution. By using a standard technique, the original image is transformed to simulate the effect of the protanopia deficiency. Then, the above-mentioned clustering approach is separately applied to the original and the protanopia simulated images. By comparing the color clusters between these two cases, the colors in the original image are classified into two classes: (a) colors that must be corrected so that a protanope can easily distinguish them, and (b) colors that must remain intact. To this end, the colors belonging to the former class are adapted subject to the constraint that they must not be similar to the colors belonging to the latter class. Finally, the effectiveness of the proposed methodology is demonstrated through a number of experiments on color art paintings.
11th International Conference of Pattern Recognition Systems (ICPRS 2021), 2021
Challenges in anomaly detection include the implicit definition of anomaly, benchmarking against ... more Challenges in anomaly detection include the implicit definition of anomaly, benchmarking against human intuition and scarcity of anomalous examples. We introduce a novel approach designed to enforce separation of normal and abnormal samples in an embedded space using a refined Triple Loss Function, within the paradigm of Deep Networks. Training is based on randomly sampled triplets to manage datasets with small proportion of anomalous data. Results for a range of proportions between normal and anomalous data are presented on the MNIST, CIFAR10 and Concrete Cracks datasets and compared against the current state of the art.
Sensors (Basel, Switzerland), Jan 24, 2018
In this paper, a novel approach to detect incipient slip based on the contact area between a tran... more In this paper, a novel approach to detect incipient slip based on the contact area between a transparent silicone medium and different objects using a neuromorphic event-based vision sensor (DAVIS) is proposed. Event-based algorithms are developed to detect incipient slip, slip, stress distribution and object vibration. Thirty-seven experiments were performed on five objects with different sizes, shapes, materials and weights to compare precision and response time of the proposed approach. The proposed approach is validated by using a high speed constitutional camera (1000 FPS). The results indicate that the sensor can detect incipient slippage with an average of 44.1 ms latency in unstructured environment for various objects. It is worth mentioning that the experiments were conducted in an uncontrolled experimental environment, therefore adding high noise levels that affected results significantly. However, eleven of the experiments had a detection latency below 10 ms which shows t...
IEEE Access
Neuromorphic vision sensor is an attractive technology that offers high dynamic range, and low la... more Neuromorphic vision sensor is an attractive technology that offers high dynamic range, and low latency which are crucial in robotic applications. However, the lack of event-based data in this field, limits the sensors' performance in a real-world environments. In this paper, we propose a novel augmentation technique for neuromorphic vision sensors to improve contact force measurements from events. The proposed method shifts a proportion of events across the time domain, 'Temporal Event Shifting', to augment the dataset. A new set of grasping experiments is performed to validate and analyze the effectiveness of the proposed augmentation method for contact force measurements. The results indicate that temporal event shifting is highly effective augmentation method which improves the models' accuracy for the contact force estimation by thirty percent without performing new experiments. INDEX TERMS Event-based augmentation, neuromorphic augmentation, vision-based tactile sensor.
Pattern Recognition Letters, Sep 1, 2010
We propose an advanced framework for the automatic configuration of spectral dimensionality reduc... more We propose an advanced framework for the automatic configuration of spectral dimensionality reduction methods. This is achieved by introducing, first, the mutual information measure to assess the quality of discovered embedded spaces. Secondly, unsupervised Radial Basis Function network is designated for mapping between spaces where the learning process is derived from graph theory and based on Markov cluster algorithm. Experiments on synthetic and real datasets demonstrate the effectiveness of the proposed methodology.
In this paper, our main contribution is a framework for the automatic configuration of any spectr... more In this paper, our main contribution is a framework for the automatic configuration of any spectral dimensionality reduction methods. This is achieved, first, by introducing the mutual information measure to assess the quality of discovered embedded spaces. Secondly, we overcome the deficiency of mapping function in spectral dimensionality reduction approaches by proposing data projection between spaces based on fully automatic and dynamically adjustable Radial Basis Function network. Finally, this automatic framework is evaluated in the context of 3D human pose estimation. We demonstrate mutual information measure outperforms all current space assessment metrics. Moreover, experiments show the mapping associated to the induced embedded space displays good generalization properties. In particular, it allows improvement of accuracy by around 30% when refining 3D pose estimates of a walking sequence produced by an activity independent method.
Springer eBooks, Dec 22, 2005
An important capability of an ambient intelligent environment is the capacity to detect, locate a... more An important capability of an ambient intelligent environment is the capacity to detect, locate and identify objects of interest. In many cases interesting can move, and in order to provide meaningful interaction, capturing and tracking the motion creates a perceptively-enabled interface, capable of understanding and reacting to a wide range of actions and activities. CCTV systems fulfill an increasingly important role in the modern world, providing live video access to remote environments. Whilst the role of CCTV has been primarily focused on ...
This paper proposes a two-level 3D human pose tracking method for a specific action captured by s... more This paper proposes a two-level 3D human pose tracking method for a specific action captured by several cameras. The generation of pose estimates relies on fitting a 3D articulated model on a Visual Hull generated from the input images. First, an initial pose estimate is constrained by a low dimensional manifold learnt by Temporal Laplacian Eigenmaps. Then, an improved global pose is calculated by refining individual limb poses. The validation of our method uses a public standard dataset and demonstrates its accurate and computational efficiency.
Lecture Notes in Computer Science, 2008
A novel probabilistic formulation for 2-D human pose recovery from monocular images is proposed. ... more A novel probabilistic formulation for 2-D human pose recovery from monocular images is proposed. It relies on a bottom-up approach based on an iterative process between clustering and body model fitting. Body parts are segmented from the foreground by clustering a set of images cues. Clustering is driven by 2D human body model fitting to obtain optimal segmentation while the model is resized and its articulated configuration is updated according to the clustering result. This method neither requires a training stage, nor any prior knowledge of poses and appearance as characteristics of body parts are already embedded in the integrated cues. Furthermore, a probabilistic confidence measure is proposed to evaluate the expected accuracy of recovered poses. Experimental results demonstrate the accuracy and robustness of this new algorithm by estimating 2-D human poses from walking sequences.
Ultrasound in Medicine and Biology, Jun 1, 2019
This study investigates the application and evaluation of existing indirect methods, namely point... more This study investigates the application and evaluation of existing indirect methods, namely point-based registration techniques, for the estimation and compensation of observed motion included in the 2D image plane of Contrast-Enhanced Ultrasound (CEUS) cine-loops recorded for the characterization and diagnosis of focal liver lesions (FLL). The value of applying motion compensation in the challenging modality of CEUS is to assist the quantification of the perfusion dynamics of an FLL in relation to its parenchyma, allowing for a potentially accurate diagnostic suggestion. Towards this end, this study also proposes a novel quantitative multi-level framework for evaluating the quantification of FLLs, which to the best of our knowledge remains undefined, notwithstanding many relevant studies. Our results suggest the 'compact and real-time descriptor' as the optimal indirect motion compensation method in CEUS, following a quantitative evaluation of nineteen other indirect algorithms and configurations, while also considering the requirement for computational efficiency.
This paper presents a novel auto-calibration method from unconstrained human body motion. It reli... more This paper presents a novel auto-calibration method from unconstrained human body motion. It relies on the underlying biomechanical constraints associated with human bipedal locomotion. By analysing positions of key points during a sequence, our technique is able to detect frames where the human body adopts a particular posture which ensures the coplanarity of those key points and therefore allows a successful camera calibration. Our technique includes a 3D model adaptation phase which removes the requirement for a precise geometrical 3D description of those points. Our method is validated using a variety of human bipedal motions and camera configurations.
Pattern Analysis and Applications, Dec 1, 2004
This paper presents a framework for event detection and video content analysis for visual surveil... more This paper presents a framework for event detection and video content analysis for visual surveillance applications. The system is able to coordinate the tracking of objects between multiple camera views, which may be overlapping or non-overlapping. The key novelty of our approach is that we can automatically learn a semantic scene model for a surveillance region, and have defined data models to support the storage of tracking data with different layers of abstraction into a surveillance database. The surveillance database provides a mechanism to generate video content summaries of objects detected by the system across the entire surveillance region in terms of the semantic scene model. In addition, the surveillance database supports spatiotemporal queries, which can be applied for event detection and notification applications.