Mircea Paul Muresan | Technical University of Cluj-Napoca (original) (raw)

Papers by Mircea Paul Muresan

Research paper thumbnail of Stereo and Mono Depth Estimation Fusion for an Improved and Fault Tolerant 3D Reconstruction

2021 IEEE 17th International Conference on Intelligent Computer Communication and Processing (ICCP)

Research paper thumbnail of Pose Based Pedestrian Street Cross Action Recognition in Infrared Images

2021 IEEE 17th International Conference on Intelligent Computer Communication and Processing (ICCP)

Research paper thumbnail of Compact Solution for Low Earth Orbit Surveillance

2021 IEEE 17th International Conference on Intelligent Computer Communication and Processing (ICCP)

Research paper thumbnail of Real-time object detection using a sparse 4-layer LIDAR

2017 13th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)

The robust detection of obstacles, on a given road path by vehicles equipped with range measureme... more The robust detection of obstacles, on a given road path by vehicles equipped with range measurement devices represents a requirement for many research fields including autonomous driving and advanced driving assistance systems. One particular sensor system used for measurement tasks, due to its known accuracy, is the LIDAR (Light Detection and Ranging). The commercial price and computational intensiveness of such systems generally increase with the number of scanning layers. For this reason, in this paper, a novel six step based obstacle detection approach using a 4-layer LIDAR is presented. In the proposed pipeline we tackle the problem of data correction and temporal point cloud fusion and we present an original method for detecting obstacles using a combination between a polar histogram and an elevation grid. The results have been validated by using objects provided from other range measurement sensors.

Research paper thumbnail of Automatic Vision Inspection Solution for the Manufacturing Process of Automotive Components Through Plastic Injection Molding

2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)

In the automotive industry, vehicle components can be obtained through the process of plastic inj... more In the automotive industry, vehicle components can be obtained through the process of plastic injection molding. The components can be fixed in the vehicle by using metal bushings, which are placed on the injection mold before the beginning of the plastic injection process. The incorrect placement or absence of the bushings leads to a defective product. Object classification has been a long-tackled problem in Computer Science, and the breakthroughs in Artificial Intelligence allows us to solve this problem with a high degree of accuracy and precision. In the context of industrial inspection, ensuring high-quality products is a matter of utmost importance. The automated vision inspection process facilitates the creation of a high volume of products, which are in conformity with the quality standards, in a short amount of time and can save the manufacturing company money. In this paper we propose a solution that automatically detects the injection mold and classifies the positioning of the bushings, warning the operator in case the current positioning can lead to a defective product. Furthermore, the bushing detection is optimized in such a way that, it is mandatory for the mold to be visible only in the first frame, thus reducing the running time of the whole pipeline. The results are validated by using a data set covering multiple possible working scenarios. Moreover, we compare the implemented classifier with other classifiers, highlighting the performance of the proposed solution with respect to running time and classification accuracy.

Research paper thumbnail of PartID – Individual Objects Tracking in Occupancy Grids Using Particle Identities

2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)

Occupancy grid tracking algorithms see the world as made out of cells that can be either free or ... more Occupancy grid tracking algorithms see the world as made out of cells that can be either free or occupied, and can have speed probability densities attached to each cell. These algorithms estimate the overall state of the environment based on sensor data, but they are not aware of, nor concerned with the identity of individual objects. This paper proposes a new approach for individual objects tracking using the dynamic occupancy grids, which will embed the identity of the objects in the grid state. The particle based dynamic occupancy grid is extended by attaching identity information to each particle, thus achieving individual object tracking at grid level without the need of explicit modeling of the object’s shape. The position and dynamics of the world occupied cells are tracked independently of their identity, by the mechanism of the particle based occupancy grid. For achieving individual object tracking, this mechanism is extended with components for assigning and managing the identity of the particles. The designed system was tested on real world sequences, and was able to successfully track obstacles found on the road without making assumptions about their nature, shape or size.

Research paper thumbnail of Teeth Detection and Dental Problem Classification in Panoramic X-Ray Images using Deep Learning and Image Processing Techniques

2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)

Deep convolutional neural networks, have gained a lot popularity in medical research due to their... more Deep convolutional neural networks, have gained a lot popularity in medical research due to their impressive results in detection, prediction and classification. Analysis of panoramic dental radiographies help specialists observe problems in poor visibility areas, inside the buccal cavity or in hard to reach areas. However, poor image quality or fatigue can cause the diagnosis to vary, which can ultimately hinder the treatment. In this paper we propose a novel approach of automatic teeth detection and dental problem classification using panoramic X-Ray images which can aid the medical staff in making decisions regarding the correct diagnosis. For this endeavor panoramic radiographies were collected from three dental clinics and annotated, highlighting 14 different dental issues that can appear. A CNN was trained using the annotated data for obtaining semantic segmentation information. Next, multiple image processing operations were performed for segmenting and refining the bounding boxes corresponding to the teeth detections. Finally, each tooth instance was labeled and the problem affecting it was identified using a histogram-based majority voting within the detected region of interest. The implemented solution was evaluated with respect to several metrics like intersection over union for the semantic segmentation and accuracy, precision, recall and F1-score for the generated bounding box detections. The results were compared qualitatively with the data obtained from other approaches illustrating the superiority of the proposed solution.

Research paper thumbnail of Pedestrian Street-Cross Action Recognition in Monocular Far Infrared Sequences

Research paper thumbnail of Multi-Object Tracking of 3D Cuboids Using Aggregated Features

2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP)

the unknown correspondences of measurements and targets, referred to as data association, is one ... more the unknown correspondences of measurements and targets, referred to as data association, is one of the main challenges of multi-target tracking. Each new measurement received could be the continuation of some previously detected target, the first detection of a new target or a false alarm. Tracking 3D cuboids, is particularly difficult due to the high amount of data, which can include erroneous or noisy information coming from sensors, that can lead to false measurements, detections from an unknown number of objects which may not be consistent over frames or varying object properties like dimension and orientation. In the self-driving car context, the target tracking module holds an important role due to the fact that the ego vehicle has to make predictions regarding the position and velocity of the surrounding objects in the next time epoch, plan for actions and make the correct decisions. To tackle the above mentioned problems and other issues coming from the self-driving car processing pipeline we propose three original contributions: 1) designing a novel affinity measurement function to associate measurements and targets using multiple types of features coming from LIDAR and camera, 2) a context aware descriptor for 3D objects that improves the data association process, 3) a framework that includes a module for tracking dimensions and orientation of objects. The implemented solution runs in real time and experiments that were performed on real world urban scenarios prove that the presented method is effective and robust even in a highly dynamic environment.

Research paper thumbnail of Dot Matrix OCR for Bottle Validity Inspection

2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP)

Identifying expiration dates on water bottles for fast industrial processing of a large amount of... more Identifying expiration dates on water bottles for fast industrial processing of a large amount of products mainly affects water bottling factories, food warehouses and supermarkets. The impact of this problem is the distribution of expired bottles of water or their existence on the market. One of the key issues for automatic character readers from bottles is dotted text. Furthermore, the transparent and curved nature of the containing bottle in the presence of water increases the difficulty of the text extraction. An optical character recognition solution using a convolutional neural networks is proposed to solve this issue. The proposed solution segments the input image to extract the bottle, detects the text region of interest, then performs pre-processing operations and finally converts the characters extracted from the region of interest on the bottle in human-readable characters. The proposed solution has real time performance and it achieves high quality results on the evaluation set.

Research paper thumbnail of A Multi Patch Warping Approach for Improved Stereo Block Matching

Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2017

Stereo cameras are a suitable solution for reconstructing the 3D information of the observed scen... more Stereo cameras are a suitable solution for reconstructing the 3D information of the observed scenes, and, because of their low price and ease to set up and operate, they can be used in a wide area of applications, ranging from autonomous driving to advanced driver assistance systems or robotics. Due to the high quality of the results, energy based reconstruction methods like semi global matching have gained a lot of popularity in recent years. The disadvantages of semi global matching are the large memory footprint and the high computational complexity. In contrast, window based matching methods have a lower complexity, and are leaner with respect to the memory consumption. The downside of block matching methods is that they are more error prone, especially on surfaces which are not parallel to the image plane. In this paper we present a novel block matching scheme that improves the quality of local stereo correspondence algorithms. The first contribution of the paper consists in an original method for reliably reconstructing the environment on slanted surfaces. The second contribution consists in the creation of set of local constraints that filter out possible outlier disparity values. The third and final contribution consists in the creation of a refinement technique which improves the resulted disparity map. The proposed stereo correspondence approach has been validated on the KITTI stereo dataset.

Research paper thumbnail of Stabilization and Validation of 3D Object Position Using Multimodal Sensor Fusion and Semantic Segmentation

Sensors

The stabilization and validation process of the measured position of objects is an important step... more The stabilization and validation process of the measured position of objects is an important step for high-level perception functions and for the correct processing of sensory data. The goal of this process is to detect and handle inconsistencies between different sensor measurements, which result from the perception system. The aggregation of the detections from different sensors consists in the combination of the sensorial data in one common reference frame for each identified object, leading to the creation of a super-sensor. The result of the data aggregation may end up with errors such as false detections, misplaced object cuboids or an incorrect number of objects in the scene. The stabilization and validation process is focused on mitigating these problems. The current paper proposes four contributions for solving the stabilization and validation task, for autonomous vehicles, using the following sensors: trifocal camera, fisheye camera, long-range RADAR (Radio detection and r...

Research paper thumbnail of Patch warping and local constraints for improved block matching stereo correspondence

2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP), 2016

Depth estimation of the surrounding environment using a stereoscopic camera setup is an important... more Depth estimation of the surrounding environment using a stereoscopic camera setup is an important and fundamental research topic in computer vision. Due to its running time and quality performance in real situations the semi global matching algorithm is often used. The biggest disadvantage of the semi global approach is its large memory footprint. On the other hand, block matching stereo is leaner when it comes to memory consumption and therefore it is commonly used in applications where we do not have many resources, in order to obtain coarse depth information of the environment. The poor quality performance of such algorithms make them impractical for many real life applications. In this paper we focus on improving the quality of the classical block matching (BM) stereo method by proposing a novel approach which tackles the problem of stereo matching for slanted and fronto-parallel surfaces by using different types of binary masks on the matching window. Another improvement consists in the usage of different types of local constraints in the generation of the winning disparity for a specific position, such that possible outliers are eliminated from the start. The validation of our results has been done on the KITTI stereo benchmark dataset.

Research paper thumbnail of Improving local stereo algorithms using binary shifted windows, fusion and smoothness constraint

2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), 2015

Stereo cameras are a viable solution for reconstructing 3D scenes and are well suited for advance... more Stereo cameras are a viable solution for reconstructing 3D scenes and are well suited for advanced driver assistance systems, autonomous driving and robotics applications. Modern stereo reconstruction algorithms offer good results, but require very much memory and their real time capabilities are limited on modern day processors. On the other hand, local window aggregation algorithms have a small memory footprint, they are very fast and can be ported to embedded devices, although they provide a lower number of 3D reconstructed points and are more error prone in the case of occluded and slanted surfaces. In this paper we propose a novel, local block matching method which has increased quality and is suitable for real time processing with hardware acceleration (satisfying running time). Our first contribution consists in the introduction of two new binary descriptors used for block matching. The second contribution lies in the shifting method implemented for the matching windows, in order to capture surfaces which are slanted, together with the fusion of the results obtained for fronto-parallel surfaces. Here we propose and compare two fusion methods: a naive and a gradient based approach. The final contribution consists in a smoothness constraint applied to neighboring pixels. The results have been tested on images from the Middlebury benchmark and also on real traffic scene.

Research paper thumbnail of Vision algorithms and embedded solution for pedestrian detection with far infrared camera

2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP), 2014

In the automotive industry the issue of safety remains a major priority. This aspect is not focus... more In the automotive industry the issue of safety remains a major priority. This aspect is not focused just on the driver but also on the other participants of the traffic like the pedestrians. This paper describes a pedestrian detection system where three different classification methods are used for detecting pedestrians with a far infrared camera. The three methods are tested and compared on variable number of features in order to obtain a scalable solution. The authors propose a low cost embedded implementation for the classification method that has proven to be best with respect to the accuracy and training time, taking the HOG as features descriptors for the region of interest.

Research paper thumbnail of Multimodal sparse LIDAR object tracking in clutter

2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP)

one of the key components of the perception system in an autonomous vehicle or ADAS is the target... more one of the key components of the perception system in an autonomous vehicle or ADAS is the target tracking module. Using target tracking in the sea of clutter, self-driving cars are able to better understand the environment and make predictions about the surrounding objects. Cuboids obtained from a sparse LIDAR often exhibit a fluctuating behavior due to segmentation problems and errors accumulated from the motion correction module. Furthermore, targets in real life scenarios do not move in a predictable manner, so it is very difficult for a classical motion model to describe the complex behavior of any road objects in such cases. In this paper we propose a two-step data association scheme that efficiently and effectively finds correspondences between tracks and measurements. Then we aim to generate better position estimates for objects with an ambiguous dynamic behavior by associating and combining the results from two different motion models. The proposed solution runs in real time and it was validated using a high precision GPS, and also by projecting the prediction results in the corresponding intensity image and assessing whether the prediction falls on the correct item.

Research paper thumbnail of Robust Data Association Using Fusion of Data-Driven and Engineered Features for Real-Time Pedestrian Tracking in Thermal Images

Sensors

Object tracking is an essential problem in computer vision that has been extensively researched f... more Object tracking is an essential problem in computer vision that has been extensively researched for decades. Tracking objects in thermal images is particularly difficult because of the lack of color information, low image resolution, or high similarity between objects of the same class. One of the main challenges in multi-object tracking, also referred to as the data association problem, is finding the correct correspondences between measurements and tracks and adapting the object appearance changes over time. We addressed this challenge of data association for thermal images by proposing three contributions. The first contribution consisted of the creation of a data-driven appearance score using five Siamese Networks, which operate on the image detection and on parts of it. Secondly, we engineered an original edge-based descriptor that improves the data association process. Lastly, we proposed a dataset consisting of pedestrian instances that were recorded in different scenarios an...

Research paper thumbnail of Stabilization and Validation of 3D Object Position Using Multimodal Sensor Fusion and Semantic Segmentation

Sensors 2020, 20(4), 2020

The stabilization and validation process of the measured position of objects is an important step... more The stabilization and validation process of the measured position of objects is an important step for high-level perception functions and for the correct processing of sensory data. The goal of this process is to detect and handle inconsistencies between different sensor measurements, which result from the perception system. The aggregation of the detections from different sensors consists in the combination of the sensorial data in one common reference frame for each identified object, leading to the creation of a super-sensor. The result of the data aggregation may end up with errors such as false detections, misplaced object cuboids or an incorrect number of objects in the scene. The stabilization and validation process is focused on mitigating these problems. The current paper proposes four contributions for solving the stabilization and validation task, for autonomous vehicles, using the following sensors: trifocal camera, fisheye camera, long-range RADAR (Radio detection and ranging), and 4-layer and 16-layer LIDARs (Light Detection and Ranging). We propose two original data association methods used in the sensor fusion and tracking processes. The first data association algorithm is created for tracking LIDAR objects and combines multiple appearance and motion features in order to exploit the available information for road objects. The second novel data association algorithm is designed for trifocal camera objects and has the objective of finding measurement correspondences to sensor fused objects such that the super-sensor data are enriched by adding the semantic class information. The implemented trifocal object association solution uses a novel polar association scheme combined with a decision tree to find the best hypothesis-measurement correlations. Another contribution we propose for stabilizing object position and unpredictable behavior of road objects, provided by multiple types of complementary sensors, is the use of a fusion approach based on the Unscented Kalman Filter and a single-layer perceptron. The last novel contribution is related to the validation of the 3D object position, which is solved using a fuzzy logic technique combined with a semantic segmentation image. The proposed algorithms have a real-time performance, achieving a cumulative running time of 90 ms, and have been evaluated using ground truth data extracted from a high-precision GPS (global positioning system) with 2 cm accuracy, obtaining an average error of 0.8 m.

Research paper thumbnail of Multi-Object tracking of 3D cuboids using aggregated features

the unknown correspondences of measurements and targets, referred to as data association, is one ... more the unknown correspondences of measurements and targets, referred to as data association, is one of the main challenges of multi-target tracking. Each new measurement received could be the continuation of some previously detected target, the first detection of a new target or a false alarm. Tracking 3D cuboids, is particularly difficult due to the high amount of data, which can include erroneous or noisy information coming from sensors, that can lead to false measurements, detections from an unknown number of objects which may not be consistent over frames or varying object properties like dimension and orientation. In the self-driving car context, the target tracking module holds an important role due to the fact that the ego vehicle has to make predictions regarding the position and velocity of the surrounding objects in the next time epoch, plan for actions and make the correct decisions. To tackle the above mentioned problems and other issues coming from the self-driving car processing pipeline we propose three original contributions: 1) designing a novel affinity measurement function to associate measurements and targets using multiple types of features coming from LIDAR and camera, 2) a context aware descriptor for 3D objects that improves the data association process, 3) a framework that includes a module for tracking dimensions and orientation of objects. The implemented solution runs in real time and experiments that were performed on real world urban scenarios prove that the presented method is effective and robust even in a highly dynamic environment.

Research paper thumbnail of Dot Matrix OCR for Bottle Validity Inspection

Identifying expiration dates on water bottles for fast industrial processing of a large amount of... more Identifying expiration dates on water bottles for fast industrial processing of a large amount of products mainly affects water bottling factories, food warehouses and supermarkets. The impact of this problem is the distribution of expired bottles of water or their existence on the market. One of the key issues for automatic character readers from bottles is dotted text. Furthermore, the transparent and curved nature of the containing bottle in the presence of water increases the difficulty of the text extraction. An optical character recognition solution using a convolutional neural networks is proposed to solve this issue. The proposed solution segments the input image to extract the bottle, detects the text region of interest, then performs pre-processing operations and finally converts the characters extracted from the region of interest on the bottle in human-readable characters. The proposed solution has real time performance and it achieves high quality results on the evaluation set.

Research paper thumbnail of Stereo and Mono Depth Estimation Fusion for an Improved and Fault Tolerant 3D Reconstruction

2021 IEEE 17th International Conference on Intelligent Computer Communication and Processing (ICCP)

Research paper thumbnail of Pose Based Pedestrian Street Cross Action Recognition in Infrared Images

2021 IEEE 17th International Conference on Intelligent Computer Communication and Processing (ICCP)

Research paper thumbnail of Compact Solution for Low Earth Orbit Surveillance

2021 IEEE 17th International Conference on Intelligent Computer Communication and Processing (ICCP)

Research paper thumbnail of Real-time object detection using a sparse 4-layer LIDAR

2017 13th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)

The robust detection of obstacles, on a given road path by vehicles equipped with range measureme... more The robust detection of obstacles, on a given road path by vehicles equipped with range measurement devices represents a requirement for many research fields including autonomous driving and advanced driving assistance systems. One particular sensor system used for measurement tasks, due to its known accuracy, is the LIDAR (Light Detection and Ranging). The commercial price and computational intensiveness of such systems generally increase with the number of scanning layers. For this reason, in this paper, a novel six step based obstacle detection approach using a 4-layer LIDAR is presented. In the proposed pipeline we tackle the problem of data correction and temporal point cloud fusion and we present an original method for detecting obstacles using a combination between a polar histogram and an elevation grid. The results have been validated by using objects provided from other range measurement sensors.

Research paper thumbnail of Automatic Vision Inspection Solution for the Manufacturing Process of Automotive Components Through Plastic Injection Molding

2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)

In the automotive industry, vehicle components can be obtained through the process of plastic inj... more In the automotive industry, vehicle components can be obtained through the process of plastic injection molding. The components can be fixed in the vehicle by using metal bushings, which are placed on the injection mold before the beginning of the plastic injection process. The incorrect placement or absence of the bushings leads to a defective product. Object classification has been a long-tackled problem in Computer Science, and the breakthroughs in Artificial Intelligence allows us to solve this problem with a high degree of accuracy and precision. In the context of industrial inspection, ensuring high-quality products is a matter of utmost importance. The automated vision inspection process facilitates the creation of a high volume of products, which are in conformity with the quality standards, in a short amount of time and can save the manufacturing company money. In this paper we propose a solution that automatically detects the injection mold and classifies the positioning of the bushings, warning the operator in case the current positioning can lead to a defective product. Furthermore, the bushing detection is optimized in such a way that, it is mandatory for the mold to be visible only in the first frame, thus reducing the running time of the whole pipeline. The results are validated by using a data set covering multiple possible working scenarios. Moreover, we compare the implemented classifier with other classifiers, highlighting the performance of the proposed solution with respect to running time and classification accuracy.

Research paper thumbnail of PartID – Individual Objects Tracking in Occupancy Grids Using Particle Identities

2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)

Occupancy grid tracking algorithms see the world as made out of cells that can be either free or ... more Occupancy grid tracking algorithms see the world as made out of cells that can be either free or occupied, and can have speed probability densities attached to each cell. These algorithms estimate the overall state of the environment based on sensor data, but they are not aware of, nor concerned with the identity of individual objects. This paper proposes a new approach for individual objects tracking using the dynamic occupancy grids, which will embed the identity of the objects in the grid state. The particle based dynamic occupancy grid is extended by attaching identity information to each particle, thus achieving individual object tracking at grid level without the need of explicit modeling of the object’s shape. The position and dynamics of the world occupied cells are tracked independently of their identity, by the mechanism of the particle based occupancy grid. For achieving individual object tracking, this mechanism is extended with components for assigning and managing the identity of the particles. The designed system was tested on real world sequences, and was able to successfully track obstacles found on the road without making assumptions about their nature, shape or size.

Research paper thumbnail of Teeth Detection and Dental Problem Classification in Panoramic X-Ray Images using Deep Learning and Image Processing Techniques

2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)

Deep convolutional neural networks, have gained a lot popularity in medical research due to their... more Deep convolutional neural networks, have gained a lot popularity in medical research due to their impressive results in detection, prediction and classification. Analysis of panoramic dental radiographies help specialists observe problems in poor visibility areas, inside the buccal cavity or in hard to reach areas. However, poor image quality or fatigue can cause the diagnosis to vary, which can ultimately hinder the treatment. In this paper we propose a novel approach of automatic teeth detection and dental problem classification using panoramic X-Ray images which can aid the medical staff in making decisions regarding the correct diagnosis. For this endeavor panoramic radiographies were collected from three dental clinics and annotated, highlighting 14 different dental issues that can appear. A CNN was trained using the annotated data for obtaining semantic segmentation information. Next, multiple image processing operations were performed for segmenting and refining the bounding boxes corresponding to the teeth detections. Finally, each tooth instance was labeled and the problem affecting it was identified using a histogram-based majority voting within the detected region of interest. The implemented solution was evaluated with respect to several metrics like intersection over union for the semantic segmentation and accuracy, precision, recall and F1-score for the generated bounding box detections. The results were compared qualitatively with the data obtained from other approaches illustrating the superiority of the proposed solution.

Research paper thumbnail of Pedestrian Street-Cross Action Recognition in Monocular Far Infrared Sequences

Research paper thumbnail of Multi-Object Tracking of 3D Cuboids Using Aggregated Features

2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP)

the unknown correspondences of measurements and targets, referred to as data association, is one ... more the unknown correspondences of measurements and targets, referred to as data association, is one of the main challenges of multi-target tracking. Each new measurement received could be the continuation of some previously detected target, the first detection of a new target or a false alarm. Tracking 3D cuboids, is particularly difficult due to the high amount of data, which can include erroneous or noisy information coming from sensors, that can lead to false measurements, detections from an unknown number of objects which may not be consistent over frames or varying object properties like dimension and orientation. In the self-driving car context, the target tracking module holds an important role due to the fact that the ego vehicle has to make predictions regarding the position and velocity of the surrounding objects in the next time epoch, plan for actions and make the correct decisions. To tackle the above mentioned problems and other issues coming from the self-driving car processing pipeline we propose three original contributions: 1) designing a novel affinity measurement function to associate measurements and targets using multiple types of features coming from LIDAR and camera, 2) a context aware descriptor for 3D objects that improves the data association process, 3) a framework that includes a module for tracking dimensions and orientation of objects. The implemented solution runs in real time and experiments that were performed on real world urban scenarios prove that the presented method is effective and robust even in a highly dynamic environment.

Research paper thumbnail of Dot Matrix OCR for Bottle Validity Inspection

2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP)

Identifying expiration dates on water bottles for fast industrial processing of a large amount of... more Identifying expiration dates on water bottles for fast industrial processing of a large amount of products mainly affects water bottling factories, food warehouses and supermarkets. The impact of this problem is the distribution of expired bottles of water or their existence on the market. One of the key issues for automatic character readers from bottles is dotted text. Furthermore, the transparent and curved nature of the containing bottle in the presence of water increases the difficulty of the text extraction. An optical character recognition solution using a convolutional neural networks is proposed to solve this issue. The proposed solution segments the input image to extract the bottle, detects the text region of interest, then performs pre-processing operations and finally converts the characters extracted from the region of interest on the bottle in human-readable characters. The proposed solution has real time performance and it achieves high quality results on the evaluation set.

Research paper thumbnail of A Multi Patch Warping Approach for Improved Stereo Block Matching

Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2017

Stereo cameras are a suitable solution for reconstructing the 3D information of the observed scen... more Stereo cameras are a suitable solution for reconstructing the 3D information of the observed scenes, and, because of their low price and ease to set up and operate, they can be used in a wide area of applications, ranging from autonomous driving to advanced driver assistance systems or robotics. Due to the high quality of the results, energy based reconstruction methods like semi global matching have gained a lot of popularity in recent years. The disadvantages of semi global matching are the large memory footprint and the high computational complexity. In contrast, window based matching methods have a lower complexity, and are leaner with respect to the memory consumption. The downside of block matching methods is that they are more error prone, especially on surfaces which are not parallel to the image plane. In this paper we present a novel block matching scheme that improves the quality of local stereo correspondence algorithms. The first contribution of the paper consists in an original method for reliably reconstructing the environment on slanted surfaces. The second contribution consists in the creation of set of local constraints that filter out possible outlier disparity values. The third and final contribution consists in the creation of a refinement technique which improves the resulted disparity map. The proposed stereo correspondence approach has been validated on the KITTI stereo dataset.

Research paper thumbnail of Stabilization and Validation of 3D Object Position Using Multimodal Sensor Fusion and Semantic Segmentation

Sensors

The stabilization and validation process of the measured position of objects is an important step... more The stabilization and validation process of the measured position of objects is an important step for high-level perception functions and for the correct processing of sensory data. The goal of this process is to detect and handle inconsistencies between different sensor measurements, which result from the perception system. The aggregation of the detections from different sensors consists in the combination of the sensorial data in one common reference frame for each identified object, leading to the creation of a super-sensor. The result of the data aggregation may end up with errors such as false detections, misplaced object cuboids or an incorrect number of objects in the scene. The stabilization and validation process is focused on mitigating these problems. The current paper proposes four contributions for solving the stabilization and validation task, for autonomous vehicles, using the following sensors: trifocal camera, fisheye camera, long-range RADAR (Radio detection and r...

Research paper thumbnail of Patch warping and local constraints for improved block matching stereo correspondence

2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP), 2016

Depth estimation of the surrounding environment using a stereoscopic camera setup is an important... more Depth estimation of the surrounding environment using a stereoscopic camera setup is an important and fundamental research topic in computer vision. Due to its running time and quality performance in real situations the semi global matching algorithm is often used. The biggest disadvantage of the semi global approach is its large memory footprint. On the other hand, block matching stereo is leaner when it comes to memory consumption and therefore it is commonly used in applications where we do not have many resources, in order to obtain coarse depth information of the environment. The poor quality performance of such algorithms make them impractical for many real life applications. In this paper we focus on improving the quality of the classical block matching (BM) stereo method by proposing a novel approach which tackles the problem of stereo matching for slanted and fronto-parallel surfaces by using different types of binary masks on the matching window. Another improvement consists in the usage of different types of local constraints in the generation of the winning disparity for a specific position, such that possible outliers are eliminated from the start. The validation of our results has been done on the KITTI stereo benchmark dataset.

Research paper thumbnail of Improving local stereo algorithms using binary shifted windows, fusion and smoothness constraint

2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), 2015

Stereo cameras are a viable solution for reconstructing 3D scenes and are well suited for advance... more Stereo cameras are a viable solution for reconstructing 3D scenes and are well suited for advanced driver assistance systems, autonomous driving and robotics applications. Modern stereo reconstruction algorithms offer good results, but require very much memory and their real time capabilities are limited on modern day processors. On the other hand, local window aggregation algorithms have a small memory footprint, they are very fast and can be ported to embedded devices, although they provide a lower number of 3D reconstructed points and are more error prone in the case of occluded and slanted surfaces. In this paper we propose a novel, local block matching method which has increased quality and is suitable for real time processing with hardware acceleration (satisfying running time). Our first contribution consists in the introduction of two new binary descriptors used for block matching. The second contribution lies in the shifting method implemented for the matching windows, in order to capture surfaces which are slanted, together with the fusion of the results obtained for fronto-parallel surfaces. Here we propose and compare two fusion methods: a naive and a gradient based approach. The final contribution consists in a smoothness constraint applied to neighboring pixels. The results have been tested on images from the Middlebury benchmark and also on real traffic scene.

Research paper thumbnail of Vision algorithms and embedded solution for pedestrian detection with far infrared camera

2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP), 2014

In the automotive industry the issue of safety remains a major priority. This aspect is not focus... more In the automotive industry the issue of safety remains a major priority. This aspect is not focused just on the driver but also on the other participants of the traffic like the pedestrians. This paper describes a pedestrian detection system where three different classification methods are used for detecting pedestrians with a far infrared camera. The three methods are tested and compared on variable number of features in order to obtain a scalable solution. The authors propose a low cost embedded implementation for the classification method that has proven to be best with respect to the accuracy and training time, taking the HOG as features descriptors for the region of interest.

Research paper thumbnail of Multimodal sparse LIDAR object tracking in clutter

2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP)

one of the key components of the perception system in an autonomous vehicle or ADAS is the target... more one of the key components of the perception system in an autonomous vehicle or ADAS is the target tracking module. Using target tracking in the sea of clutter, self-driving cars are able to better understand the environment and make predictions about the surrounding objects. Cuboids obtained from a sparse LIDAR often exhibit a fluctuating behavior due to segmentation problems and errors accumulated from the motion correction module. Furthermore, targets in real life scenarios do not move in a predictable manner, so it is very difficult for a classical motion model to describe the complex behavior of any road objects in such cases. In this paper we propose a two-step data association scheme that efficiently and effectively finds correspondences between tracks and measurements. Then we aim to generate better position estimates for objects with an ambiguous dynamic behavior by associating and combining the results from two different motion models. The proposed solution runs in real time and it was validated using a high precision GPS, and also by projecting the prediction results in the corresponding intensity image and assessing whether the prediction falls on the correct item.

Research paper thumbnail of Robust Data Association Using Fusion of Data-Driven and Engineered Features for Real-Time Pedestrian Tracking in Thermal Images

Sensors

Object tracking is an essential problem in computer vision that has been extensively researched f... more Object tracking is an essential problem in computer vision that has been extensively researched for decades. Tracking objects in thermal images is particularly difficult because of the lack of color information, low image resolution, or high similarity between objects of the same class. One of the main challenges in multi-object tracking, also referred to as the data association problem, is finding the correct correspondences between measurements and tracks and adapting the object appearance changes over time. We addressed this challenge of data association for thermal images by proposing three contributions. The first contribution consisted of the creation of a data-driven appearance score using five Siamese Networks, which operate on the image detection and on parts of it. Secondly, we engineered an original edge-based descriptor that improves the data association process. Lastly, we proposed a dataset consisting of pedestrian instances that were recorded in different scenarios an...

Research paper thumbnail of Stabilization and Validation of 3D Object Position Using Multimodal Sensor Fusion and Semantic Segmentation

Sensors 2020, 20(4), 2020

The stabilization and validation process of the measured position of objects is an important step... more The stabilization and validation process of the measured position of objects is an important step for high-level perception functions and for the correct processing of sensory data. The goal of this process is to detect and handle inconsistencies between different sensor measurements, which result from the perception system. The aggregation of the detections from different sensors consists in the combination of the sensorial data in one common reference frame for each identified object, leading to the creation of a super-sensor. The result of the data aggregation may end up with errors such as false detections, misplaced object cuboids or an incorrect number of objects in the scene. The stabilization and validation process is focused on mitigating these problems. The current paper proposes four contributions for solving the stabilization and validation task, for autonomous vehicles, using the following sensors: trifocal camera, fisheye camera, long-range RADAR (Radio detection and ranging), and 4-layer and 16-layer LIDARs (Light Detection and Ranging). We propose two original data association methods used in the sensor fusion and tracking processes. The first data association algorithm is created for tracking LIDAR objects and combines multiple appearance and motion features in order to exploit the available information for road objects. The second novel data association algorithm is designed for trifocal camera objects and has the objective of finding measurement correspondences to sensor fused objects such that the super-sensor data are enriched by adding the semantic class information. The implemented trifocal object association solution uses a novel polar association scheme combined with a decision tree to find the best hypothesis-measurement correlations. Another contribution we propose for stabilizing object position and unpredictable behavior of road objects, provided by multiple types of complementary sensors, is the use of a fusion approach based on the Unscented Kalman Filter and a single-layer perceptron. The last novel contribution is related to the validation of the 3D object position, which is solved using a fuzzy logic technique combined with a semantic segmentation image. The proposed algorithms have a real-time performance, achieving a cumulative running time of 90 ms, and have been evaluated using ground truth data extracted from a high-precision GPS (global positioning system) with 2 cm accuracy, obtaining an average error of 0.8 m.

Research paper thumbnail of Multi-Object tracking of 3D cuboids using aggregated features

the unknown correspondences of measurements and targets, referred to as data association, is one ... more the unknown correspondences of measurements and targets, referred to as data association, is one of the main challenges of multi-target tracking. Each new measurement received could be the continuation of some previously detected target, the first detection of a new target or a false alarm. Tracking 3D cuboids, is particularly difficult due to the high amount of data, which can include erroneous or noisy information coming from sensors, that can lead to false measurements, detections from an unknown number of objects which may not be consistent over frames or varying object properties like dimension and orientation. In the self-driving car context, the target tracking module holds an important role due to the fact that the ego vehicle has to make predictions regarding the position and velocity of the surrounding objects in the next time epoch, plan for actions and make the correct decisions. To tackle the above mentioned problems and other issues coming from the self-driving car processing pipeline we propose three original contributions: 1) designing a novel affinity measurement function to associate measurements and targets using multiple types of features coming from LIDAR and camera, 2) a context aware descriptor for 3D objects that improves the data association process, 3) a framework that includes a module for tracking dimensions and orientation of objects. The implemented solution runs in real time and experiments that were performed on real world urban scenarios prove that the presented method is effective and robust even in a highly dynamic environment.

Research paper thumbnail of Dot Matrix OCR for Bottle Validity Inspection

Identifying expiration dates on water bottles for fast industrial processing of a large amount of... more Identifying expiration dates on water bottles for fast industrial processing of a large amount of products mainly affects water bottling factories, food warehouses and supermarkets. The impact of this problem is the distribution of expired bottles of water or their existence on the market. One of the key issues for automatic character readers from bottles is dotted text. Furthermore, the transparent and curved nature of the containing bottle in the presence of water increases the difficulty of the text extraction. An optical character recognition solution using a convolutional neural networks is proposed to solve this issue. The proposed solution segments the input image to extract the bottle, detects the text region of interest, then performs pre-processing operations and finally converts the characters extracted from the region of interest on the bottle in human-readable characters. The proposed solution has real time performance and it achieves high quality results on the evaluation set.