A Systematic Literature Review on Machine Learning and Deep Learning Methods for Semantic Segmentation (original) (raw)

IJERT-A Survey on Semantic Segmentation using Deep Learning Techniques

International Journal of Engineering Research and Technology (IJERT), 2021

https://www.ijert.org/a-survey-on-semantic-segmentation-using-deep-learning-techniques https://www.ijert.org/research/a-survey-on-semantic-segmentation-using-deep-learning-techniques-IJERTCONV9IS05011.pdf Semantic segmentation is a challenging task in the field of computer vision. It is process of classifying each pixel belonging to a particular label. It has many challenging applications such as autonomous vehicles, human-computer interaction, robot navigation, medical research and so on, which motivates us to survey the different semantic segmentation architectures. Most of these methods have been built using the deep learning techniques. In this paper we made a review of some state-of-the-art Convolutional Neural Network(CNN) architectures such as AlexNet, GoogleNet, VGGNet, ResNet which form the basis for Semantic Segmentation. Further, we presenteddifferent semanticsegmentation architectures such as Fully Convolutional Network (FCN), ParseNet, Deconvolution Network, U-Net, Feature Pyramid Network(FPN), Mask R-CNN. Finally, we compared the performances of all these architectures.

Deep-Learning-Based Approaches for Semantic Segmentation of Natural Scene Images: A Review

Electronics

The task of semantic segmentation holds a fundamental position in the field of computer vision. Assigning a semantic label to each pixel in an image is a challenging task. In recent times, significant advancements have been achieved in the field of semantic segmentation through the application of Convolutional Neural Networks (CNN) techniques based on deep learning. This paper presents a comprehensive and structured analysis of approximately 150 methods of semantic segmentation based on CNN within the last decade. Moreover, it examines 15 well-known datasets in the semantic segmentation field. These datasets consist of 2D and 3D image and video frames, including general, indoor, outdoor, and street scenes. Furthermore, this paper mentions several recent techniques, such as SAM, UDA, and common post-processing algorithms, such as CRF and MRF. Additionally, this paper analyzes the performance evaluation of reviewed state-of-the-art methods, pioneering methods, common backbone networks...

State of the Art Techniques to Advance Deep Networks for Semantic Segmentation

U.Porto Journal of Engineering

In recent times, the computer vision community has seen remarkable growth in the field of scene understanding. With such a wide prevalence of images, the importance of this field is growing rapidly along with the technologies involved in it. Semantic Segmentation is an important step in scene understanding which requires the assignment of each pixel in an image to a pre-defined class and achieving 100% accuracy is a challenging task, thereby making it an active research topic among researchers. In this paper, an extensive study and review of the existing Deep Learning (DL) based techniques used for Semantic Segmentation is carried out along with a summary of the datasets and evaluation metrics used for it. The study involved the meticulous selection of relevant research papers in the field of interest by search based on several defined keywords. The study begins with a general and broader focus on Semantic Segmentation as a problem and further narrows its focus on existing Deep Lear...

How deep learning is empowering semantic segmentation

Multimedia Tools and Applications

Semantic segmentation involves extracting meaningful information from images or input from a video or recording frames. It is the way to perform the extraction by checking pixels by pixel using a classification approach. It gives us more accurate and fine details from the data we need for further evaluation. Formerly, we had a few techniques based on some unsupervised learning perspectives or some conventional ways to do some image processing tasks. With the advent of time, techniques are improving, and we now have more improved and efficient methods for segmentation. Image segmentation is slightly simpler than semantic segmentation because of the technical perspective as semantic segmentation is pixels based. After that, the detected part based on the label will be masked and refer to the masked objects based on the classes we have defined with a relevant class name and the designated color. In this paper, we have reviewed almost all the supervised and unsupervised learning algorithms from scratch to advanced and more efficient algorithms that have been done for semantic segmentation. As far as deep learning is concerned, we have many techniques already developed until now. We have studied around 120 papers in this research area. We have concluded how deep learning is helping in solving the critical issues of semantic segmentation and gives us more efficient results. We have reviewed and comprehensively studied different surveys on semantic segmentation, specifically using deep learning.

Comparing the Semantic Segmentation of High-Resolution Images Using Deep Convolutional Networks: SegNet, HRNet, CSE-HRNet and RCA-FCN

Nashriyyahʼi farhangʼi Khurāsān, 2023

Semantic segmentation is a branch of computer vision, used extensively in image search engines, automated driving, intelligent agriculture, disaster management, and other machine-human interactions. Semantic segmentation aims to predict a label for each pixel from a given label set, according to semantic information. Among the proposed methods and architectures, researchers have focused on deep learning algorithms due to their good feature learning results. Thus, many studies have explored the structure of deep neural networks, especially convolutional neural networks. Most of the modern semantic segmentation models are based on fully convolutional networks (FCN), which first replace the fully connected layers in common classification networks with convolutional layers, getting pixel-level prediction results. After that, a lot of methods are proposed to improve the basic FCN methods results. With the increasing complexity and variety of existing data structures, more powerful neural networks and the development of existing networks are needed. This study aims to segment a high-resolution (HR) image dataset into six separate classes. Here, an overview of some important deep learning architectures will be presented with a focus on methods producing remarkable scores in segmentation metrics such as accuracy and F1-score. Finally, their segmentation results will be discussed and we would see that the methods, which are superior in the overall accuracy and overall F1-score, are not necessarily the best in all classes. Therefore, the results of this paper lead to the point to choose the segmentation algorithm according to the application of segmentation and the importance degree of each class.

Semantic segmentation of urban environments: Leveraging U-Net deep learning model for cityscape image analysis

PloS one, 2024

Semantic segmentation of cityscapes via deep learning is an essential and game-changing research topic that offers a more nuanced comprehension of urban landscapes. Deep learning techniques tackle urban complexity and diversity, which unlocks a broad range of applications. These include urban planning, transportation management, autonomous driving, and smart city efforts. Through rich context and insights, semantic segmentation helps decision-makers and stakeholders make educated decisions for sustainable and effective urban development. This study investigates an in-depth exploration of cityscape image segmentation using the U-Net deep learning model. The proposed U-Net architecture comprises an encoder and decoder structure. The encoder uses convolutional layers and down sampling to extract hierarchical information from input images. Each down sample step reduces spatial dimensions, and increases feature depth, aiding context acquisition. Batch normalization and dropout layers stabilize models and prevent overfitting during encoding. The decoder reconstructs higher-resolution feature maps using "UpSampling2D" layers. Through extensive experimentation and evaluation of the Cityscapes dataset, this study demonstrates the effectiveness of the U-Net model in achieving state-of-the-art results in image segmentation. The results clearly shown that, the proposed model has high accuracy, mean IOU and mean DICE compared to existing models.

Deep Learning-Based Semantic Segmentation of Urban Features in Satellite Images: A Review and Meta-Analysis

Remote Sensing

Availability of very high-resolution remote sensing images and advancement of deep learning methods have shifted the paradigm of image classification from pixel-based and object-based methods to deep learning-based semantic segmentation. This shift demands a structured analysis and revision of the current status on the research domain of deep learning-based semantic segmentation. The focus of this paper is on urban remote sensing images. We review and perform a meta-analysis to juxtapose recent papers in terms of research problems, data source, data preparation methods including pre-processing and augmentation techniques, training details on architectures, backbones, frameworks, optimizers, loss functions and other hyper-parameters and performance comparison. Our detailed review and meta-analysis show that deep learning not only outperforms traditional methods in terms of accuracy, but also addresses several challenges previously faced. Further, we provide future directions of resea...

A Semantic-based Scene segmentation using convolutional neural networks

AEU - International Journal of Electronics and Communications, 2020

Semantic segmentation is a crucial operation in the computer vision field. One of the promising techniques is the convolutional neural network (CNN). It can be utilized with both single and multidimensional arrays and is useful for processing 2D arrays in computer vision tasks. In this paper, a new model for semantic scene segmentation is proposed. In order to enhance the segmentation results, the model starts with classifying the input scene as either indoor or outdoor scenes. In this context, the MobileNet is used as it provides better results when compared to Inception-v3 and Inception-ResNet-v2 networks. The next step, two models based on Pyramid Scene Parsing Network (PSPNet) are used for image segmentation (indoor images are segmented by the indoor model and outdoor images are segmented by the outdoor model). Experimental results prove the concept that a specific scene model can achieve higher accuracy than general scene models on the semantic segmentation task.

Street Scene understanding via Semantic Segmentation Using Deep Learning

Maǧallaẗ al-handasaẗ wa-al-tiknūlūǧiyā, 2022

 Scene classification is an essential conception task used by robotics for understanding the environment.  The deep learning technique has been proved as a great role in the challenging scene understanding application.  Using data augmentation to increase dataset size  Using K-means clustering as a preprocessor for the input dataset  The proposed hydride model is generated by combined two of the deep, deep neural networks as an xception and U-net models. Scene classification is an essential conception task used by robotics for understanding the environment. Like the street scene, the outdoor scene is composed of images with depth that has a greater variety than iconic object images. Image semantic segmentation is an important task for Autonomous driving and Mobile robotics applications because it introduces enormous information needed for safe navigation and complex reasoning. This paper provides a model for semantic segmentation of outdoor sense to classify each object in the scene. The proposed network model generates a hybrid model that combines U-NET with Xception networks to work on 2.5 dimensions cityscape dataset, which is used for 3D applications. This process contains two stages. The first is the pre-processing operation on the RGB-D dataset (data Augmentation and k-means cluster). The second stage designed the hybrid model, which achieves a pixel accuracy is 0.7874. The output module is generated using a computer with GPU memory NVIDIA GeForce RTX 2060 6G, programming with python 3.7.

A Brief Survey and an Application of Semantic Image Segmentation for Autonomous Driving

Handbook of Deep Learning Applications, 2019

Deep learning is a fast-growing machine learning approach to perceive and understand large amounts of data. In this paper, general information about the deep learning approach which is attracted much attention in the field of machine learning is given in recent years and an application about semantic image segmentation is carried out in order to help autonomous driving of autonomous vehicles. This application is implemented with Fully Convolutional Network (FCN) architectures obtained by modifying the Convolutional Neural Network (CNN) architectures based on deep learning. Experimental studies for the application are utilized 4 different FCN architectures named FCN-AlexNet, FCN-8s, FCN-16s and FCN-32s. For the experimental studies, FCNs are first trained separately and validation accuracies of these trained network models on the used dataset is compared. In addition, image segmentation inferences are visualized to take account of how precisely FCN architectures can segment objects.