Enhancing Object Detection with Mask R-CNN: A Deep Learning Perspective (original) (raw)

Object identification is a critical task in the field of machine learning, focusing on locating and recognizing specific elements of interest within an image. A mask-based R-CNN model using the ResNet-50-FPN backbone, which has been already trained on the data set COCO as a feature extractor is used in this study of author's innovative method for object detection. In order to train several neural network algorithms for object recognition, which include CNN, VGG-16, and Inception Net, pertinent features from the input photos are first extracted using the pre-trained models. The proposed method improved accuracy in object detection, as well as the ability to efficiently extract relevant features using pre-trained models. The three frameworks were assessed on two different datasets: Pascal VOC and COCO. For both the datasets the VGG-16 model achieves high accuracy 89.9% and 95.4% respectively. The results of the experiment show that the suggested approach is beneficial, with the model achieving of higher accuracy in object detection compared to other existing methods. This research contributes to the development of efficient and adaptable frameworks in order to recognize and separate things in photographs which are essential for the automation of machine vision systems.