Arabic image caption generation using attention-based deep learning model /

Linked Agent
Elnagar, Ashraf,, Thesis advisor
Date Issued
2023
Language
English
Thesis Type
Thesis
Abstract
Automatic image caption generation is one of the most challenging topics in the machine-learning field. Over the past few years, this has been an area of interest for several researchers. However, most current research focuses on generating captions in English only, which is attributed mainly to the lack of publicly available datasets, especially for the Arabic language. In addition, few available Arabic Image Captioning (AIC) studies have utilized attention in the proposed machine learning models. The attention layer in a neural network is expected to focus on eseential parts of the image that contribute to the caption generation process. In this thesis, an Arabic attention-based image captioning model was developed. The proposed model was implemented and trained on an Arabic version of the MSCOCO dataset, which was developed and made publicly available. To extract the features of an image, a convolutional neural network (CNN) was used, and an extended version of a Recurrent Neural Network (LSTM) was used to generate the caption. Model's results were evaluated and compared using famous evaluation metrics, such as BLEU, ROUGE, and CIDER. A comparison with the limited existing AIC systems was also performed. The results showed that the proposed model outperformed the recently reported systems that implemented AIC, improving the performance on the MSCOCO dataset and raising the BLEU-1 score by 12% from the following top score. The results also show that although some of the generated captions for some images gave low scores, the model produced correct and more accurate captions than the reference captions and generated new words that were not present in the reference captions.
Note
A Dissertation Submitted in Partial Fulfilment of the Requirements for Master in Computer Science, University of Sharjah, May, 2023.
Category
Theses
Library of Congress Classification
Q325.5 .A883 2023
Local Identifier
b16375427

Same Subject