A transformer-based deep learning model for evaluation of accessibility of image descriptions

Shrestha, Raju

dc.contributor.author	Shrestha, Raju
dc.date.accessioned	2023-08-09T12:39:43Z
dc.date.available	2023-08-09T12:39:43Z
dc.date.created	2022-11-01T12:11:14Z
dc.date.issued	2022
dc.identifier.isbn	978-1-4503-9570-0
dc.identifier.uri	https://hdl.handle.net/11250/3083209
dc.description.abstract	Images have become an integral part of digital and online media and they are used for creative expression and dissemination of knowledge. To address image accessibility challenges to the visually impaired community, adequate textual image descriptions or captions are provided, which can be read through screen readers. These descriptions could be either human-authored or software-generated. It is found that most of the image descriptions provided tend to be generic, inadequate, and often unreliable making them inaccessible. There are tools, methods, and metrics used to evaluate the quality of the generated text, but almost all of them are word-similarity-based and generic. There are standard guidelines such as NCAM image accessibility guidelines to help write accessible image descriptions. However, web content developers and authors do not seem to use them much, possibly due to the lack of knowledge, undermining the importance of accessibility coupled with complexity and difficulty understanding the guidelines. To our knowledge, none of the quality evaluation techniques take into account accessibility aspects. To address this, a deep learning model based on the transformer, a most recent and most effective architecture used in natural language processing, which measures compliance of the given image description to ten NCAM guidelines, is proposed. The experimental results confirm the effectiveness of the proposed model. This work could contribute to the growing research towards accessible images not only on the web but also on all digital devices.	en_US
dc.language.iso	eng	en_US
dc.publisher	Association for Computing Machinery (ACM)	en_US
dc.relation.ispartof	Proceedings of the 14th International Conference on Machine Learning and Computing (ICMLC)
dc.title	A transformer-based deep learning model for evaluation of accessibility of image descriptions	en_US
dc.type	Chapter	en_US
dc.type	Peer reviewed	en_US
dc.type	Conference object	en_US
dc.description.version	acceptedVersion	en_US
cristin.ispublished	true
cristin.fulltext	original
cristin.fulltext	postprint
cristin.qualitycode	1
dc.identifier.doi	https://doi.org/10.1145/3529836.3529856
dc.identifier.cristin	2067296

Tilhørende fil(er)

Filnavn:: ICMLC2022-Shrestha.pdf
Størrelse:: 862.8Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Publikasjoner fra Cristin [3269]
TKD - Institutt for informasjonsteknologi [945]
TKD - Department of Computer Science

Vis enkel innførsel