Vis enkel innførsel

dc.contributor.authorShrestha, Raju
dc.date.accessioned2023-08-09T12:39:43Z
dc.date.available2023-08-09T12:39:43Z
dc.date.created2022-11-01T12:11:14Z
dc.date.issued2022
dc.identifier.isbn978-1-4503-9570-0
dc.identifier.urihttps://hdl.handle.net/11250/3083209
dc.description.abstractImages have become an integral part of digital and online media and they are used for creative expression and dissemination of knowledge. To address image accessibility challenges to the visually impaired community, adequate textual image descriptions or captions are provided, which can be read through screen readers. These descriptions could be either human-authored or software-generated. It is found that most of the image descriptions provided tend to be generic, inadequate, and often unreliable making them inaccessible. There are tools, methods, and metrics used to evaluate the quality of the generated text, but almost all of them are word-similarity-based and generic. There are standard guidelines such as NCAM image accessibility guidelines to help write accessible image descriptions. However, web content developers and authors do not seem to use them much, possibly due to the lack of knowledge, undermining the importance of accessibility coupled with complexity and difficulty understanding the guidelines. To our knowledge, none of the quality evaluation techniques take into account accessibility aspects. To address this, a deep learning model based on the transformer, a most recent and most effective architecture used in natural language processing, which measures compliance of the given image description to ten NCAM guidelines, is proposed. The experimental results confirm the effectiveness of the proposed model. This work could contribute to the growing research towards accessible images not only on the web but also on all digital devices.en_US
dc.language.isoengen_US
dc.publisherAssociation for Computing Machinery (ACM)en_US
dc.relation.ispartofProceedings of the 14th International Conference on Machine Learning and Computing (ICMLC)
dc.titleA transformer-based deep learning model for evaluation of accessibility of image descriptionsen_US
dc.typeChapteren_US
dc.typePeer revieweden_US
dc.typeConference objecten_US
dc.description.versionacceptedVersionen_US
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.fulltextpostprint
cristin.qualitycode1
dc.identifier.doihttps://doi.org/10.1145/3529836.3529856
dc.identifier.cristin2067296


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel