Show simple item record

dc.contributor.authorGautam, Sushant
dc.contributor.authorStorås, Andrea
dc.contributor.authorMidoglu, Cise
dc.contributor.authorHicks, Steven
dc.contributor.authorThambawita, Vajira L B
dc.contributor.authorHalvorsen, Pål
dc.contributor.authorRiegler, Michael Alexander
dc.date.accessioned2024-11-08T08:00:12Z
dc.date.available2024-11-08T08:00:12Z
dc.date.created2024-11-07T13:49:11Z
dc.date.issued2024
dc.identifier.isbn9798400712074
dc.identifier.urihttps://hdl.handle.net/11250/3163977
dc.description.abstractWe introduce Kvasir-VQA, an extended dataset derived from the HyperKvasir and Kvasir-Instrument datasets, augmented with questionand-answer annotations to facilitate advanced machine learning tasks in Gastrointestinal (GI) diagnostics. This dataset comprises 6,500 annotated images spanning various GI tract conditions and surgical instruments, and it supports multiple question types including yes/no, choice, location, and numerical count. The dataset is intended for applications such as image captioning, Visual Question Answering (VQA), text-based generation of synthetic medical images, object detection, and classification. Our experiments demonstrate the dataset’s effectiveness in training models for three selected tasks, showcasing significant applications in medical image analysis and diagnostics. We also present evaluation metrics for each task, highlighting the usability and versatility of our dataset. The dataset and supporting artifacts are available at https://datasets.simula.no/kvasir-vqa.en_US
dc.language.isoengen_US
dc.publisherAssociation for Computing Machinery (ACM)en_US
dc.relation.ispartofVLM4Bio'24: Proceedings of the First International Workshop on Vision-Language Models for Biomedical Applications, in the 32nd ACM International Conference on Multimedia
dc.titleKvasir-VQA: A Text-Image Pair GI Tract Dataseten_US
dc.typeChapteren_US
dc.typePeer revieweden_US
dc.typeConference objecten_US
dc.description.versionacceptedVersionen_US
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode1
dc.identifier.doihttps://doi.org/10.1145/3689096.3689458
dc.identifier.cristin2318696


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record