Kvasir-VQA: A Text-Image Pair GI Tract Dataset
dc.contributor.author | Gautam, Sushant | |
dc.contributor.author | Storås, Andrea | |
dc.contributor.author | Midoglu, Cise | |
dc.contributor.author | Hicks, Steven | |
dc.contributor.author | Thambawita, Vajira L B | |
dc.contributor.author | Halvorsen, Pål | |
dc.contributor.author | Riegler, Michael Alexander | |
dc.date.accessioned | 2024-11-08T08:00:12Z | |
dc.date.available | 2024-11-08T08:00:12Z | |
dc.date.created | 2024-11-07T13:49:11Z | |
dc.date.issued | 2024 | |
dc.identifier.isbn | 9798400712074 | |
dc.identifier.uri | https://hdl.handle.net/11250/3163977 | |
dc.description.abstract | We introduce Kvasir-VQA, an extended dataset derived from the HyperKvasir and Kvasir-Instrument datasets, augmented with questionand-answer annotations to facilitate advanced machine learning tasks in Gastrointestinal (GI) diagnostics. This dataset comprises 6,500 annotated images spanning various GI tract conditions and surgical instruments, and it supports multiple question types including yes/no, choice, location, and numerical count. The dataset is intended for applications such as image captioning, Visual Question Answering (VQA), text-based generation of synthetic medical images, object detection, and classification. Our experiments demonstrate the dataset’s effectiveness in training models for three selected tasks, showcasing significant applications in medical image analysis and diagnostics. We also present evaluation metrics for each task, highlighting the usability and versatility of our dataset. The dataset and supporting artifacts are available at https://datasets.simula.no/kvasir-vqa. | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Association for Computing Machinery (ACM) | en_US |
dc.relation.ispartof | VLM4Bio'24: Proceedings of the First International Workshop on Vision-Language Models for Biomedical Applications, in the 32nd ACM International Conference on Multimedia | |
dc.title | Kvasir-VQA: A Text-Image Pair GI Tract Dataset | en_US |
dc.type | Chapter | en_US |
dc.type | Peer reviewed | en_US |
dc.type | Conference object | en_US |
dc.description.version | acceptedVersion | en_US |
cristin.ispublished | true | |
cristin.fulltext | postprint | |
cristin.qualitycode | 1 | |
dc.identifier.doi | https://doi.org/10.1145/3689096.3689458 | |
dc.identifier.cristin | 2318696 |
Files in this item
This item appears in the following Collection(s)
-
HV - Institutt for rehabiliteringsvitenskap og helseteknologi [418]
HV - Department of Rehabilitation Science and Health Technology -
Publikasjoner fra Cristin [3837]
-
SAM - Institutt for sosialfag [545]
SAM - Department of Social Work, Child Welfare and Social Policy -
TKD - Fakultet for teknologi, kunst og design [32]
TKD - Faculty of Technology, Art and Design