[1]
D. Koshti, A. Gupta, M. Kalla, and A. Sharma, “TRANS-VQA: Fully Transformer-Based Image Question-Answering Model Using Question-guided Vision Attention”, ia, vol. 27, no. 73, pp. 111–128, Jan. 2024.