Image Captioning using Convolutional Neural Networks
Objective
The inspiration for this project was the paper - Convolutional Image Captioning by Jyoti Aneja, Aditya Deshpande and Alexander G. Schwing at the University of Illinois at Urbana-Champaign. The paper was one of the first to perform image captioning on the MSCOCO dataset using only CNNs as compared to traditional approaches that used Long Short Term Memory networks (LSTMs).
Methodology
For our project, we implemented a similar captioning system and aimed at improving evaluation scores by using deeper encoders, and tuning for values such as epochs, number of captions per image, temperature, etc.
Tools & Technologies
Image Captioning, CNN, PyTorch, Linux, GPU