top of page
Autumn Road

Automated Image Captioning

Data labeling is an essential step in all supervised machine learning tasks.However, this can be a laborious process requiring several man-hours. In this project, we implement a model that automatically captions images for the purpose of image labeling.

The model consists of a convolutional part that uses transfer learning with VGG16 to process the images. The last layer is then mapped to a 3-stack LSTM. The LSTM is also fed word embeddings of the same size as the last layer of the CNN. The caption is generated by calculating the most probable sequence of words that describes the image. The model was trained on 8000 images from the Flickr8K dataset.

bottom of page