A computer vision pipeline to identify the "icons" in Christian paintings

Last update: Jul 30, 2022

Related tags

Overview

Christian-Iconography

A computer vision pipeline to identify the "icons" in Christian paintings.

A bit about iconography.

Iconography is related to identifying the subject itself in the image. So, for instance when I say Christian Iconography I would mean that I am trying to identify some objects like crucifix or mainly in this project the saints!

Inspiration

I was looking for some interesting problem to solve and I came across RedHenLab's barnyard of projects and it had some really wonderful ideas there and this particular one intrigued me. On the site they didn't have much progress on it as the datasets were not developed on this subject but after surfing around I found something and just like that I got started!

Dataset used.

The project uses the ArtDL dataset which contains 42,479 images of artworks portraying Christian saints, divided in 10 classes: Saint Dominic (iconclass 11HH(DOMINIC)), Saint Francis of Assisi (iconclass 11H(FRANCIS)), Saint Jerome (iconclass 11H(JEROME)), Saint John the Baptist (iconclass 11H(JOHN THE BAPTIST)), Saint Anthony of Padua (iconclass 11H(ANTONY OF PADUA), Saint Mary Magdalene (iconclass 11HH(MARY MAGDALENE)), Saint Paul (iconclass 11H(PAUL)), Saint Peter (iconclass 11H(PETER)), Saint Sebastian (iconclass 11H(SEBASTIAN)) and Virgin Mary (iconclass 11F). All images are associated with high-level annotations specifying which iconography classes appear in them (from a minimum of 1 class to a maximum of 7 classes).

Sources

Preprocessing steps.

All the images were first padded so that the resolution is sort of intact when the image is resized. A dash of normalization and some horizontal flips and the dataset is ready to be eaten/trained on by our model xD.

Architecture used.

As mentioned the ArtDL dataset has around 43k images and hence training it completely wouldn't make sense. Hence a ResNet50 pretrained model was used.

But there is a twist.

Instead of just having the final classifying layer trained we only freeze the initial layer as it has gotten better at recognizing patterns from a lot of images it might have trained on. And then we fine-tune the deeper layers so that it learns the art after the initial abstraction. Another deviation is to replace the final linear layer by 1x1 conv layer to make the classification.

Quantiative Results.

Training

I trained the network for 10 epochs which took around 3 hours and used Stochastic Gradient Descent with LR=0.01 and momentum 0.9. The accuracy I got was 64% on the test set which can be further improved.

Classification Report

From the classification report it is clear that Saint MARY has the most number of samples in the training set and the precision for that is high. On the other hand other samples are low in number and hence their scores are low and hence we can't infer much except the fact that we need to oversample some of these classes so that we can gain more meaningful resuls w.r.t accuracy and of course these metrics as well

Qualitative Results

We try an image of Saint Dominic and see what our classifier is really learning.

Saliency Map

We can notice that regions around are more lighter than elsewhere which could mean that our classifier at least knows where to look :p

Guided-Backpropagation

So what really guided backprop does is that it points out the positve influences while classifiying an image. From this result we can see that it is really ignoring the padding applied and focussing more on the body and interesting enough the surroundings as well

Grad-CAM!

As expected the Grad-CAM when used shows the hot regions in our images and it is around the face and interesting enough the surrounding so maybe it could be that surroundings do have a role-play in type of saint?

Possible improvements.

Finding more datasets
Or working on the architecture maybe?
Using GANs to generate samples and make classifier stronger

Citations

@misc{milani2020data,
title={A Data Set and a Convolutional Model for Iconography Classification in Paintings},
author={Federico Milani and Piero Fraternali},
eprint={2010.11697},
archivePrefix={arXiv},
primaryClass={cs.CV},
year={2020}
}

RedhenLab's barnyard of projects

A computer vision pipeline to identify the "icons" in Christian paintings

Related tags

Overview

Christian-Iconography

A bit about iconography.

Inspiration

Dataset used.

Sources

Preprocessing steps.

Architecture used.

Quantiative Results.

Training

Classification Report

Qualitative Results

Saliency Map

Guided-Backpropagation

Grad-CAM!

Possible improvements.

Citations

Owner

Rishab Mudliar

TipToiDog - Tip Toi Dog With Python

Artificial intelligence technology inferring issues and logically supporting facts from raw text

Auxiliary data to the CHIIR paper Searching to Learn with Instructional Scaffolding

PAMI stands for PAttern MIning. It constitutes several pattern mining algorithms to discover interesting patterns in transactional/temporal/spatiotemporal databases

mlpack: a scalable C++ machine learning library --

Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

A collection of resources and papers on Diffusion Models, a darkhorse in the field of Generative Models

Plenoxels: Radiance Fields without Neural Networks, Code release WIP

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

A complete, self-contained example for training ImageNet at state-of-the-art speed with FFCV

QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

🔥3D-RecGAN in Tensorflow (ICCV Workshops 2017)

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

上海交通大学全自动抢课脚本，支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

This repository contains the code used to quantitatively evaluate counterfactual examples in the associated paper.

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

A NSFW content filter.

This package implements the algorithms introduced in Smucler, Sapienza, and Rotnitzky (2020) to compute optimal adjustment sets in causal graphical models.