RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Last update: Sep 20, 2022

Related tags

Text Data & NLP ru-clip-tiny

Overview

RuCLIPtiny

Zero-shot image classification model for Russian language

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts). Our model is based on ConvNeXt-tiny and DistilRuBert-tiny, and is supported by extensive research zero-shot transfer, computer vision, natural language processing, and multimodal learning.

Result evaluation

Our model achieved 46.62% top1 and 73.18% top5 zero-shot accuracy on CIFAR100

Examples

Evaluate & Simple usage

Finetuning

ONNX conversion and speed testing

Model weights

Usage

Install rucliptiny module and requirements first. Use this trick

!gdown -O ru-clip-tiny.pkl https://drive.google.com/uc?id=1-3g3J90pZmHo9jbBzsEmr7ei5zm3VXOL
!pip install git+https://github.com/cene555/ru-clip-tiny.git

Example in 3 steps

Download CLIP image from repo

!wget -c -O CLIP.png https://github.com/openai/CLIP/blob/main/CLIP.png?raw=true

Import libraries

from rucliptiny.predictor import Predictor
from rucliptiny import RuCLIPtiny
import torch

torch.manual_seed(1)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Load model

model = RuCLIPtiny()
model.load_state_dict(torch.load('ru-clip-tiny.pkl'))
model = model.to(device).eval()

Use predictor to get probabilities

predictor = Predictor()

classes = ['диаграмма', 'собака', 'кошка']
text_probs = predictor(model=model, images_path=["CLIP.png"],
                       classes=classes, get_probs=True,
                       max_len=77, device=device)

Cosine similarity Visualization Example

Speed Testing

NVIDIA Tesla K80 (Google Colab session)

TORCH	batch	encode_image	encode_text	total
RuCLIPtiny	2	0.011	0.004	0.015
RuCLIPtiny	8	0.011	0.004	0.015
RuCLIPtiny	16	0.012	0.005	0.017
RuCLIPtiny	32	0.014	0.005	0.019
RuCLIPtiny	64	0.013	0.006	0.019

We would like to express my gratitude to Sber AI for the grants provided, for which research was carried out, as part of the Artificial Intelligence International Junior Contest (AIIJC)

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Related tags

Overview

RuCLIPtiny

Result evaluation

Examples

Model weights

Usage

Example in 3 steps

Cosine similarity Visualization Example

Speed Testing

Owner

Shahmatov Arseniy

Findings of ACL 2021

Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

NLP project that works with news (NER, context generation, news trend analytics)

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

Code for the paper TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

A method for cleaning and classifying text using transformers.

A python wrapper around the ZPar parser for English.

Python Implementation of ``Modeling the Influence of Verb Aspect on the Activation of Typical Event Locations with BERT'' (Findings of ACL: ACL 2021)

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Seq2seq attn - Use the Seq2Seq method to implement machine translation and introduce Attention mechanism to improve the results

Rethinking the Truly Unsupervised Image-to-Image Translation - Official PyTorch Implementation (ICCV 2021)

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

🏆 • 5050 most frequent words in 109 languages

TensorFlow code and pre-trained models for BERT

Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)

Python library for Serbian Natural language processing (NLP)

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Named Entity Recognition API used by TEI Publisher