A Pythonic library for Nvidia Codec.

The project is still in active development; expect breaking changes.

Why another Python library for Nvidia Codec?

Comparison to Video-Processing-Framework

Methodologies

VPF is written fully in C++ and uses pybind to expose Python interfaces. PNC is written fully in Python and uses ctypes to access Nvidia C interfaces. Our codes tends to be more concise, less duplicative and easier to read and write.

Performance

Preliminary tests shows little to no difference in terms of performance, because the heavy lifting is done on the GPU anyway. Both library can saturate GPU decoder. PNC uses more CPU than VPF as expected from Python vs. C++, but still negligible (less than 10% of Ryzen 3100 single core for 8K*4K HEVC)

Resource Management

In VPF Surface given to user are not owned by the user. It will be overwritten by new frames which is counter-intuitive; Picture are not exposed to user at all - they are always mapped (post-processed and copied) to Surface so the picture can be ready for new frames. The latter is inefficient when only a subset of Pictures are needed (e.g. screenshots).
The above is because VPF allocates the bare minimum of resources needed for most decoding tasks. PNC allows the user to specify the amount of resources to be allocated for advanced applications. Users own the resources and decide when and whether to deal with them.
Managing resources is not painful: similar to pycuda, we shift the burden of managing host/device resources to the Python garbage collector. Resources (such as Picture and Surface) are automatically freed when the user drops the reference.

Things to come

TODO Cropping and scaling support in postprocessing
TODO Color space conversion from YUV (bt. 601/709, full-range/limit-range) to RGB using pycuda
Encoder

Acknowledgements

Many thanks to @rarzumanyan for all the helps and explanations!

A Pythonic library for Nvidia Codec.

Related tags

Overview

A Pythonic library for Nvidia Codec.

Why another Python library for Nvidia Codec?

Things to come

Acknowledgements

Owner

Zesen Qian

Code & Data for Enhancing Photorealism Enhancement

A deep learning tabular classification architecture inspired by TabTransformer with integrated gated multilayer perceptron.

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice,

Vehicle speed detection with python

TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

A check for whether the dependency jobs are all green.

Visualizer for neural network, deep learning, and machine learning models

Curating a dataset for bioimage transfer learning

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

A collection of implementations of deep domain adaptation algorithms

Learning Chinese Character style with conditional GAN

AISTATS 2019: Confidence-based Graph Convolutional Networks for Semi-Supervised Learning

PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks

🌳 A Python-inspired implementation of the Optimum-Path Forest classifier.

Text to Image Generation with Semantic-Spatial Aware GAN

This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider