Genomics Workshop
FIXME: overview of workshop
Code of Conduct
All participants should agree to abide by The Carpentries Code of Conduct.
Authors
FIXME
Citation
Please cite as:
FIXME
FIXME: overview of workshop
All participants should agree to abide by The Carpentries Code of Conduct.
FIXME
Please cite as:
FIXME
npy-converter Matrix Data Converter. Expand matrix for multi-thread, multi-process Divid matrix for multi-thread, multi-process Support: .mtx, .npy, .
Flenser Have you ever been handed a dataset you've never seen before? Flenser is a simple, minimal, automated exploratory data analysis tool. It runs
ETL Pipeline with Airflow, Spark, s3, MongoDB and Amazon Redshift
We work with the accounting department of a company that sells motorcycle parts. The company operates three warehouses in a large metropolitan area.
Open-Domain Question-Answering for COVID-19 and Other Emergent Domains This repository contains the source code for an end-to-end open-domain question
Approximate Nearest Neighbor Search for Sparse Data in Python! This library is well suited to finding nearest neighbors in sparse, high dimensional spaces (like text documents).
ped-crash-techvol: Texas Ped Crash Tech Volume Pack In conjunction with the Final Report "Identifying Risk Factors that Lead to Increase in Fatal Pede
ZhuSuan is a Python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of Bayesian methods and
unyt A package for handling numpy arrays with units. Often writing code that deals with data that has units can be confusing. A function might return
Larch: Data Analysis Tools for X-ray Spectroscopy and More Documentation: http://xraypy.github.io/xraylarch Code: http://github.com/xraypy/xraylarch L
Python asymptotic Partial Directed Coherence and Directed Coherence estimation package for brain connectivity analysis. Free software: MIT license Doc
Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.
DataComPy DataComPy is a package to compare two Pandas DataFrames. Originally started to be something of a replacement for SAS's PROC COMPARE for Pand
Nested Collaborative Learning for Long-Tailed Visual Recognition This repository is the official PyTorch implementation of the paper in CVPR 2022: Nes
RPA Challenge in Python Projeto para realizar o RPA Challenge (www.rpachallenge.com), utilizando Python. O objetivo deste desafio é criar um fluxo de
Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems.
Population Shift Monitoring popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets.
pandas-datareader Up to date remote data access for pandas, works for multiple versions of pandas. Installation Install using pip pip install pandas-d
MidTerm Project for the Data Analysis FT Bootcamp, Adam Tycner and Florent ZAHOUI Hallo