Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

Overview

Myo Keylogging

This is the source code for our paper My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack by Matthias Gazzari, Annemarie Mattmann, Max Maass and Matthias Hollick in Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 5, Issue 4, 2021.

We include the software used for recording the dataset (record folder) and the software for training and running the neural networks (ml folder) as well as analyzing the results (analysis folder). The scripts folder provides some helper scripts for automating batches of hyperparameter optimization, model fitting, analyses and more. The results folder includes a pickled version of the predictions of our models, on which analyses can be run, e.g. to reproduce the paper results.

Installation

To install the project, first clone the repository and change directory into the fresh clone:

git clone https://github.com/seemoo-lab/myo-keylogging.git
cd myo-keylogging

You can use a python virtual environment (or any other virtual environment of your choice):

mkvirtualenv myo --system-site-packages
workon myo

To make sure you have the newest software versions you can run an upgrade:

pip install --upgrade pip setuptools

To install the requirements run:

pip install -r requirements.txt

Finally, import the training and test data into the project. The top level folder should include a folder train-data with all the records for training the models and a folder test-data with all the records for testing the models.

wget https://zenodo.org/record/5594651/files/myo-keylogging-dataset.zip
unzip myo-keylogging-dataset.zip

Using the record library, you can add you can extend this dataset.

Rerun of Results

To reproduce our results from the provided predictions of our models, go to the top level directory and run:

./scripts/create_results.sh

This will recreate all performance value files and plots in the subfolders of the results folder as used in the paper.

Run the following to list the fastest and slowest typists in order to determine their class imbalance in the results/train-data-skew.csv and the results/test-data-skew.csv files:

python -m analysis exp_key_data

To recreate the provided predictions and class skew files, execute the following from the top level directory:

./scripts/create_models.sh
./scripts/create_predictions.sh
./scripts/create_class_skew_files.sh

This will fit the models with the current choice of hyperparameters and run each model on the test dataset to create the required predictions for analysis. Additionally, the class skew files will be recreated.

To run the hyperparameter optimization either run the run_shallow_hpo.sh script or, alternatively, the slurm_run_shallow_hpo.sh script when on a SLURM cluster.

sbatch scripts/slurm_run_shallow_hpo.sh
./scripts/run_shallow_hpo.sh

Afterwards you can use the merge_shallow_hpo_runs.py script to combine the results for easier evaluation of the hyperparameters.

Fit Models

In order to fit and analyze your own models, go to the top level directory and run any of:

python -m ml crnn
python -m ml resnet
python -m ml resnet11
python -m ml wavenet

This will fit the respective model with the default parameters and in binary mode for keystroke detection. In order to fit multiclass models for keystroke identification, use the encoding parameter, e.g.:

python -m ml crnn --encoding "multiclass"

In order to test specific sensors, ignore the others (note that quaternions are ignored by default), e.g. to use only EMG on a CRNN model, use:

python -m ml crnn --ignore "quat" "acc" "gyro"

To run a hyperparameter optimization, run e.g.:

python -m ml crnn --func shallow_hpo --step 5

To gain more information on possible parameters, run e.g.:

python -m ml crnn --help

Some parameters for the neural networks are fixed in the code.

Analyze Models

In order to analyze your models, run apply_models to create the predictions as pickled files. On these you can run further analyses found in the analysis folder.

To run apply_models on a binary model, do:

python -m analysis apply_models --model_path results/<PATH_TO_MODEL> --encoding binary --data_path test-data/ --save_path results/<PATH_TO_PKL> --save_only --basenames <YOUR MODELS>

To run a multiclass model, do:

python -m analysis apply_models --model_path results/<PATH_TO_MODEL> --encoding multiclass --data_path test-data/ --save_path results/<PATH_TO_PKL> --save_only --basenames <YOUR MODELS>

To chain a binary and multiclass model, do e.g.:

python -m analysis apply_models --model_path results/<PATH_TO_MODEL> --encoding chain --data_path test-data/ --save_path results/<PATH_TO_PKL> --save_only --basenames <YOUR MODELS> --tolerance 10

Further parameters interesting for analyses may be a filter on the users with the parameter (--users known or --users unknown) or on the data (--data known or --data unknown) to include only users (not) in the training data or include only data typed by all or no other user respectively.

For more information, run:

python -m analysis apply_models --help

To later recreate model performance results and plots, run:

python -m analysis apply_models --encoding <ENCODING> --load_results results/<PATH_TO_PKL> --save_path results/<PATH_TO_PKL> --save_only

with the appropriate encoding of the model used to create the pickled results.

To run further analyses on the generated predictions, create or choose your analysis from the analysis folder and run:

python -m analysis <ANALYSIS_NAME>

Refer to the help for further information:

python -m analysis <ANALYSIS_NAME> --help

Record Data

In order to record your own data(set), switch to the record folder. To record sensor data with our recording software, you will need one to two Myo armbands connected to your computer. Then, you can start a training data recording, e.g.:

python tasks.py -s 42 -l german record touch_typing --left_tty <TTY_LEFT_MYO> --left_mac <MAC_LEFT_MYO> --right_tty <TTY_RIGHT_MYO> --right_mac <MAC_RIGHT_MYO> --kb_model TADA68_DE

for a German recording with seed 42, a touch typist and a TADA68 German physical keyboard layout or

python tasks.py -s 42 -l english record touch_typing --left_tty <TTY_LEFT_MYO> --left_mac <MAC_LEFT_MYO> --right_tty <TTY_RIGHT_MYO> --right_mac <MAC_RIGHT_MYO> --kb_model TADA68_US

for an English recording with seed 42, a hybrid typist and a TADA68 English physical keyboard layout.

In order to start a test data recording, simply run the passwords.py instead of the tasks.py.

After recording training data, please execute the following script to complete the meta data:

python update_text_meta.py -p ../train-data/

After recording test data, please execute the following two scripts to complete the meta data:

python update_pw_meta.py -p ../test-data/
python update_cuts.py -p ../test-data/

For further information, check:

python tasks.py --help
python passwords.py --help

Note that the recording software includes text extracts as outlined in the acknowledgments below.

Links

Acknowledgments

This work includes the following external materials to be found in the record folder:

  1. Various texts from Wikipedia available under the CC-BY-SA 3.0 license.
  2. The EFF's New Wordlists for Random Passphrases available under the CC-BY 3.0 license.
  3. An extract of the Top 1000 most common passwords by Daniel Miessler, Jason Haddix, and g0tmi1k available under the MIT license.

License

This software is licensed under the GPLv3 license, please also refer to the LICENSE file.

Owner
Secure Mobile Networking Lab
Secure Mobile Networking Lab
A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2020 Links Doc

Sebastian Raschka 4.2k Jan 02, 2023
ANEA: Distant Supervision for Low-Resource Named Entity Recognition

ANEA: Distant Supervision for Low-Resource Named Entity Recognition ANEA is a tool to automatically annotate named entities in unlabeled text based on

Saarland University Spoken Language Systems Group 15 Mar 30, 2022
Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Realtime Face Anti-Spoofing Detection 🤖 Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo

Prem Kumar 86 Aug 03, 2022
Main repository for the HackBio'2021 Virtual Internship Experience for #Team-Greider ❤️

Hello 🤟 #Team-Greider The team of 20 people for HackBio'2021 Virtual Bioinformatics Internship 💝 🖨️ 👨‍💻 HackBio: https://thehackbio.com 💬 Ask us

Siddhant Sharma 7 Oct 20, 2022
Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training

ColossalAI An integrated large-scale model training system with efficient parallelization techniques. arXiv: Colossal-AI: A Unified Deep Learning Syst

HPC-AI Tech 7.9k Jan 08, 2023
Deep Learning for humans

Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For

Keras 57k Jan 09, 2023
RoboDesk A Multi-Task Reinforcement Learning Benchmark

RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro

Google Research 66 Oct 07, 2022
Pairwise learning neural link prediction for ogb link prediction

Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB) This repository provides evaluation codes of PLNLP for OGB link property prediction t

Zhitao WANG 31 Oct 10, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Dec 30, 2022
A research toolkit for particle swarm optimization in Python

PySwarms is an extensible research toolkit for particle swarm optimization (PSO) in Python. It is intended for swarm intelligence researchers, practit

Lj Miranda 1k Dec 30, 2022
CountDown to New Year and shoot fireworks

CountDown and Shoot Fireworks About App This is an small application make you re

5 Dec 31, 2022
Applying PVT to Semantic Segmentation

Applying PVT to Semantic Segmentation Here, we take MMSegmentation v0.13.0 as an example, applying PVTv2 to SemanticFPN. For details see Pyramid Visio

35 Nov 30, 2022
Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed

fast-Bart Reduction of BART model size by 3X, and boost in inference speed up to 3X BART implementation of the fastT5 library (https://github.com/Ki6a

Siddharth Sharma 19 Dec 09, 2022
Semantic Image Synthesis with SPADE

Semantic Image Synthesis with SPADE New implementation available at imaginaire repository We have a reimplementation of the SPADE method that is more

NVIDIA Research Projects 7.3k Jan 07, 2023
Pseudo-mask Matters in Weakly-supervised Semantic Segmentation

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation By Yi Li, Zhanghui Kuang, Liyang Liu, Yimin Chen, Wayne Zhang SenseTime, Tsinghua Unive

33 Oct 14, 2022
Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation Our paper is accepted by ICCV2021. Picture: Overview of the proposed Plug-an

Yunfei Liu 32 Dec 10, 2022
Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

HamasKhan 3 Jul 08, 2022
GND-Nets (Graph Neural Diffusion Networks) in TensorFlow.

GNDC For submission to IEEE TKDE. Overview Here we provide the implementation of GND-Nets (Graph Neural Diffusion Networks) in TensorFlow. The reposit

Wei Ye 3 Aug 08, 2022
Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

mtomo Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation.

Katsuya Hyodo 24 Mar 02, 2022
DTCN IJCAI - Sequential prediction learning framework and algorithm

DTCN This is the implementation of our paper "Sequential Prediction of Social Me

Bobby 2 Jan 24, 2022