Picka: A Python module for data generation and randomization.

Last update: Nov 30, 2021

Related tags

Overview

Picka: A Python module for data generation and randomization.

Author:	Anthony Long
Version:	1.0.1 - Fixed the broken image stuff. Whoops

What is Picka?

Picka generates randomized data for testing.

Data is generated both from a database of known good data (which is included), or by generating realistic data (valid), using string formatting (behind the scenes).

Picka has a function for any field you would need filled in. With selenium, something like would populate the "field-name-here" box for you, 100 times with random names.

for x in xrange(101):
        self.selenium.type('field-name-here', picka.male_name())

But this is just the beginning. Other ways to implement this, include using dicts:

user_information = {
        "first_name": picka.male_name(),
        "last_name": picka.last_name(),
        "email_address": picka.email(10, extension='example.org'),
        "password": picka.password_numerical(6),
}

This would provide:

{
        "first_name": "Jack",
        "last_name": "Logan",
        "email_address": "[email protected]",
        "password": "485444"
}

Don't forget, since all of the data is considered "clean" or valid - you can also use it to fill selects and other form fields with pre-defined values. For example, if you were to generate a state; picka.state() the result would be "Alabama". You can use this result to directly select a state in an address drop-down box.

Examples:

Selenium

def search_for_garbage():
        selenium.open('http://yahoo.com')
        selenium.type('id=search_box', picka.random_string(10))
        selenium.submit()

def test_search_for_garbage_results():
        search_for_garbage()
        selenium.wait_for_page_to_load('30000')
        assert selenium.get_xpath_count('id=results') == 0

Webdriver

driver = webdriver.Firefox()
driver.get("http://somesite.com")
x = {
        "name": [
                "#name",
                picka.name()
        ]
}
driver.find_element_by_css_selector(
        x["name"][0]).send_keys(x["name"][1]
)

Funcargs / pytest

def pytest_generate_tests(metafunc):
        if "test_string" in metafunc.funcargnames:
                for i in range(10):
                        metafunc.addcall(funcargs=dict(numiter=picka.random_string(20)))

def test_func(test_string):
        assert test_string.isalpha()
        assert len(test_string) == 20

MySQL / SQLite

first, last, age = picka.first_name(), picka.last_name(), picka.age()
cursor.execute(
   "insert into user_data (first_name, last_name, age) VALUES (?, ?, ?)",
   (first, last, age)
)

HTTP

def post(host, data):
        http = httplib.HTTP(host)
        return http.send(data)

def test_post_result():
        post("www.spam.egg/bacon.htm", picka.random_string(10))

Comments

No test suite

Slightly ironic, a test data generation toolkit which doesnt have a test suite.

Also setup.py doesnt declare Python 3 support, hence the need for a test suite to validate it works correctly.

opened by jayvdb 1
Additional Functionality for Testers to Add Their Own Data

Picka provides general data for testing. Leveraging this effort provides custom test data. Test data is not limited to just preconfigured values when it's possible to add custom test data. Data can be accessed sequentially, randomly or completely.

opened by bkuehlhorn 1
Fixed test file, added alternative sentence maker
Fixed usage of number in tests (it takes one arg, not two)

Added sentence_actual, which returns an actual sentence from the Sherlock text.

Added _picka._Book class to hold the text and split sentences read from Sherlock. Users can call sentence() without reading the entire file again and again.

Added test of sentence_actual to picka.tests

The sentence_actual function has some nice features:

You're much less likely to get a sentence fragment

You can specify a minimum and maximum number of words

It should be relatively efficient, because the split sentences are cached by the _Book class.

The sentences aren't always perfect, but I think that has to do with the source. A book other than Sherlock Holmes, preferably one with less dialog, would give more "normal" sentences.
opened by TadLeonard 1
Library does not take locale into account
The library assumes an English locale is used (e.g., English-language hardcoded month names). Ideally the library would use locale-dependent constants so that computations are done correctly (e.g., the duration of a month in month_and_day):

>>> locale.setlocale(locale.LC_ALL, 'it_IT') 'it_IT' >>> picka.month() 'Marzo' >>> picka.month_and_day() 'Maggio 2'
opened by svisser 0
picka.age will return ages outside of the bounds

If I call picka.age(1, 1) repeatedly I get 1 and 2 as results. I would have expected it to always return 1. Note that this situation can occur when passing variables to picka.age, I don't expect people to write this in their code themselves.

I can also get ages outside of the bounds when I call picka.age(0, 1) which resorts to using the default values and can therefore return any age within the default values.

opened by svisser 0
Module name means "cunt"

I'm not sure if this is a real issue, but when I look at this module I cannot do so with a straight face. "Picka" is "cunt" in Serbian, Macedonian, Bosnian, Croatian, and I'm unsure as to whether there are other languages where this holds.

While not grounds for any specific action, I find this largely amusing and just wanted to share.

opened by geomaster 2

Releases(v0.96)

v0.96(Jan 17, 2014)

hex, rbg, image and more.
Source code(tar.gz)
Source code(zip)
picka-0.9.6.tar.gz(8.13 MB)
picka-0.9.6.zip(8.18 MB)

Owner

Anthony

GitHub Repository http://antlong.com

A fast, flexible, and performant feature selection package for python.

linselect A fast, flexible, and performant feature selection package for python. Package in a nutshell It's built on stepwise linear regression When p

88 Dec 06, 2022

Tools for working with MARC data in Catalogue Bridge.

catbridge_tools Tools for working with MARC data in Catalogue Bridge. Borrows heavily from PyMarc

1 Nov 11, 2021

Feature engineering and machine learning: together at last

Feature engineering and machine learning: together at last! Lambdo is a workflow engine which significantly simplifies data analysis by unifying featu

14 Sep 15, 2022

Analyze the Gravitational wave data stored at LIGO/VIRGO observatories

Gravitational-Wave-Analysis This project showcases how to analyze the Gravitational wave data stored at LIGO/VIRGO observatories, using Python program

1 Jan 23, 2022

Handle, manipulate, and convert data with units in Python

unyt A package for handling numpy arrays with units. Often writing code that deals with data that has units can be confusing. A function might return

304 Jan 02, 2023

Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

Statistical Analysis 📈 This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr

1 Sep 03, 2022

MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data.

MetPy MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data. MetPy follows semantic versioni

971 Dec 25, 2022

fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

359 Dec 22, 2022

BIGDATA SIMULATION ONE PIECE WORLD CENSUS

ONE PIECE is a Japanese manga of great international success. The story turns inhabited in a fictional world, tells the adventures of a young man whose body gained rubber properties after accidentall

3 Jun 30, 2022

A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms

MatrixProfile MatrixProfile is a Python 3 library, brought to you by the Matrix Profile Foundation, for mining time series data. The Matrix Profile is

302 Dec 29, 2022

Common bioinformatics database construction

biodb Common bioinformatics database construction 1.taxonomy （Substance classification database） Download the database wget -c https://ftp.ncbi.nlm.ni

2 Jan 04, 2022

Detecting Underwater Objects (DUO)

Underwater object detection for robot picking has attracted a lot of interest. However, it is still an unsolved problem due to several challenges. We take steps towards making it more realistic by ad

27 Dec 12, 2022

Time ranges with python

timeranges Time ranges. Read the Docs Installation pip timeranges is available on pip: pip install timeranges GitHub You can also install the latest v

2 Sep 01, 2022

Lale is a Python library for semi-automated data science.

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-

293 Dec 29, 2022

Pyspark Spotify ETL

This is my first Data Engineering project, it extracts data from the user's recently played tracks using Spotify's API, transforms data and then loads it into Postgresql using SQLAlchemy engine. Data

16 Jun 09, 2022

Projeto para realizar o RPA Challenge . Utilizando Python e as bibliotecas Selenium e Pandas.

RPA Challenge in Python Projeto para realizar o RPA Challenge (www.rpachallenge.com), utilizando Python. O objetivo deste desafio é criar um fluxo de

1 Apr 12, 2022

Pip install minimal-pandas-api-for-polars

Minimal Pandas API for Polars Install From PyPI: pip install minimal-pandas-api-for-polars Example Usage (see tests/test_minimal_pandas_api_for_polars

6 Oct 16, 2022

Efficient matrix representations for working with tabular data

70 Dec 14, 2022

talkbox is a scikit for signal/speech processing, to extend scipy capabilities in that domain.

76 Nov 30, 2022

Data analysis and visualisation projects from a range of individual projects and applications

Python-Data-Analysis-and-Visualisation-Projects Data analysis and visualisation projects from a range of individual projects and applications. Python

1 Jan 25, 2022

Picka: A Python module for data generation and randomization.

Related tags

Overview

Picka: A Python module for data generation and randomization.

What is Picka?

Examples:

Selenium

Webdriver

Funcargs / pytest

MySQL / SQLite

HTTP

Comments

No test suite

Additional Functionality for Testers to Add Their Own Data

Fixed test file, added alternative sentence maker

Library does not take locale into account

picka.age will return ages outside of the bounds

Module name means "cunt"

Releases(v0.96)

v0.96(Jan 17, 2014)

Owner

Anthony

A fast, flexible, and performant feature selection package for python.

Tools for working with MARC data in Catalogue Bridge.

Feature engineering and machine learning: together at last

Analyze the Gravitational wave data stored at LIGO/VIRGO observatories

Handle, manipulate, and convert data with units in Python

Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data.

fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

BIGDATA SIMULATION ONE PIECE WORLD CENSUS

A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms

Common bioinformatics database construction

Detecting Underwater Objects (DUO)

Time ranges with python

Lale is a Python library for semi-automated data science.

Pyspark Spotify ETL

Projeto para realizar o RPA Challenge . Utilizando Python e as bibliotecas Selenium e Pandas.

Pip install minimal-pandas-api-for-polars

Efficient matrix representations for working with tabular data

talkbox is a scikit for signal/speech processing, to extend scipy capabilities in that domain.

Data analysis and visualisation projects from a range of individual projects and applications