Statistical tests for the sequential locality of graphs

Last update: Nov 23, 2021

Related tags

Overview

Statistical tests for the sequential locality of graphs

You can assess the statistical significance of the sequential locality of an adjacency matrix (graph + vertex sequence) using sequential_locality.py.

This file also includes ORGM.py that generates an instance of the ordered random graph model (ORGM) [1] and spectral.py that yields an optimized vertex sequence based on the spectral ordering algorithms.

Please find Ref. [1] for the details of the statistical tests.

sequential_locality.py

sequential_locality.py executes statistical tests with respect to the sequential locality.

Simple example

import numpy as np
import igraph
import sequential_locality as seq

s = seq.SequentialLocality(
		g = igraph.Graph.Erdos_Renyi(n=20,m=80), 
		sequence = np.arange(20)
		)
s.H1()

{'H1': 1.0375,
 'z1': 0.5123475382979811,
 'H1 p-value (ER/ORGM)': 0.6957960998835012,
 'H1 p-value (random)': 0.7438939644617626,
 'bandwidth_opt': None}

Please find Demo.ipynb for more examples.

SequentialLocality

This is a class to be instantiated to assess the sequential locality.

Input parameters

Either g or edgelist must be provided as an input.

Parameter	Value	Default	Description
g	graph	None	Graph (undirected, unweighted, no self-loops) in igraph or graph-tool.
edgelist	list of tuples	None	Edgelist as a list of tuples.
sequence	1-dim array	None	Array (list or ndarray) indicating the vertex ordering. If provided, the vertex indices in the graph will be replaced based on `sequence` . If `sequence` is `None`, the intrinsic vertex indices in the graph or edgelist will be used as the `sequence` .
format	`'igraph'` or `'graph-tool'`	`'igraph'`	Input graph format
simple	Boolean	True	If `True`, the graph is assumed to be a simple graph, otherwise the graph is assumed to be a multigraph.

H1

This is a method that returns H₁ and z₁ test statistics and p-values of the input data.

Input parameters

Parameter	Value	Default	Description
random_sequence	`'analytical'` or `'empirical'`	`'analytical'`	If `'analytical'` is selected, the p-value based on the normal approximation will be returned for the test of vertex sequence `H1 p-value (random)`. If `'empirical'` is selected, the p-value based on random sequences specified by `samples` will be returned.
n_samples	Integer	10,000	Number of samples to be drawn as a set of random sequences. This is used only when `random_sequence = 'empirical'`.
in_envelope	Boolean	`False`	If `False`, the p-value based on the ER model will be returned. If `True`, the p-value based on the ORGM will be returned. That is, the matrix elements outside of the bandwidth `r` will be ignored.
r	Integer	None	An integer between `1` and `N-1`. If provided, `r` will be used as the bandwidth when `in_envelope=True`.

Output parameters

Parameter	Description
H1	H₁ test statistic of the input data (graph & vertex sequence)
z1	z₁ test statistic of the input data
H1 p-value (ER/ORGM)	p-value under the null hypothesis of the ER random graph (when `in_envelope=False`) or the ORGM (when `in_envelope=True`).
H1 p-value (random)	p-value under the null hypothesis of random sequences
bandwidth_opt	Maximum likelihood estimate (MLE) of the bandwidth (when `r=None` in the input) or the input bandwidth `r`

HG

This is a method that returns H_G and z_G test statistics and p-values of the input data.

There is no in_envelope option for the test based on H_G.
random_sequence = 'analytical' can be computationally demanding.

Input parameters

Parameter	Value	Default	Description
random_sequence	`'analytical'` or `'empirical'`	`'empirical'`	If `'analytical'` is selected, the p-value based on the normal approximation will be returned for the test of vertex sequence `H1 p-value (random)`. If `'empirical'` is selected, the p-value based on random sequences specified by `samples` will be returned.
n_samples	Integer	10,000	Number of samples to be drawn as a set of random sequences. This is used only when `random_sequence = 'empirical'`.

Output parameters

Parameter	Description
HG	H_G test statistic of the input data (graph & vertex sequence)
zG	z_G test statistic of the input data
HG p-value (ER)	p-value under the null hypothesis of the ER random graph.
HG p-value (random)	p-value under the null hypothesis of random sequences

ORGM.py

ORGM.py is a random graph generator. It generates an ORGM [1] instance that has a desired strength of sequentially lcoal structure.

Simple example

import ORGM as orgm

edgelist, valid = orgm.ORGM(
	N=20, M=80, bandwidth=10, epsilon=0.25
	)

Input parameters

Parameter	Value	Default	Description
N	Integer	required input	Number of vertices
M	Integer	required input	Number of edges
bandwidth	Integer	required input	Bandwidth of the ORGM
epsilon	Float (in [0,1])	required input	Density ratio between the adjacency matrix elements inside & outside of the envelope. When `epsilon=1`, the ORGM becomes a uniform model. When `epsilon=0`, the nonzero matrix elements are strictly confined in the envelope.
simple	Boolean	`True`	If `True`, the graph is constrained to be simple. If `False`, the graph is allowed to have multiedges.

spectral.py

spectral.py is an implementation of the spectral ordering [2].

Simple example

import graph_tool.all as gt
import spectral

g_real = gt.collection.ns['karate/77']
inferred_sequence = spectral.spectral_sequence(
	g= g_real, 
	format='graph-tool'
	)

Parameter	Value	Default	Description
g	graph	required input	graph (undirected, unweighted, no self-loops) in igraph or graph-tool
normalized	Boolean	`True`	Normalized Laplacian (`True`) vs unnormalized (combinatorial) Laplacian (`False`)
format	`'igraph'` or `'graph-tool'`	`'igraph'`	Input graph format

Citation

Please use Ref. [1] for the citation of the present code.

References

[1] Tatsuro Kawamoto and Teruyoshi Kobayashi, "Sequential locality of graphs and its hypothesis testing," arXiv:2111.11267 (2021).
[2] Chris Ding and Xiaofeng He, "Linearized Cluster Assignment via Spectral Ordering," Proceedings of the Twenty-First International Conference on Machine Learning (ICML) (2004).

Statistical tests for the sequential locality of graphs

Related tags

Overview

Statistical tests for the sequential locality of graphs

sequential_locality.py

Simple example

SequentialLocality

Input parameters

H1

Input parameters

Output parameters

HG

Input parameters

Output parameters

ORGM.py

Simple example

Input parameters

spectral.py

Simple example

Citation

References

Owner

pytest plugin for distributed testing and loop-on-failures testing modes.

A collection of testing examples using pytest and many other libreris

Generic automation framework for acceptance testing and RPA

hyppo is an open-source software package for multivariate hypothesis testing.

Front End Test Automation with Pytest Framework

Generates realistic traffic for load testing tile servers

This project demonstrates selenium's ability to extract files from a website.

Aioresponses is a helper for mock/fake web requests in python aiohttp package.

Data-Driven Tests for Python Unittest

Multi-asset backtesting framework. An intuitive API lets analysts try out their strategies right away

A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:

Object factory for Django

Automatic SQL injection and database takeover tool

Rerun pytest when your code changes

RAT-el is an open source penetration test tool that allows you to take control of a windows machine.

The evaluator covering all of the metrics required by tasks within the DUE Benchmark.

Main purpose of this project is to provide the service to automate the API testing process

Pytest support for asyncio.

A simple python script that uses selenium(chrome web driver),pyautogui,time and schedule modules to enter google meets automatically

Fully functioning price detector built with selenium and python