funcnet.coupling_analysis¶

Provides classes for analyzing spatially embedded complex networks, handling multivariate data. Written by Jakob Runge.

class pyunicorn.funcnet.coupling_analysis.CouplingAnalysis(data, silence_level=0)[source]¶

Bases: object

Contains methods to calculate coupling matrices from large arrays of scalar time series. Comprises linear and information-theoretic measures, lagged and directed couplings.

__init__(data, silence_level=0)[source]¶

Initialize an instance of CouplingAnalysis from data array.

Parameters:

data (multidimensional numpy array) – The time series array with time in first dimension.
silence_level (int >= 0) – The higher, the less progress info is output.

__str__()[source]¶: Return a string representation of the CouplingAnalysis object.

__weakref__¶: list of weak references to the object

static _par_corr_to_cmi(par_corr)[source]¶

Transformation of partial correlation to conditional mutual information scale using the (multivariate) Gaussian assumption.

Parameters:: par_corr (float or array) – partial correlation
Return type:: float
Returns:: transformed partial correlation.

static _quantile_bin_array(array, bins=6)[source]¶

Returns symbolified array with aequi-quantile binning.

This partition results in a uniform distribution of the marginals.

Parameters:

array (array) – data
bins (int) – number of bins

Return type:

array

Returns:

converted data

static bincount_hist(symb_array)[source]¶

Computes histogram from symbolic array.

Parameters:: symb_array (array of integers) – symbolic data
Return type:: array
Returns:: (unnormalized) histogram

static create_plogp(T)[source]¶

Precalculation of p*log(p) needed for entropies.

Parameters:: T (int) – sample length
Return type:: array
Returns:: p*log(p) array from p=1 to p=T

cross_correlation(tau_max=0, lag_mode='max')[source]¶

Return cross correlation between all pairs of nodes.

Two lag-modes are available (default: lag_mode=’max’):

lag_mode = ‘all’: Return 3-dimensional array of lagged cross correlations between all pairs of nodes. An entry $(i, j, τ)$ corresponds to $ρ (X_{t}^{i} - τ, X_{t}^{j})$ for positive lags tau, i.e., the direction i –> j for $τ \neq 0$ .

lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lagged cross correlation (CC) between all pairs of nodes. Returns two usually asymmetric matrices of CC values and lags: In each matrix, an entry $(i, j)$ corresponds to the (positive or negative) value and lag, respectively, at absolute maximum of $ρ (X_{t}^{i} - τ, X_{t}^{j})$ for positive lags tau, i.e., the direction i –> j for $τ > 0$ . The matrices are, thus, asymmetric. The function symmetrize_by_absmax() can be used to obtain a symmetric matrix.

Example:

>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data())
>>> similarity_matrix, lag_matrix = coup_ana.cross_correlation(
...     tau_max=5, lag_mode='max')
>>> r((similarity_matrix, lag_matrix))
(array([[ 1.   ,  0.757 ,  0.779 ,  0.7536],
       [ 0.4847,  1.    ,  0.4502,  0.5197],
       [ 0.6219,  0.5844,  1.    ,  0.5992],
       [ 0.4827,  0.5509,  0.4996,  1.    ]]),
 array([[0, 4, 1, 2], [0, 0, 0, 0], [0, 3, 0, 1], [0, 2, 0, 0]]))

Parameters:

tau_max (int [int>=0]) – maximum lag of cross correlation lag function.
lag_mode (str [('max'|'all')]) – lag-mode of cross correlations to return.

Return type:

3D-array or tuple of matrices

Returns:

all-lag array or matrices of value and lag at the absolute maximum.

static get_nearest_neighbors(array, xyz, k, standardize=True)[source]¶

Returns nearest-neighbors for conditional mutual information estimator.

Reference: [Kraskov2004]

Parameters:

array (array (float)) – data array.
xyz (array [int(0|1|2)]) – identifier of X, Y, Z in CMI
k (int [int>=1]) – nearest-neighbor MI estimation parameter.
standardize (bool) – standardize array before estimation. (default: True)

Return type:

tuple of arrays

Returns:

nearest neighbors for each sample point.

information_transfer(tau_max=0, estimator='knn', knn=10, past=1, cond_mode='ity', lag_mode='max')[source]¶

Return bivariate information transfer between all pairs of nodes.

Two condition modes of information transfer are available as described in [Runge2012b].

Information transfer to Y (ITY):: $I (X_{t}^{i} - τ, X_{t}^{j} | X_{t}^{j} - 1, . . ., X_{t}^{j} - p a s t)$
Momentary information transfer (MIT):: $I (X_{t}^{i} - τ, X_{t}^{j} | X_{t}^{j} - 1, . . ., X_{t}^{j} - p a s t, X_{t}^{i} - τ - 1, . . ., X_{t}^{j} - τ - p a s t)$

Two estimators are available:

estimator = ‘knn’ (Recommended): Based on k-nearest-neighbors [Kraskov2004], version 1 in their paper. Larger k have smaller variance, but larger (typically negative) bias, and vice versa.

estimator = ‘gauss’: Captures only linear part of association. Essentially estimates a transformed partial correlation.

Two lag-modes are available (default: lag_mode=’max’):

lag_mode = ‘all’: Return 3-dimensional array of lag-functions between all pairs of nodes. An entry $(i, j, τ)$ corresponds to $I (X_{t}^{i} - τ, X_{t}^{j} | . . .)$ for positive lags tau, i.e., the direction i –> j for $τ \neq 0$ .

lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lag-functions between all pairs of nodes. Returns two usually asymmetric matrices of values and lags: In each matrix, an entry $(i, j)$ corresponds to the value and lag, respectively, at absolute maximum of $I (X_{t}^{i} - τ, X_{t}^{j} | . . .)$ for positive lags tau, i.e., the direction i –> j for $τ > 0$ . The matrices are, thus, asymmetric. The function symmetrize_by_absmax() can be used to obtain a symmetric matrix.

Example:

>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data())
>>> similarity_matrix, lag_matrix = coup_ana.information_transfer(
...     tau_max=5, estimator='knn', knn=10)
>>> r((similarity_matrix, lag_matrix))
(array([[ 0.    ,  0.1544,  0.3261,  0.3047],
       [  0.0218,  0.    ,  0.0394,  0.0976],
       [  0.0134,  0.0663,  0.    ,  0.1502],
       [  0.0066,  0.0694,  0.0401,  0.    ]]),
array([[0, 2, 1, 2], [5, 0, 0, 0], [5, 1, 0, 1], [5, 0, 0, 0]]))

Parameters:

tau_max (int [int>=0]) – maximum lag of ITY lag function.
past (int [int>=1]) – maximum lag of past history.
knn (int [int>=1]) – nearest-neighbor ITY estimation parameter. (default: 10)
bins (int [int>=2]) – binning ITY estimation parameter. (default: 6)
estimator (str [('knn'|'gauss')]) – ITY estimator. (default: ‘knn’)
cond_mode (str [('ity'|'mit')]) – condition mode. (default: ‘ity’)
lag_mode (str [('max'|'all')]) – lag-mode of ITY to return.

Return type:

3D-array or tuple of matrices

Returns:

all-lag array or matrices of value and lag at the absolute maximum.

mutual_information(tau_max=0, estimator='knn', knn=10, bins=6, lag_mode='max')[source]¶

Return mutual information (MI) between all pairs of nodes.

Three estimators are available:

estimator = ‘knn’ (Recommended): Based on k-nearest-neighbors [Kraskov2004], version 1 in their paper. Larger k have smaller variance, but larger (typically negative) bias, and vice versa.

estimator = ‘binning’: Binning estimator based on equal-quantile binning.

estimator = ‘gauss’: Captures only linear part of association. Essentially estimates a transformed partial correlation.

Two lag-modes are available (default: lag_mode=’max’):

lag_mode = ‘all’: Return 3-dimensional array of lagged MI between all pairs of nodes. An entry $(i, j, τ)$ corresponds to $I (X_{t}^{i} - τ, X_{t}^{j})$ for positive lags tau, i.e., the direction i –> j for $τ \neq 0$ .

lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lagged MI between all pairs of nodes. Returns two usually asymmetric matrices of MI values and lags: In each matrix, an entry $(i, j)$ corresponds to the value and lag, respectively, at absolute maximum of $I (X_{t}^{i} - τ, X_{t}^{j})$ for positive lags tau, i.e., the direction i –> j for $τ > 0$ . The matrices are, thus, asymmetric. The function symmetrize_by_absmax() can be used to obtain a symmetric matrix.

Reference: [Kraskov2004]

Example:

>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data())
>>> similarity_matrix, lag_matrix = coup_ana.mutual_information(
...     tau_max=5, knn=10, estimator='knn')
>>> r(similarity_matrix)
array([[ 4.6505,  0.4387,  0.4652,  0.4126],
       [ 0.147 ,  4.6505,  0.1065,  0.1639],
       [ 0.2483,  0.2126,  4.6505,  0.2204],
       [ 0.1209,  0.199 ,  0.1453,  4.6505]])
>>> lag_matrix
array([[0, 4, 1, 2],
       [0, 0, 0, 0],
       [0, 2, 0, 1],
       [0, 2, 0, 0]], dtype=int8)

Parameters:

tau_max (int [int>=0]) – maximum lag of MI lag function.
knn (int [int>=1]) – nearest-neighbor MI estimation parameter. (default: 10)
bins (int [int>=2]) – binning MI estimation parameter. (default: 6)
estimator (str [('knn'|'binning'|'gauss')]) – MI estimator. (default: ‘knn’)
lag_mode (str [('max'|'all')]) – lag-mode of MI to return.

Return type:

3D-array or tuple of matrices

Returns:

all-lag array or matrices of value and lag at the absolute maximum.

silence_level¶: (int>=0) higher -> less progress info

symmetrize_by_absmax(similarity_matrix, lag_matrix)[source]¶

Returns symmetrized similarity matrix.

Computes the largest absolute value for each pair (i,j) and (j,i) and returns the in-place changed matrices of measures and lags. A negative lag for an entry (i,j) in the lag_matrix then indicates a ‘direction’ j –> i regarding the peak of the lag function, and vice versa for a positive lag.

Example:

>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data())
>>> similarity_matrix, lag_matrix = coup_ana.cross_correlation(
...     tau_max=2)
>>> r((similarity_matrix, lag_matrix))
(array([[ 1.    , 0.698 , 0.7788, 0.7535],
        [ 0.4848, 1.    , 0.4507, 0.52  ],
        [ 0.6219, 0.5704, 1.    , 0.5996],
        [ 0.4833, 0.5503, 0.5002, 1.    ]]),
 array([[0, 2, 1, 2], [0, 0, 0, 0],
        [0, 2, 0, 1], [0, 2, 0, 0]]))
>>> r(coup_ana.symmetrize_by_absmax(similarity_matrix, lag_matrix))
(array([[ 1.    , 0.698 , 0.7788, 0.7535],
        [ 0.698 , 1.    , 0.5704, 0.5503],
        [ 0.7788, 0.5704, 1.    , 0.5996],
        [ 0.7535, 0.5503, 0.5996, 1.    ]]),
 array([[ 0, 2, 1, 2], [-2, 0, -2, -2],
        [-1, 2, 0, 1], [-2, 2, -1, 0]]))

Parameters:

similarity_matrix (array-like [float]) – array-like [node, node] matrix of similarity estimates
lag_matrix (array-like [int>=0]) – array-like [node, node] matrix of lags

Return type:

tuple of arrays

Returns:

the value at the absolute maximum and the (pos or neg) lag.

static test_data()[source]¶: Return example test data as discussed in pyunicorn description paper.