funcnet.coupling_analysis¶
Provides classes for analyzing spatially embedded complex networks, handling multivariate data. Written by Jakob Runge.
- class pyunicorn.funcnet.coupling_analysis.CouplingAnalysis(data, silence_level=0)[source]¶
Bases:
object
Contains methods to calculate coupling matrices from large arrays of scalar time series. Comprises linear and information-theoretic measures, lagged and directed couplings.
- __init__(data, silence_level=0)[source]¶
Initialize an instance of CouplingAnalysis from data array.
- Parameters:
data (multidimensional numpy array) – The time series array with time in first dimension.
silence_level (int >= 0) – The higher, the less progress info is output.
- __weakref__¶
list of weak references to the object
- static _par_corr_to_cmi(par_corr)[source]¶
Transformation of partial correlation to conditional mutual information scale using the (multivariate) Gaussian assumption.
- Parameters:
par_corr (float or array) – partial correlation
- Return type:
float
- Returns:
transformed partial correlation.
- static _quantile_bin_array(array, bins=6)[source]¶
Returns symbolified array with aequi-quantile binning.
This partition results in a uniform distribution of the marginals.
- Parameters:
array (array) – data
bins (int) – number of bins
- Return type:
array
- Returns:
converted data
- static bincount_hist(symb_array)[source]¶
Computes histogram from symbolic array.
- Parameters:
symb_array (array of integers) – symbolic data
- Return type:
array
- Returns:
(unnormalized) histogram
- static create_plogp(T)[source]¶
Precalculation of p*log(p) needed for entropies.
- Parameters:
T (int) – sample length
- Return type:
array
- Returns:
p*log(p) array from p=1 to p=T
- cross_correlation(tau_max=0, lag_mode='max')[source]¶
Return cross correlation between all pairs of nodes.
Two lag-modes are available (default: lag_mode=’max’):
lag_mode = ‘all’: Return 3-dimensional array of lagged cross correlations between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(\rho(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).
lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lagged cross correlation (CC) between all pairs of nodes. Returns two usually asymmetric matrices of CC values and lags: In each matrix, an entry \((i, j)\) corresponds to the (positive or negative) value and lag, respectively, at absolute maximum of \(\rho(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function
symmetrize_by_absmax()
can be used to obtain a symmetric matrix.Example:
>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data()) >>> similarity_matrix, lag_matrix = coup_ana.cross_correlation( ... tau_max=5, lag_mode='max') >>> r((similarity_matrix, lag_matrix)) (array([[ 1. , 0.757 , 0.779 , 0.7536], [ 0.4847, 1. , 0.4502, 0.5197], [ 0.6219, 0.5844, 1. , 0.5992], [ 0.4827, 0.5509, 0.4996, 1. ]]), array([[0, 4, 1, 2], [0, 0, 0, 0], [0, 3, 0, 1], [0, 2, 0, 0]]))
- Parameters:
tau_max (int [int>=0]) – maximum lag of cross correlation lag function.
lag_mode (str [('max'|'all')]) – lag-mode of cross correlations to return.
- Return type:
3D-array or tuple of matrices
- Returns:
all-lag array or matrices of value and lag at the absolute maximum.
- static get_nearest_neighbors(array, xyz, k, standardize=True)[source]¶
Returns nearest-neighbors for conditional mutual information estimator.
Reference: [Kraskov2004]
- Parameters:
array (array (float)) – data array.
xyz (array [int(0|1|2)]) – identifier of X, Y, Z in CMI
k (int [int>=1]) – nearest-neighbor MI estimation parameter.
standardize (bool) – standardize array before estimation. (default: True)
- Return type:
tuple of arrays
- Returns:
nearest neighbors for each sample point.
- information_transfer(tau_max=0, estimator='knn', knn=10, past=1, cond_mode='ity', lag_mode='max')[source]¶
Return bivariate information transfer between all pairs of nodes.
Two condition modes of information transfer are available as described in [Runge2012b].
- Information transfer to Y (ITY):
- \[I(X^i_t-\tau, X^j_t | X^j_t-1, ...,X^j_t-past)\]
- Momentary information transfer (MIT):
- \[I(X^i_t-\tau, X^j_t | X^j_t-1, ...,X^j_t-past, X^i_t-\tau-1, ...,X^j_t-\tau-past)\]
Two estimators are available:
estimator = ‘knn’ (Recommended): Based on k-nearest-neighbors [Kraskov2004], version 1 in their paper. Larger k have smaller variance, but larger (typically negative) bias, and vice versa.
estimator = ‘gauss’: Captures only linear part of association. Essentially estimates a transformed partial correlation.
Two lag-modes are available (default: lag_mode=’max’):
lag_mode = ‘all’: Return 3-dimensional array of lag-functions between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(I(X^i_t-\tau, X^j_t | ...)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).
lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lag-functions between all pairs of nodes. Returns two usually asymmetric matrices of values and lags: In each matrix, an entry \((i, j)\) corresponds to the value and lag, respectively, at absolute maximum of \(I(X^i_t-\tau, X^j_t | ...)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function
symmetrize_by_absmax()
can be used to obtain a symmetric matrix.Example:
>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data()) >>> similarity_matrix, lag_matrix = coup_ana.information_transfer( ... tau_max=5, estimator='knn', knn=10) >>> r((similarity_matrix, lag_matrix)) (array([[ 0. , 0.1544, 0.3261, 0.3047], [ 0.0218, 0. , 0.0394, 0.0976], [ 0.0134, 0.0663, 0. , 0.1502], [ 0.0066, 0.0694, 0.0401, 0. ]]), array([[0, 2, 1, 2], [5, 0, 0, 0], [5, 1, 0, 1], [5, 0, 0, 0]]))
- Parameters:
tau_max (int [int>=0]) – maximum lag of ITY lag function.
past (int [int>=1]) – maximum lag of past history.
knn (int [int>=1]) – nearest-neighbor ITY estimation parameter. (default: 10)
bins (int [int>=2]) – binning ITY estimation parameter. (default: 6)
estimator (str [('knn'|'gauss')]) – ITY estimator. (default: ‘knn’)
cond_mode (str [('ity'|'mit')]) – condition mode. (default: ‘ity’)
lag_mode (str [('max'|'all')]) – lag-mode of ITY to return.
- Return type:
3D-array or tuple of matrices
- Returns:
all-lag array or matrices of value and lag at the absolute maximum.
- mutual_information(tau_max=0, estimator='knn', knn=10, bins=6, lag_mode='max')[source]¶
Return mutual information (MI) between all pairs of nodes.
Three estimators are available:
estimator = ‘knn’ (Recommended): Based on k-nearest-neighbors [Kraskov2004], version 1 in their paper. Larger k have smaller variance, but larger (typically negative) bias, and vice versa.
estimator = ‘binning’: Binning estimator based on equal-quantile binning.
estimator = ‘gauss’: Captures only linear part of association. Essentially estimates a transformed partial correlation.
Two lag-modes are available (default: lag_mode=’max’):
lag_mode = ‘all’: Return 3-dimensional array of lagged MI between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(I(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).
lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lagged MI between all pairs of nodes. Returns two usually asymmetric matrices of MI values and lags: In each matrix, an entry \((i, j)\) corresponds to the value and lag, respectively, at absolute maximum of \(I(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function
symmetrize_by_absmax()
can be used to obtain a symmetric matrix.Reference: [Kraskov2004]
Example:
>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data()) >>> similarity_matrix, lag_matrix = coup_ana.mutual_information( ... tau_max=5, knn=10, estimator='knn') >>> r(similarity_matrix) array([[ 4.6505, 0.4387, 0.4652, 0.4126], [ 0.147 , 4.6505, 0.1065, 0.1639], [ 0.2483, 0.2126, 4.6505, 0.2204], [ 0.1209, 0.199 , 0.1453, 4.6505]]) >>> lag_matrix array([[0, 4, 1, 2], [0, 0, 0, 0], [0, 2, 0, 1], [0, 2, 0, 0]], dtype=int8)
- Parameters:
tau_max (int [int>=0]) – maximum lag of MI lag function.
knn (int [int>=1]) – nearest-neighbor MI estimation parameter. (default: 10)
bins (int [int>=2]) – binning MI estimation parameter. (default: 6)
estimator (str [('knn'|'binning'|'gauss')]) – MI estimator. (default: ‘knn’)
lag_mode (str [('max'|'all')]) – lag-mode of MI to return.
- Return type:
3D-array or tuple of matrices
- Returns:
all-lag array or matrices of value and lag at the absolute maximum.
- silence_level¶
(int>=0) higher -> less progress info
- symmetrize_by_absmax(similarity_matrix, lag_matrix)[source]¶
Returns symmetrized similarity matrix.
Computes the largest absolute value for each pair (i,j) and (j,i) and returns the in-place changed matrices of measures and lags. A negative lag for an entry (i,j) in the lag_matrix then indicates a ‘direction’ j –> i regarding the peak of the lag function, and vice versa for a positive lag.
Example:
>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data()) >>> similarity_matrix, lag_matrix = coup_ana.cross_correlation( ... tau_max=2) >>> r((similarity_matrix, lag_matrix)) (array([[ 1. , 0.698 , 0.7788, 0.7535], [ 0.4848, 1. , 0.4507, 0.52 ], [ 0.6219, 0.5704, 1. , 0.5996], [ 0.4833, 0.5503, 0.5002, 1. ]]), array([[0, 2, 1, 2], [0, 0, 0, 0], [0, 2, 0, 1], [0, 2, 0, 0]])) >>> r(coup_ana.symmetrize_by_absmax(similarity_matrix, lag_matrix)) (array([[ 1. , 0.698 , 0.7788, 0.7535], [ 0.698 , 1. , 0.5704, 0.5503], [ 0.7788, 0.5704, 1. , 0.5996], [ 0.7535, 0.5503, 0.5996, 1. ]]), array([[ 0, 2, 1, 2], [-2, 0, -2, -2], [-1, 2, 0, 1], [-2, 2, -1, 0]]))
- Parameters:
similarity_matrix (array-like [float]) – array-like [node, node] matrix of similarity estimates
lag_matrix (array-like [int>=0]) – array-like [node, node] matrix of lags
- Return type:
tuple of arrays
- Returns:
the value at the absolute maximum and the (pos or neg) lag.