API Reference

This page provides an auto-generated summary of HYENNA’s API. For more details and examples, refer to the rest of the documentation.

Estimators

hyeenna.estimators.conditional_entropy(X: numpy.array, Y: numpy.array, k: int = 5) → float[source]

Computes the conditional Shannon entropy of a sample of a random variable X given another sample of a random variable Y using an adaptation of the KL and KSG estimators

Parameters:
  • X (np.array) – Sample from a random variable
  • Y (np.array) – Sample from a random variable
  • k (int, optional) – Number of neighbors to use in estimation
Returns:

cent – estimated conditional entropy

Return type:

float

References

[0]Goria, M. N., Leonenko, N. N., Mergel, V. V., & Inverardi, P. L. N. (2005). A new class of random vector entropy estimators and its applications in testing statistical hypotheses. Journal of Nonparametric Statistics, 17(3), 277–297.
[1]
  • Kraskov, A., Stögbauer, H., & Grassberger, P. (2004).

Estimating mutual information. Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 69(6), 16. https://doi.org/10.1103/PhysRevE.69.066138

hyeenna.estimators.conditional_mutual_info(X: numpy.array, Y: numpy.array, Z: numpy.array, k: int = 5) → float[source]

Compute the conditional mutual information

Parameters:
  • X (np.array) – Sample from random variable X
  • Y (np.array) – Sample from random variable Y
  • Z (np.array) – Sample from random variable Z
  • k (int, optional) – Number of neighbors to use in estimation
Returns:

Return type:

estimated conditional mutual information

References

[0]
  • Vlachos, I., & Kugiumtzis, D. (2010).

Non-uniform state space reconstruction and coupling detection. https://doi.org/10.1103/PhysRevE.82.016207

hyeenna.estimators.conditional_transfer_entropy(X: numpy.array, Y: numpy.array, Z: numpy.array, tau: int = 1, omega: int = 1, nu: int = 1, k: int = 1, l: int = 1, m: int = 1, neighbors: int = 5, **kwargs) → float[source]

Compute the transfer entropy from a source variable, X, to a target variable, Y, conditioned on other variables contained in Z.

Parameters:
  • X (np.array) – Source sample from a random variable X
  • Y (np.array) – Target sample from a random variable Y
  • Z (np.array) – Conditioning variable(s).
  • tau (int (default: 1)) – Number of timestep lags for the source variable
  • omega (int (default: 1)) – Number of timestep lags for the target variable conditioning
  • nu (int (default: 1)) – Number of timestep lags for the source variable conditioning
  • k (int (default: 1)) – Width of window for the source variable.
  • l (int (default: 1)) – Width of window for the target variable conditioning.
  • m (int (default: 1)) – Width of window for the source variable conditioning.
  • neighbors (int (default: K)) – Parameter controlling the number of neighbors to use in estimation.
  • **kwargs – Other arguments (undocumented, for internal usage)
Returns:

conditional_transfer_entropy – Computed via conditional_mutual_info

Return type:

float

References

[0]Schreiber, T. (2000). Measuring information transfer. Physical Review Letters, 85(2), 461–464. https://doi.org/10.1103/PhysRevLett.85.461
hyeenna.estimators.entropy(X: numpy.array, k: int = 5) → float[source]

Computes the Shannon entropy of a random variable X using the KL nearest neighbor estimator.

The formula is given by:
$$ hat{H}(X) = psi(N) - psi(k) + log(C_d) + d langle log(epsilon)
angle
$$
where
  • $N$ is the number of samples
  • $k$ is the number of neighbors
  • $psi is the digamma function
  • $

angle cdot angle$ is the mean

  • $epsilon_i$ is the 2 times the distance to the $k^{th}$ nearest neighbor.
X: np.array
Sample from a random variable
k: int, optional
Number of neighbors to use in estimation
ent: float
estimated entropy
[0]Goria, M. N., Leonenko, N. N., Mergel, V. V., & Inverardi, P. L. N. (2005). A new class of random vector entropy estimators and its applications in testing statistical hypotheses. Journal of Nonparametric Statistics, 17(3), 277–297. https://doi.org/10.1080/104852504200026815
hyeenna.estimators.kl_divergence(P: numpy.array, Q: numpy.array, k: int = 5)[source]

Compute the KL divergence

Parameters:
  • P (np.array) – Sample from random variable P
  • Q (np.array) – Sample from random variable Q
  • k (int, optional) – Number of neighbors to use in estimation
Returns:

Return type:

estimated KL divergence D(P|Q)

References

[0]
  • Wang, Q., Kulkarni, S. R., & Verdu, S. (2006). A Nearest-Neighbor

Approach to Estimating Divergence between Continuous Random Vectors. In 2006 IEEE International Symposium on Information Theory. https://doi.org/10.1109/ISIT.2006.261842

hyeenna.estimators.marginal_neighbors(X: numpy.array, R: numpy.array, metric='chebyshev') → list[source]

Number of neighbors within a certain radius

hyeenna.estimators.mi_local_nonuniformity_correction(X, *args, k: int = 5, alpha=1.05, **kwargs)[source]

Compute the local nonuniformity correction factor for strongly dependent variables. This correction is calculated based on the structure of the space of k-nearest neighbors. The volume of the hyper-rectangle of the maximum-norm bounding box for the k-nearest neighbor estimation is compared to that of the hyper-rectangle bounding the principal components of the covariance matrix of the k-nearest neighbor locations.

Parameters:
  • X (np.array) – A sample from a random variable
  • *args (List[np.array]) – Samples from random variables
  • k (int, optional) – Number of neighbors to use in estimation.
  • alpha (float, optional) – Sensitivity parameter for filtering non-dependent volumes
  • **kwargs (np.array) – Samples from random variables
Returns:

lnc – The correction factor to be subtracted from the mutual information

Return type:

float

References

[0]
  • Gao, S., Steeg, G. V., & Galstyan, A. (2014). Efficient

Estimation of Mutual Information for Strongly Dependent Variables. Retrieved from https://arxiv.org/abs/1411.2003v3

hyeenna.estimators.multi_mutual_info(X: numpy.array, *args, k: int = 5, **kwargs) → float[source]

Computes the multivariate mututal information of several random variables using the KSG nearest neighbor estimator.

The formula is given by:
$$ hat{I}(X_1,…,X_m) = (m-1)cdotpsi(N) + psi(k) -
rac{m-1}{k}
  • langle psi(n_{X_1} +1) + … + psi(n_{X_m} +1)
angle
$$
where
  • $N$ is the number of samples
  • $m$ is the number of variables
  • $k$ is the number of neighbors
  • $psi is the digamma function
  • $

angle cdot angle$ is the mean

  • $
_i$ is the number of points within the distance of
the $k^{th}$ nearest neighbor when projected into the subspace spanned by $i$.
X: np.array
A sample from a random variable
*args: List[np.array]
Samples from random variables
k: int, optional
Number of neighbors to use in estimation.
**kwargs: np.array
Samples from random variables
mi: float
The mutual information
[0]
  • Kraskov, A., Stögbauer, H., & Grassberger, P. (2004).

Estimating mutual information. Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 69(6), 16. https://doi.org/10.1103/PhysRevE.69.066138

hyeenna.estimators.mutual_info(X: numpy.array, Y: numpy.array, k: int = 5) → float[source]

Computes the Mututal information of two random variables, X and Y, using the KSG nearest neighbor estimator.

The formula is given by:
$$ hat{I}(X,Y) = psi(N) + psi(k) -
rac{1}{k}
  • langle psi(n_X +1) + psi(n_Y +1)
angle
$$
where
  • $N$ is the number of samples
  • $k$ is the number of neighbors
  • $psi is the digamma function
  • $

angle cdot angle$ is the mean

  • $
_i$ is the number of points within the distance of
the $k^{th}$ nearest neighbor when projected into the subspace spanned by $i$.
X: np.array
A sample from a random variable
Y: np.array
A sample from a random variable
k: int, optional
Number of neighbors to use in estimation.
mi: float
The mutual information
[0]
  • Kraskov, A., Stögbauer, H., & Grassberger, P. (2004).

Estimating mutual information. Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 69(6), 16. https://doi.org/10.1103/PhysRevE.69.066138

hyeenna.estimators.nearest_distances(X: numpy.array, Y: numpy.array = None, k: int = 5, metric='chebyshev') → list[source]

Distance to the kth nearest neighbor

hyeenna.estimators.nearest_distances_vec(X: numpy.array, Y: numpy.array = None, k: int = 5, metric='chebyshev') → numpy.array[source]

Find vector distance to all k nearest neighbors

hyeenna.estimators.transfer_entropy(X: numpy.array, Y: numpy.array, tau: int = 1, omega: int = 1, k: int = 1, l: int = 1, neighbors: int = 5, **kwargs) → float[source]

Compute the transfer entropy from a source variable, X, to a target variable, Y.

Parameters:
  • X (np.array) – Source sample from a random variable X
  • Y (np.array) – Target sample from a random variable Y
  • tau (int (default: 1)) – Number of timestep lags for the source variable
  • omega (int (default: 1)) – Number of timestep lags for the target variable conditioning
  • k (int (default: 1)) – Width of window for the source variable.
  • l (int (default: 1)) – Width of window for the target variable conditioning.
  • neighbors (int (default: K)) – Parameter controlling the number of neighbors to use in estimation.
  • **kwargs – Other arguments (undocumented, for internal usage)
Returns:

transfer_entropy – Computed via conditional_mutual_info

Return type:

float

References

[0]Schreiber, T. (2000). Measuring information transfer. Physical Review Letters, 85(2), 461–464. https://doi.org/10.1103/PhysRevLett.85.461

Analysis

hyeenna.analysis.estimate_info_transfer_network(varlist: list, names: list, tau: int = 1, omega: int = 1, nu: int = 1, k: int = 1, l: int = 1, m: int = 1, condition: bool = True, nruns: int = 10, sample_size: int = 3000) → pandas.core.frame.DataFrame[source]

Compute the pairwise transfer entropy for a list of given variables, resulting in an information transfer network.

Parameters:
  • varlist (list) – List of given variable data
  • names (list) – List of names corresponding to the data given in varlist
  • tau (int (default=1)) – Lag value for source variables
  • omega (int (default=1)) – Lag value for conditioning target variable history
  • nu (int (default=1)) – Lag value for conditioning source variable histories
  • k (int (default=1)) – Window length for source variables (applied to the same variable as the tau parameter)
  • l (int (default=1)) – Window length for target variable histories (applied to the same variable as the omega parameter)
  • m (int (default=1)) – Window length for source conditioning variables (applied to the same variable as the nu parameter)
  • condition (bool (default=False)) – Whether to condition on all variables, or just the target variable history.
  • nruns (int (default=10)) – Number of samples to compute for each connection. The median value is reported.
  • sample_size (int (default=3000)) – Size of samples to take during estimation of transfer entropy.
Returns:

df – Dataframe representing the information transfer network. Both rows and columns are populated with the given names.

Return type:

pd.DataFrame

hyeenna.analysis.estimate_timescales(X: numpy.ndarray, Y: numpy.ndarray, lag_list: list, window_list: list, sample_size: int = 5000) → pandas.core.frame.DataFrame[source]

Compute the transfer entropy (TE) over a range of lag counts and window sizes.

Parameters:
  • X (np.array) – Source data
  • Y (np.array) – Target data
  • lag_list (list) – A list enumerating the lag counts to compute TE with
  • window_list (list) – A list enumerating the window widths to compute TE with
  • sample_size (int) – Number of samples to use when computing TE
Returns:

out – A dataframe containing the computed transfer entropies for every combination of lag and window given in the input parameters

Return type:

pd.DataFrame

hyeenna.analysis.estimator_stats(estimator: callable, data: dict, params: dict, nruns: int = 10, sample_size: int = 3000) → dict[source]

Compute some statistics about a given estimator.

Parameters:
  • estimator (callable) – The estimator to compute statistics on. Suggested to be from the HYEENNA library.
  • data (dict) – Input data to feed into the estimator
  • params (dict) – Parameters to feed into the estimator
  • nruns (int (default: 10)) – Number of times to run the estimator.
  • sample_size (int (default 3000)) – Size of sample to draw from data to feed into the estimator
Returns:

stats – A dictionary containing sample statistics along with the actual results from each run of the estimator.

Return type:

dict

hyeenna.analysis.shuffle_test(estimator: callable, data: dict, params: dict, confidence: float = 0.99, nruns: int = 10, sample_size: int = 3000) → dict[source]

Compute a one tailed Z test against a sample of shuffled surrogates.

Parameters:
  • estimator (callable) – The estimator to compute statistics on. Suggested to be from the HYEENNA library.
  • data (dict) – Input data to feed into the estimator
  • params (dict) – Parameters to feed into the estimator
  • confidence (float (default: 0.99)) – Confidence level to conduct the test at.
  • nruns (int (default: 10)) – Number of times to run the estimator.
  • sample_size (int (default: 3000)) – Size of sample to draw from data to feed into the estimator
Returns:

stats – A dictionary with statistics from the standard estimator_stats function along with statistics computed on the shuffled surrogates. Most importantly are the ‘test_value’ and ‘significant’ keys, which are the value to perform the test on, along with whether the test result was significantly significant at the given confidence level.

Return type:

dict

Plotting

class hyeenna.plot.NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Credit: https://stackoverflow.com/questions/26646362/numpy-array-is-not-json-serializable

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)