API Reference¶
This page provides an auto-generated summary of HYENNA’s API. For more details and examples, refer to the rest of the documentation.
Estimators¶
-
hyeenna.estimators.
conditional_entropy
(X: numpy.array, Y: numpy.array, k: int = 5) → float[source]¶ Computes the conditional Shannon entropy of a sample of a random variable X given another sample of a random variable Y using an adaptation of the KL and KSG estimators
Parameters: - X (np.array) – Sample from a random variable
- Y (np.array) – Sample from a random variable
- k (int, optional) – Number of neighbors to use in estimation
Returns: cent – estimated conditional entropy
Return type: float
References
[0] Goria, M. N., Leonenko, N. N., Mergel, V. V., & Inverardi, P. L. N. (2005). A new class of random vector entropy estimators and its applications in testing statistical hypotheses. Journal of Nonparametric Statistics, 17(3), 277–297. [1] - Kraskov, A., Stögbauer, H., & Grassberger, P. (2004).
Estimating mutual information. Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 69(6), 16. https://doi.org/10.1103/PhysRevE.69.066138
-
hyeenna.estimators.
conditional_mutual_info
(X: numpy.array, Y: numpy.array, Z: numpy.array, k: int = 5) → float[source]¶ Compute the conditional mutual information
Parameters: - X (np.array) – Sample from random variable X
- Y (np.array) – Sample from random variable Y
- Z (np.array) – Sample from random variable Z
- k (int, optional) – Number of neighbors to use in estimation
Returns: Return type: estimated conditional mutual information
References
[0] - Vlachos, I., & Kugiumtzis, D. (2010).
Non-uniform state space reconstruction and coupling detection. https://doi.org/10.1103/PhysRevE.82.016207
-
hyeenna.estimators.
conditional_transfer_entropy
(X: numpy.array, Y: numpy.array, Z: numpy.array, tau: int = 1, omega: int = 1, nu: int = 1, k: int = 1, l: int = 1, m: int = 1, neighbors: int = 5, **kwargs) → float[source]¶ Compute the transfer entropy from a source variable, X, to a target variable, Y, conditioned on other variables contained in Z.
Parameters: - X (np.array) – Source sample from a random variable X
- Y (np.array) – Target sample from a random variable Y
- Z (np.array) – Conditioning variable(s).
- tau (int (default: 1)) – Number of timestep lags for the source variable
- omega (int (default: 1)) – Number of timestep lags for the target variable conditioning
- nu (int (default: 1)) – Number of timestep lags for the source variable conditioning
- k (int (default: 1)) – Width of window for the source variable.
- l (int (default: 1)) – Width of window for the target variable conditioning.
- m (int (default: 1)) – Width of window for the source variable conditioning.
- neighbors (int (default: K)) – Parameter controlling the number of neighbors to use in estimation.
- **kwargs – Other arguments (undocumented, for internal usage)
Returns: conditional_transfer_entropy – Computed via conditional_mutual_info
Return type: float
References
[0] Schreiber, T. (2000). Measuring information transfer. Physical Review Letters, 85(2), 461–464. https://doi.org/10.1103/PhysRevLett.85.461
-
hyeenna.estimators.
entropy
(X: numpy.array, k: int = 5) → float[source]¶ Computes the Shannon entropy of a random variable X using the KL nearest neighbor estimator.
- The formula is given by:
- $$ hat{H}(X) = psi(N) - psi(k) + log(C_d) + d langle log(epsilon)
- angle
- $$
- where
- $N$ is the number of samples
- $k$ is the number of neighbors
- $psi is the digamma function
- $
angle cdot angle$ is the mean
- $epsilon_i$ is the 2 times the distance to the $k^{th}$ nearest neighbor.
- X: np.array
- Sample from a random variable
- k: int, optional
- Number of neighbors to use in estimation
- ent: float
- estimated entropy
[0] Goria, M. N., Leonenko, N. N., Mergel, V. V., & Inverardi, P. L. N. (2005). A new class of random vector entropy estimators and its applications in testing statistical hypotheses. Journal of Nonparametric Statistics, 17(3), 277–297. https://doi.org/10.1080/104852504200026815
-
hyeenna.estimators.
kl_divergence
(P: numpy.array, Q: numpy.array, k: int = 5)[source]¶ Compute the KL divergence
Parameters: - P (np.array) – Sample from random variable P
- Q (np.array) – Sample from random variable Q
- k (int, optional) – Number of neighbors to use in estimation
Returns: Return type: estimated KL divergence D(P|Q)
References
[0] - Wang, Q., Kulkarni, S. R., & Verdu, S. (2006). A Nearest-Neighbor
Approach to Estimating Divergence between Continuous Random Vectors. In 2006 IEEE International Symposium on Information Theory. https://doi.org/10.1109/ISIT.2006.261842
-
hyeenna.estimators.
marginal_neighbors
(X: numpy.array, R: numpy.array, metric='chebyshev') → list[source]¶ Number of neighbors within a certain radius
-
hyeenna.estimators.
mi_local_nonuniformity_correction
(X, *args, k: int = 5, alpha=1.05, **kwargs)[source]¶ Compute the local nonuniformity correction factor for strongly dependent variables. This correction is calculated based on the structure of the space of k-nearest neighbors. The volume of the hyper-rectangle of the maximum-norm bounding box for the k-nearest neighbor estimation is compared to that of the hyper-rectangle bounding the principal components of the covariance matrix of the k-nearest neighbor locations.
Parameters: - X (np.array) – A sample from a random variable
- *args (List[np.array]) – Samples from random variables
- k (int, optional) – Number of neighbors to use in estimation.
- alpha (float, optional) – Sensitivity parameter for filtering non-dependent volumes
- **kwargs (np.array) – Samples from random variables
Returns: lnc – The correction factor to be subtracted from the mutual information
Return type: float
References
[0] - Gao, S., Steeg, G. V., & Galstyan, A. (2014). Efficient
Estimation of Mutual Information for Strongly Dependent Variables. Retrieved from https://arxiv.org/abs/1411.2003v3
-
hyeenna.estimators.
multi_mutual_info
(X: numpy.array, *args, k: int = 5, **kwargs) → float[source]¶ Computes the multivariate mututal information of several random variables using the KSG nearest neighbor estimator.
- The formula is given by:
- $$ hat{I}(X_1,…,X_m) = (m-1)cdotpsi(N) + psi(k) -
- rac{m-1}{k}
- langle psi(n_{X_1} +1) + … + psi(n_{X_m} +1)
- angle
- $$
- where
- $N$ is the number of samples
- $m$ is the number of variables
- $k$ is the number of neighbors
- $psi is the digamma function
- $
angle cdot angle$ is the mean
- $
- _i$ is the number of points within the distance of
- the $k^{th}$ nearest neighbor when projected into the subspace spanned by $i$.
- X: np.array
- A sample from a random variable
- *args: List[np.array]
- Samples from random variables
- k: int, optional
- Number of neighbors to use in estimation.
- **kwargs: np.array
- Samples from random variables
- mi: float
- The mutual information
[0] - Kraskov, A., Stögbauer, H., & Grassberger, P. (2004).
Estimating mutual information. Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 69(6), 16. https://doi.org/10.1103/PhysRevE.69.066138
-
hyeenna.estimators.
mutual_info
(X: numpy.array, Y: numpy.array, k: int = 5) → float[source]¶ Computes the Mututal information of two random variables, X and Y, using the KSG nearest neighbor estimator.
- The formula is given by:
- $$ hat{I}(X,Y) = psi(N) + psi(k) -
- rac{1}{k}
- langle psi(n_X +1) + psi(n_Y +1)
- angle
- $$
- where
- $N$ is the number of samples
- $k$ is the number of neighbors
- $psi is the digamma function
- $
angle cdot angle$ is the mean
- $
- _i$ is the number of points within the distance of
- the $k^{th}$ nearest neighbor when projected into the subspace spanned by $i$.
- X: np.array
- A sample from a random variable
- Y: np.array
- A sample from a random variable
- k: int, optional
- Number of neighbors to use in estimation.
- mi: float
- The mutual information
[0] - Kraskov, A., Stögbauer, H., & Grassberger, P. (2004).
Estimating mutual information. Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 69(6), 16. https://doi.org/10.1103/PhysRevE.69.066138
-
hyeenna.estimators.
nearest_distances
(X: numpy.array, Y: numpy.array = None, k: int = 5, metric='chebyshev') → list[source]¶ Distance to the kth nearest neighbor
-
hyeenna.estimators.
nearest_distances_vec
(X: numpy.array, Y: numpy.array = None, k: int = 5, metric='chebyshev') → numpy.array[source]¶ Find vector distance to all k nearest neighbors
-
hyeenna.estimators.
transfer_entropy
(X: numpy.array, Y: numpy.array, tau: int = 1, omega: int = 1, k: int = 1, l: int = 1, neighbors: int = 5, **kwargs) → float[source]¶ Compute the transfer entropy from a source variable, X, to a target variable, Y.
Parameters: - X (np.array) – Source sample from a random variable X
- Y (np.array) – Target sample from a random variable Y
- tau (int (default: 1)) – Number of timestep lags for the source variable
- omega (int (default: 1)) – Number of timestep lags for the target variable conditioning
- k (int (default: 1)) – Width of window for the source variable.
- l (int (default: 1)) – Width of window for the target variable conditioning.
- neighbors (int (default: K)) – Parameter controlling the number of neighbors to use in estimation.
- **kwargs – Other arguments (undocumented, for internal usage)
Returns: transfer_entropy – Computed via conditional_mutual_info
Return type: float
References
[0] Schreiber, T. (2000). Measuring information transfer. Physical Review Letters, 85(2), 461–464. https://doi.org/10.1103/PhysRevLett.85.461
Analysis¶
-
hyeenna.analysis.
estimate_info_transfer_network
(varlist: list, names: list, tau: int = 1, omega: int = 1, nu: int = 1, k: int = 1, l: int = 1, m: int = 1, condition: bool = True, nruns: int = 10, sample_size: int = 3000) → pandas.core.frame.DataFrame[source]¶ Compute the pairwise transfer entropy for a list of given variables, resulting in an information transfer network.
Parameters: - varlist (list) – List of given variable data
- names (list) – List of names corresponding to the data given in varlist
- tau (int (default=1)) – Lag value for source variables
- omega (int (default=1)) – Lag value for conditioning target variable history
- nu (int (default=1)) – Lag value for conditioning source variable histories
- k (int (default=1)) – Window length for source variables (applied to the same variable as the tau parameter)
- l (int (default=1)) – Window length for target variable histories (applied to the same variable as the omega parameter)
- m (int (default=1)) – Window length for source conditioning variables (applied to the same variable as the nu parameter)
- condition (bool (default=False)) – Whether to condition on all variables, or just the target variable history.
- nruns (int (default=10)) – Number of samples to compute for each connection. The median value is reported.
- sample_size (int (default=3000)) – Size of samples to take during estimation of transfer entropy.
Returns: df – Dataframe representing the information transfer network. Both rows and columns are populated with the given names.
Return type: pd.DataFrame
-
hyeenna.analysis.
estimate_timescales
(X: numpy.ndarray, Y: numpy.ndarray, lag_list: list, window_list: list, sample_size: int = 5000) → pandas.core.frame.DataFrame[source]¶ Compute the transfer entropy (TE) over a range of lag counts and window sizes.
Parameters: - X (np.array) – Source data
- Y (np.array) – Target data
- lag_list (list) – A list enumerating the lag counts to compute TE with
- window_list (list) – A list enumerating the window widths to compute TE with
- sample_size (int) – Number of samples to use when computing TE
Returns: out – A dataframe containing the computed transfer entropies for every combination of lag and window given in the input parameters
Return type: pd.DataFrame
-
hyeenna.analysis.
estimator_stats
(estimator: callable, data: dict, params: dict, nruns: int = 10, sample_size: int = 3000) → dict[source]¶ Compute some statistics about a given estimator.
Parameters: - estimator (callable) – The estimator to compute statistics on. Suggested to be from the HYEENNA library.
- data (dict) – Input data to feed into the estimator
- params (dict) – Parameters to feed into the estimator
- nruns (int (default: 10)) – Number of times to run the estimator.
- sample_size (int (default 3000)) – Size of sample to draw from data to feed into the estimator
Returns: stats – A dictionary containing sample statistics along with the actual results from each run of the estimator.
Return type: dict
-
hyeenna.analysis.
shuffle_test
(estimator: callable, data: dict, params: dict, confidence: float = 0.99, nruns: int = 10, sample_size: int = 3000) → dict[source]¶ Compute a one tailed Z test against a sample of shuffled surrogates.
Parameters: - estimator (callable) – The estimator to compute statistics on. Suggested to be from the HYEENNA library.
- data (dict) – Input data to feed into the estimator
- params (dict) – Parameters to feed into the estimator
- confidence (float (default: 0.99)) – Confidence level to conduct the test at.
- nruns (int (default: 10)) – Number of times to run the estimator.
- sample_size (int (default: 3000)) – Size of sample to draw from data to feed into the estimator
Returns: stats – A dictionary with statistics from the standard estimator_stats function along with statistics computed on the shuffled surrogates. Most importantly are the ‘test_value’ and ‘significant’ keys, which are the value to perform the test on, along with whether the test result was significantly significant at the given confidence level.
Return type: dict
Plotting¶
-
class
hyeenna.plot.
NumpyEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶ Credit: https://stackoverflow.com/questions/26646362/numpy-array-is-not-json-serializable
-
default
(obj)[source]¶ Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
-