Skip to content

Utilities

Contains general utility functions for the package.

Euclidean_distances(data)

Calculate the Euclidian distances between the rows of a data array.

Parameters:

  • data (ndarray) –

    Array in which the rows represent data points.

Returns:

  • ndarray

    Matrix with Euclidian distances between the rows of the data input.

Source code in pykda\utilities.py
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
def Euclidean_distances(data: np.ndarray) -> np.ndarray:
    """
    Calculate the Euclidian distances between the rows of a data array.

    Parameters
    ----------
    data : np.ndarray
        Array in which the rows represent data points.

    Returns
    -------
    np.ndarray
        Matrix with Euclidian distances between the rows of the data input.

    """

    return np.sqrt(((data[:, np.newaxis] - data) ** 2).sum(axis=2))

Gaussian_similarity(data, scale=1)

Calculate the Gaussian similarity function between the rows of an array.

Parameters:

  • data (ndarray) –

    Array in which the rows represent data points.

  • scale (float, default: 1 ) –

    The variance of the data points is scaled by this value. Larger values of scale means smaller data neighborhoods, and vice versa. Default is 6.5, which is taken from Berkhout and Heidergott (2019).

Returns:

  • ndarray

    Array of Gaussian similarity function values between the data rows.

Source code in pykda\utilities.py
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
def Gaussian_similarity(data: np.ndarray, scale: float = 1) -> np.ndarray:
    """
    Calculate the Gaussian similarity function between the rows of an array.

    Parameters
    ----------
    data : np.ndarray
        Array in which the rows represent data points.
    scale: float
        The variance of the data points is scaled by this value. Larger values
        of scale means smaller data neighborhoods, and vice versa. Default is
        6.5, which is taken from Berkhout and Heidergott (2019).

    Returns
    -------
    np.ndarray
        Array of Gaussian similarity function values between the data rows.

    """

    distances = Euclidean_distances(data)
    n = len(data)
    var = np.sum(np.linalg.norm(data - np.mean(data, axis=0), axis=1) ** 2) / (
        n - 1
    )
    scaled_variance = var / scale

    return np.exp(-(distances**2) / scaled_variance)

create_graph_dict(A)

Creates a graph dictionary based upon adjacency matrix A, where each (i, j) for which A(i, j) > 0 is an edge by assumption.

Parameters:

  • A (ndarray) –

    An adjacency matrix.

Returns:

  • graph ( dict ) –

    graph[i] gives a list of nodes that can be reached from node i.

Source code in pykda\utilities.py
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
def create_graph_dict(A: np.ndarray) -> dict:
    """Creates a graph dictionary based upon adjacency matrix A, where each
    (i, j) for which A(i, j) > 0 is an edge by assumption.

    Parameters
    ----------
    A : np.ndarray
        An adjacency matrix.

    Returns
    -------
    graph : dict
        graph[i] gives a list of nodes that can be reached from node i.
    """

    return {
        i: np.where(A[i] > constants.VALUE_ZERO)[0].tolist()
        for i in range(len(A))
    }

eigenvec_centrality(A)

Compute the eigenvector centrality of a given non-negative adjacency matrix.

I am assuming the matrix A contains one connected component.

Parameters:

  • A (ndarray) –

    Adjacency matrix to calculate the eigenvector centrality of.

Returns:

  • ndarray

    The eigenvector centrality of A.

  • float

    The eigenvector centrality of A.

Source code in pykda\utilities.py
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def eigenvec_centrality(A: np.ndarray) -> tuple[np.ndarray, float]:
    """
    Compute the eigenvector centrality of a given non-negative adjacency matrix.

    I am assuming the matrix A contains one connected component.

    Parameters
    ----------
    A : np.ndarray
        Adjacency matrix to calculate the eigenvector centrality of.

    Returns
    -------
    np.ndarray
        The eigenvector centrality of A.
    float
        The eigenvector centrality of A.
    """

    assert is_nonnegative_matrix(A), "Ensure the matrix elements are >= 0."

    eigenvalues, eigenvectors = np.linalg.eig(A.T)
    max_idx = np.argmax(eigenvalues)

    return eigenvectors[:, [max_idx]], eigenvalues[max_idx]

expand_matrix_with_row_and_column(A)

Expands the given matrix with an extra row and column of zeros at the start.

Parameters:

  • A (ndarray) –

    Matrix to which to add first row and column.

Returns:

  • ndarray

    A where a first row and column of zeros is added at the start.

Source code in pykda\utilities.py
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
def expand_matrix_with_row_and_column(A: np.ndarray) -> np.ndarray:
    """
    Expands the given matrix with an extra row and column of zeros at the start.

    Parameters
    ----------
    A : np.ndarray
        Matrix to which to add first row and column.

    Returns
    -------
    np.ndarray
        A where a first row and column of zeros is added at the start.

    """

    extra_column = np.zeros((A.shape[0], 1))
    A = np.hstack([extra_column, A])
    extra_row = np.zeros((1, A.shape[1]))
    return np.vstack([extra_row, A])

has_positive_row_sums(A)

Check if the row sums of a given matrix are positive.

Parameters:

  • A (ndarray) –

    Matrix to be checked.

Returns:

  • bool

    True if the row sums of A are positive, False otherwise.

Source code in pykda\utilities.py
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
def has_positive_row_sums(A: np.ndarray) -> bool_:
    """
    Check if the row sums of a given matrix are positive.

    Parameters
    ----------
    A : np.ndarray
        Matrix to be checked.

    Returns
    -------
    bool
        True if the row sums of A are positive, False otherwise.

    """
    return (A.sum(axis=1) > constants.VALUE_ZERO).all()

is_nonnegative_matrix(A)

Check if a given matrix is non-negative.

Parameters:

  • A (ndarray) –

    Matrix to be checked.

Returns:

  • bool

    True if A is non-negative, False otherwise.

Source code in pykda\utilities.py
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def is_nonnegative_matrix(A: np.ndarray) -> bool_:
    """
    Check if a given matrix is non-negative.

    Parameters
    ----------
    A : np.ndarray
        Matrix to be checked.

    Returns
    -------
    bool
        True if A is non-negative, False otherwise.

    """
    return (A >= 0).all()

is_stochastic_matrix(A)

Check if a given matrix is a stochastic matrix.

Parameters:

  • A (ndarray) –

    Matrix to be checked.

Returns:

  • bool

    True if P is a stochastic matrix, False otherwise.

Source code in pykda\utilities.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def is_stochastic_matrix(A: np.ndarray) -> bool:
    """
    Check if a given matrix is a stochastic matrix.

    Parameters
    ----------
    A : np.ndarray
        Matrix to be checked.

    Returns
    -------
    bool
        True if P is a stochastic matrix, False otherwise.

    """
    return is_nonnegative_matrix(A) and row_sums_are_1(A)

perturb_stochastic_matrix(P, i, j, theta=10 ** -4)

Perturbes P towards (i, j) with rate theta according to the method from Berkhout and Heidergott (2019) "Analysis of Markov influence graphs".

Parameters:

  • P (ndarray) –

    An adjacency matrix.

  • i (int) –

    The row index of the perturbation.

  • j (int) –

    The column index of the perturbation.

  • theta (float, default: 10 ** -4 ) –

    The perturbation parameter.

Returns:

  • ndarray

    P perturbed into the direction of (i, j) with rate theta.

Source code in pykda\utilities.py
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
def perturb_stochastic_matrix(
    P: np.ndarray, i: int, j: int, theta: float = 10 ** (-4)
) -> np.ndarray:
    """Perturbes P towards (i, j) with rate theta according to the method
    from Berkhout and Heidergott (2019) "Analysis of Markov influence graphs".

    Parameters
    ----------
    P : np.ndarray
        An adjacency matrix.
    i : int
        The row index of the perturbation.
    j : int
        The column index of the perturbation.
    theta : float
        The perturbation parameter.

    Returns
    -------
    np.ndarray
        P perturbed into the direction of (i, j) with rate theta.
    """

    P_perturbed = P.copy()
    P_perturbed[i, :] *= 1 - theta
    P_perturbed[i, j] += theta

    return P_perturbed

row_sums_are_1(A)

Check if the row sums of a given matrix are equal to one.

Parameters:

  • A (ndarray) –

    Matrix to be checked.

Returns:

  • bool

    True if the row sums of A are equal to one, False otherwise.

Source code in pykda\utilities.py
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
def row_sums_are_1(A: np.ndarray) -> bool:
    """
    Check if the row sums of a given matrix are equal to one.

    Parameters
    ----------
    A : np.ndarray
        Matrix to be checked.

    Returns
    -------
    bool
        True if the row sums of A are equal to one, False otherwise.

    """
    return np.all(np.abs(A.sum(axis=1) - 1) < constants.VALUE_ZERO)