API
Preprocessing
- pyemb.preprocessing.find_connected_components(A, attributes, n_components=None)
Find connected components of a multipartite graph.
- Parameters:
A (scipy.sparse.csr_matrix) – The matrix of the graph.
attributes (list of lists) – The attributes of the nodes. The first list contains the attributes of the nodes in rows. The second list contains the attributes of the nodes in the columns.
n_components (int) – The number of components to be found.
- Returns:
The matrices of the connected components and their attributes.
- Return type:
list of scipy.sparse.csr_matrix, list of lists
- pyemb.preprocessing.find_subgraph(A, attributes, subgraph_attributes)
Find a subgraph of a multipartite graph.
- Parameters:
A (scipy.sparse.csr_matrix) – The matrix of the multipartite graph.
attributes (list of lists) – The attributes of the nodes. The first list contains the attributes of the nodes in rows. The second list contains the attributes of the nodes in the columns.
subgraph_attributes (list of lists) – The attributes of the nodes of the wanted in the subgraph. The first list contains the attributes of the nodes wanted in the rows. The second list contains the attributes of the nodes wanted in the column.
- Returns:
The matrix and attributes of the subgraph.
- Return type:
scipy.sparse.csr_matrix, list of lists
- pyemb.preprocessing.graph_from_dataframes(tables, relationship_cols, same_attribute=False, dynamic_col=None, weight_col=None, join_token='::')
Create a graph from a list of tables and relationships.
- Parameters:
tables (pandas.DataFrame or list of pandas.DataFrames) – Dataframe of relationships or list of dataframes. The column names of the dataframe(s) indicate the partition of the entities therein.
relationship_cols (list of lists) – The pairs of partitions we are interested in. This can be one of two formats. Either, a list of pairs of partitions, e.g.
[['A','B'], ['C','B']]
and each pair is looked for in each table. This allows for the case where the same relationships appear in different table. Or,len(relationship_cols) == len(tables)
and the pairs of paritions to create relationships from for each table are given in the corresponding index of the list.same_attribute (bool) – Whether the entities in the columns are from the same attribute. This allows for intra-partition relationships.
dynamic_col (str or list of str) – The name of the column containing the time information. If a list is given then
dynamic_col[i]
is the name of the time column fortables[i]
. IfNone
, the time information is not used.weight_col (str or list of str) – The name of the column containing the edge weight information. If a list is given then
weight_col[i]
is the name of the weight column fortables[i]
. IfNone
, the time information is not used.join_token (str) – The token used to join the names of the partitions and the names of the entities to create a unique ID. Default is
::
.
- Returns:
A (scipy.sparse.csr_matrix) – The adjacency matrix of the graph.
attributes (list of lists) – The attributes of the nodes. The first list contains the attributes of the nodes in the rows. The second list contains the attributes of the nodes in the columns.
Examples
>>> import pyemb as eb >>> # Create dataframes >>> df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) >>> df2 = pd.DataFrame({'A': [1, 2, 3], 'C': [7, 8, 9]}) >>> # Create graph from dataframes >>> A, attributes = eb.graph_from_dataframes([df1, df2], [['A', 'B'], ['A', 'C']]) >>> print(A.todense()) >>> print(attributes)
- pyemb.preprocessing.largest_cc_of(A, attributes, partition, dynamic=False)
Find the connected component containing the most nodes from a partition.
- Parameters:
A (scipy.sparse.csr_matrix) – The matrix of the graph.
attributes (list of lists) – The attributes of the nodes. The first list contains the attributes of the nodes in rows. The second list contains the attributes of the nodes in the columns.
partition (str) – The partition to be searched.
dynamic (bool) – Whether we want the connected component containing the most nodes from dynamic part or not.
- Returns:
The matrix of the connected component and its attributes.
- Return type:
scipy.sparse.csr_matrix. list of lists
- pyemb.preprocessing.text_matrix_and_attributes(data, column_name, remove_stopwords=True, clean_text=True, remove_email_addresses=False, update_stopwords=None, **kwargs)
Create a matrix from a dataframe containing text data. This needs to have one column containing the text data and each row should be a document.
- Parameters:
data (pandas.DataFrame) – The data to be used to create the matrix.
column_name (str) – The name of the column containing the text data.
remove_stopwords (bool) – Whether to remove stopwords. Default is
True
.clean_text (bool) – Whether to clean the text data. Default is
True
. This removes symbols and makes everything lower case.remove_email_addresses (bool) – Whether to remove email addresses. Default is
False
.update_stopwords (list of str) – The list of additional stopwords to be removed. Default is
None
.kwargs (dict) – Other arguments to be passed to
sklearn.feature_extraction.text.TfidfVectorizer
.
- Returns:
The matrix created from the text data. The attributes of the nodes. The first list contains the attributes of the nodes in rows. The second list contains the attributes of the nodes in the columns.
- Return type:
numpy.ndarray, list of lists
- pyemb.preprocessing.time_series_matrix_and_attributes(data, time_col, drop_nas=True)
Create a matrix from a time series.
- Parameters:
data (pandas.DataFrame) – The data to be used to create the matrix.
time_col (str) – The name of the column containing the time information.
drop_nas (bool) – Whether to drop rows with missing values. Default is
True
.
- Returns:
The matrix created from the time series and the attributes of the nodes. The first list contains the attributes of the nodes in rows. The second list contains the attributes of the nodes in the columns.
- Return type:
numpy.ndarray, list of lists
- pyemb.preprocessing.to_networkx(A, attributes, symmetric=None)
Convert a multipartite graph to a networkx graph.
Embedding
- pyemb.embedding.ISE(As, d, flat=True, procrustes=False, consistent_orientation=True)
Computes the spectral embedding (ISE) for each adjacency snapshot.
- Parameters:
As (numpy.ndarray) – An adjacency matrix series of shape
(T, n, n)
.d (int) – Embedding dimension.
flat (bool, optional) – Whether to return a flat embedding
(n*T, d)
or a 3D embedding(T, n, d)
. Default isTrue
.procrustes (bool, optional) – Whether to align each embedding with the previous embedding. Default is
False
.consistent_orientation (bool, optional) – Whether to ensure the eigenvector orientation is consistent. Default is
True
.
- Returns:
Dynamic embedding of shape
(n*T, d)
or(T, n, d)
.- Return type:
numpy.ndarray
- pyemb.embedding.OMNI(As, d, flat=True, sparse_matrix=False)
Computes the omnibus dynamic spectral embedding. For more details, see: https://arxiv.org/abs/1705.09355
- Parameters:
As (numpy.ndarray) – Adjacency matrices of shape
(T, n, n)
.d (int) – Embedding dimension.
flat (bool, optional) – Whether to return a flat embedding
(n*T, d)
or a 3D embedding(T, n, d)
. Default isTrue
.sparse_matrix (bool, optional) – Whether to use sparse matrices. Default is
False
.
- Returns:
Dynamic embedding of shape
(n*T, d)
or(T, n, d)
.- Return type:
numpy.ndarray
- pyemb.embedding.UASE(As, d, flat=True, sparse_matrix=False, return_left=False)
Computes the unfolded adjacency spectral embedding (UASE). For more details, see: https://arxiv.org/abs/2007.10455 https://arxiv.org/abs/2106.01282
- Parameters:
As (numpy.ndarray) – An adjacency matrix series of shape
(T, n, n)
.d (int) – Embedding dimension.
flat (bool, optional) – Whether to return a flat embedding
(n*T, d)
or a 3D embedding(T, n, d)
. Default isTrue
.sparse_matrix (bool, optional) – Whether the adjacency matrices are sparse. Default is
False
.return_left (bool, optional) – Whether to return the left (anchor) embedding as well as the right (dynamic) embedding. Default is
False
.
- Returns:
numpy.ndarray – Dynamic embedding of shape
(n*T, d)
or(T, n, d)
.numpy.ndarray, optional – Anchor embedding of shape
(n, d)
if return_left isTrue
.
- pyemb.embedding.dyn_embed(As, d=50, method='UASE', regulariser='auto', flat=True)
Computes the dynamic embedding using a specified method.
- Parameters:
As (numpy.ndarray or list) – An adjacency matrix series which is either a numpy array of shape
(T, n, n)
, a list of numpy arrays of shape(n, n)
, or a series of CSR matrices.d (int, optional) – Embedding dimension. Default is
50
.method (str, optional) – The embedding method to use. Options are
ISE
,ISE PROCRUSTES
,UASE
,OMNI
,ULSE
,URLSE
,RANDOM
. Default isUASE
.regulariser (float or
auto
, optional) – Regularisation parameter for the Laplacian matrix. Ifauto
, the regulariser is set to the average node degree. Default isauto
.flat (bool, optional) – Whether to return a flat embedding
(n*T, d)
or a 3D embedding(T, n, d)
. Default isTrue
.
- Returns:
Dynamic embedding of shape
(n*T, d)
or(T, n, d)
.- Return type:
numpy.ndarray
- Raises:
Exception – If the specified method is not recognized.
- pyemb.embedding.eigen_decomp(A, dim=None)
Perform eigenvalue decomposition of a matrix.
- Parameters:
A (numpy.ndarray) – The matrix to be decomposed.
dim (int) – The number of eigenvalues and eigenvectors to be returned. If
None
, all eigenvalues and eigenvectors are returned.
- Returns:
eigenvalues (numpy.ndarray) – The eigenvalues.
eigenvectors (numpy.ndarray) – The eigenvectors.
- pyemb.embedding.embed(Y, d=50, version='sqrt', return_right=False, flat=True, make_laplacian=False, regulariser=0)
Embed a matrix.
- Parameters:
Y (numpy.ndarray or list of numpy.ndarray) – The matrix to embed.
d (int) – The number of dimensions to embed into.
version (str) – Whether to take the square root of the singular values. Options are
full
orsqrt
(default).return_right (bool) – Whether to return the right embedding.
flat (bool) – Whether to return a flat embedding
(n*T, d)
or a 3D embedding(T, n, d)
.make_laplacian (bool) – Whether to use the Laplacian matrix.
regulariser (float) – The regulariser to be added to the degrees of the nodes. (only used if
make_laplacian=True
)
- Returns:
numpy.ndarray – The left embedding.
numpy.ndarray – The right embedding.
- pyemb.embedding.regularised_ULSE(As, d, regulariser='auto', flat=True, sparse_matrix=False, return_left=False)
Computes the regularised unfolded Laplacian spectral embedding (regularised ULSE).
- Parameters:
As (numpy.ndarray) – An adjacency matrix series of shape
(T, n, n)
.d (int) – Embedding dimension.
regulariser (float, optional) – Regularisation parameter for the Laplacian matrix. By default, this is the average node degree.
flat (bool, optional) – Whether to return a flat embedding
(n*T, d)
or a 3D embedding(T, n, d)
. Default isTrue
.sparse_matrix (bool, optional) – Whether the adjacency matrices are sparse. Default is
False
.return_left (bool, optional) – Whether to return the left (anchor) embedding as well as the right (dynamic) embedding. Default is
False
.
- Returns:
numpy.ndarray – Dynamic embedding of shape
(n*T, d)
or(T, n, d)
.numpy.ndarray, optional – Anchor embedding of shape
(n, d)
ifreturn_left
isTrue
.
- pyemb.embedding.wasserstein_dimension_select(Y, dims, split=0.5)
Select the number of dimensions using Wasserstein distances.
- Parameters:
Y (numpy.ndarray) – The array of matrix.
dims (list of int) – The dimensions to be considered.
split (float) – The proportion of the data to be used for training.
- Returns:
list of numpy.ndarray – The Wasserstein distances between the training and test data for each number of dimensions.
int – The recommended number of dimensions. The dimension recommended is the one with the smallest Wasserstein distance.
Plotting
- pyemb.plotting.get_fig_legend_handles_labels(fig)
Get the legend handles and labels from a figure.
- Parameters:
fig (matplotlib.figure.Figure) – The figure object.
- Returns:
The handles and labels of the legend
- Return type:
list, list
- pyemb.plotting.quick_plot(embedding, n, T=1, node_labels=None, **kwargs)
Produces an interactive plot an embedding. If the embedding is dynamic (i.e. T > 1), then the embedding will be animated over time.
- Parameters:
embedding (numpy.ndarray
(n*T, d)
or(T, n, d)
) – The dynamic embedding.n (int) – The number of nodes.
T (int (optional)) – The number of time points (> 1 animates the embedding).
node_labels (list of length n (optional)) – The labels of the nodes (time-invariant).
return_df (bool (optional)) – Option to return the plotting dataframe.
title (str (optional)) – The title of the plot.
- pyemb.plotting.snapshot_plot(embedding, n=None, node_labels=None, c=None, idx_of_interest=None, max_cols=4, title=None, title_fontsize=20, sharex=False, sharey=False, tick_labels=False, xaxis_label='', yaxis_label='', axis_fontsize=12, figsize_scale=5, figsize=None, add_legend=False, move_legend=(0.5, -0.1), loc='lower center', max_legend_cols=4, **kwargs)
Plot a snapshot of an embedding at a given time point.
- Parameters:
embedding (np.ndarray or list of np.ndarray) – The embedding to plot.
n (int (optional)) – The number of nodes in the graph. Should be provided if the embedding is a single numpy array and
n
is not the first dimension of the array.node_labels (list (optional)) – The labels of the nodes. Default is None.
c (list or dict (optional)) – The colors of the nodes. If a list is provided, it should be a list of length
n
. If a dictionary is provided, it should map each unique label to a colour.idx_of_interest (list (optional)) – The indices which to plot. For example if embedding is a list,
idx_of_interest
can be used to plot only a subset of the embeddings. By default, all embeddings are plotted.max_cols (int (optional)) – The maximum number of columns in the plot. Default is
4
.title (str (optional)) – The title of the plot. If a list is provided, each element will be the title of a subplot. Default is
None
.title_fontsize (int (optional)) – The fontsize of the title. Default is
20
.sharex (bool (optional)) – Whether to share the x-axis across subplots. Default is
False
.sharey (bool (optional)) – Whether to share the y-axis across subplots. Default is
False
.tick_labels (bool (optional)) – Whether to show tick labels. Default is
False
.xaxis_label (str (optional)) – The x-axis label. Default is
None
.yaxis_label (str (optional)) – The y-axis label. Default is
None
.figsize_scale (int (optional)) – The scale of the figure size. Default is
5
.figsize (tuple (optional)) – The figure size. Default is
None
.add_legend (bool (optional)) – Whether to add a legend to the plot. Default is
False
.loc (str (optional)) – The anchor point for where the legend will be placed. Default is
lower center
.move_legend (tuple (optional)) – This adjusts the exact coordinates of the anchor point. Default is
(0.5,-.1)
.max_legend_cols (int (optional)) – The maximum number of columns in the legend. Default is
4
.kwargs (dict (optional)) – Additional keyword arguments for the scatter plot.
- Returns:
The figure object.
- Return type:
matplotlib.figure.Figure
Hierarchical Clustering
- class pyemb.hc.ConstructTree(point_cloud=None, model=None, epsilon=0.25)
Bases:
object
Construct a condensed tree from a hierarchical clustering model.
- Parameters:
model (AgglomerativeClustering, optional) – The fitted model.
point_cloud (ndarray, optional) – The data points.
epsilon (float, optional) – The threshold for condensing the tree.
**kwargs (dict, optional) – Additional keyword arguments.
- model
The fitted model.
- Type:
AgglomerativeClustering
- point_cloud
The data points.
- Type:
ndarray
- epsilon
The threshold for condensing the tree.
- Type:
float
- linkage
The linkage matrix.
- Type:
ndarray
- tree
The condensed tree.
- Type:
nx.Graph
- collapsed_branches
The collapsed branches.
- Type:
dict
- fit(**kwargs)
Fit the condensed tree.
- plot(labels=None, colours=None, colour_threshold=0.5, prog='sfdp', forceatlas_iter=250, node_size=10, scaling_node_size=1, add_legend=True, max_legend_cols=4, move_legend=(0.5, -0.1), loc='lower center', **kwargs)
Plot the condensed tree.
- Parameters:
labels (list, optional) – Labels for each leaf node. Default is
None
.colours (dict, optional) – A dictionary of colours for each label. Default is
None
, then leaf nodes are coloured light blue and internal nodes are black.colour_threshold (float, optional) – Used for cluster nodes, colour by a type if
num_of_type / num_points_in_cluster > colour_threshold
. Default is0.5
.prog (str, optional) – The graphviz program to use for layout. Default is
sfdp
.forceatlas_iter (int, optional) – The number of iterations to run the forceatlas2 algorithm. If it is set to zero, forceatlas2 won’t be used. Default is
250
.node_size (int, optional) – The size of the nodes of one data point. Default is
10
.scaling_node_size (int, optional) – The scaling factor for the size of the cluster nodes:
node_size + scaling_node_size * num_points_in_cluster
. Default is1
.add_legend (bool, optional) – Whether to add a legend to the plot. Default is
True
.max_legend_cols (int, optional) – The maximum number of columns in the legend. Default is
4
.loc (str (optional)) – The anchor point for where the legend will be placed. Default is
lower center
.move_legend (tuple (optional)) – This adjusts the exact coordinates of the anchor point. Default is
(0.5,-.1)
.**kwargs (dict, optional) – Additional keyword arguments to pass to
nx.draw
.
- class pyemb.hc.DotProductAgglomerativeClustering(metric='dot_product', linkage='average', distance_threshold=0, n_clusters=None)
Bases:
object
Perform hierarchical clustering using dot product as the metric. If a different metric is used, the AgglomerativeClustering class from scikit-learn is used.
- Parameters:
metric (str, optional) – The metric to use for clustering. Default is
dot_product
.linkage (str, optional) – The linkage criterion to use. Default is
average
.distance_threshold (float, optional) – The linkage distance threshold above which, clusters will not be merged. Default is
0
.n_clusters (int, optional) – The number of clusters to find. Default is
None
.
- distances_
Distance between the corresponding nodes in
children_
.- Type:
ndarray
- children_
The children of each non-leaf node.
- Type:
ndarray
- labels_
Cluster labels of each point.
- Type:
ndarray
- n_clusters_
The number of clusters.
- Type:
int
- n_connected_components_
The estimated number of connected components.
- Type:
int
- n_leaves_
The number of leaves.
- Type:
int
- n_features_in_
The number of features seen during fit.
- Type:
int
- fit(X)
- pyemb.hc.branch_lengths(Z, point_cloud=None)
Calculate branch lengths for a hierarchical clustering dendrogram.
- Parameters:
Z (ndarray) – The linkage matrix.
point_cloud (ndarray, optional) – The data points. If not provided, the leaf heights are set to the maximum height.
- Returns:
Matrix of branch lengths.
- Return type:
ndarray
- pyemb.hc.cophenetic_distances(Z)
Calculate the cophenetic distances between each observation and internal nodes.
- Parameters:
Z (ndarray) – The linkage matrix.
- Returns:
d – The full cophenetic distance matrix
(2n-1) x (2n-1)
.- Return type:
ndarray
- pyemb.hc.find_descendents(Z, node, desc=None, just_leaves=True)
Find all descendants of a given node in a hierarchical clustering tree.
- Parameters:
Z (ndarray) – The linkage matrix.
node (int) – The node to find descendants of.
desc (dict, optional) – Dictionary to store descendants.
just_leaves (bool, optional) – Whether to include only leaf nodes.
- Returns:
List of descendants.
- Return type:
list
- pyemb.hc.get_ranking(model)
Get the ranking of the samples.
- Parameters:
model (AgglomerativeClustering) – The fitted model.
- Returns:
mh_rank – The ranking of the samples.
- Return type:
numpy.ndarray
- pyemb.hc.kendalltau_similarity(model, true_ranking)
Calculate the Kendall’s tau similarity between the model and true ranking.
- Parameters:
model (AgglomerativeClustering) – The fitted model.
true_ranking (array-like, shape
(n_samples, n_samples)
) – The true ranking of the samples.
- Returns:
The mean Kendall’s tau similarity between the model and true ranking.
- Return type:
float
- pyemb.hc.linkage_matrix(model)
Convert a hierarchical clustering model to a linkage matrix.
- Parameters:
model (AgglomerativeClustering) – The fitted model.
- Returns:
The linkage matrix.
- Return type:
ndarray
- pyemb.hc.plot_dendrogram(model, dot_product_clustering=True, rescale=False, **kwargs)
Create linkage matrix and then plot the dendrogram
- Parameters:
model (AgglomerativeClustering) – The fitted model to plot.
**kwargs (dict) – Keyword arguments for dendrogram function.
- Return type:
None
- pyemb.hc.sample_hyperbolicity(data, metric='dot_products', num_samples=5000)
Calculate the hyperbolicity of the data.
- Parameters:
data (numpy.ndarray) – The data to calculate the hyperbolicity.
metric (str) – The metric to use. Options are
dot_products
,cosine_similarity
,precomputed
or any metric supported byscikit-learn
.num_samples (int) – The number of samples to calculate the hyperbolicity.
- Returns:
The hyperbolicity of the data.
- Return type:
float
Datasets
- pyemb.datasets.load_lyon()
Load the Lyon dataset. Returns a dictionary with the following keys.
- Returns:
data (numpy array of shape
(n_edges, 3)
) – The edges of the network. The first column is time and the second and third columns are the nodes. The nodes are indices from 0.labels (numpy array of shape
(n_nodes,)
) – The labels of the nodes. The index of the label corresponds to the node index.
- pyemb.datasets.load_newsgroup()
Load the Newsgroup dataset. Returns a pandas DataFrame with the following columns.
- Returns:
data (str) – The text of the newsgroup post.
target (int) – The label of the newsgroup post.
target_names (str) – The label name of the newsgroup post.
layer1 (str) – The category of the newsgroup post.
layer2 (str) – The subcategory of the newsgroup post.
- pyemb.datasets.load_planaria()
Load the Planaria dataset. Returns a dictionary with the following keys.
- Returns:
Y (numpy array of shape
(n_samples, n_features)
) – The preprocessed data matrix.labels (numpy array of shape
(n_samples,)
) – The cell type of each data point.labels (numpy array) – The unique cell types.
colour_dict (dict) – A dictionary mapping cell types to colours.
Matrix and Graph Tools
- pyemb.tools.degree_correction(embedding)
Perform degree correction.
- Parameters:
embedding (numpy.ndarray) – The embedding of the graph, either 2D or 3D.
- Returns:
The degree-corrected embedding.
- Return type:
numpy.ndarray
- pyemb.tools.recover_subspaces(embedding, attributes)
Recover the subspaces for each partition from an embedding.
- Parameters:
embedding (numpy.ndarray) – The embedding of the graph.
attributes (list of lists) – The attributes of the nodes. The first list contains the attributes of the nodes in rows. The second list contains the attributes of the nodes in the columns.
- Returns:
The embeddings and attributes of the partitions.
- Return type:
dict, dict
- pyemb.tools.select(embedding, attributes, select_attributes)
Select portion of embedding and attributes associated with a set of attributes.
- Parameters:
embedding (numpy.ndarray) – The embedding of the graph.
attributes (list of lists) – The attributes of the nodes. The first list contains the attributes of the nodes in rows. The second list contains the attributes of the nodes in the columns.
select_attributes (dict or list of dicts) – The attributes to select by. If a list of dicts is provided, the intersection of the nodes satisfying each dict is selected.
- Returns:
The selected embedding and its attributes.
- Return type:
numpy.ndarray, list of lists
- pyemb.tools.to_laplacian(A, regulariser=0)
Convert an adjacency matrix to a Laplacian matrix.
- Parameters:
A (scipy.sparse.csr_matrix) – The adjacency matrix.
regulariser (float) – The regulariser to be added to the degrees of the nodes. If
auto
, the regulariser is set to the mean of the degrees.
- Returns:
The Laplacian matrix.
- Return type:
scipy.sparse.csr_matrix
- pyemb.tools.varimax(Phi, gamma=1, q=20, tol=1e-06)
Perform varimax rotation.
- Parameters:
Phi (numpy.ndarray) – The matrix to rotate.
gamma (float, optional) – The gamma parameter.
q (int, optional) – The number of iterations.
tol (float, optional) – The tolerance.
- Returns:
The rotated matrix.
- Return type:
numpy.ndarray
Simulation
- pyemb.simulation.SBM(n=200, B=array([[0.5, 0.5], [0.5, 0.4]]), pi=array([0.5, 0.5]))
Generate an adjacency matrix from a stochastic block model.
- Parameters:
n (int, optional) – The number of nodes. Default is
200
.B (numpy.ndarray, optional) – The block matrix. Default is a 2-by-2 matrix.
pi (numpy.ndarray, optional) – The block probability vector. Default is a vector of
1/2
.
- Returns:
The adjacency matrix and the block assignment.
- Return type:
tuple
- pyemb.simulation.iid_SBM(n=200, T=2, B=array([[0.5, 0.5], [0.5, 0.4]]), pi=array([0.5, 0.5]))
Generate dynamic adjacency matrices from a stochastic block model.
- Parameters:
n (int, optional) – The number of nodes. Default is
200
.T (int, optional) – The number of time steps. Default is
2
.B (numpy.ndarray, optional) – The block matrix. Default is a 2-by-2 matrix.
pi (numpy.ndarray, optional) – The block probability vector. Default is a vector of
1/2
.
- Returns:
The sequence of adjacency matrices and the block assignment.
- Return type:
tuple
- pyemb.simulation.symmetrises(A, diag=False)
Symmetrise a matrix.
- Parameters:
A (numpy.ndarray) – The matrix to symmetrise.
diag (bool, optional) – Whether to include the diagonal. Default is
False
.
- Returns:
The symmetrised matrix.
- Return type:
numpy.ndarray