smart.MNN
- smart.MNN.MMN_batch(X, batches, far_frac=0.6, top_k=2, random_state=None, verbose=True)
Compute mutual nearest neighbors across batches and construct triplets.
- Parameters:
X (np.ndarray) – Feature matrix of shape (n_cells, n_features).
batches (np.ndarray) – Batch labels of length n_cells.
far_frac (float, default=0.6) – Fraction of farthest same-batch neighbors to consider as negatives.
top_k (int, default=2) – Number of nearest neighbors to consider between batches.
random_state (int, optional) – Seed for reproducibility.
verbose (bool, default=True) – Whether to print progress.
- Returns:
anchors (np.ndarray) – Anchor indices.
positives (np.ndarray) – Positive (MNN) indices.
negatives (np.ndarray) – Negative sample indices.
- smart.MNN.Mutual_Nearest_Neighbors(adata, key=None, n_nearest_neighbors=1, farthest_ratio=0.5, max_samples=20000)
Find mutual nearest neighbors (MNNs) and construct triplets with optional sampling.
- Parameters:
adata (AnnData) – Input dataset.
key (str, optional) – Key in adata.obsm to use as features (default: use adata.X).
n_nearest_neighbors (int, default=1) – Number of nearest neighbors to consider.
farthest_ratio (float, default=0.5) – Fraction of farthest neighbors to consider when sampling negatives.
max_samples (int, default=20000) – Maximum number of cells to process. If dataset is larger, random sampling is applied.
- Returns:
anchors (list[int]) – Indices of anchor points in original adata.
positives (list[int]) – Indices of positive samples (MNNs).
negatives (list[int]) – Indices of negative samples (randomly sampled farthest neighbors).
- smart.MNN.fastSort32(a)
Perform fast argsort for float32 arrays using Numba parallelization.
- Parameters:
a (np.ndarray (float32)) – 2D array of distances.
- Returns:
b – Indices that would sort each row of a.
- Return type:
np.ndarray (int32)
- smart.MNN.fastSort64(a)
Perform fast argsort for float64 arrays using Numba parallelization.
- Parameters:
a (np.ndarray (float64)) – 2D array of distances.
- Returns:
b – Indices that would sort each row of a.
- Return type:
np.ndarray (int32)