smart.MNN

smart.MNN.MMN_batch(X, batches, far_frac=0.6, top_k=2, random_state=None, verbose=True)

Compute mutual nearest neighbors across batches and construct triplets.

Parameters:
  • X (np.ndarray) – Feature matrix of shape (n_cells, n_features).

  • batches (np.ndarray) – Batch labels of length n_cells.

  • far_frac (float, default=0.6) – Fraction of farthest same-batch neighbors to consider as negatives.

  • top_k (int, default=2) – Number of nearest neighbors to consider between batches.

  • random_state (int, optional) – Seed for reproducibility.

  • verbose (bool, default=True) – Whether to print progress.

Returns:

  • anchors (np.ndarray) – Anchor indices.

  • positives (np.ndarray) – Positive (MNN) indices.

  • negatives (np.ndarray) – Negative sample indices.

smart.MNN.Mutual_Nearest_Neighbors(adata, key=None, n_nearest_neighbors=1, farthest_ratio=0.5, max_samples=20000)

Find mutual nearest neighbors (MNNs) and construct triplets with optional sampling.

Parameters:
  • adata (AnnData) – Input dataset.

  • key (str, optional) – Key in adata.obsm to use as features (default: use adata.X).

  • n_nearest_neighbors (int, default=1) – Number of nearest neighbors to consider.

  • farthest_ratio (float, default=0.5) – Fraction of farthest neighbors to consider when sampling negatives.

  • max_samples (int, default=20000) – Maximum number of cells to process. If dataset is larger, random sampling is applied.

Returns:

  • anchors (list[int]) – Indices of anchor points in original adata.

  • positives (list[int]) – Indices of positive samples (MNNs).

  • negatives (list[int]) – Indices of negative samples (randomly sampled farthest neighbors).

smart.MNN.fastSort32(a)

Perform fast argsort for float32 arrays using Numba parallelization.

Parameters:

a (np.ndarray (float32)) – 2D array of distances.

Returns:

b – Indices that would sort each row of a.

Return type:

np.ndarray (int32)

smart.MNN.fastSort64(a)

Perform fast argsort for float64 arrays using Numba parallelization.

Parameters:

a (np.ndarray (float64)) – 2D array of distances.

Returns:

b – Indices that would sort each row of a.

Return type:

np.ndarray (int32)