scHopfield.dynamics.simulate_perturbation

scHopfield.dynamics.simulate_perturbation(adata: AnnData, perturb_condition: Dict[str, float], cluster_key: str = 'cell_type', target_clusters: List[str] | None = None, n_propagation: int = 3, dt: float = 1.0, use_cluster_specific_GRN: bool = True, clip_delta_X: bool = True, x_max_percentile: float = 99.0, residual_gene_dynamics: bool = False, verbose: bool = True) AnnData[source]

Simulate gene expression changes after perturbation using direct GRN effects.

Computes the effect of perturbed genes on all other genes using iterative signal propagation:

x_i^new = x_i^current + dt * sum_k W_ik * (sigmoid_k(x_k^current) - sigmoid_k(x_k^original))

The propagation works as follows: - Step 1: Only the manually perturbed genes propagate their effects - Steps 2+: All TFs (genes with outgoing edges in the GRN) that have changed

from their original state propagate their effects

This captures the cascade where perturbed genes affect other TFs, which then also contribute to further propagation through the network.

Parameters:
  • adata (AnnData) – Annotated data object with fitted interactions (W matrices)

  • perturb_condition (dict) – Perturbation conditions as {gene_name: value}. Examples: - Knockout: {“Gata1”: 0.0} - Overexpression: {“Gata1”: 5.0} - Multiple: {“Gata1”: 0.0, “Tal1”: 2.0}

  • cluster_key (str, optional (default: 'cell_type')) – Key in adata.obs for cluster labels

  • target_clusters (list of str, optional) – List of cluster names to simulate perturbation in. If None, simulates in all clusters. Cells not in target clusters will have delta_X = 0.

  • n_propagation (int, optional (default: 3)) – Number of signal propagation steps through the GRN. Higher values capture more indirect effects.

  • dt (float, optional (default: 1.0)) – Scaling factor for each propagation step.

  • use_cluster_specific_GRN (bool, optional (default: True)) – If True, uses cluster-specific W matrices. If False, uses the ‘all’ W matrix for all cells.

  • clip_delta_X (bool, optional (default: True)) – If True, clips final simulated values to the observed expression range to avoid out-of-distribution predictions.

  • x_max_percentile (float, optional (default: 99.0)) – Percentile of expression to use as upper bound during propagation. This prevents divergence by clipping values at each step. Set to None to disable step-wise upper bound clipping.

  • residual_gene_dynamics (bool, optional (default: False)) – If False, perturbed genes are held fixed at their perturbed values throughout all propagation steps. If True, perturbed genes can change according to the GRN dynamics after the initial perturbation is applied.

  • verbose (bool, optional (default: True)) – Whether to show progress information.

Returns:

Modified adata with added layers: - ‘simulated_count’: Simulated gene expression after perturbation - ‘delta_X’: Difference between simulated and original expression

And added to adata.uns[‘scHopfield’]: - ‘perturb_condition’: The perturbation conditions used - ‘n_propagation’: Number of propagation steps - ‘dt’: Scaling factor used

Return type:

AnnData

References

Logic for the transition vector field is inspired by the perturbation simulation workflow in CellOracle: Kamimoto et al. (2023). Nature. https://doi.org/10.1038/s41586-022-05688-9

Examples

>>> import scHopfield as sch
>>> # Knockout simulation
>>> sch.dyn.simulate_perturbation(adata, {"Gata1": 0.0})
>>> # Overexpression
>>> sch.dyn.simulate_perturbation(adata, {"Gata1": 5.0})
>>> # Check results
>>> delta = adata.layers['delta_X']