This page was generated from docs/notebooks/02_energy_analysis.ipynb.

Energy Analysis

Energy Analysis

Energy landscapes provide a quantitative measure of cellular state stability in the Hopfield network framework.

Energy Function

The total energy is decomposed into three components:

\[E_{total} = E_{interaction} + E_{degradation} + E_{bias}\]

Where:

\[\begin{split}\begin{aligned} E_{interaction} &= -\frac{1}{2} s^T W s \\ E_{degradation} &= \sum_i \gamma_i \int \sigma^{-1}(s_i) ds_i \\ E_{bias} &= -I^T s \end{aligned}\end{split}\]

Computing Energies

Visualization

scHopfield provides comprehensive plotting functions for all analysis results.

Energy Plots

See energy_analysis for details.

The scHopfield energy functional decomposes into three biologically interpretable

components:

\[E = E_{\text{interaction}} + E_{\text{degradation}} + E_{\text{bias}}\]
  • E_interaction: Energy stored in gene–gene regulatory interactions

  • E_degradation: Energy stored in mRNA decay terms

  • E_bias: Energy stored in external input / basal expression bias

Lower total energy ≈ more stable attractor state. This notebook shows how to

compute, visualise, and interpret these energies.

Setup

[1]:
import itertools

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scanpy as sc
import scHopfield as sch

import warnings
warnings.filterwarnings('ignore', category=FutureWarning)

# Assumes the model was saved in notebook 01
DATA_PATH  = './scratch/Data/'
DATASET_FILE = 'hematopoiesis.h5ad'
MODEL_FILE = 'model.h5sch'

CLUSTER_KEY = 'cell_type'
SPLICED_KEY  = 'M_t'

CELL_TYPE_ORDER = ['Meg', 'Ery', 'MEP-like', 'HSC', 'GMP-like', 'Mon', 'Bas', 'Neu']

adata = sc.read_h5ad(DATA_PATH + DATASET_FILE)
adata = sch.tl.load_model(adata, MODEL_FILE)
print(adata)


# Set seed for reproducibility
np.random.seed(42)

/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/louvain/__init__.py:54: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import get_distribution, DistributionNotFound
/tmp/ipykernel_1221845/3037205977.py:23: UserWarning: adata has 1956 genes but the model was trained on 1728.  A subsetted copy is being returned; the original adata is NOT modified.  Reassign the return value:
    adata = sch.tl.load_model(adata, filename)
  adata = sch.tl.load_model(adata, MODEL_FILE)
Model loaded from 'model.h5sch'  |  clusters=['Bas', 'Ery', 'GMP-like', 'HSC', 'MEP-like', 'Meg', 'Mon', 'Neu', 'all']  |  genes=1728
AnnData object with n_obs × n_vars = 1947 × 1728
    obs: 'batch', 'time', 'cell_type', 'nGenes', 'nCounts', 'pMito', 'pass_basic_filter', 'new_Size_Factor', 'initial_new_cell_size', 'total_Size_Factor', 'initial_total_cell_size', 'spliced_Size_Factor', 'initial_spliced_cell_size', 'unspliced_Size_Factor', 'initial_unspliced_cell_size', 'Size_Factor', 'initial_cell_size', 'ntr', 'cell_cycle_phase', 'leiden', 'control_point_pca', 'inlier_prob_pca', 'obs_vf_angle_pca', 'pca_ddhodge_div', 'pca_ddhodge_potential', 'acceleration_pca', 'curvature_pca', 'n_counts', 'mt_frac', 'jacobian_det_pca', 'manual_selection', 'divergence_pca', 'curv_leiden', 'curv_louvain', 'SPI1->GATA1_jacobian', 'jacobian', 'umap_ori_leiden', 'umap_ori_louvain', 'umap_ddhodge_div', 'umap_ddhodge_potential', 'curl_umap', 'divergence_umap', 'acceleration_umap', 'control_point_umap_ori', 'inlier_prob_umap_ori', 'obs_vf_angle_umap_ori', 'curvature_umap_ori'
    var: 'gene_name', 'gene_id', 'nCells', 'nCounts', 'pass_basic_filter', 'use_for_pca', 'frac', 'ntr', 'time_3_alpha', 'time_3_beta', 'time_3_gamma', 'time_3_half_life', 'time_3_alpha_b', 'time_3_alpha_r2', 'time_3_gamma_b', 'time_3_gamma_r2', 'time_3_gamma_logLL', 'time_3_delta_b', 'time_3_delta_r2', 'time_3_bs', 'time_3_bf', 'time_3_uu0', 'time_3_ul0', 'time_3_su0', 'time_3_sl0', 'time_3_U0', 'time_3_S0', 'time_3_total0', 'time_3_beta_k', 'time_3_gamma_k', 'time_5_alpha', 'time_5_beta', 'time_5_gamma', 'time_5_half_life', 'time_5_alpha_b', 'time_5_alpha_r2', 'time_5_gamma_b', 'time_5_gamma_r2', 'time_5_gamma_logLL', 'time_5_bs', 'time_5_bf', 'time_5_uu0', 'time_5_ul0', 'time_5_su0', 'time_5_sl0', 'time_5_U0', 'time_5_S0', 'time_5_total0', 'time_5_beta_k', 'time_5_gamma_k', 'use_for_dynamics', 'gamma', 'gamma_r2', 'use_for_transition', 'gamma_k', 'gamma_b', 'I_Bas', 'I_Ery', 'I_GMP-like', 'I_HSC', 'I_MEP-like', 'I_Meg', 'I_Mon', 'I_Neu', 'I_all', 'scHopfield_used', 'sigmoid_exponent', 'sigmoid_mse', 'sigmoid_offset', 'sigmoid_threshold'
    uns: 'PCs', 'VecFld_pca', 'VecFld_umap', 'X_umap_neighbors', 'cell_phase_genes', 'cell_type_colors', 'dynamics', 'explained_variance_ratio_', 'feature_selection', 'grid_velocity_pca', 'grid_velocity_umap', 'grid_velocity_umap_ori_perturbation', 'grid_velocity_umap_test', 'jacobian_pca', 'leiden', 'neighbors', 'pca_mean', 'pp', 'response', 'scHopfield'
    obsm: 'X', 'X_pca', 'X_pca_SparseVFC', 'X_umap', 'X_umap_SparseVFC', 'X_umap_ori_perturbation', 'X_umap_test', 'acceleration_pca', 'acceleration_umap', 'cell_cycle_scores', 'curvature_pca', 'curvature_umap', 'j_delta_x_perturbation', 'velocity_pca', 'velocity_pca_SparseVFC', 'velocity_umap', 'velocity_umap_SparseVFC', 'velocity_umap_ori_perturbation', 'velocity_umap_test'
    layers: 'M_n', 'M_nn', 'M_t', 'M_tn', 'M_tt', 'X_new', 'X_total', 'velocity_alpha_minus_gamma_s'
    obsp: 'X_umap_connectivities', 'X_umap_distances', 'connectivities', 'cosine_transition_matrix', 'distances', 'fp_transition_rate', 'moments_con', 'pca_ddhodge', 'perturbation_transition_matrix', 'umap_ddhodge'
    varp: 'W_Bas', 'W_Ery', 'W_GMP-like', 'W_HSC', 'W_MEP-like', 'W_Meg', 'W_Mon', 'W_Neu', 'W_all'

2.1 Compute & Decompose Energies

[2]:
sch.tl.compute_energies(adata, cluster_key=CLUSTER_KEY, spliced_key=SPLICED_KEY)

# Inspect stored energy columns
energy_cols = ['energy_total', 'energy_interaction', 'energy_degradation', 'energy_bias']
print(adata.obs[energy_cols].describe())

       energy_total  energy_interaction  energy_degradation  energy_bias
count   1947.000000         1947.000000         1947.000000  1947.000000
mean       0.465193           -6.238026            6.703232    -0.000013
std        2.017105            4.142948            2.293970     0.000014
min       -8.506401          -22.371951            3.737046    -0.000058
25%        0.257570           -7.225530            4.937362    -0.000021
50%        1.199007           -4.855515            6.226903    -0.000014
75%        1.606413           -3.356720            7.367836    -0.000005
max        2.778031           -1.868195           14.112776     0.000053
[3]:
# Summary statistics per cell type (including coefficient of variation)
summary = adata.obs[[CLUSTER_KEY] + energy_cols].groupby(CLUSTER_KEY).describe()
for energy in energy_cols:
    summary[(energy, 'cv')] = summary[(energy, 'std')] / summary[(energy, 'mean')]

print("\nTotal energy statistics:")
print(summary['energy_total'].round(3))


Total energy statistics:
           count   mean    std    min    25%    50%    75%    max     cv
cell_type
Bas        177.0 -0.475  1.625 -4.416 -1.506 -0.197  0.840  2.274 -3.419
Ery        234.0  0.538  0.726 -1.788  0.057  0.502  1.054  2.369  1.349
GMP-like   161.0  1.386  0.306  0.547  1.190  1.391  1.593  2.092  0.221
HSC        309.0  1.433  0.374  0.309  1.227  1.465  1.663  2.383  0.261
MEP-like   457.0  1.659  0.350  0.224  1.430  1.656  1.871  2.778  0.211
Meg        154.0 -4.099  2.206 -8.506 -5.636 -4.294 -2.506  0.812 -0.538
Mon        423.0  0.672  0.888 -1.749  0.173  0.757  1.325  2.478  1.322
Neu         32.0 -6.661  1.309 -7.924 -7.730 -7.065 -6.085 -3.137 -0.196

2.2 Energy Boxplots by Cell Type

[4]:
adata
[4]:
AnnData object with n_obs × n_vars = 1947 × 1728
    obs: 'batch', 'time', 'cell_type', 'nGenes', 'nCounts', 'pMito', 'pass_basic_filter', 'new_Size_Factor', 'initial_new_cell_size', 'total_Size_Factor', 'initial_total_cell_size', 'spliced_Size_Factor', 'initial_spliced_cell_size', 'unspliced_Size_Factor', 'initial_unspliced_cell_size', 'Size_Factor', 'initial_cell_size', 'ntr', 'cell_cycle_phase', 'leiden', 'control_point_pca', 'inlier_prob_pca', 'obs_vf_angle_pca', 'pca_ddhodge_div', 'pca_ddhodge_potential', 'acceleration_pca', 'curvature_pca', 'n_counts', 'mt_frac', 'jacobian_det_pca', 'manual_selection', 'divergence_pca', 'curv_leiden', 'curv_louvain', 'SPI1->GATA1_jacobian', 'jacobian', 'umap_ori_leiden', 'umap_ori_louvain', 'umap_ddhodge_div', 'umap_ddhodge_potential', 'curl_umap', 'divergence_umap', 'acceleration_umap', 'control_point_umap_ori', 'inlier_prob_umap_ori', 'obs_vf_angle_umap_ori', 'curvature_umap_ori', 'energy_total', 'energy_interaction', 'energy_degradation', 'energy_bias'
    var: 'gene_name', 'gene_id', 'nCells', 'nCounts', 'pass_basic_filter', 'use_for_pca', 'frac', 'ntr', 'time_3_alpha', 'time_3_beta', 'time_3_gamma', 'time_3_half_life', 'time_3_alpha_b', 'time_3_alpha_r2', 'time_3_gamma_b', 'time_3_gamma_r2', 'time_3_gamma_logLL', 'time_3_delta_b', 'time_3_delta_r2', 'time_3_bs', 'time_3_bf', 'time_3_uu0', 'time_3_ul0', 'time_3_su0', 'time_3_sl0', 'time_3_U0', 'time_3_S0', 'time_3_total0', 'time_3_beta_k', 'time_3_gamma_k', 'time_5_alpha', 'time_5_beta', 'time_5_gamma', 'time_5_half_life', 'time_5_alpha_b', 'time_5_alpha_r2', 'time_5_gamma_b', 'time_5_gamma_r2', 'time_5_gamma_logLL', 'time_5_bs', 'time_5_bf', 'time_5_uu0', 'time_5_ul0', 'time_5_su0', 'time_5_sl0', 'time_5_U0', 'time_5_S0', 'time_5_total0', 'time_5_beta_k', 'time_5_gamma_k', 'use_for_dynamics', 'gamma', 'gamma_r2', 'use_for_transition', 'gamma_k', 'gamma_b', 'I_Bas', 'I_Ery', 'I_GMP-like', 'I_HSC', 'I_MEP-like', 'I_Meg', 'I_Mon', 'I_Neu', 'I_all', 'scHopfield_used', 'sigmoid_exponent', 'sigmoid_mse', 'sigmoid_offset', 'sigmoid_threshold'
    uns: 'PCs', 'VecFld_pca', 'VecFld_umap', 'X_umap_neighbors', 'cell_phase_genes', 'cell_type_colors', 'dynamics', 'explained_variance_ratio_', 'feature_selection', 'grid_velocity_pca', 'grid_velocity_umap', 'grid_velocity_umap_ori_perturbation', 'grid_velocity_umap_test', 'jacobian_pca', 'leiden', 'neighbors', 'pca_mean', 'pp', 'response', 'scHopfield'
    obsm: 'X', 'X_pca', 'X_pca_SparseVFC', 'X_umap', 'X_umap_SparseVFC', 'X_umap_ori_perturbation', 'X_umap_test', 'acceleration_pca', 'acceleration_umap', 'cell_cycle_scores', 'curvature_pca', 'curvature_umap', 'j_delta_x_perturbation', 'velocity_pca', 'velocity_pca_SparseVFC', 'velocity_umap', 'velocity_umap_SparseVFC', 'velocity_umap_ori_perturbation', 'velocity_umap_test'
    layers: 'M_n', 'M_nn', 'M_t', 'M_tn', 'M_tt', 'X_new', 'X_total', 'velocity_alpha_minus_gamma_s', 'sigmoid'
    obsp: 'X_umap_connectivities', 'X_umap_distances', 'connectivities', 'cosine_transition_matrix', 'distances', 'fp_transition_rate', 'moments_con', 'pca_ddhodge', 'perturbation_transition_matrix', 'umap_ddhodge'
    varp: 'W_Bas', 'W_Ery', 'W_GMP-like', 'W_HSC', 'W_MEP-like', 'W_Meg', 'W_Mon', 'W_Neu', 'W_all'
[5]:
# Retrieve cell-type colours from the dataset (or define your own dict)
# colors = {ct: f'C{i}' for i, ct in enumerate(CELL_TYPE_ORDER)}
colors = dict(zip(CELL_TYPE_ORDER, adata.uns['cell_type_colors']))

# All four energy components
sch.pl.plot_energy_boxplots(
    adata,
    cluster_key=CLUSTER_KEY,
    order=CELL_TYPE_ORDER,
    colors=colors
)
plt.show()

/home/bernaljp/packages/scHopfield/scHopfield/plotting/energy.py:204: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(ax.get_xticklabels(), rotation=45, ha='right')
/home/bernaljp/packages/scHopfield/scHopfield/plotting/energy.py:204: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(ax.get_xticklabels(), rotation=45, ha='right')
/home/bernaljp/packages/scHopfield/scHopfield/plotting/energy.py:204: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(ax.get_xticklabels(), rotation=45, ha='right')
/home/bernaljp/packages/scHopfield/scHopfield/plotting/energy.py:204: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(ax.get_xticklabels(), rotation=45, ha='right')
../_images/notebooks_02_energy_analysis_10_1.png
[6]:
# Total energy only
sch.pl.plot_energy_boxplots(
    adata,
    cluster_key=CLUSTER_KEY,
    plot_energy='total',
    order=CELL_TYPE_ORDER,
    colors=[colors[k] for k in CELL_TYPE_ORDER]
)
plt.show()

/home/bernaljp/packages/scHopfield/scHopfield/plotting/energy.py:204: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(ax.get_xticklabels(), rotation=45, ha='right')
../_images/notebooks_02_energy_analysis_11_1.png

2.3 Energy on UMAP

[7]:
# Scatter plots of each energy component overlaid on UMAP embedding
sch.pl.plot_energy_scatters(
    adata,
    cluster_key=CLUSTER_KEY,
    basis='umap',
    show_legend=True,
    colors=colors,
)
plt.show()

../_images/notebooks_02_energy_analysis_13_0.png
[8]:
# Interaction energy only
sch.pl.plot_energy_scatters(
    adata,
    cluster_key=CLUSTER_KEY,
    plot_energy='interaction',
    colors=colors,
)
plt.show()

../_images/notebooks_02_energy_analysis_14_0.png

2.4 Energy–Gene Correlations

Identifies which genes drive energy differences across cell types.

[9]:
sch.tl.energy_gene_correlation(
    adata,
    spliced_key=SPLICED_KEY,
    cluster_key=CLUSTER_KEY
)

# Tabulate top correlated genes
df_correlations = sch.tl.get_correlation_table(
    adata,
    cluster_key=CLUSTER_KEY,
    energy_type='total',
    n_top_genes=100,
    order=CELL_TYPE_ORDER
)
df_correlations.head(10)

/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/numpy/lib/function_base.py:2897: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/numpy/lib/function_base.py:2898: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/numpy/lib/function_base.py:2897: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/numpy/lib/function_base.py:2898: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/numpy/lib/function_base.py:2897: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/numpy/lib/function_base.py:2898: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/numpy/lib/function_base.py:2897: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
/home/bernaljp/miniconda3/envs/SCH/lib/python3.11/site-packages/numpy/lib/function_base.py:2898: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
[9]:
Meg Ery MEP-like HSC GMP-like Mon Bas Neu
Gene Correlation Gene Correlation Gene Correlation Gene Correlation Gene Correlation Gene Correlation Gene Correlation Gene Correlation
0 FBL 0.807304 FABP5 0.665676 PRSS57 0.315137 CCDC137 0.482047 CITED2 0.434898 ELF1 0.649138 EXO1 0.807626 FBL 0.900663
1 PHB2 0.778194 CD33 0.650139 IL17RB 0.307605 NOP9 0.436700 ITGA2B 0.412077 HERC2P9 0.563224 FABP5 0.742007 ERG 0.882317
2 ZNRF1 0.775508 RASSF2 0.641904 CHD3 0.284941 FUCA2 0.425116 FCER1A 0.400928 TOP2A 0.536342 PHB2 0.725082 ACSS1 0.868937
3 RPL18A 0.748102 PALM2AKAP2 0.618501 RAB20 0.280315 SOD2 0.425081 HDC 0.399843 KANTR 0.532998 RPL18A 0.711273 GFI1 0.866927
4 HMGB3 0.744607 IL2RG 0.611712 PEX6 0.279536 PTCD1 0.420303 AL157895.1 0.384206 STAT3 0.531445 ARV1 0.693583 KLHDC2 0.861760
5 RPL35 0.731323 SATB1 0.597738 RASSF2 0.272160 COTL1 0.393799 CPA3 0.371276 ASPM 0.526317 MFSD2B 0.679917 POLR1C 0.840239
6 RPUSD4 0.723665 MBOAT7 0.560575 SRGN 0.269023 PSEN1 0.387846 DDIT4 0.370603 HERC2P3 0.515384 DDX41 0.675017 AQP3 0.826446
7 RECQL4 0.713247 DBN1 0.556944 USE1 0.268064 RECQL4 0.387596 GATA2 0.369129 CD34 0.507963 FBL 0.672615 ASPM 0.824554
8 EIF3K 0.710026 AC244502.1 0.550963 CEBPA 0.266089 MFSD2B 0.384711 HPGDS 0.366681 CASP7 0.504359 GPI 0.672221 ERCC6 0.823418
9 HPGDS 0.703534 LPCAT2 0.549897 CEACAM1 0.259691 DDX41 0.379453 ZFPM1 0.366653 RDX 0.500153 APRT 0.667679 INKA1 0.822812
[10]:
# Pairwise scatter plots (all combinations of cell-type pairs)
sch.pl.plot_correlations_grid(
    adata,
    cluster_key=CLUSTER_KEY,
    energy='total',
    order=CELL_TYPE_ORDER,
    colors=colors,
    x_low=-0.4, x_high=0.4,
    y_low=-0.4, y_high=0.4
)
plt.show()

../_images/notebooks_02_energy_analysis_17_0.png
[11]:
# Highlight specific cell-type pairs
fig, ax = plt.subplots(1, 3, figsize=(18, 6), tight_layout=True)

sch.pl.plot_gene_correlation_scatter(
    adata, 'Meg', 'Ery',
    cluster_key=CLUSTER_KEY, energy='total',
    annotate=6, ax=ax[0],
    clus1_low=-0.4, clus1_high=0.4,
    clus2_low=-0.4, clus2_high=0.4
)
sch.pl.plot_gene_correlation_scatter(
    adata, 'Neu', 'Bas',
    cluster_key=CLUSTER_KEY, energy='total',
    annotate=6, ax=ax[1]
)
sch.pl.plot_gene_correlation_scatter(
    adata, 'Neu', 'Mon',
    cluster_key=CLUSTER_KEY, energy='total',
    annotate=6, ax=ax[2]
)
plt.show()

../_images/notebooks_02_energy_analysis_18_0.png

2.5 Corner Gene Identification

“Corner genes” have high-magnitude correlation with one cell type and opposing

(or low) correlation with another — these are candidate lineage-specific

regulatory genes.

[12]:
import itertools
genes_mask  = sch._utils.io.get_genes_used(adata)
gene_names  = adata.var.index[genes_mask]

# Build per-cluster correlation arrays
correlation = {}
for cluster in CELL_TYPE_ORDER:
    col = f'correlation_total_{cluster}'
    if col in adata.var.columns:
        correlation[cluster] = adata.var[col].values[genes_mask]

# Thresholds for "corner" classification
clus1_low  = -0.4
clus1_high =  0.4
clus2_low  = -0.4
clus2_high =  0.4
nn = 5   # top-n genes per pair

corner_genes = np.array([])

for corr1, corr2 in itertools.combinations(CELL_TYPE_ORDER, 2):
    if corr1 not in correlation or corr2 not in correlation:
        continue

    c1 = correlation[corr1]
    c2 = correlation[corr2]

    mask_corner = np.logical_or(
        np.logical_and(c1 >= clus1_high, c2 <= clus2_low),
        np.logical_and(c1 <= clus1_low,  c2 >= clus2_high)
    )

    idxs = np.where(mask_corner)[0]
    top  = np.argsort(c1[idxs] ** 2 + c2[idxs] ** 2)[-nn:]
    corner_genes = np.concatenate((corner_genes, gene_names[idxs[top]]))

corner_genes = np.unique(corner_genes)
print(f"Found {len(corner_genes)} corner genes:")
print(corner_genes)

Found 54 corner genes:
['ACSS1' 'AHNAK' 'ARL6IP5' 'ASPM' 'AURKA' 'AURKAIP1' 'CENPE' 'CITED2'
 'CNPY3' 'CORO1A' 'COTL1' 'CPA3' 'CYBA' 'E2F4' 'EIF3K' 'FUCA2' 'GATA2'
 'GCSAML' 'GFI1' 'HDC' 'HEMGN' 'HERC5' 'HLA-DMA' 'HLA-DPB1' 'HLA-DQB1'
 'HLA-DRB6' 'HPGDS' 'IL18R1' 'IL2RG' 'ITGA2B' 'KLF1' 'LMO2' 'LPCAT2'
 'LTBP1' 'NKG7' 'PF4' 'PRICKLE1' 'RAB27B' 'RABGGTA' 'RASSF2' 'RPL35'
 'RPS19' 'RPS21' 'SEMA7A' 'SLC1A4' 'SNCA' 'SOD2' 'STON2' 'SUN2' 'TAP1'
 'TMEM273' 'TOP2A' 'TPI1P1' 'ZNF263']
[13]:
# Visualise corner gene correlation table
df_corr_corners = pd.DataFrame.from_dict(correlation, orient='columns')
df_corr_corners.index = gene_names
df_corr_corners = df_corr_corners.loc[corner_genes]
df_corr_corners.round(3)

[13]:
Meg Ery MEP-like HSC GMP-like Mon Bas Neu
ACSS1 0.060 -0.467 -0.204 0.066 -0.304 -0.172 -0.109 0.869
AHNAK 0.553 0.461 0.199 0.076 0.077 -0.282 -0.917 -0.715
ARL6IP5 -0.787 0.108 0.236 -0.213 0.037 0.427 -0.398 -0.858
ASPM 0.504 0.024 -0.071 -0.603 0.142 0.526 -0.211 0.825
AURKA 0.417 -0.009 -0.038 -0.529 0.079 0.316 0.100 -0.411
AURKAIP1 0.622 -0.169 -0.110 -0.099 0.095 -0.733 -0.517 -0.563
CENPE 0.361 0.040 -0.100 -0.605 0.081 0.463 -0.168 0.410
CITED2 -0.396 -0.062 -0.013 0.089 0.435 0.187 -0.408 -0.041
CNPY3 0.275 0.206 0.015 0.289 -0.127 -0.633 -0.486 0.734
CORO1A 0.401 0.545 0.218 -0.072 -0.128 -0.674 -0.319 -0.059
COTL1 -0.800 0.444 0.174 0.394 0.123 -0.159 0.055 -0.850
CPA3 0.352 0.501 0.221 0.249 0.371 0.387 -0.932 0.042
CYBA -0.418 0.510 0.159 -0.194 0.018 -0.579 -0.103 -0.816
E2F4 -0.123 -0.203 -0.127 -0.033 -0.437 -0.058 -0.460 0.723
EIF3K 0.710 -0.266 -0.116 0.011 -0.136 -0.684 0.214 0.183
FUCA2 0.186 -0.078 0.115 0.425 0.279 -0.075 -0.417 0.642
GATA2 -0.083 0.359 0.124 0.103 0.369 0.422 -0.905 0.714
GCSAML -0.576 -0.241 -0.028 0.120 0.055 0.403 -0.874 0.694
GFI1 0.278 0.408 0.023 -0.498 -0.188 -0.170 -0.314 0.867
HDC 0.440 0.549 0.194 0.365 0.400 0.241 -0.818 0.802
HEMGN -0.178 -0.568 -0.086 -0.281 -0.015 0.425 0.574 0.032
HERC5 0.115 0.451 -0.029 -0.527 -0.126 -0.059 -0.362 0.648
HLA-DMA 0.489 0.441 0.081 -0.016 -0.031 -0.185 0.446 -0.887
HLA-DPB1 0.420 0.273 -0.038 -0.326 -0.406 -0.040 0.621 -0.909
HLA-DQB1 0.299 0.482 -0.069 -0.151 -0.068 -0.091 0.323 -0.878
HLA-DRB6 0.535 0.530 0.122 -0.230 0.138 0.413 0.625 -0.877
HPGDS 0.704 0.365 0.184 0.158 0.367 0.196 -0.948 0.524
IL18R1 -0.640 0.444 0.235 -0.456 0.023 0.111 -0.884 0.499
IL2RG 0.540 0.612 0.073 -0.038 0.024 -0.644 -0.569 -0.417
ITGA2B -0.876 -0.423 -0.028 0.099 0.412 0.231 -0.476 0.713
KLF1 0.517 -0.583 -0.339 0.093 0.215 0.116 0.337 0.269
LMO2 0.180 -0.663 -0.238 -0.520 -0.211 0.333 0.418 -0.839
LPCAT2 0.656 0.550 0.173 0.082 0.144 0.178 -0.909 0.691
LTBP1 -0.926 -0.485 -0.036 -0.034 -0.216 -0.158 -0.463 0.627
NKG7 0.000 0.338 0.024 0.217 -0.310 -0.676 -0.272 0.801
PF4 -0.929 -0.394 0.099 0.031 0.220 0.007 -0.593 0.607
PRICKLE1 -0.857 -0.423 -0.122 -0.171 -0.019 -0.098 -0.407 0.739
RAB27B -0.833 -0.361 -0.094 -0.248 -0.193 0.136 -0.865 0.796
RABGGTA 0.486 -0.411 -0.280 -0.525 -0.222 -0.073 -0.034 0.378
RASSF2 0.132 0.642 0.272 0.077 0.033 0.200 -0.805 -0.644
RPL35 0.731 -0.036 0.101 0.243 0.094 -0.659 0.627 0.305
RPS19 0.580 -0.109 0.004 -0.019 -0.152 -0.709 0.561 0.650
RPS21 0.694 -0.121 -0.155 0.109 -0.232 -0.701 0.558 0.281
SEMA7A 0.579 0.436 0.193 0.300 0.271 0.197 -0.923 0.285
SLC1A4 0.424 0.445 0.011 -0.140 -0.322 -0.497 0.040 -0.795
SNCA -0.919 -0.658 -0.174 -0.077 0.298 0.116 0.618 -0.411
SOD2 -0.745 -0.084 -0.052 0.425 0.049 0.051 0.641 -0.755
STON2 -0.869 -0.236 -0.055 0.239 0.160 0.411 -0.536 0.605
SUN2 -0.195 0.410 0.010 -0.576 -0.106 -0.371 -0.618 0.602
TAP1 -0.715 0.496 0.059 0.197 -0.131 -0.255 0.239 0.087
TMEM273 0.466 0.478 0.172 -0.119 0.054 0.258 -0.928 -0.469
TOP2A 0.426 0.256 -0.113 -0.579 0.162 0.536 -0.152 0.816
TPI1P1 -0.726 0.412 0.085 0.041 -0.233 -0.050 0.180 0.631
ZNF263 0.110 -0.465 -0.259 0.105 -0.412 -0.806 -0.578 0.455