regdiffusion.data.load_beeline

Contents

regdiffusion.data.load_beeline#

regdiffusion.data.load_beeline(data_dir='data', benchmark_data='hESC', benchmark_setting='500_STRING')[source]#

Load BEELINE data and its ground truth (download if necessary).

Paper: Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data Paper Link: https://www.nature.com/articles/s41592-019-0690-6

BEELINE consists of 7 single-cell datasets (hESC, hHep, mDC, mESC, mHSC, mHSC-GM, and mHSC-L) and 3 sets of ground truth networks (STRING, Non-ChIP, ChIP-seq).

Parameters:
  • data_dir (str) – Parent directory to save and load the data. If the path

  • exist (does not)

  • a (it will be created. Data will be saved in)

  • path. (subdirectory under the provided)

  • benchmark_data (str) – Benchmark datasets. Choose among “hESC”, “hHep”,

  • "mDC"

  • "mESC"

  • "mHSC"

  • "mHSC-GM"

  • "mHSC-L". (and)

  • benchmark_setting (str) – Benchmark settings. Choose among “500_STRING”,

  • "1000_STRING"

  • "500_Non-ChIP"

  • "1000_Non-ChIP"

  • "500_ChIP-seq"

:param : :param “1000_ChIP-seq”: :param “500_lofgof”: :param and “1000_lofgof”. If either of the: :param “lofgof” settings is choosed: :param only “mESC” data is available.:

Returns:

A tuple containing two objects for a single BEELINE benchmark. The first element is a scanpy AnnData with cells on rows and genes on columns. Second element is an numpy array for the adjacency list of the ground truth network.

Return type:

tuple