scallops.reads.merge_sbs_phenotype

scallops.reads.merge_sbs_phenotype(df_labels, df_phenotype, df_barcode, sbs_cycles, how='outer')

Combine sequencing and phenotype tables with one row per label.

The index must be the same in both tables (e.g., both tables generated from the same segmentation).

The barcode table is then joined using its barcode column to the most abundant (barcode_0) and second-most abundant (barcode_1) barcodes for each label. The substring (prefix) of barcode used for joining is determined by the sbs_cycles index. Duplicate prefixes are dropped for the joined table (e.g., if insufficient sequencing is available to disambiguate two barcodes).

Parameters:
  • df_labels (DataFrame | DataFrame) – Data frame containing SBS reads:

  • df_phenotype (DataFrame | DataFrame) – Data frame with phenotype calls

  • df_barcode (DataFrame) – Barcode information data frame

  • sbs_cycles (Sequence[int]) – List of cycles used (starting at 1)

  • how (Literal['left', 'right', 'inner', 'outer', 'cross']) – How to merge

Returns:

Combined table

Return type:

DataFrame | DataFrame