Rebalancing single-cell pool with preliminary sequencing run

An interactive note inspired by Bret Victor's work on reactive documents and his notes on the problem of climate change.

Code heavily borrowed from Tangle and the notes
on the problem of climate change.

Nikolay Markov, October 2020

Single-cell libraries are often too small to be sequenced on high-output flowcells. Doing this would result in wasteful sequencing depth, i.e. high sequencing saturation.

To overcome this, multiple libraries are loaded on one high-output flowcell, to distribute the number of reads between libraries and achieve desired sequencing depth.

To gain more control over resulting sequencing depth, we can use low-output flowcell run to estimate library quality and composition, and adjust the proportion of each library in the pool.

Example

We have 4 libraries: . We sequenced them on low-output flowcell and got the following results:

	Number of cells	Proportion of reads	Number of reads	Mean reads per cell

You can change the value of green numbers by dragging them left or right and see how the resulting calculations are affected.

Loading factor

After the preliminary run, we compute loading factors for the libraries. In Illumina report this is done either by or by sequencing .

= max() /
= max() /
=

You can select how the is computed and see how the resulting calculations are affected.

Result

With the assumption that loading factor will linearly adjust the number of reads each library gets, we can look at the projected distribution of the 2 key metrics.

	Number of cells	Loading factor	New reads	New depth