Rebalancing single-cell pool with preliminary sequencing run
An interactive note inspired by Bret Victor's work on reactive documents and his notes on the problem of climate change.
Code heavily borrowed from Tangle
and the notes
on the problem of climate change.
Nikolay Markov, October 2020
Single-cell libraries are often too small to be sequenced on high-output flowcells. Doing this would result in wasteful sequencing depth, i.e. high sequencing saturation.
To overcome this, multiple libraries are loaded on one high-output flowcell, to distribute the number of reads between libraries and achieve desired sequencing depth.
To gain more control over resulting sequencing depth, we can use low-output flowcell run to estimate library quality and composition, and adjust the proportion of each library in the pool.
Example
We have 4 libraries: . We sequenced them on low-output flowcell and got the following results:
Number of cells | Proportion of reads | Number of reads | Mean reads per cell |
You can change the value of green numbers by dragging them left or right and see how the resulting calculations are affected.
Loading factor
After the preliminary run, we compute loading factors for the libraries. In Illumina report this is done either by or by sequencing .
- = max() /
- = max() /
- =
You can select how the is computed and see how the resulting calculations are affected.
Result
With the assumption that loading factor will linearly adjust the number of reads each library gets, we can look at the projected distribution of the 2 key metrics.
Number of cells | Loading factor | New reads | New depth |