Introduction to LODopt

Reuben Thomas

2025-07-16

Introduction

The LODopt package provides tools for analyzing associations between cell type abundances and experimental conditions in single-cell RNA-seq data.

Quick Start

Example Data

We simulate data for 30 samples with 25 clusters, where 4 clusters (c0, c2, c7 and c14) have changed their abundance.

# Create example count matrix
  # Create test data
  set.seed(123)
   # No of clusters in single cell dataset
   K=25
   # No of cells in tissue
   depth = 1e9
   # Total number of samples
   nsamp = 30
   alpha = 10^runif(K, min=log10(0.5), max = log10(10))
   p <- dirmult::rdirichlet(alpha = alpha) |> sort()
   p <- p[p > 0.001]
   p <- p/sum(p)
   size <- rep(10, length(p))
   change_mean = rep(1, length(p))
   ##changed cluster indices
   change_mean[c(1,3,8,15)] = c(0.2, 2, 0.2, 2)
   # Simulate counts
   counts_res <- simulate_cellCounts_fromTissue(props=p,nsamp=nsamp,depth=depth, size = size, change_mean = change_mean)

Set up the required SummarizedExperiment object

  counts <- counts_res$counts

  pheno_data <- data.frame(sampleID = paste0("S", 1:30),
                           groupid = c(rep("group0", 15), rep("group1", 15)))
  pheno_data %<>% tibble::column_to_rownames("sampleID")



  require(SummarizedExperiment)
  model_formula <- "groupid"
  cellcomp_se <- SummarizedExperiment(assays = list(counts=counts),
                                      colData = pheno_data,
                                      metadata = list(modelFormula = model_formula,
                                                      coef_of_interest_index = 2,
                                                      reference_levels_of_variables = list(c("groupid", "group0")),
                                                      random_seed = 123456,
                                                      unchanged_cluster_indices = NULL))

Running Association Analysis

cellcomp_res <- logodds_optimized_normFactors(cellcomp_se)
#> [1] "Running iteration no. 1"
#> boundary (singular) fit: see help('isSingular')
#> [1] "Running iteration no. 2"
#> [1] "Running iteration no. 3"
#> [1] "Found stable solution"

Summarizing Results

The estimate column has the log odds ratio estimates for the cluster-specific association with groupid while the estimates_significance has the corresponding raw p-values for the corresponding null hypothesis that the log odds ratio is equal to 0.

print(cellcomp_res$res)
#>    cluster_id    comparison   estimates estimates_significance
#> 1          c0 groupidgroup1 -1.52100359           1.079388e-17
#> 2          c1 groupidgroup1 -0.17344082           2.124728e-01
#> 3          c2 groupidgroup1  0.79391408           1.975057e-09
#> 4          c3 groupidgroup1  0.04588954           7.117915e-01
#> 5          c4 groupidgroup1 -0.04211118           6.974807e-01
#> 6          c5 groupidgroup1 -0.13292183           2.514397e-01
#> 7          c6 groupidgroup1 -0.02823263           7.608435e-01
#> 8          c7 groupidgroup1 -1.75714635           3.002863e-50
#> 9          c8 groupidgroup1  0.13039553           1.830506e-01
#> 10         c9 groupidgroup1  0.14547388           2.132688e-01
#> 11        c10 groupidgroup1 -0.09124305           5.273222e-01
#> 12        c11 groupidgroup1  0.09725929           3.803643e-01
#> 13        c12 groupidgroup1  0.02971688           8.012406e-01
#> 14        c13 groupidgroup1 -0.07937818           4.649689e-01
#> 15        c14 groupidgroup1  0.68735275           9.847319e-12
#> 16        c15 groupidgroup1 -0.04634329           7.559317e-01
#> 17        c16 groupidgroup1  0.08155154           4.830891e-01
#> 18        c17 groupidgroup1 -0.11924900           3.281300e-01
#> 19        c18 groupidgroup1 -0.14010580           2.215115e-01
#> 20        c19 groupidgroup1  0.01754712           8.798781e-01
#> 21        c20 groupidgroup1  0.09118643           4.688317e-01
#> 22        c21 groupidgroup1  0.06205020           4.739529e-01
#> 23        c22 groupidgroup1  0.10001258           4.572840e-01
#> 24        c23 groupidgroup1 -0.06205290           5.987648e-01
#> 25        c24 groupidgroup1  0.08948168           5.602581e-01

Conclusion

The LODopt package provides a simple interface for testing associations between cell type abundances and experimental conditions, with support for multiple statistical methods and comprehensive result visualization.