Detecting Monotone Association in an Unspecified Subpopulation
Speaker: Joe Verducci (Ohio State University)
Date: Thu 30th April 2009
Location: Statistics Seminar Room- Library building
Motivation: The US National Cancer Institute maintains a database of 60 cancer cell-lines (NCI-60) representing nine different tissues of origin. Enormous stores of information about each cell-line may be compared to find like behaviour across subsets of this panel. Here we focus on finding associations between gene expression and resistance to
anti-cancer drugs. Previous studies have found some expected and some surprising associations in subpopulations, e.g., associations over estrogen-regulated cell-lines and association over less differentiated cell-lines. The proposed methods would greatly reduce the amount of effort required to detect such associations.
Methods: We use a sequence of Kendall's tau statistics, constructed according to an optimal/admissible ordering, to form a path, whose crossing over pre-specified boundaries provides a statistical test for association in an unspecified subpopulation. For each cell-line, we also provide a posterior probability that it is included in the association. Finally, we apply to the test to convex combinations of two predictors (e.g., pairs of gene expressions) and produce diagnostics that suggest asymmetric roles in the type of association each predictor provides.
Results: Compared with Kendall's classical test for association over the whole population, the tau-path test has substantially greater power in situations where there is a strong association over a subset. This comes at a small cost of having slightly less power when there is a weak association over the whole population. The tau-path method confirms a strong, negative correlation between ASNS expression level and L-asparaginase potency over the leukemia and ovarian subsets of the NCI-60; antisense technology has previously shown a causal link in the ovarian cell-lines, but only in estrogen sensitive cell-lines. The tau-path method also found significant negative compound-gene associations between all five quassinoid drugs under test and IGFBP6 gene, with a common subset of 36 cells for all compounds. We hope to verify this association through laboratory experiments.
Extensions: The method is also currently being used to detect which of the 210 designated market areas of the US demonstrate the most sensitivity to incentives to repurchase products.
(This talk is part of the Statistics and Actuarial Science series.)