|
| |
|
|
| |
 |
Seminar schedule for Semester 1, 2009/2010.
|
| |
| |
Semester 2, 2009/2010 |
| |
Hybrid Monte-Carlo: A little-known MCMC algorithm |
| |
Speaker: Alex Beskos (University College London) |
3pm, Thursday February 18th |
| Room 550B, Library Building, UCD |
| |
We are interested in the behavior of MCMC algorithms in high dimensions. Results in the literature have shown that the 'vanilla' Random Walk Metropolis scales as 1/n, with n being the dimension of the state space. The so-called Metropolis-adjusted Langevin algorithm scales as 1/n^{1/3}, as it uses information about the gradient of the target density. We consider an MCMC algorithm popular amongst physicists (but not statisticians): the Hybrid Monte-Carlo (HMC) algorithm. HMC scales as 1/n^{1/4}. In connection with related results for other MCMC algorithms, for a simple class of target distributions we identify a single asymptotically optimal acceptance probability for HMC (which is 0.651, to three decimal places) irrespectively of the the particular selection of target distribution. |
|
|
| |
Prediction from Gene Expression Data using Iterative Bayesian Model Averaging. |
| |
Speaker: Kevin Hayes (University of Limerick) |
3pm, Thursday January 21st |
| Room 550B, Library Building, UCD |
|
|
| |
Spatial Cluster Detection Using the Number of Connected Components of a Graph |
| |
Speaker: Avner Bar-Hen (Universite Paris Descartes) |
3pm, Thursday January 14th |
| Room 550B, Library Building, UCD |
| |
The aim of this work is to detect spatial clusters based on the number of connected components of a graph. We link Erd\"os graph and Poisson point process. We give the probability distribution function (pdf) of the number of connected component for an Erd\"os graph and obtain the pdf of the number of clusters for a Poisson process. We also obtain the pdf of the number of clusters bigger than a given threshold for Poisson process. We extend this results to marked point process by using multi-class Erd\"os graph. Using this result, we obtain a test for complete spatial randomness and also identify the clusters that violates the CSR hypothesis. Border effects are computed. We illustrate our results on a tropical forest example.
|
|
|
| |
|
| |
Semester 1, 2009/2010 |
| |
Bayesian non-parametric modelling for quantile regression |
| |
Speaker:Milovan Krnjajic (NUIG) |
3pm, Thursday November 19th |
| Room 550B, Library Building, UCD |
| |
Abstract: Quantile regression may be used to analyze higher or lower quantilesof the response distribution when they exhibit a different structuralrelationship to the covariates than the average responses. We developBayesian semiparametric models based on Dirichlet process (DP) mixturepriors for the error distribution, and dependent DP prior for errordensities that change with covariates. We also study a fully nonparametric model using Gaussian process priors for the quantile regression function and a DP mixture prior for the error distribution. The stochastic prior sepecifications enable uncovering of non-linearity in the quantile regression function and non-standard features in the response distribution. Inference is based on a combination of posterior simulation methods for Dirichlet process mixtures. The proposed models are illustrated using simulated and real data sets. |
|
|
| |
Mortality modelling |
| |
Speaker: Mary Hall (UCD) |
3pm, Friday November 6th |
| Room 550B, Library Building, UCD |
| |
Abstract: Modelling of past mortality trends and projecting future mortality rates is an important area of actuarial research. In this presentation two statistical methods – generalised additive and generalised linear mixed models – are considered for modelling Irish mortality data. Generalised additive models are used to project future mortality rates and are compared with two methods recently reviewed by the UK Actuarial Profession. The models are applied to Irish population data (from the Central Statistics Office) and mortality rates projected to approximately 2050. The generalised linear mixed model is used to model the impact of pension amount on Irish pensioner mortality. Mortality rates amongst pensioners are known to vary by pension amount and the generalised linear mixed model is compared with traditional actuarial approaches for analysing the impact of pension amount on mortality. |
|
|
| |
Semester 2 Seminars 2008/2009 |
| |
Detecting Monotone Association in an Unspecified Subpopulation |
| |
Speaker:Joe Verducci (Ohio State University) |
3pm, April 30th |
| Room 550B, Library Building, UCD |
| |
Motivation: The US National Cancer Institute maintains a database of 60 cancer cell-lines (NCI-60) representing nine different tissues of origin. Enormous stores of information about each cell-line may be compared to find like behaviour across subsets of this panel. Here we focus on finding associations between gene expression and resistance to anti-cancer drugs. Previous studies have found some expected and some surprising associations in subpopulations, e.g., associations over estrogen-regulated cell-lines and association over less differentiated cell-lines. The proposed methods would greatly reduce the amount of effort required to detect such associations.
Methods: We use a sequence of Kendall's tau statistics, constructed according to an optimal/admissible ordering, to form a path, whose crossing over pre-specified boundaries provides a statistical test for association in an unspecified subpopulation. For each cell-line, we also provide a posterior probability that it is included in the association. Finally, we apply to the test to convex combinations of two predictors (e.g., pairs of gene expressions) and produce diagnostics that suggest asymmetric roles in the type of association each predictor provides.
Results: Compared with Kendall's classical test for association over the whole population, the tau-path test has substantially greater power in situations where there is a strong association over a subset. This comes at a small cost of having slightly less power when there is a weak association over the whole population. The tau-path method confirms a strong, negative correlation between ASNS expression level and L-asparaginase potency over the leukemia and ovarian subsets of the NCI-60; antisense technology has previously shown a causal link in the ovarian cell-lines, but only in estrogen sensitive cell-lines. The tau-path method also found significant negative compound-gene associations between all five quassinoid drugs under test and IGFBP6 gene, with a common subset of 36 cells for all compounds. We hope to verify this association through laboratory experiments.
Extensions: The method is also currently being used to detect which of the 210 designated market areas of the US demonstrate the most sensitivity to incentives to repurchase products. |
|
|
| |
There will be a statistics mini-colloquium on "Network models in the health sciences" held on Friday, April 24th in room L532 of the James Joyce library.Details are as follows:
[Printer-Friendly Version]
|
| 11 - 11.30 |
Speaker: Jim McCann (Queens University, Belfast) |
| Title: Stochastic models of MRSA infection |
|
| 11.30 - 12 |
Speaker: Adrian Barnett (Queensland University of Technology) |
| Title: Estimating the effect of MRSA infection on length of stay using a longitudinal model |
|
| 12 - 12.30 |
Speaker: Helen McAneney (Queens University, Belfast) |
| Title: The role of Networks within Public Health |
| |
|
|
| |
Inferring ecologically recent migration rates from multilocus genotype data: one use of a mixture model |
| |
Speaker: Jon Yearsley, School of Biology and Environmental Science, UCD |
3pm, January 15th |
| Room 550B, Library Building, UCD |
| |
Dispersal is a mixing process which mixes individuals between local populations. It is important in many areas of ecology, but often difficult to directly observe. We have been looking at the use of individual genotype data for inferring rates of recent dispersal between pairs of populations. We have been concentrating upon using Bayesian mixture models as a novel approach to analysing this kind of data. I’ll briefly describe why dispersal is important, the problems of inferring dispersal, the advantages of using genotype data over, say, mark-recapture data, and some of the current approaches that use genotype data. I’ll then describe our approach, its relationship with other mixing problems, and its performance on simulated data of greater white-toothed shrew populations. The model performance suggests new avenues of research in estimating selection within the genome andestimation of dispersal kernel parameters. |
|
|
| |
A Parental Care Game |
| |
Speaker:David Ramsey, University of Limerick |
3pm, February 5th |
| Room 550B, Library Building, UCD |
| |
In the 1970s Maynard Smith considered a simple parental game, in which parents simply decided whether to care for or desert their offspring. In his simplest model, the expected payoff (number of surviving offspring) of both parents depended simply on their decisions. However, he was well aware of the drawbacks involved in this model. For example, on average a male deserter will be able to father more broods than a male carer, but the expected gains from deserting will depend on the strategies used by the population as a whole (the strategy profile). Among other factors, the larger the proportion of male deserters, the more competition he will face in obtaining extra partners. The talk will start by considering the system of parental care used in St. Peter’s fish, which highlights many of the game theoretic aspects of such problems. A simple model will then be presented which takes into account the importance of the profile of strategies used by the population in determining the success of a particular strategy. Conditions for various systems of parental care (no care, uniparental care and biparental care) to be evolutionarily stable are given. It will also be shown that there can be such equilibria in which, for example, females always care, but males use both strategies (even when there is no difference between individuals). The talk will end by discussing ways of developing such models. |
|
|
| |
Bayesian inference for the p* model |
| |
Speaker: Alberto Caimo |
3pm, March 25th |
| Room 550B, Library Building, UCD |
|
|
| |
|
| |
Semester 1 Seminars 2008/2009 |
| |
Modelling Irish Mortality at the Highest Ages |
| |
Speaker: Dr. Shane Whelan (University College Dublin) |
3pm, September 25th |
| Room 550B, Library Building, UCD |
| |
We discuss the problems of estimating mortality at ages above 80 years, taking Irish data as a case study. We demonstrate that the official Irish Life tables have systematically overstated mortality since the 1950s with the bias tending to increase with increasing age. Using several models we re-estimate mortality rates at advanced ages, basing the crude rates on the method of extinct generations. We show that the Kannisto version of Perks’s logistic curve appears best. We conclude by outlining an extension to the method of extinct generations that appears more natural than the survivor ratio method. |
|
|
| |
Mixture Model Component Trees: Visualizing the Hierarchical Structure of Complex Groups |
|
Speaker: Dr. Nema Dean (University of Glasgow) |
3pm, October 16th |
| Room 550B, Library Building, UCD |
| |
Cluster analysis is the search for unknown group structure in data. In general within cluster analysis we may wish to find more than one type of group shape. Model-based clustering is a parametric method for clustering based on mixture models which is gaining in popularity over earlier algorithmic methods. For continuous data, the most common form is a mixture of (multivariate) Gaussians (possibly with a single uniform noise component to pick up outliers). The popularity of model-based clustering is due to the fact that model choice methods can be used to make automatic decisions about selecting the number of components that best describes the data, unlike for example, k-means, which requires k to be specified in advance. The frequently made assumption is that the groups or subpopulations making up the overall population (modelled by the mixture density) are modeled/estimated by individual components in the mixture. If the groups in the data are not from distributions with elliptical contours (with no skewness or heavy tails), this assumption is likely false. For more complex group shapes, the number of components is an overestimate of the number of groups. However, the mixture model will still usually give a good density estimate of the data. In order to represent the hierarchical structure of the mixture components (where a combination/sub-mixture of components can be deemed a cluster) the density estimate from the mixture is used to create a (dis)similarity measure between the components, which is then used as input to an agglomerative hierarchical clustering algorithm the result of which can be viewed by a dendrogram: the component tree. Similarly heat-maps of the similarity matrix with ordering based on the hierarchical clustering can be useful in examining the structure and similarity of components and underlying groups. In addition to the component tree, the density-based dendrogram of individual observations can be helpful in visualizing group information in high-dimensional data. |
|
|
| |
| |
| |
Speaker: Dr. Luc Bijnens (Johnson & Johnson) |
4pm, Friday November 7th |
| Room 550B, Library Building, UCD |
| |
(Title to be confirmed) |
|
|
| |
| |
| |
Speaker: Dr. Adele Marshall (Queens University Belfast) |
3pm, November 13th |
| Room 550B, Library Building, UCD |
| |
(Title to be confirmed) |
|
|
| |
| |
| |
Speaker: Prof. Julian Besag (University of Bath) |
3pm, December 4th |
| Room 550B, Library Building, UCD |
| |
(Title to be confirmed) |
|
| |
|
|
|