JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for ALLSTAT Archives


ALLSTAT Archives

ALLSTAT Archives


allstat@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ALLSTAT Home

ALLSTAT Home

ALLSTAT  May 2018

ALLSTAT May 2018

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Introduction to Mixed (Hierarchical) models for life sciences using R (IMLS01)

From:

Oliver Hooker <[log in to unmask]>

Reply-To:

Oliver Hooker <[log in to unmask]>

Date:

Tue, 15 May 2018 21:37:56 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (141 lines)

Introduction to Mixed (Hierarchical) models for life sciences using R (IMLS01) 

https://www.psstatistics.com/course/introduction-to-mixed-hierarchical-models-for-life-sciences-using-r-imls01/

This course will be delivered by Prof Subhash Lele in Glasgow City Centre from the 24th - 28th September 2018.

Course Overview:
Mixed models, also known as hierarchical models and multilevel models, is a useful class of models for many applied sciences, including psychology, ecology and evolution. The goal of this course is to give a thorough introduction to the logic, theory and most importantly implementation of these models to solve practical problems in psychology and ecology. Participants are not expected to know mathematics beyond the basic algebra and calculus. Participants are expected to know some R programming and to be familiar with the linear and generalized linear regression. We will be using JAGS (Just Another Gibbs Sampler) for Markov Chain Monte Carlo (MCMC) simulations for analyzing mixed models. The course will be conducted so that participants have substantial hands-on experience and will address both frequentist and Bayesian approaches.

Intended Audience
Research postgraduates, practicing academics and primary investigators in psychology, ecology and evolutionary biology, management and professionals in government and industry.

Course Programme
Monday 24th – Classes from 09:30 to 17:30
Linear and Generalized linear models

To understand mixed models, the most important first step is to thoroughly understand the linear and generalized linear models. Also, when conducting the data analysis, it is useful to fit a simpler fixed effects model before trying to fit a more complex mixed effects model. Hence, we will start with a very detailed review of these models. We are assuming that the participants are familiar with these models and hence we will emphasize some important, but not commonly covered, topics. 

This will also give us an opportunity to unify the notation, review the basic R commands and fill out any gaps in knowledge and understanding of these topics.

1. We will show the use of non-parametric exploratory techniques such as classification and regression trees (CART) for learning about important covariates and possible non-linearities in the relationships.

2. We will emphasize graphical and simulation based methods (e.g. Gelman and Hill, 2006) to understand and explore the implications of the fitted model.

3. We will discuss graphical tools such as marginal and conditional plots that are useful for conveying the results of a multiple regression model to a lay person.

4. We will emphasize the use of graphical tools to conduct regression diagnostics and appropriateness of the model.

5. We will discuss the important concepts of confounding, effect modification and interaction. These are particularly important to conduct causal, not just correlational, inference using observational studies.

Tuesday 25th – Classes from 09:30 to 17:30
Computational inference

Many of the topics that will be covered involve the use of matrix algebra and calculus. While these mathematical techniques are essential tools for a mathematical statistician who is trying to understand the theory behind the methods, they can be avoided in practice by using simulation based techniques. The built-in functions such as the ’lm’ and ’glm’ to fit the regression models use the method of maximum likelihood to estimate the parameters and conduct statistical inference. We will discuss the use of JAGS (Just Another Gibbs Sampler) and the R package ’dclone’ to fit the same models. We will use a different statistical philosophy, namely the Bayesian inference, to fit these models. We will show how the Bayesian approach can be tricked into giving frequentist answers using data cloning (Lele et al. 2007, Ecology Letters). We will also discuss the rudiments of frequentist and Bayesian inference although we will not go into the pros and cons of them at this time. That will be covered during sessions 3 and 4 of the fifth day (and, over beer afterwards).

1. What makes an inference statistical inference?

2. What do we mean by probability of an event?

3. How do we quantify uncertainty in an inferential statement in the frequentist framework?

4. How do we quantify uncertainty in an inferential statement in the Bayesian framework?

We will then discuss the simulation based methods to quantify uncertainty.

1. Parametric bootstrap to quantify frequentist uncertainty

2. Markov Chain Monte Carlo to quantify Bayesian uncertainty

3. Fitting LM and GLM using JAGS and Bayesian approach

Wednesday 26th – Classes from 09:00 to 17:00
Linear Mixed Models
Historically, linear mixed models arose in the study of quantitative genetics and heritability issues. They were successfully applied in animal breeding and led to the ’white’ revolution with abundance of milk supply for the developing world. They were, also, used in horse racing and other such fun areas. The other situation where linear mixed effects models were developed were in the context of growth curves. We will follow this historical trajectory of mixed models, paying tribute to the great statisticians R. A. Fisher, C. R. Rao and Jerzy Neyman, and study linear mixed models first. The questions they tried to solve were: Deciding the genetic value of a sire and/or a dam, studying heritability of traits, studying co-evolution of traits etc. 

These can be answered provided we assume that the sires and dams in our experiment or sample are merely a sample from a super-population of sires and dams. In growth curve analysis, we need to take into account that each individual is unique in its own way but is also a part of a population. How do we discuss both individual level and population inferences? In modern times, linear mixed effects models have arisen in the context of small area estimation in survey sampling where one is interested in inferring about a census tract based on county or state level data. These models arise also in the context of combining remote sensed data from different resolutions and types. The main issues that we will be discussing are:

1. What is a random effect? What is a fixed effect? How do we decide if an effect is random or fixed?

2. How do we modify a linear regression model to accommodate random effects?

3. Why bother fitting a mixed effects models? What do we gain?

4. How to modify the JAGS linear models program to fit a linear mixed effects model using JAGS?

5. What is the difference between a Bayesian and a frequentist inference?

6. What is a prior? What is a non-informative prior?

7. How do we interpret the results of a linear mixed effects model fit? Graphical and simulation based methods

8. How do we do model selection with mixed effects models?

9. How do we do model diagnostics in mixed effects models?

10. Parameter identifiabilty issues in linear mixed models

As we discuss these applications, we will discuss some subtle computational issues involved in using MCMC. In my recollection (which may be biased as it has been about 25 years since the quote), Daryl Pregibon said: MCMC is the crack cocaine of modern statistics; it is addictive, seductive and destructive. Hence, it is important for a practitioner to understand these issues in order not to misuse the MCMC technique.

1. What is a Markov Chain Monte Carlo method? Why is it necessary for mixed models?

2. What are the subtleties in implementing MCMC?: Convergence of the algorithm, Mixing of the chains.

3. Pros and cons of using MCMC

Thursday 27th – Classes from 09:00 to 17:00
Generalised Linear Mixed Models
We will again start the discussion of GLMM in its historical context. One of the initial uses of mixed models were in the context of over dispersion in count data. Zero inflated count data was another important example. The example that drove the current revolution in the use of GLMM was in the context of spatial epidemiology. Clayton and Caldor (1989, Biometrics) showed that one can use spatial correlation to improve the prediction in mapping disease rates. This was also an example of the application of Empirical Bayes methods that allow one to pool information from different spatial areas (or, studies, or, scales, and so on).

1. Zero inflated data In many practical situations, we observe that there are many locations where there are zero counts, far in excess of what would be expected under the Poisson regression model. This can be effectively modelled using a mixed model framework. The mixed models framework allows us to use much more complex and realistic models.

2. Over dispersion in GLM, Spatial GLM, Spatio-temporal GLM The Poisson regression model assumes that the mean and variance are equal. This is, often, not true in practice. Generally the variance in the data exceeds the mean. One can show that such over-dispersion can be modelled using a mixed effects model. These models also arise in the context of capturerecapture sampling where capture probabilities vary across space or time or individuals.

3. Longitudinal or panel data with discrete response variable Many times we have data on different individuals where within the individual there is temporal dependence but individuals are independent of each other. Cluster sampling is another situation where we have dependence within a cluster but independence between clusters. Such data needs to take into account the innate variation between individuals before one can discuss the effect of interesting covariates or risk factors. Such data are effectively modelled as GLMM.

4. Measurement error, missing data Missing data and measurement error are ubiquitous in ecological studies. Mixed models provide a convenient way to take into account these difficulties and infer about the underlying processes of interest. We will discuss these issues in the context of Population Viability Analysis, Spatial population dynamics and source-sink analysis, Occupancy and abundance surveys. These also arise while doing usual linear and generalized linear models if the covariates are measured with error.

5. Additional topics depending on the interest of the participants. These may include, for example, discussion of Species Distribution Models, Resource Selection Functions and Animal movement models.

6 Computational issues: Advanced topics.

Friday 28th – Classes from 09:30 to 16:00
Mixed Models in a Bayesian Framework
MCMC is not the only approach to analyse mixed models. We will briefly discuss Laplace approximation based techniques (INLA, in particular) along with approximate techniques such as Composite likelihood and Approximate Bayesian Computation. Because of the mathematical nature, this discussion will be somewhat limited, only giving the basics and hinting at the important issues.
7 Philosophical issues: Sophie’s choice

1. What are the philosophical problems with using the frequentist quantification of uncertainty?

2. What are the philosophical problems with using the Bayesian quantification of uncertainty?

3. Sophie’s choice?

If you have any questions please email [log in to unmask]

Oliver Hooker PhD.
PS statistics

Introduction to Bayesian hierarchical modelling using R (IBHM02) 
https://www.psstatistics.com/course/introduction-to-bayesian-hierarchical-modelling-using-r-ibhm02/

Behavioural data analysis using maximum likelihood in R (BDML01)
https://www.psstatistics.com/course/behavioural-data-analysis-using-maximum-likelihood-bdml01/

Social Network Analysis for Behavioural Scientists using R (SNAR01)
https://www.psstatistics.com/course/social-network-analysis-for-behavioral-scientists-snar01/
 
Introduction to statistical modelling for psychologists in R (IPSY01)
https://www.psstatistics.com/course/introduction-to-statistics-using-r-for-psychologists-ipsy01/
 
Introduction to Mixed (Hierarchical) models for life sciences using R (IMLS01)
https://www.psstatistics.com/course/introduction-to-mixed-hierarchical-models-for-life-sciences-using-r-imls01/
 
Statistical modelling of time-to-event data using survival analysis: an introduction for animal behaviourists, ecologists and evolutionary biologists (TTED01)
https://www.psstatistics.com/course/statistical-modelling-of-time-to-event-data-using-survival-analysis-tted01/

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager