Detectability

We’re looking at detectability this week in Environmental Monitoring & Audit. Here are some relevant links:

1. First, check out Guru and Jose’s video explaining why detectability is important in species distribution models (there’s also some bloopers).

2. Then we have Georgia’s post about setting minimum survey effort requirements to detect a species at a site.

3. Another by Georgia about her trait-based model of detection.

4. And finally, a paper showing that Georgia’s time to detection model can efficiently estimate detectability.

And if you want more about detectability, check out a few posts of mine.

Advertisements

Some statistics to get started

The subject Environmental Monitoring and Audit starts today. We’ll be delving into some statistics, so my introductory chapter on statistical inference for an upcoming book might be useful.

And we’ll be using R, so if you need a quick introduction, check out Liz Martin’s blog.

Edit: And if you want some more information about double sampling (from Angus’ lecture today), please read this blog post.

How many surveys to demonstrate absence?

In the lecture today in Environmental Monitoring and Audit, I mentioned the model examining how much search effort is required to be sufficiently sure of the absence of a species at a site. This was based on a paper by Brendan Wintle et al. (2012).

You can read more about this topic here, with an attempt at an intuitive interpretation of the model, and some links to other examples where the prior probability (base rate) matters.

If you are particularly keen, you can read a copy of the manuscript here.

Reference

Wintle, B.A. Walshe, T.V., Parris, K.M., and McCarthy, M.A. (2012). Designing occupancy surveys and interpreting non-detection when observations are imperfect. Diversity and Distributions 18: 417-424.

Is the temperature rising?

In 2009, Senators Penny Wong and Steven Fielding debated the issue about whether the Earth’s temperature was increasing. Senator Wong was the minister for climate change at the time, and Senator Fielding was on the cross benches in the senate; the government was hoping for his support for action on climate change.

Senator Fielding’s position was basically that the Earth was no longer warming based on data over the last 10 years. Senator Wong argued that using data over a few decades, the Earth was clearly still warming.

The question about whether the Earth’s temperature is increasing is rather simple – one would hope that data might help answer it. In a previous post on my research site, I looked at how much data would be required to answer this question with confidence. This analysis suggested that more than 15 years of data would be required to be sure that the rate of temperature increase had slowed, even when the observed trend was flat. This was because the temperature record is noisy – measured global temperatures fluctuate.

Here we are, a few years later – where does the evidence lie? Well, let’s examine the relationship between global temperature and atmospheric concentrations of CO2. This might be the world’s most mindless climate model, but let’s assume that temperature is related linearly to CO2 concentration. Using HadCRUT3 data and CO2 measured at Mauna Loa (data downloaded in July 2013), the relationship looks like this:

Annual average global temperature anomaly as measured by HadCRUT3 versus annual average carbon dioxide concentration measured at Mauna Loa.

Annual average global temperature anomaly as measured by HadCRUT3 (relative to the average for the period 1961-1990) versus the annual average carbon dioxide concentration measured at Mauna Loa. Data are shown for the period up to 2008, which were the data available at the time of the debate between Senators Fielding and Wong.

We can characterize Senator Wong’s position as there being a linear relationship between temperature and CO2 that will continue in the future. Thus, we can fit a linear regression to the data up to 2008, predict an interval in which we would expect the data to fall (e.g., a 95% prediction interval), and check for departure from that. Here is that prediction and the subsequent data:

A linear regression of the the global temperature anomaly as measured by HadCRUT3 versus the annual average carbon dioxide concentration. The regression was fitted to data up to 2008, and the data for the next 4 years are shown in blue. These data points fall within the 95% prediction interval, which is given by the dashed line.

A linear regression of the the global temperature anomaly as measured by HadCRUT3 versus the annual average carbon dioxide concentration. The regression was fitted to data up to 2008, and the data for the next 4 years are shown in blue. These data points fall within the 95% prediction interval, which is given by the dashed line.

Code for fitting this linear regression in OpenBUGS is here.

The data remain within the bounds predicted, even if they are slightly on the low side of the trend. But we cannot be sure that the temperature has departed from its previous increasing trend.

Now, let’s characterize Senator Fielding’s position as there being a previous increase in temperature with CO2, but that plateaued after 1998. We fit a linear regression to the data up to 1998, and then a flat line after 1998, with these two lines joining in 1998. Note, in this case I excluded the observed temperature in 1998 because it reduces the influence of cherry picking. Code for fitting this relationship in OpenBUGS is here.

A regression of the the global temperature anomaly as measured by HadCRUT3 versus the annual average carbon dioxide concentration using data up to 2008. The linear relationship up to 1998 was estimated, while the trend was assumed to be flat after that year. To reduce the influence of cherry picking the change point as 1998 (the year with the highest temperature), this year was excluded from the analysis.  The data for the next 4 years are shown in blue. These data points fall within the 95% prediction interval, which is given by the dashed line.

A regression of the the global temperature anomaly as measured by HadCRUT3 versus the annual average carbon dioxide concentration using data up to 2008. The linear relationship up to 1998 was estimated, while the trend was assumed to be flat after that year. To reduce the influence of cherry picking the change point as 1998 (the year with the highest temperature), this year was excluded from the analysis. The data for the next 4 years are shown in blue. These data points fall within the 95% prediction interval, which is given by the dashed line.

The data remain within the bounds predicted, even if they are slightly on the high side of the prediction. But we cannot be sure that the temperature does not conform with a flat relationship since 1998.

What does all this mean? Well, as in my previous post, it shows that several years (e.g., >15 years) of temperature data (or very large and clear changes in temperature) are required to distinguish between these two points of view. It is unfortunate that the number of years of data are considerably longer than the election cycle; while politicians might debate the issue, definitive evidence to support or refute their positions will not arise in typical political lifetimes.

It is also unfortunate that the data used in this debate represent only a small fraction of the heat content of the planet. Thus, while the heat content of the planet seems to be rising consistently, the surface temperature, which is measured by HadCRUT3 and experienced by most people, is noisy. Because the temperature data are equivocal over timescales of a decade or so, the other evidence about increases in the heat content of the Earth (e.g., melting of ice caps, warming of sub-surface of the ocean) become more important, but less tangible to people.

Further debate about this topic also includes whether it is plausible that temperatures might increase with CO2 over some periods and not others. Such debates become much more subtle than simple measures of the temperature at the Earth’s surface.

Yet the noisy data (and subtle changes relative to that noise) cloud the debate about whether the Earth’s temperature is continuing to increase, let alone what are appropriate responses to human-caused climate change.

The most dangerous equation

The equation for the standard error is pretty basic right? It is:

se =\dfrac{\sigma}{\sqrt{n}},

where n is the sample size and \sigma is the standard deviation of the data.

It measures potential error in an estimate that has been obtained from simple random sampling.

First published by the French mathematician Abraham de Moivre in 1730, Howard Wainer describes it as “The most dangerous equation”. Why? Not because it is dangerous to use, but because ignorance of it causes huge waste. Check out the five examples he provides here.

Learn to wield de Moivre’s equation to avoid the dangers of ignoring it.