Tuesday, November 15, 2016

Fishes of Ohio

I remember clearly over 20 years ago reading Fishes of Ohio by Milton Trautman. First published in 1957, the book is primarily a key and description to the fishes of the state.

The part I remember most was the introduction. I remember it describing Ohio when settlers first came to settle Ohio. Trautman summarized early records and painted a picture of lands with clear waters and an amazing abundance of fish.

In my mind, the abundance of fish was represented by a story that the boardwalks of Cleveland were built on the backs of fish.

I related that story the other day, but thought I should go back and find if I was remembering the details correctly. So, I purchased a copy, sat down, and started working through the 700+ page tome.

"The state of Ohio, situated in the midlands of the United States, is squarish in outline". It's not the most flattering beginning to a book, but it's a true representation of the state.

After this, Trautman describes historical accounts of Ohio. The waters were so clear that "early pioneers drank as readily from flowing streams as the did from springs." The abundances of fish are characterized, too. Before 1800, in the Maumee River "A spear may be thrown into the water at random, and will rarely miss killing one!"

After 1800, things start to go downhill.

I looked through this introduction and couldn't find the line about the boardwalk.

I went into the sections on individual fish.

In my mind, the boardwalks were built on the backs of sturgeon or maybe blue pike, a subspecies of walleye.

The section on lake sturgeon describes them being so abundant that fisherman sometimes placed them in large piles and set fire to them. Nothing on boardwalks.

Blue pike? 26M pounds caught in the 1950's, but nothing on their use as a base for sidewalks.

Google books has been no help, nor google. Nor bing.

I don't think it was in this book.

I have no idea where I read that.

Still, it was good to read through it again. Books like these just don't get written much anymore. And certainly there are few individuals left that spend more than 25 years making more than 2000 collections of fish--some half million individuals identified-- to help map the distribution of fish of a state.

Tuesday, October 4, 2016

Map of streams of US

I found this map of the US today. It shows all of the streams and rivers in the lower 48 states. You can find it here.

It seem like everywhere in the US, even dry places, have at least temporary streams. But not everywhere.

What fascinates me about the map are not that there are so many streams, but the regions where there aren't any.

That strip in the Dakotas is the Missouri Coteau. It's the western extent of the eastern glaciers. It looks something like this:

There really are no streams there. Pot hole lakes and well drained soil. But really no streams.

In the middle, you can also see the Sandhills of Nebraska. Again, almost no streams. At least on the surface. Almost all the water drains through the soil and feeds aquifers. 

You can also see the high plains of Texas to the south of there and south of there the coastal sand plain of Texas. Again, no streams there. 

In the northwest, there are the buttes of the Snake River valley:

Again, no streams or rivers up there.

South of the Snake River Valley is the Great Salt Lake Basin. 

Down in Florida, the Everglades are prominent.

I really have no other insight about the map, except it's an interesting way to look at the geography of the US. 

Friday, September 23, 2016

Book review: Statistics Done Wrong: the Woefully Complete Guide

How to Lie with Statistics came out in 1954. It has long been considered a classic with over a half-million copies sold.

The second edition of Statistics Done Wrong might be a true successor to the classic. The first edition, recently released, is still a good read for any scientist.

The author, Alex Reinhart, spends covers some basics about statistics and then empirical cases where statistics have been used incorrectly.

It's a good book. I learned a few things while reading it and was impressed to see that important examples from recent news were included as cautionary tales. I think most scientists should spend the time to read through this. If they don't learn anything, they should at least feel good about that. My guess is that they would.

That said, the difficulty with the book is that it isn't mature yet. The section on p-values is probably the most important, but I finished the section not quite sure what the author wanted to impress upon the reader. The reader is admonished to use ranges instead of p-values, but it just isn't clear why. As the book matures, the author should have better examples to get his points across. In contrast, his examples for base rate fallacies were mature. They were poignant and the reader should have a clear idea how to calculate what percentage of positives are likely false.

Another problem with any statistics book is that statistics is too broad to be covered in any thin volume no more than a single book could cover how to lie with language.

At any rate, Statistics Done Wrong is an admirable effort. At the very least it should set off alarm bells with researchers to ask questions about their statistics a little more deeply.

I'm looking forward to seeing what the second edition holds.

Monday, September 19, 2016

Tuesday, September 13, 2016

The trajectory of nitrogen availability

It is a simple fact that N availability is rising throughout the world, likely causing a planetary boundary to be exceeded. Considering that humans have doubled global N2 fixation, it's impossible that it hasn't.

It is also a simple fact that CO2 concentrations have been rising, which likely should be causing N to become progressively more limiting.

It is also a simple fact that no one has taken the time to comprehensively address whether N availability has been increasing or decreasing in the ecosystems of the world. There are almost no time series of direct measurements of N supplies or availability to test whether N availability is going up or down.

As a result, it is unresolved as to whether N availability is increasing or decreasing in ecosystems across the world.

Andrew Elmore and Dave Nelson (with a little help from me) report in the latest issue of Nature Plants new data that looks at whether N availability is increasing or decreasing in US eastern deciduous forests.

Short answer: N availability looks to be decreasing.

Using ratios of N isotopes in wood as a proxy for N availability, Elmore et al. show that N availability has been declining in the forests they examined for some time.

That's a pretty big result.

Not only do they show this, but they also show that the declines are tied to spring phenology. Years with warmer springs have the lowest N availability.

Mechanistically, one link between phenology in N availability is that years with warmer springs have greater increases in plant demand for N than any increases in N supplies, leading to declines in N availability.

One question that arises from this work...if N availability is declining in these forests, how sure are we that we have crossed a planetary boundary for N? Are the world's terrestrial ecosystems really eutrophying?

Elmore, A. J., D. M. Nelson, and J. M. Craine. 2016. Earlier springs are causing reduced nitrogen availability in North American eastern deciduous forests. Nat Plants 2:16133.

Monday, September 5, 2016

Study on old trees


The oldest trees in the world are often in the most stressful environments, or so it seems. Yet, there has never been a quantitative attempt to assess tree longevity.

Di Filippo et al. make a first attempt at this by analyzing tree-ring data for broad-leaved deciduous trees in the Northern Hemisphere.

Given the massive impact of humans on old-growth forests, any study like this will have caveats, but the data are interesting.

For example, they report that 300-400 years is a good baseline for tree lifespan (if that concept even applies to trees).  They also report a maximum longevity of 600-700 years for deciduous trees in general.

They also show that the really old trees spent a long time growing slowly. The idea is that mortality rates increase with size, so staying small is a good way to avoid mortal blows like wind throw.

The relationship they show with maximum age for Fagus was interesting. Essentially, in warm places, the maximum age of Fagus was a lot less than in cold places. They cannot answer whether this is a direct or indirect effect, but they did not find the same relationship for Quercus species.

The authors don't believe the evidence assembled indicates a biological limitation to longevity in trees, e.g. meristems senesce after a certain amount of time.

Instead, trees can only roll the dice so many times. And it's hard to roll the dice for more than a few hundred years and not lose.

Di Filippo, A., N. Pederson, M. Baliva, M. Brunetti, A. Dinella, K. Kitamura, H. D. Knapp, B. Schirone, and G. Piovesan. 2015. The longevity of broadleaf deciduous trees in Northern Hemisphere temperate forests: insights from tree-ring series. Frontiers in Ecology and Evolution 3.

Thursday, September 1, 2016

Quantifying cattle diet across broad gradients

Just a quick note on a new paper that we just published.

I've worked before with Texas A&M's GANLab to assess patterns across the US of forage quality for cattle. In that paper, we saw that cattle in cool, wet regions had the highest forage quality, which suggested the warming would reduce forage quality. It was an important paper for understanding how global warming would affect the ability of grassland to sustain grazers.

Although that work showed geographic patterns of forage quality, we couldn't tell how the species that the cattle consumed might be changing.

Just this month, we published a new paper where we sequenced the plant DNA in fecal samples of cattle across the US to answer that question.

Those results are pretty interesting, too.

In short, cattle in warmer grasslands are relying more on forbs than cattle in cooler grasslands. That suggests warming will shift the diet of cattle, potentially to compensate for lower forage quality. This is pretty similar to what we saw for bison.

The specific results are important, but the general approach is even more interesting. This is the first time the diet of an herbivore was quantified over such a large spatial scale and with such specificity. For example, we could see the species of grasses shift as one moved south, and the unique diet of cattle in southern Texas (a fair amount of live oak there).

Plant productivity and climate: a back and forth

The process of science is one we do not talk about much. There are reams of studies on statistical tests for a given data set, and meta-analyses have moved science forward for bringing together different data sets to test an idea.

But how does science decide the "truth" when there are different assumptions between different studies? What process gets used when words are used in different ways? No statistical test or meta-analysis can bridge that gap.

Here's an example...

In August of 2014, Michaletz et al. published a paper that analyzed data on plant production for over a thousand forests across the world. It has long been understood that production is greatest in warm, wet forests (think tropical rain forests) and least in cold, dry forests (think bristlecone pine).When we warm or irrigate forests, they grow more, too. Seems like the role of climate is pretty well settled.

In their review, the authors found that, indeed, production correlated with temperature and precipitation, but according to them, this was too simple. When viewed through metabolic scaling theory, climate had only an indirect effect on production. The authors asserted that "age and biomass explained most of the variation in production whereas temperature and precipitation explained almost none".  In short, warm, wet forests are more productive only because they tend to be older and larger there, not because warm or wet conditions promote growth. By this idea, if you compare two forests of equal size and age, but one forest was in a cold, dry environment, and the other was in a warm, wet environment, there would be no difference in their production.

The authors have published many excellent papers on metabolic scaling, really developing a line of thought to begin to unify some fractured thought on how plants work. If this result held, it would be a coup de grace in many ways.

So how did the authors rule out that climate directly affected production?

The authors calculated a rate of monthly production by dividing production by the length of the growing season. This removed the influence of differences in the length of the growing season to compare forests across the world more equally, essentially asking if forests in warm, wet places grow more each month than ones in cold, dry places. When they did this, they found that "In contrast to results for NPP, average growing season temperature,...mean annual precipitation, and mean growing season precipitation explained little to no variation in global [monthly production]."

And with that result, the authors move on to test other factors, such as stand age and biomass, independent of climate, finding that "A large proportion of variation in NPP...was explained by just two variables: stand biomass and plant age."

The Michaletz  paper was published in Nature, which is often publishes some of the most important results in our discipline only after intense scrutiny. It seemed like that question was settled. Climate only affects how big forests get and how old they are, it doesn't make a given forest grow any faster per se.

Well, I guess it can be said that one person's assumption is another person's legerdemain.

This past January a new paper was published in Global Change Biology. Chu et al. reanalyze the Michaletz data and start with the title "Does climate directly influence NPP globally?" The authors assert that the Michaletz study had "flaws that affected that study’s conclusions". They also "present novel analyses to disentangle the effects of stand variables and climate in determining NPP."

In short, the authors state that ruling out climate's direct effect by calculating monthly production was erroneous. Growing season length and mean climate are highly correlated. In their view, it was incorrect to rule out the direct effect of climate by dividing production by growing season length and then examining the resultant metric against climate variables.**

**This debate, in part, is the Knops-Vitousek debate all over again...

Instead, using different analytic techniques, Chu et al. simultaneously test the roles of growing season length and other climate variables on production.

Their conclusion? Climate does directly affects production.

At this point, I'm not writing about this to weigh in on which side is right or more right or right under specific conditions.

I only pose this question.

Now what?

How does our discipline resolve the tension here?

Were the assumptions by Michaletz right? Are the two camps' differences semantic? Which conclusion should be accepted? Does climate directly affect production or only indirectly?

At this point, if it was convenient for a scientist's argument for climate not to affect production, they just cite the Michaletz paper. If the contrary held, just cite the Chu paper.

In the legal world, when different circuit courts come to different conclusions, this can lead to "forum shopping" where a plaintiff can simply go to the circuit that is most favorable to their case. That shouldn't be if the goal is to have one set of laws to govern a nation.

Like the legal world, it seems like being able to cite either one of two opposing ideas is not sustainable for science either.

It is interesting to note that in the US  federal court system, two contrary ideas existing at the same time would be the equivalent of a "circuit split" where two circuit courts come to two different conclusions about how to interpret the law. This tension would often be resolved at the next higher court, the Supreme Court. And the decision of the Supreme Court would resolve the differences of opinion.

All I note here is that science doesn't have that. We have no formalized process for resolving a split. Split conclusions can theoretically last indefinitely. And scientists can cite whichever side they believe in more or find most convenient.

I think that is fascinating.

Tuesday, May 31, 2016

CUDOS in science

Here's something I hadn't seen before.

I've thought that it's interesting that there isn't a Hippocratic oath for scientists (scientists didn't exist in the days of Hippocrates). It turns out there are norms of scientific society.

I read about this in Wootton's book, but hadn't ever heard about them. Apparently, these norms were described in 1942 by Merton in his description of the sociology of science.

The norms of science go by the acronym of CUDOS:

Organized Skepticism

The wikipedia page describes them fairly well. As does this blog post.

In short, these norms describe the ideals of science. The results should be open to everyone. Ideas (and opportunities) are evaluated blind to the characteristics of the individual. Scientists report results independent of the consequences of the outcome. All ideas are subject to scrutiny.

Wootton's analogy between science and the law in these norms is pretty interesting. The legal profession holds similar ideas**. For example, evidence should never be withheld to opposing parties, which is similar to communalism.

**Though Wootton does not delve into it, the adversarial nature of legal actions is not replicated generally in science even though there is still a tradition of a "defense" of theses or dissertations.

Most scientists believe they deserve more kudos for the work they do. Apparently CUDOS are built into science.

Monday, May 30, 2016

Book review: The Invention of Science

Currently, science is undergoing a convulsion. The very way that science operates is changing. It's a change that appears to be unprecedented in modern times.

For the first time, science is being forced to deal with bias. Questions of reproducibility have become a crisis. The review process is under renewed scrutiny. The nature and openness of publishing is being assaulted legally and illegally. Everyone from editors to scientists to funding agencies are being forced to reckon with consequences of retractions at an unprecedented rate.

The Invention of Science, a new book by David Wootton addresses none of these modern ills. But, sometimes, modern crises are an important time to revisit our history. The Invention of Science is an unparalleled examination of the long, slow (and sometimes convulsive birth of science).

Note, this thing is a wrist-breaker. 600 pages before you get to the endnotes. That's a good thing. Understanding the history of a topic is not something to do in Cliff Notes form. You need a comfortable chair and a pen for the margins to absorb the lessons.

The thesis of this book is that science (as we currently define it) once did not exist. Knowledge was generated through means other than science. In order for science to be invented, a number of conventions had to be created, too. We needed a new vocabulary. People needed to act and interact differently. The conceptual framework that we recognize as scientific had to not only be assembled, it had to displace previous frameworks.

A book of this scope is hard to summarize with any justice.

Here is the first sentence of the book. "Modern science was invented between 1572, when Tyco Brahe saw a nova...and 1704, when Newton published his Opticks...." Science took a bit over 100 years to invent. It's only a bit over 300 years old.

Over a hundred years to invent something that seems so simple that we do it every day? Why so long?

The book answers why it wasn't as easy as people might think.

The middle chapters are the ones I've spent the most time on. These are their titles: Facts, Experiments, Laws, Hypotheses/Theories, Evidence and Judgment.

These chapters lay out the history of the main elements of the modern scientific approach.

I'm going to have to read these chapters one or two more times before I can crystallize them, but their scopes are the raw material for anyone trying to understand if not shape modern science.

For example, the word "fact" (with its modern meaning) did not exist in any language. The Greeks and Romans had no word for "fact". The concept of a "fact" did not exist. And facts are not the same as the truth.

Let me quote here.

"What is a fact? It is a sort of trump card in an intellectual game...Facts are a linguistic device which ensures that experience always trumps authority and reason."

Facts are a linguistic device? Since when is the truth a device? Facts must be something other than what we recognize them to be.

The experiments chapter describes a number of the early experiments. Here's a quote: "This is the first 'proper' experiment, in that it involves a carefully designed procedure, verification (the onlookers are thereto ensure this really is a reliable account), repetition and independent replication, followed rapidly by dissemination."

When did this happen? 1648 when a brother-in-law of Pascal climbed a mountain with a barometer.

But note his definition of an experiment. It involves verification. Repetition. Replication. Followed by  dissemination. Our modern crisis comes about because of a lack of verification, repetition, and replication (or reproducibility as we refer to it). Only touched on, the author highlights the motto of an Italian society. The motto was: provando i reprovando. Test and retest. Hard to imagine that as any modern society's motto.

The evidence and judgement chapter has interesting nuggets, too. In part, it examines the legal frameworks of different European countries, which affects how scientists came to prove things. Drawing techniques for a judicial system that relies on judges vs. juries leads to different ways of conducting science. That thumbprint is still with us today. Like any organism that has evolved, modern science still bears the marks of its history and past forms.

Here's a quick example he provides:

"A friend of mine was once in hospital in Paris. The doctors told him that they had an hypothesis regarding the nature of his illness which they intended to prove, where in England they would have told him that he had certain symptoms which suggested a diagnosis which they would run tests to confirm."

This is a subtle difference, but one whose distinctions should be obvious to anyone practicing science. Different paths do not always lead to the same destination, so choose the path wisely.

Right now, our science is in the middle of a transformation. The question is whether a new layer will simply be added or whether parts will be torn down and rebuilt. Anyone who offers an opinion on how science should be reformulated is wise to know it's history. This is a good book to start on that.

But buy it in hardcover so you can write in the margins.

The only drawback is that the margins are not wide enough.

Tuesday, May 24, 2016

Why we cite papers

Scholarly works are set apart from other types of writings by the use of citations. An essay on natural history might cover a scientific topic, but it is just an essay until it contains citations. Scientific papers are not scientific without citations.***

***This blog post is certainly not scientific...no citations here. OK, maybe one.

Most scientists do not question the need for citations nor the role they play in the paper itself. When we do not have a common understanding of the role of citation, we have trouble determining when citations are improper and what to do when what we think to be true shifts.

Most of us think that the big debates about citations is formatting. Do we number our citations or list the authors and dates each time? There are deeper issues that that. They have nothing to do with formatting.

The first time I really thought about citations is I remember that Stephen Jay Gould once got into trouble for citing a paper in his thesis that was not contained in his school's library.* His advisors questioned the link between the statement he was making and the original citation. They were not refuting that his statement wasn't true. Only that he didn't know it was true, because he could not have examined the original source. Another author's judgment on the assessment of truth was insufficient. That's how rigorous citation can be.

*This is a place where a citation is really needed. But, I can't remember which of his books I read this in. Structure of Evolutionary Theory? Panda's Thumb? I'm fuzzy on the details here, but whether it happened or not, it could have happened, which is all that is necessary here.

When I think about how I use citations, I feel there are two types of citations that I use.

The first I call vertical citations.

Vertical citations are the links between what has been found to be true in the past and a statement we currently would like to make to establish the truth.

For example, here is the first sentence of a paper that I just submitted to a journal:

There are approximately 1 billion cattle in the world with cattle populations steadily increasing over the past few decades (Estell et al. 2014).

This is a vertical citation. I am going back into the literature to provide evidence of the truth of a statement. I personally have not counted how many cattle there are in the world. Nor have I determined whether cattle populations are increasing or decreasing over the past few decades. So, instead of going out and counting cattle, I cite a paper that has established this to be true or has cited the papers that have established this to be true.  The paper I chose to cite is Estell et al. 2014***

***et al. stands for et alia (in the neuter form), which means and others in Latin. Et alia is almost always abbreviated et al., which is funny because we really aren't saving that many characters. Really just one. I think, in part, it gets abbreviated because the actual Latin phrase depends on whether the "others" are male, female, or both. Easier to write "et al." than determine whether et alii, et aliae, et alia is more appropriate.

So, when are vertical citations necessary?

Any time we make a statement in a scientific paper about what we consider to be true outside of the personal experience we are describing, a citation is necessary.

Any time.

If we want to say that there are a billion cattle in the world, we need a citation. If we want to say that atmospheric CO2 concentrations are increasing, we need a citation. The sky is blue? Citation. Gravity exists? Citation.

Now, if we want to say that we performed a certain procedure in an experiment, we do not need a citation. We hold it true that we might have measured something at a certain temperature, but there is no citation for this since it comes from our experience, not the literature.

Vertical citations go back into the literature to provide justification for the truth of statements we are making. Think of the Newton's phrase, if I have seen further, it is by standing on the shoulders of giants...When we cite a previous work, we are placing our foot on the shoulder of a giant that has come before us. We are reaching down vertically to build something taller.

As opposed to vertical citations, there are also horizontal citations. Like vertical citations, they reach down into the literature to establish the truth, but the purpose is different.

Horizontal citations are primarily for context. In the introduction, horizontal citations are typically used to identify intellectual tension. Study A found this. Study B found that. This and that cannot be both true under our current intellectual framework. We cite these papers to show what other researchers have found to justify our work.

In the discussion, horizontal citations are used in a similar manner, but it is not to establish that there is intellectual tension, but to see if there is intellectual tension. Study A found this. Study B found that. We found this, too. Therefore, it seems like this is more likely to be true than that.

With horizontal citations, we are not citing other giants, but instead other dwarfs (or other Isaac Newtons).**

**the original metaphor was "dwarfs standing on the shoulders of giants". Citation here. We think of Newton as a giant now, but originally he would have been a dwarf in the metaphor.

So, when I think about how I reference the literature, it is generally vertically or horizontally. I am either reaching down to stand taller, or reaching across to build linkages.

That's probably a long enough post for now. Down the line, I should cover the consequences of failing to cite the literature correctly and the consequences of determining that the findings of a published paper was not true: what happens when a giant tumbles?

Mostly as a note to myself, comparing legal citations and scientific citations is also instructive. The law only cares about what was legally true at the time the law was being examined. Science cares about what is known to be true at the time the scientific fact was established and after. Hence, changes in the law and changes in scientific understanding have much different consequences.

Wednesday, March 9, 2016

Declines in tree nutrient concentration over past 25 years

I've been trying to catch up on journals lately. Apparently, I hadn't read anything from Global Change Biology over the past 2 years. Must have been distracted. No time like the present...

Here's one that struck me as amazing.

Researchers in Europe resampled forest leaves from 1992 - 2009 across a large number of plots in Europe. At each site for a subset of species they assessed nutrient concentrations and leaf mass--a pretty simple and standard measurement. Doing this allowed them to examine the trajectory of nutrient concentrations (and contents). Nutrient concentrations in leaves are critical to determining tree productivity as well as interactions with herbivores, so knowing whether concentrations are going up or down is critical to modeling the future productivity of these forests.

Here's the simplified result: almost all nutrient concentrations were declining. 20 nutrients had declining concentrations. 2 were increasing.

Here's an example of the pattern for beech. white bars are concentrations, grey contents.

The authors focus on P nutrition the most, emphasizing the role of N deposition in promoting P limitation. Yet, even N concentrations were declining. These declines must be more than just N deposition causing imbalances, especially since N deposition has been declining over the time period. 

The authors suggest elevated atmospheric CO2 might also be playing a role, as well as droughts and warming, but this paper mostly describes the pattern, which is fine.

The big question is: What is causing this massive, continental decline in nutrient concentrations?

Monday, March 7, 2016

ASA statement on P-values

The American Statistical Associations statement on the use of p-values can be found here.

The short list is:

  1. P-values can indicate how incompatible the data are with a specified statistical model. 
  2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone. 
  3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold. 
  4. Proper inference requires full reporting and transparency. 
  5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result. 
  6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis. 
My personal take is that there are a few corrections in how p-values are used. 

1) p< 0.05 is arbitrary. report the exact p-values and think of them as a continuum. Don't think a paper should be accepted just because p < 0.05. Don't reject a paper just because p > 0.05. 

2) the p-value reported needs to be contextualized with the number of comparisons made. this is where p-hacking shows up. if you do 20 independent analyses, 1 is likely to have p-value < 0.05. You need to state that you did an additional 19 analyses if you are reporting the 20th. if you went and added more data or looked more carefully for outliers because a p-value wasn't low enough, this needs to be reported.

3) p-values and effect sizes must be reported together. an independent assessment of whether the measured effect is biologically relevant is needed. 

#2 on the list is the hardest to comprehend because it involves logical assumptions of the test. 

The manuscript's explanation of this is:

Researchers often wish to turn a p-value into a statement about the truth of a null hypothesis, or about the probability that random chance produced the observed data. The p-value is neither. It is a statement about data in relation to a specified hypothetical explanation, and is not a statement about the explanation itself.

At RetractionWatch, the author explains it this way:

Retraction Watch: Some of the principles seem straightforward, but I was curious about #2 – I often hear people describe the purpose of a p value as a way to estimate the probability the data were produced by random chance alone. Why is that a false belief? 
Ron Wasserstein: Let’s think about what that statement would mean for a simplistic example. Suppose a new treatment for a serious disease is alleged to work better than the current treatment. We test the claim by matching 5 pairs of similarly ill patients and randomly assigning one to the current and one to the new treatment in each pair. The null hypothesis is that the new treatment and the old each have a 50-50 chance of producing the better outcome for any pair. If that’s true, the probability the new treatment will win for all five pairs is (½)5 = 1/32, or about 0.03. If the data show that the new treatment does produce a better outcome for all 5 pairs, the p-value is 0.03. It represents the probability of that result, under the assumption that the new and old treatments are equally likely to win. It is not the probability the new treatment and the old treatment are equally likely to win.
This is perhaps subtle, but it is not quibbling.  It is a most basic logical fallacy to conclude something is true that you had to assume to be true in order to reach that conclusion.  If you fall for that fallacy, then you will conclude there is only a 3% chance that the treatments are equally likely to produce the better outcome, and assign a 97% chance that the new treatment is better. You will have committed, as Vizzini says in “The Princess Bride,” a classic (and serious) blunder.
I'm still looking for the right wording on this one, but it seems like the probability that the null hypothesis is true given the effect size observed. 

Saturday, March 5, 2016

Biogeochemical Planetary Boundary: Beyond the zone of uncertainty? (Part II)

I think of scientists as having two jobs.

One is to create intellectual tension.

The other is to resolve it.

Creating intellectual tension is generating hypotheses. Hypotheses that we do not know whether they are true or false represents intellectual tension. Competing hypotheses about how the world works are also intellectual tension. We do not know which is true. This is the tension.

Resolving intellectual tension can sometimes occur by identifying logical flaws in one hypothesis. Generally, intellectual tension is resolved by collecting data. It is a fair question about whether a hypothesis can ever be proven or disproven and therefore whether intellectual tension is ever fully resolved, but the process of science works to reduce intellectual by favoring hypotheses.

In the previous post, I identified some important intellectual tension in the scientific world.

There is the hypothesis that the planet has exceeded a biogeochemical "planetary boundary". Too much nitrogen is being fixed and entering ecosystems. This is the hypothesis.

Yet, it is unclear whether this is causing planetary-scale eutrophication of terrestrial ecosystems or  aquatic ecosystems.

On the one hand, we have a hypothesis where the world is awash in nitrogen. We fix more nitrogen than ever and apply it to ecosystems on a massive scale. As a result, nitrogen is leaking out into waterways creating dead zones in the oceans. Nitrogen is also entering the atmosphere and raining down on even the most remote ecosystems on earth. As a result, terrestrial ecosystems are becoming eutrophied. Species adapted to low nitrogen availability are being crowded out by faster growing plants. Biodiversity is plummeting. Productivity is increasing unsustainably. With all this extra nitrogen, we have exceeded a biogeochemical planetary boundary. Civilization as we know it is threatened.

Yet, the intellectual tension on this hypothesis actually takes the form of a competing hypothesis. It is possible that not only have we not exceeded a planetary boundary for nitrogen, but ecosystems might be becoming more nitrogen limited over time. As temperatures warm and atmospheric CO2 builds up, this might stimulate the demand for N more than it is being supplied. Plants and microbes become more limited by N. Plant N concentrations decline. Photosynthesis declines. Plants that compete well for N become more dominant. Less N leaks out of ecosystems into streams. Productivity becomes more and more constrained by the lack of nitrogen. Vegetation sequesters less and less carbon than they could be, all because there is not enough nitrogen. As a result, more CO2 accumulates in the atmosphere than could be if forests had more nitrogen. Climates warm even faster. Civilization as we know it is threatened.

Intellectual tension like this could not be as stark.

If you reduce the world to one pixel, there is either too much nitrogen. Or there is too little.

Resolving this tension requires data. On the one hand, we know that N is being fixed in ever greater amounts. On the other hand, CO2 continues to increase which shifts demand for N even higher. Back again, N is raining down on ecosystems still at an elevated rate. Yet, the NO3- concentrations of water in streams is so low, stream water is approaching the NO3- concentrations of distilled water.

The only way to resolve this tension is to collect data on N availability.

Yet we need long-term measurements of N availability to know for sure whether N is becoming more or less limiting.

We don't have these.

We could use the species composition of plant communities in conjunction with indices of what plants represent low or high N availability, but again we have not invested in long-term monitoring of our plant communities.

The tension of whether the world is becoming more eutrophic or more oligotrophic has existed for a long time now.

It probably is not a bad thing to think that civilization is threatened. But we should at least know whether it is because there is too much nitrogen or too little before we try to fix it. Or else our remedies might exacerbate the situation.

Without the right data, we cannot resolve this tension. That means we start monitoring key indices like N availability and species composition now and try to answer the question in 10 years.

Or we find a different dataset that allows us to reconstruct N availability on broad spatial scales far enough back in time to discern the trajectory of N availability.

Do we have the data to resolve this tension?

I think we might...

Let's see what reviewers say.

Biogeochemical Planetary Boundary: Beyond the zone of uncertainty? (Part I)

The cycling of nitrogen in a terrestrial ecosystem determines its primary (and secondary) productivity, its diversity, and how much (and how) nitrogen is lost to the atmosphere and waters. In general, plant productivity is limited by the availability of nitrogen. Add a little more nitrogen, and not much changes. Productivity increases, but qualitatively, the ecosystem functions the same. Add a little more, and the ecosystem changes quantitatively, but not qualitatively. Productivity increases. N concentrations increase a bit, but it still is qualitatively similar to the unfertilized ecosystem.

Keep fertilizing the ecosystem with N, and eventually the ecosystem reaches a threshold. Not only does productivity increase, but a lot of other things change. Suddenly, plant N concentrations increase a lot. The plant community shifts towards plants that thrive under higher N. They have high N concentrations, they use alkaloids instead of tannins to defend themselves, their leaves are built to capture as much light as possible, rather than avoid capturing too much light. In the soil, the soil microbial community shifts and the richness of N causes N to start leaving the soils in ways it hadn't before. More NO3- comes out in the waters. More gaseous N is lost to the atmosphere.

This threshold has been repeated experimentally in individual ecosystems throughout the world. And we've seen it when we non-experimentally add a lot of N to pastures or croplands or even forests.

What we see at the plot level or even at the level of the stand or region could potentially have analogs at the planetary level. As humans fix more and more N and more and more N is added to the ecosystems, could the whole planet flip states and autocatalyze from a oligotrophic world to a eutrophic world? Could N limitation become the exception, rather than the rule.

In 2009, Rockstrom et al. published their summary of the state of the earth in respect to Planetary Boundaries (see my 2012 post on the issue here). These planetary boundaries are planet-wide environmental boundaries or ‘tipping points’. Exceed these thresholds, and humanity is at risk.

That paper was updated last year by Steffen et al. As before, the authors state that for climate change, we have entered a "zone of uncertainty" with "increasing risk". Despite all the warming, the sea level rise, the collapsing ice sheets, the potential for a shutdown of the thermohaline circulation, losses of coral reefs, thawing of permafrost, and climatic reorganization underway, their summary is that humanity is still in a safe operating space climatically.

In contrast, for the global nitrogen cycle, the status is the same as in 2009. We are apparently beyond the zone of uncertainty, and humanity is currently at high risk of exceeding a planetary threshold.

That sounds pretty dire.

But are we?

The basis for this assessment is from a recent paper by de Vries et al. 2013.

Reading the paper, apparently, for the planet to have exceeded a planetary boundary for N requires that one of the following (according to the authors) has exceeded safe operating space:

1) eutrophication of terrestrial ecosystems
2) eutrophication of marine ecosystems
3) acidification of soils and fresh waters
4) NOx, a greenhouse gas
5) ozone formation
6) groundwater contamination
7) stratospheric ozone depletion

There is really  no evidence of too much tropospheric ozone or too much groundwater contamination for humans to safely inhabit planet. Soils do not appear to be becoming acidified due to N deposition and fertilization globally. NOx levels are not deathly high. Stratospheric ozone levels are still recovering from CFC phase-outs.

Therefore, if humanity has exceeded a biogeochemical planetary boundary, then there must be evidence of planetary-scale eutrophication of terrestrial or marine ecosystems.

In a future post, I'll examine the intellectual tension about this idea...