# What is a locus, anyway?

”Locus” is one of those confusing genetics terms (its meaning, not just its pronunciation). We can probably all agree with a dictionary and with Wikipedia that it means a place in the genome, but a place of what and in what sense? We also use place-related word like ”site” and ”region” that one might think were synonymous, but don’t seem to be.

For an example, we can look at this relatively recent preprint (Chebib & Guillaume 2020) about a model of the causes of genetic correlation. They have pairs of linked loci that each affect one trait each (that’s the tight linkage condition), and also a set of loci that affect both traits (the pleiotropic condition), correlated Gaussian stabilising selection, and different levels of mutation, migration and recombination between the linked pairs. A mutation means adding a number to the effect of an allele.

This means that loci in this model can have a large number of alleles with quantitatively different effects. The alleles at a locus share a distribution of mutation effects, that can be either two-dimensional (with pleiotropy) or one-dimensional. They also share a recombination rate with all other loci, which is constant.

What kind of DNA sequences can have these properties? Single nucleotide sites are out of the question, as they can have four, or maybe five alleles if you count a deletion. Larger structural variants, such as inversions or allelic series of indels might work. A protein-coding gene taken as a unit could have a huge number of different alleles, but they would probably have different distributions of mutational effects in different sites, and (relatively small) differences in genetic distance to different sites.

It seems to me that we’re talking about an abstract group of potential alleles that have sufficiently similar effects and that are sufficiently closely linked. This is fine; I’m not saying this to criticise the model, but to explore how strange a locus really is.

They find that there is less genetic correlation with linkage than with pleiotropy, unless the mutation rate is high, which leads to a discussion about mutation rate. This reasoning about the mutation rate of a locus illustrates the issue:

A high rate of mutation (10−3) allows for multiple mutations in both loci in a tightly linked pair to accumulate and maintain levels of genetic covariance near to that of mutations in a single pleiotropic locus, but empirical estimations of mutation rates from varied species like bacteria and humans suggests that per-nucleotide mutation rates are in the order of 10−8 to 10−9 … If a polygenic locus consists of hundreds or thousands of nucleotides, as in the case of many quantitative trait loci (QTLs), then per-locus mutation rates may be as high as 10−5, but the larger the locus the higher the chance of recombination between within-locus variants that are contributing to genetic correlation. This leads us to believe that with empirically estimated levels of mutation and recombination, strong genetic correlation between traits are more likely to be maintained if there is an underlying pleiotropic architecture affecting them than will be maintained due to tight linkage.

I don’t know if it’s me or the authors who are conceptually confused here. If they are referring to QTL mapping, it is true that the quantitative trait loci that we detect in mapping studies often are huge. ”Thousands of nucleotides” is being generous to mapping studies: in many cases, we’re talking millions of them. But the size of a QTL region from a mapping experiment doesn’t tell us how many nucleotides in it that matter to the trait. It reflects our poor resolution in delineating the, one or more, causative variants that give rise to the association signal. That being said, it might be possible to use tricks like saturation mutagenesis to figure out which mutations within a relevant region that could affect a trait. Then, we could actually observe a locus in the above sense.

Another recent theoretical preprint (Chantepie & Chevin 2020) phrases it like this:

[N]ote that the nature of loci is not explicit in this model, but in any case these do not represent single nucleotides or even genes. Rather, they represent large stretches of effectively non-recombining portions of the genome, which may influence the traits by mutation. Since free recombination is also assumed across these loci (consistent with most previous studies), the latter can even be thought of as small chromosomes, for which mutation rates of the order to 10−2 seem reasonable.

Literature

Chebib and Guillaume. ”Pleiotropy or linkage? Their relative contributions to the genetic correlation of quantitative traits and detection by multi-trait GWA studies.” bioRxiv (2019): 656413.

Chantepie and Chevin. ”How does the strength of selection influence genetic correlations?” bioRxiv (2020).

# Adrian Bird on genome ecology

I recently read this essay by Adrian Bird on ”The Selfishness of Law-Abiding Genes”. That is a colourful title in itself, but it doesn’t stop there; this is an extremely metaphor-rich piece. In terms of the theoretical content, there is not much new under the sun. Properties of the organism like complexity, redundancy, and all those exquisite networks of developmental gene regulation may be the result of non-adaptive processes, like constructive neutral evolution and intragenomic conflict. As the title suggests, Bird argues that this kind of thinking is generally accepted about things like transposable elements (”selfish DNA”), but that the same logic applies to regular ”law-abiding” genes. They may also be driven by other evolutionary forces than a net fitness gain at the organismal level.

He gives a couple of possible examples: toxin–antitoxin gene pairs, RNA editing and MeCP2 (that’s probably Bird’s favourite protein that he has done a lot of work on). He gives this possible description of MeCP2 evolution:

Loss of MeCP2 via mutation in humans leads to serious defects in the brain, which might suggest that MeCP2 is a fundamental regulator of nervous system development. Evolutionary considerations question this view, however, as most animals have nervous systems, but only vertebrates, which account for a small proportion of the animal kingdom, have MeCP2. This protein therefore appears to be a late arrival in evolutionary terms, rather than being a core ancestral component of brain assembly. A conventional view of MeCP2 function is that by exerting global transcriptional restraint it tunes gene expression in neurons to optimize their identity, but it is also possible to devise a scenario based on self-interest. Initially, the argument goes, MeCP2 was present at low levels, as it is in non-neuronal tissues, and therefore played little or no role in creating an optimal nervous system. Because DNA methylation is sparse in the great majority of the genome, sporadic mutations that led to mildly increased MeCP2 expression would have had a minimal dampening effect on transcription that may initially have been selectively neutral. If not eliminated by drift, further chance increases might have followed, with neuronal development incrementally adjusting to each minor hike in MeCP2-mediated repression through compensatory mutations in other genes. Mechanisms that lead to ‘constructive neutral evolution’ of this kind have been proposed. Gradually, brain development would accommodate the encroachment of MeCP2 until it became an essential feature. So, in response to the question ‘why do brains need MeCP2?’, the answer under this speculative scenario would be: ‘they do not; MeCP2 has made itself indispensable by stealth’.

I think this is a great passage, and it can be read both as a metaphorical reinterpretation, and as substantive hypothesis. The empirical question ”Did MeCP2 offer an important innovation to vertebrate brains as it arose?”, is a bit hard to answer with data, though. On the other hand, if we just consider the metaphor, can’t you say the same about every functional protein? Sure, it’s nice to think of p53 as the Guardian of the Genome, but can’t it also be viewed as a gangster extracting protection money from the organism? ”Replicate me, or you might get cancer later …”

The piece argues for a gene-centric view, that thinks of molecules and the evolutionary pressures they face. This doesn’t seem so be the fashionable view (sorry, extended synthesists!) but Bird argues that it would be healthy for molecular cell biologists to think more about the alternative, non-adaptive, bottom-up perspective. I don’t think the point is to advocate that way of thinking to the exclusion of the all other. To me, the piece reads more like an invitation to use a broader set of metaphors and verbal models to aid hypothesis generation.

There are too may good quotes in this essay, so I’ll just quote one more from the end, where we’ve jumped from the idea of selfish law-abiding genes, over ”genome ecology” — not in the sense of using genomics in ecology, but in the sense of thinking of the genome as some kind of population of agents with different niches and interactions, I guess — to ”Genetics Meets Sociology?”

Biologists often invoke parallels between molecular processes of life and computer logic, but a gene-centered approach suggests that economics or social science may be a more appropriate model …

I feel like there is a circle of reinforcing metaphors here. Sometimes when we have to explain how something came to be, for example a document, a piece of computer code or a the we do things in an organisation, we say ”it grew organically” or ”it evolved”. Sometimes we talk about the genome as a computer program, and sometimes we talk about our messy computer program code as an organism. Like viruses are just like computer viruses, only biological.

Literature

Bird, Adrian. ”The Selfishness of Law-Abiding Genes.” Trends in Genetics 36.1 (2020): 8-13.

# Journal club of one: ”Evolutionary stalling and a limit on the power of natural selection to improve a cellular module”

This is a relatively recent preprint on how correlations between genetic variants can limit the response to selection, with experimental evolution in bacteria.

Experimental evolution and selection experiments live on the gradient from modelling to observations in the wild. Experimental evolution researchers can design the environments and the genotypes to pose problems for evolution, and then watch in real time as organisms solve them. With sequencing, they can also watch how the genome responds to selection.

In this case, the problem posed is how to improve a particular cellular function (”module”). The researcher started out with engineered Escherichia coli that had one component of their translation machinery manipulated: they started out with only one copy of an elongation factor gene (where E.coli normally has two) that could be either from another species, an reconstructed ancestral form, or the regular E.coli gene as a control.

Then, they sequenced samples from replicate populations over time, and looked for potentially adaptive variants: that is, operationally, variants that had large changes in frequency (>20%) and occurred in genes that had more than one candidate adaptive variant.

Finally, because they looked at what genes these variants occurred in. Were they related to translation (”TM-specific” as they call it) or not (”generic”). That gave them trajectories of potentially adaptive variants like this. The horizontal axis is time and the vertical frequency of the variant. The letters are populations of different origin, and the numbers replicates thereof. The colour shows the classification of variants. (”fimD” and ”trkH” in the figure are genes in the ”generic” category that are of special interest for other reasons. The orange–brown shading signifies structural variation at the elongation factor gene.)

This figure shows their main observations:

• V, A and P populations had more adaptive variants in translation genes, and also worse fitness at the start of the experiment. This goes together with improving more during the experiment. If a population has poor translation, a variant in a translation gene might help. If it has decent translation efficiency already, there is less scope for improvement, and adaptive variants in other kinds of genes happen more often.

We found that populations whose TMs were initially mildly perturbed (incurring ≲ 3% fitness cost) adapted by acquiring mutations that did not directly affect the TM. Populations whose TM had a moderately severe defect (incurring ~19% fitness cost) discovered TM-specific mutations, but clonal interference often prevented their fixation. Populations whose TMs were initially severely perturbed (incurring ~35% fitness cost) rapidly discovered and fixed TM-specific beneficial mutations.

• Adaptive variants in translation genes tended to increase fast and early during the experiment and often get fixed, suggesting that they have larger effects than. Again, the your translation capability is badly broken, a large-effect variant in a translation gene might help.

Out of the 14 TM-specific mutations that eventually fixed, 12 (86%) did so in the first selective sweep. As a result, an average TM-specific beneficial mutation reached fixation by generation 300 ± 52, and only one (7%) reached fixation after generation 600 … In contrast, the average fixation time of generic mutations in the V, A and P populations was 600 ± 72 generations, and 9 of them (56%) fixed after the first selective sweep

• After one adaptive variant in a translation gene, it seems to stop at that.

The question is: when there aren’t further adaptive variants in translation genes, is that because it’s impossible to improve translation any further, or because of interference from other variants? They use the term ”evolutionary stalling”, kind of asexual linked selection. Because variants occur together, selection acts on the net effect of all the variants in an individual. Adaptation in a certain process (in this case translation) might stall, if there are large-effect adaptive variants in other, potentially completely unrelated processes, that swamp the effect on translation.

They argue for three kinds of indirect evidence that the adaptation in translation has stalled in at least some of the populations:

1. Some of the replicate populations of V didn’t fix adaptive translation variants.
2. In some populations, there were a second adaptive translation variant, not yet fixed.
3. There have been adaptive translation mutations in the Long Term Evolution Experiment, which is based on E.coli with unmanipulated translation machinery.

Stalling depends on large-effect variants, but after they have fixed, adaptation might resume. They use the metaphor of natural selection ”shifting focus”. The two non-translation genes singled out in the above figure might be examples of that:

While we did not observe resumption of adaptive evolution in [translation] during the duration of this experiment, we find evidence for a transition from stalling to adaptation in trkH and fimD genes. Mutations in these two genes appear to be beneficial in all our genetic backgrounds (Figure 4). These mutations are among the earliest to arise and fix in E, S and Y populations where the TM does not adapt … In contrast, mutations in trkH and fimD arise in A and P populations much later, typically following fixations of TM-specific mutations … In other words, natural selection in these populations is initially largely focused on improving the TM, while adaptation in trkH and fimD is stalled. After a TM-specific mutation is fixed, the focus of natural selection shifts away from the TM to other modules, including trkH and fimD.

This is all rather indirect, but interesting. Both ”the focus of natural selection shifting” and ”coupling of modules by the emergent neutrality threshold” are inspiring ways to think about the evolution of genetic architecture, and new to me.

Literature

Venkataram, Sandeep, et al. ”Evolutionary Stalling and a Limit on the Power of Natural Selection to Improve a Cellular Module.” bioRxiv (2019): 850644.

# Paper: ‘Integrating selection mapping with genetic mapping and functional genomics’

If you’re the kind of geneticist who wants to know about causative variants that affect selected traits, you have probably thought about how to combine genome scans for signatures of selection with genome-wide association studies. There is one simple problem: Unfortunately, once you’ve found a selective sweep, the association signal is gone, because the causative variant is fixed (or close to). So you need some tricks.

This is a short review that I wrote for a research topic on the genomics of adaptation. It occurred to me that one can divide the ways to combine selection mapping and genetic mapping in three categories. The review contains examples from the literature of how people have done it, and this mock genome-browser style figure to illustrate them.

You can read the whole thing in Frontiers in Genetics.

Johnsson, Martin. Integrating selection mapping with genetic mapping and functional genomics. Frontiers in Genetics 9 (2018): 603.

# Journal club of one: ‘Splendor and misery of adaptation, or the importance of neutral null for understanding evolution’

In this paper from a couple of years ago, Eugene Koonin takes on naïve adaptationism, in the style of The Spandrels of Saint Marcos and the Panglossian paradigm (Gould & Lewontin 1979). The spandrels paper is one of those classics that divide people. One of its problems was that it is easy to point out what one shouldn’t do (tell adaptive stories without justification), but harder to say what one should do. But anti-adaptationism has moved forward since the Spandrels, and the current paper has a prescription.

Spandrels contained a list of possible alternatives to adaptation, which I think breaks down into two categories: population genetic alternatives (including neutral or deleterious fixations due to drift and runaway selection driving destructive features rather than fit to the environment), and physiological or physical alternatives (features that arise due to selection on something else, which are the metaphorical spandrels of the title, and fit to the environment that happens due to natural laws unrelated to biological evolution).

Eugene Koonin elaborates on the population genetic part, concentrating more on chance and less on constraint. He brings up examples of molecular structures that may have arisen through neutral evolution. The main idea is that when a feature has fixed, it doesn’t go away so easily, and there can be a ratchet-like process of increasing complexity. Evolution doesn’t Haussmannise, but patches, pieces, and cobbles together what is already there.

As a theoretical example, Michael Lynch (2007) used population genetic models to derive conditions for when molecular networks can extend and become complex by neutral means. (Spoiler: it’s when transcription factor binding motifs arise often in the weakly constrained DNA around genes.) Eugene Koonin thinks that the thing to do with this insight is to use it as a null model:

A simplified and arguably the most realistic approach is to assume a neutral null model and then seek evidence of selection that could falsify it. Null models are standard in physics but apparently not in biology. However, if biology is to evolve into a ”hard” science, with a solid theoretical core, it must be based on null models, no other path is known.

I disagree with this for two reasons. I’m not at all convinced that biology must be based on setting up null models and rejecting them … or that physics is. In some statistical approaches, inference proceeds by setting up a null hypothesis (and model), and trying to shoot it down. But those hypotheses are different from substantial scientific hypotheses. I would suspect that biology spends too much time rejecting nulls, not too little.

Bausman & Halina (2018) summarise the argument against null hypotheses like this in their recent paper Biology & Philosophy:

The pseudo-null strategy is an attempt to move hypotheses away from parity by shifting the burden of disproving the null to the alternative hypotheses on the authority of statistics. As we have argued, there is no clear justification for this strategy, however, so the hypotheses should be treated on a par.

That is, they reject the analogy between statistical testing and scientific reasoning. They take their examples from ecology and psychology, but there is the same tendency in molecular evolution.

Also, constructive neutral evolution is as a pretty elaborate process. Just like adaptation should not be assumed as a default model without positive supporting evidence, neither should it. The default alternative for some elaborate feature of an organism need not be ‘constructive neutral evolution’, but ‘we don’t know how it came about’.

On the other hand, maybe the paper shouldn’t be read as an attempt to set constructive neutral evolution up as the default, but, like Spandrels, to repeat that adaptation isn’t everything:

It is important to realize that this changed paradigm by no means denies the importance of adaptation, only requires that it is not taken for granted. As discussed above, adaptation is common even in the weak selection regime where non-adaptive processes dominate. But the adaptive processes change their character as manifested in the switch from local to global evolutionary solutions, CNE, and pervasive (broadly understood) exaptation.

Naïve adaptationism is certainly not dead, but just whisper $\frac {1}{N_e s}$ and the ghost goes away. I would have been more interested in an attack on sophisticated adaptationism. How about the organismal level? Do ratchet-like neutral processes bias or direct the evolution of form and behaviour of say animals and plants?

Literature

Bausman W & Halina M (2018) Not null enough: pseudo-null hypotheses in community ecology and comparative psychology Philosophy & Biology

Gould SJ & Lewontin R (1979) The spandrels of San Marco and the Panglossian paradigm: a critique of the adaptationist programme Proceedings of the Royal Society B.

Koonin EV (2016) Splendor and misery of adaptation, or the importance of neutral null for understanding evolution BMC Biology.

Lynch M (2007) The evolution of genetic networks by non-adaptive processes Nature Reviews Genetics.

# Journal club of one: ‘Sacred text as cultural genome: an inheritance mechanism and method for studying cultural evolution’

This is a fun paper about something I don’t know much about: Hartberg & Sloan Wilson (2017) ‘Sacred text as cultural genome: an inheritance mechanism and method for studying cultural evolution‘. It does exactly what it says on the package: it takes an image from genome science, that of genomic DNA and gene expression, and uses it as a metaphor for how pastors in Christian churches use the Bible. So, the Bible is the genome, churches are cells, and citing bible passages in a sermon is gene expression–at least something along those lines.

The authors use a quantitative analysis analogous to differential gene expression to compare the Bible passages cited in sermons from six Protestant churches in the US with different political leanings (three conservative and three progressive; coincidentally, N = 3 is kind of the stereotypical sample size of an early 2000s gene expression study). The main message is that the churches use the Bible differently, that the conservative churches use more of the text, and that even when they draw on the same book, they use different verses.

They exemplify with Figure 3, which shows a ‘Heat map showing the frequency with which two churches, one highly conservative (C1) and one highly progressive (P1), cite specific verses within chapter 3 of the Gospel According to John in their Sunday sermons.’ I will not reproduce it for copyright reasons, but it pretty clearly shows how P1 often cites the first half of the chapter but doesn’t use the second half at all. C1, instead, uses verses from the whole chapter, but its three most used verses are all in latter half, that is the block that P1 doesn’t use at all. What are these verses? The paper doesn’t quote them except 3:16 ‘For God so loved the world, that he gave his one and only Son, that whoever believes in him should not perish, but have eternal life’, which is the exception to the pattern — it’s the most common verse in both churches (and generally, a very famous passage).

Chapter 3 of the Gospel of John is the story of how Jesus teaches Nicodemus. Here is John 3:1-17:

1 Now there was a man of the Pharisees named Nicodemus, a ruler of the Jews. 2 The same came to him by night, and said to him, ”Rabbi, we know that you are a teacher come from God, for no one can do these signs that you do, unless God is with him.”
3 Jesus answered him, ”Most certainly, I tell you, unless one is born anew, he can’t see God’s Kingdom.”
4 Nicodemus said to him, ”How can a man be born when he is old? Can he enter a second time into his mother’s womb, and be born?”
5 Jesus answered, ”Most certainly I tell you, unless one is born of water and spirit, he can’t enter into God’s Kingdom. 6 That which is born of the flesh is flesh. That which is born of the Spirit is spirit. 7 Don’t marvel that I said to you, ‘You must be born anew.’ 8 The wind blows where it wants to, and you hear its sound, but don’t know where it comes from and where it is going. So is everyone who is born of the Spirit.”
9 Nicodemus answered him, ”How can these things be?”
10 Jesus answered him, ”Are you the teacher of Israel, and don’t understand these things? 11 Most certainly I tell you, we speak that which we know, and testify of that which we have seen, and you don’t receive our witness. 12 If I told you earthly things and you don’t believe, how will you believe if I tell you heavenly things? 13 No one has ascended into heaven but he who descended out of heaven, the Son of Man, who is in heaven. 14 As Moses lifted up the serpent in the wilderness, even so must the Son of Man be lifted up, 15 that whoever believes in him should not perish, but have eternal life. 16 For God so loved the world, that he gave his one and only Son, that whoever believes in him should not perish, but have eternal life. 17 For God didn’t send his Son into the world to judge the world, but that the world should be saved through him.”

This is the passage that P1 uses a lot, but they break before they get to the verses that come right after: John 3:18-21. The conservative church uses them the most out of this chapter.

18 Whoever believes in him is not condemned, but whoever does not believe stands condemned already because they have not believed in the name of God’s one and only Son. 19 This is the verdict: Light has come into the world, but people loved darkness instead of light because their deeds were evil. 20 Everyone who does evil hates the light, and will not come into the light for fear that their deeds will be exposed. 21 But whoever lives by the truth comes into the light, so that it may be seen plainly that what they have done has been done in the sight of God.

So this is consistent with the idea of the paper: In the progressive church, the pastor emphasises the story about doubt and the possibility of salvation, where Nicodemus comes to ask Jesus for explanations, and Jesus talks about being born again. It also has some beautiful perplexing Jesus-style imagery with the spirit being like the wind. In the conservative church, the part about condemnation and evildoers hating the light gets more traction.

As for the main analogy between the Bible and a genome, I’m not sure that it works. The metaphors are mixed, and it’s not obvious what the unit of inheritance is. For example, when the paper talks about ‘fitness-enhanching information’, does that refers to the fitness of the church, the members of the church, or the Bible itself? The paper sometimes talk as if the bible was passed on from generation to generation, for instance here in the introduction:

Any mechanism of inheritance must transmit information across generations with high fidelity and translate this information into phenotypic expression during each generation. In this article we argue that sacred texts have these properties and therefore qualify as important inheritance mechanisms in cultural evolution.

But the sacred text isn’t passed on from generation to generation. The Bible is literally a book that is transmitted by printing. What may be passed on is the way pastors interpret it and, in the authors’ words, ‘cherry pick’ verses to cite. But clearly, that is not stored in the bible ‘genome’ but somehow in the culture of churches and the institutions of learning that pastors attend.

If we want to stick to the idea of the bible as a genome, I think this story makes just as much sense: Don’t think about how this plasticity of interpretation may be adaptive for humans. Instead, take a sacred text-centric perspective, analogous to the gene-centric perspective. Think of the plasticity in interpretation as preserving the fitness of the bible by making it fit community values. Because the Bible can serve as source materials for churches with otherwise different values, it survives as one of the most important and widely read books in the world.

Literature

Hartberg, Yasha M., and David Sloan Wilson. ”Sacred text as cultural genome: an inheritance mechanism and method for studying cultural evolution.” Religion, Brain & Behavior 7.3 (2017): 178-190.

The Bible quotes are from the World English Bible translation.

# Journal club of one: ”Give one species the task to come up with a theory that spans them all: what good can come out of that?”

This paper by Hanna Kokko on human biases in evolutionary biology and behavioural biology is wonderful. The style is great, and it’s full of ideas. The paper asks, pretty much, the question in the title. How much do particularities of human nature limit our thinking when we try to understand other species?

Here are some of the points Kokko comes up with:

The use of introspection and perspective-taking in invention of hypotheses. The paper starts out with a quote from Robert Trivers advocating introspection in hypothesis generation. This is interesting, because I’m sure researchers do this all the time, but to celebrate it in public is another thing. To understand evolutionary hypotheses one often has to take the perspective of an animal, or some other entity like an allele of an enhancer or a transposable element, and imagine what its interests are, or how its situation resembles a social situation such as competition or a conflict of interest.

If this sounds fuzzy or unscientific, we try to justify it by saying that such language is a short-hand, and what we really mean is some impersonal, mechanistic account of variation and natural selection. This is true to some extent; population genetics and behavioural ecology make heavy use of mathematical models that are free of such fuzzy terms. However, the intuitive and allegorical parts of the theory really do play an important role both in invention and in understanding of the research.

While scientists avoid using such anthropomorphizing language (to an extent; see [18,19] for critical views), it would be dishonest to deny that such thoughts are essential for the ease with which we grasp the many dilemmas that individuals of other species face. If the rules of the game change from A to B, the expected behaviours or life-history traits change too, and unless a mathematical model forces us to reconsider, we accept the implicit ‘what would I do if…’ as a powerful hypothesis generation tool. Finding out whether the hypothesized causation is strong enough to leave a trace in the phylogenetic pattern then necessitates much more work. Being forced to examine whether our initial predictions hold water when looking at the circumstances of many species is definitely part of what makes evolutionary and behavioural ecology so exciting.

Bias against hermaphrodites and inbreeding. There is a downside, of course. Two of the examples Kokko gives of human biases possibly hampering evolutionary thought are hermaphroditism and inbreeding — two things that may seem quite strange and surprising from a mammalian perspective, but are the norm in a substantial number of taxa.

Null models and default assumptions. One passage clashes with how I like to think. Kokko brings up null models, or default assumptions, and identifies a correct null assumption with being ”simpler, i.e. more parsimonious”. I tend to think that null models may be occasionally useful for statistical inference, but are a bit suspect in scientific reasoning. Both because there’s an asymmetry in defaulting to one model and putting the burden of proof on any alternative, and because parsimony is quite often in the eye of the beholder, or in the structure of the theories you’ve already accepted. But I may be wrong, at least in this case. If you want to formulate an evolutionary hypothesis about a particular behaviour (in this case, female multiple mating), it really does seem to matter for what needs explaining if the behaviour could be explained by a simple model (bumping into mates randomly and not discriminating between them).

However, I think that in this case, what needs explaining is not actually a question about scope and explanatory power, but about phylogeny. There is an ancestral state and what needs explaining is how it evolved from there.

Group-level and individual-level selection. The most fun part, I think, is the speculation that our human biases may make us particularly prone to think of group-level benefits. I’ll just leave this quote here:

Although I cannot possibly prove the following claim, I consider it an interesting conjecture to think about how living in human societies makes us unusually strongly aware of the group-level consequences of our actions. Whether innate, or frequently enough drilled during upbringing to become part of our psyche, the outcome is clear. By the time a biology student enters university, there is a belief in place that evolution in general produces traits because they benefit entire species. /…/ What follows, then, is that teachers need to point out the flaws in one set of ideas (e.g. ‘individuals die to avoid overpopulation’) much more strongly than the other. After the necessary training, students then graduate with the lesson not only learnt but also generalized, at which point it takes the form ‘as soon as someone evokes group-level thinking, we’ve entered “bad logic territory”’.

Literature

Kokko, Hanna. (2017) ”Give one species the task to come up with a theory that spans them all: what good can come out of that?” Proc. R. Soc. B. Vol. 284. No. 1867.

# Selected, causal, and relevant

What is ”function”? In discussions about junk DNA people often make the distinction between ”selected effects” and ”causal roles”. Doolittle & Brunet (2017) put it like this:

By the first (selected effect, or SE), the function(s) of trait T is that (those) of its effects E that was (were) selected for in previous generations. They explain why T is there. … [A]ny claim for an SE trait has an etiological justification, invoking a history of selection for its current effect.

/…/

ENCODE assumed that measurable effects of various kinds—being transcribed, having putative transcription factor binding sites, exhibiting (as chromatin) DNase hypersensitivity or histone modifications, being methylated or interacting three-dimensionally with other sites — are functions prima facie, thus embracing the second sort of definition of function, which philosophers call causal role …

In other words, their argument goes: a DNA sequence can be without a selected effect while it has, potentially several, causal roles. Therefore, junk DNA isn’t dead.

First, if we want to know the fraction of the genome that is functional, we’d like to talk about positions in some reference genome, but the selected effect definition really only works for alleles. Positions aren’t adaptive, but alleles can be. They use the word ”trait”, but we can think of an allele as a trait (with really simple genetics — its genetic basis its presence or absence in the genome).

Also, unfortunately for us, selection doesn’t act on alleles in isolation; there is linked selection, where alleles can be affected by selection without causally contributing anything to the adaptive trait. In fact, they may counteract the adaptive trait. It stands to reason that linked variants are not functional in the selected effect sense, but they complicate analysis of recent adaptation.

The authors note that there is a problem with alleles that have not seen positive selection, but only purifying selection (that could happen in constructive neutral evolution, which is when something becomes indispensable through a series of neutral or deleterious substitutions). Imagine a sequence where most mutations are neutral, but deleterious mutations can happen rarely. A realistic example could be the causal mutation for Freidreich’s ataxia: microsatellite repeats in an intron that occasionally expand enough to prevent transcription (Bidichandani et al. 1998, Ohshima et al. 1998; I recently read about it in Nessa Carey’s ”Junk DNA”). In such cases, selection does not preserve any function of the microsatellite. That a thing can break in a dangerous way is not enough to know that it was useful when whole.

Second, these distinctions may be relevant to the junk DNA debate, but for any research into the genetic basis of traits currently or in the future, such as medical genetics or breeding, neither of these perspectives is what we need. The question is not what parts of the genome come from adaptive alleles, nor what parts of the genome have causal roles. The question is what parts of the genome have causal roles that are relevant to the traits we care about.

The same example is relevant. It seems like the Friedriech’s ataxia-associated microsatellite does not fulfill the selected effect criterion. It does, however, have a causal role, and a causal role relevant to human disease, at that.

I do not dare to guess whether the set of sequences with causal roles relevant to human health is bigger or smaller than the set of sequences with selected effects. But they are not identical. And I will dare to guess that the relevant set, like the selected effect set, is a small fraction of the genome.

Literature

Doolittle, W. Ford, and Tyler DP Brunet. ”On causal roles and selected effects: our genome is mostly junk.” BMC biology 15.1 (2017): 116.

# Johan Frostegård ”Evolutionen och jag”

Jag läste Johan Frostegårds bok om evolutionen och människan över jul. Frostegård är allmänbildad och skriver småtrevligt om lite allt möjligt — lite om människans förhistoria, evolutionära öppna frågor som sexuell fortplantning, altruism, typiskt mänskliga egenskaper, två kapitel om syfilis, plus författarens syn på vetenskaps-, medvetande- och moralfilosofi. Samt Gud och Bob Dylan. Det är kul med en bok om evolution som har så många skönlitterära citat. Det bästa kapitlet är nog kapitel 18, ”Immunologi, evolutionen och jag” som berör hans egen forskning.

Men jag har ett par invändningar. Det går för fort. Jag hänger inte med. Boken stannar aldrig särskilt länge på något ämne. Men det finns ett övergripande tema: att olika ämnen — medicin, moral, nationalekonomi, humaniora — skulle tjäna på en evolutionär analys. Tyvärr är den evolutionära analysen i boken ibland inte särskilt bra. Här är två exempel i detalj:

Så här står det på sidan 89 om färgseende:

Tänk bara på färgblindhet som finns i mycket högre grad hos män än hos kvinnor, och där en rätt rimlig förklaring kan vara att detta ger en fördel när det gäller synförmåga på långa distanser, där den färgblinde anses ha större förmåga att urskilja kontraster, vilket utnyttjats även i moderna arméer. Dess förekomst är statistiskt sett på många håll ungefär som om en i varje jägarlag skulle vara färgblind.

Det är inte uteslutet att röd–grön-färgblindhet kommer med vissa fördelar också skulle kunna vara föremål för naturligt urval i människor under vissa omständigheter. Som sagt, det finns forskning som tyder på att det finns fördelar och nackdelar med att se två respektive tre färger. Och det är tydligen inte helt ovanligt att primater har variation i färgseende inom arten (Surridge, Osorio & Mundy 2003).

Men frågan är, om det nu är bättre (obs, hypotetiskt) att se två färger och inte tre, varför är inte alla män färgblinda? Det finns flera olika omständigheter när naturligt urval göra så att det finns flera varianter av en gen i en population. Det vill säga: att det fortsätter finnas flera varianter av en gen, efter att den nya varianten uppstått genom mutation. Det händer när en variant är bra ibland, dålig ibland, och kallas balanserande selektion.

Det kan vara så att en genetisk variant har både positiva och negativa egenskaper, som gör att de individer som har en kopia av den (bär den i heterozygot tillstånd) får den bästa balansen av för- och nackdelar. Ett annat alternativ är att en genetisk variant ger fördelar när den är ovanlig i populationen, men är dålig när många andra bär på den.

Men det är också möjligt att färgblindhet uppstår hyfsat ofta genom mutation och att det inte är särskilt skadligt, och kan vara vanligt av den anledningen.

Hur det ligger till är en empirisk fråga. Det räcker inte med en idé om hur något skulle kunna vara en fördel för att ha en bra evolutionär hypotes. Vad tar läsaren med sig från resonemanget om hen inte redan vet vad balanserande selektion är? Jo, en typ av spekulation — om det finns ärftlig variation i egenskap X kanske det beror på att den har en evolutionär fördel — utan vidare data eller bevis, som är vanlig men missvisande.

Exempel 2: Det finns några passager och altruismens evolution och diskussionen om släktskap och gruppselektion.

E.O. Wilson beskriver människosläktets sociala förmåga, kallad eusocialitet, som en central egenskap, och anför till och med gruppselektion som en bakomliggande mekanism, det senare något som blivit mycket ifrågasatt. [38, 53] Gruppselektion innebär att konkurrensen i naturen, som är det naturliga urvalets motor, inte bara sker på individnivå utan även på gruppnivå. (s. 91)

/…/

Men en mindre grupp talar för teorin, med nestorn inom sociobiologi, E.O. Wilson, som ett framträdande namn. Han publicerade i den prestigefyllda tidskriften Nature en artikel där han med två medförfattare och matematiska modeller beskrev gruppselektion som en förklaring till social samverkan hos sociala djur som människan [38].

Studien blev genast omdebatterad och hårt kritiserad, bland annat av Richard Dawkins som menar att teorin om gruppselektion bortser från att det är generna som är i centrum för evolutionen, i kraft av att vara replikatorer. Detta förnekar inte heller Wilson. Dock är inte sista ordet sagt, och min gissning är att Wilsons uppfattning kommer vinna mark [37, 256]. (s. 307)

Ja, altruismnördar, referens nummer 38 är ingen mindre än Nowak, Tarnita & Wilson (2010). Nummer 256 är den svarsartikel som 140 evolutionsbiologer skrev i samma tidskrift. Och nej, det tillhör inte direkt vanligheterna att en vetenskaplig tidskrift följs av ett protestupprop i samma tidskrift. (Nummer 37 är en recension som Dawkins skrivit av en av Wilsons böcker.)

Det här är inte en lätt debatt att referera, och den går som synes något djupare än ett meningsutbyte mellan Wilson och Dawkins. Och Nowak, Tarnita & Wilson (2010) är inte någon lätt artikel att läsa. Det är nog inte bara författarnas fel, utan också tidskriftens utrymmesbegränsningars. Den består nämligen av sex sidor ”artikel” och 43 sidor ”supplementary materials” med alla detaljer. Den matematiska modellen får en dryg halv sida i själva artikeln, utan vare sig resultat eller beskrivning av metoden.

Vad kan vi säga om den?

För det första: ”eusocialitet” är inte riktigt ett ord för ”människans speciella sociala natur”. Det är det speciella sociala system där djur lever i kolonier där bara en minoritet reproducerar sig och de andra är sterila. Tänk bisamhällen, myrsamhällen och kolonier av nakenråttor. Författarna tycker uppenbarligen att eusocialitet har tillräckligt gemensamt med arbetsdelning hos människor för att det ska vara en intressant analogi, men det de skriver om människans sociala evolution i artikeln är bara det här:

We have not addressed the evolution of human social behavior here, but parallels with the scenarios of animal eusocial evolution exist, and they are, we believe, well worth examining.

För det andra: det här är en debatt om matematiska modeller. Det är inget fel med det. Matematiska modeller och teoretisk forskning är utmärkt, särskilt om man vill studera något som inte går att observera. I det här fallet hur ett visst beteende uppstod i en sedan länge utdöd förmoder och -fader till en art. Men en diskussion om det bästa sättet att bygga en matematisk modell för ett hypotetiskt scenario blir lätt en smula … teoretisk.

Om vi vill bygga matematiska modeller av hur altruism uppstod finns det lite olika sätt att räkna. Tänk på arbetsbina i ett bisamhälle. Varför har de förlorat förmågan att lägga ägg? Ett sätt är att räkna ut hur många barn de kan få indirekt genom att drottningen, alltså deras mamma, lägger ägg. Om deras arbete gör att drottningen lägger tillräckligt många ägg kan det vara ett effektivare sätt för dem att sprida sina gener än om de skulle ge sig ut i världen och lägga ägg på egen hand. Det är släktskapsselektion (Frostegård beskrier det på s. 304), och sättet att räkna kallas ”inclusive fitness”. ”Fitness” betyder reproduktiv framgång, och ”inclusive fitness” är reproduktiv framgång med släktingarnas bidrag inräknat.

För det tredje så handlar Nowak, Tarnita & Wilson (2010) inte om gruppselektion. Inte direkt, i alla fall. Artikeln är en attack mot släktskapsselektion som förklaring för eusocialitet. De hävdar istället att deras modell, som inte räknar på arbetarnas inclusive fitness, utan istället beskriver hur en mutation som får arbetare att stanna kvar i boet sprider sig i en population, är mer realistisk. Men framför allt verkar de tycka att den är snyggare. Så här skriver de i artikeln:

By formulating a mathematical model of population genetics and family structure, we see that there is no need for inclusive fitness theory. The competition between the eusocial and the solitary allele is described by a standard selection equation. There is no paradoxical altruism, no payoff matrix, no evolutionary game. A ”gene-centered” approach for the evolution of eusociality makes inclusive fitness theory unnecessary.

Och sedan i kommentarer på Nowaks grupps hemsida:

Our paper does not study group selection, and it does not compare group selection
and inclusive fitness. But given the limitations of inclusive fitness it is clear that many models of group selection cannot be analyzed in terms of inclusive fitness. Also note that our model for the evolution of eusociality is not a group selection model; instead it describes selection operating at the level of genes.

Som sagt, den här debatten är rätt teknisk, och på ren svenska en jävla röra. Jag förstår att man inte vill gå in på detaljer i en populärvetenskaplig bok på ämnet. Jag vill inte gå in på detaljer heller. Men än en gång kan man fråga sig om en läsare som inte redan är insatt i ämnet blir något klokare av det här. Vad får vi med oss förutom det felaktiga intrycket att eusocialitet är ”människosläktets sociala förmåga” och ett auktoritetsargument för gruppselektion?

Litteratur

Frostegård, Johan. (2017) Evolutionen och jag. Volante. Stockholm.

Nowak, Martin A., Corina E. Tarnita, Edward O. Wilson. (2010) ”The evolution of eusociality.” Nature 466.7310

Abbot, Patrick, et al. (2011) ”Inclusive fitness theory and eusociality.” Nature 471.7339

Surridge, Alison K., Daniel Osorio, and Nicholas I. Mundy. (2003) ”Evolution and selection of trichromatic vision in primates.” Trends in Ecology & Evolution 18.4

# European Society for Evolutionary Biology congress, Groningen, 2017

The European Society for Evolutionary Biology meeting this year took place August 20–25 in Groningen, Netherlands. As usual, the meeting was great, with lots of good talks and posters. I was also happy to meet colleagues, including people from Linköping who I’ve missed a lot since moving.

Here are some of my subjective highlights:

There were several interesting talks in the recombination symposium, spanning from theory to molecular biology and from within-population variation to phylogenetic distances. For example: Irene Tiemann-Boege talked about recombination hotspot evolution from the molecular perspective with mutation bias and GC-biased gene conversion (Arbeithuber & al 2015), while Franciso Úbeda de Torres presented a population genetic model model of recombination hotspots. I would need to pore over the paper to understand what was going on and if the model solves the hotspot paradox (as the title said), and how it is different from his previous model (Úbeda & Wilkins 2011).

There were also talks about young sex chromosomes. Alison Wright talked about recombination suppression on the evolving guppy sex chromosomes (Wright & al 2017), and Bengt Hansson about the autosome–sex chromosome fusion in Sylvioidea birds (Pala & al 2012).

Piter Bijma gave two (!) talks on social genetic effects. That is when your trait value depends not just on your genotype, but on the genotype on others around you, a situation that is probably not at all uncommon. After all, animals often live in groups, and plants have to stay put where they are. One can model this, which leads to a slightly whacky quantitative genetics where heritable variance can be greater than the trait variance, and where the individual and social effects can cancel each other out and prevent response to selection.

I first heard about this at ICQG in Edinburgh a few years ago (if memory serves, it was Bruce Walsh presenting Bijma’s slides?), but have only made a couple of fairly idle and unsuccessful attempts to understand it since. I got the feeling that social genetic effects should have some bearing on debates about kin selection versus multilevel selection, but I’m not sure how it all fits together. It is nice that it comes with a way to estimate effects (given that we know which individuals are in groups together and their relatedness), and there are some compelling case studies (Wade & al 2010). On the other hand, separating social genetic effects from other social effects must be tricky; for example, early social environment effects can look like indirect genetic effects (Canario, Lundeheim & Bijma 2017).

Philipp Gienapp talked about using realised relatedness (i.e. genomic relationships a.k.a. throw all the markers into the model and let partial pooling sort them out) to estimate quantitative genetic parameters in the wild. There is a lot of relevant information in the animal breeding and human genetics literature, but applying these things in the wild comes with challenges that deserves some new research to sort things out. Evolutionary genetics, similar to human genetics, is more interested in parameter estimation than prediction of phenotypes or breeding values. On the other hand, human genetics methods often work on GWAS summary statistics. In this way, evolutionary genetics is probably more similar to breeding. Also, the relatedness structure of the the populations may matter. Evolution happens in all kinds of populations, large and small, structured and well-mixed. Therefore, evolutionary geneticists may work with populations that are different from those in breeding and human genetics.

For example, someone asked about estimating genetic correlations with genomic relationships. There are certainly animal breeding and human genetics papers about realised relatedness and genetic correlation (Jia & Jannik 2012, Visscher & al 2014 etc), because of course, breeders need to deal a lot with correlated traits and human geneticists really like finding genetic correlations between different GWAS traits.

Speaking of population structure, Fst scans are still all the rage. There was a lot of discussion about trying to find regions of the genome that stand out as more differentiated in closely related populations (”genomic islands of speciation/divergence/differentiation”), and as less differentiated in mostly separated populations (introgression, possibly adaptive). But it’s not just Fst outliers. It’s encouraging to see different kinds of quantitative and population genomic methods applied in the same systems. On the hybrid and introgression side of things, Leslie Turner (Turner & Harr 2014) and Jun Kitano (Ravinet & al 2017) gave interesting talks on mice and sticklebacks, respectively. Danièle Filiaut showed an super impressive integrative GWAS and selection mapping study of local adaptation in Swedish Arabidopsis thaliana (Kedaffrec & al 2016).

Susan Johnston spoke about recombination mapping in Soay sheep and Rum deer (Johnston & al 2016, 2017). Given how few large long term genetic studies like this there are, it’s marvelous to be see the same kind of analysis in two parallel systems. Jason Munshi-South gave what seemed like a fascinating talk about rodent evolution in New York City (Harris & Munshi-South 2017). Unfortunately, too many other people thought so too, and I mostly failed to eavesdrop form the corridor.

Finally, Nina Wedell gave a wonderful presidential address about Evolution in the 21th century. ”Because I can. I’m the president now.” Yes!

The talk was about threats to evolutionary biology, examples of it’s usefulness and a series of calls to action. I liked the part about celebrating science much more than the common call to explain science to people. You know, like you hear at seminars and the march for science: We need to ”get out there” (where?) and ”explain what we’re doing” (to whom?). Because if it is true that science and scientists are being questioned, then scientists should speak in a way that works even if they’re not starting by default from a position of authority. Scientists need not just explain the science, but justify why the science is worth listening to in the first place.

”As your current president, I encourage you to celebrate evolution!”

I think this is precisely right, and it made me so happy. Of course, it leaves questions like ”What does that mean?”, ”How do we do it?”, but as a two word slogan, I think it is perfect.

Celebration aligns with sound rhetorical strategy in two ways. First, explanation is fine when someone asks for it, or is otherwise already disposed to listen to an explanation. But otherwise, it is more important to awaken interest and a positive state of mind before laying out the facts. (I can’t claim to be any kind of rhetorics expert. But see Rhetoric: for Herennius, Book I, V-VII for ancient wisdom on the topic.) By the way, I’m sure this is what people who are good at science communication actually do. Second, celebration means concentrating on the excitement and wonder, and the good things science can do. In that way, it prevents the trap of listing all the bad things that will happen if Trumpists, creationists and anti-vaccine activists get their way.

Nina Wedell also gave examples of the usefulness of evolution: biomimicry, directed evolution of enzymes, the power of evolutionary algorithms, plant and animal breeding, and prevention of resistance to herbicides and antibiotics. These are all good, worthy things, but also quite a limited subset of evolutionary biology? Maybe this idea is that evolutionary biology should be a basic science supporting applications like these. In line with that, she brought up how serendipitous useful things can come from studying strange diverse organisms and figuring out how they do things. The example in talk was the CRISPR–Cas system. Similar stories apply to a other proteins used as biomedical and biotechnology tools, such as Taq polymerase and Green fluorescent protein.

I have to question a remark about reproducibility, though. The list of threats included ”critique of the scientific method” and concerns over reproducibility, as if this was something that came from outside of science. I may have misunderstood. It was a very brief comment. But if problems with reproducibility are a threat to science, and I think they can be, then it’s not just a problem of image but a problem with how scientists perform, analyse, and report their science.

Evolutionary biology hasn’t been in the reproducibility crisis news the same way as psychology or behavioural genetics, but I don’t know if that is because of better quality, or just that no one has looked that carefully for the problems. There are certainly contradictory results here too, and the same overly flexible data analysis and selective reporting practices that cause problems elsewhere must be common in evolution too. I can think of some reasons why evolutionary biology may be better off. Parts of the field default to analysing data with multilevel or mixed models. Mixed models are not perfect, but they help with some multiple testing problems by fitting and partially pooling a lot of coefficients in the same model. Also, studies that use classical model organisms may be able to get a lot of replication, low variance, and large sample sizes in a way that is impossible for example with human experiments.

So I don’t know if there is a desperate need for large initiatives for replication of key results, preregistration of studies, and improvement of data analysis practice in evolution; there may or there may not. But wouldn’t it still be wonderful if we had them?

Bingo! I don’t have a ton of photos from Groningen, but here is my conference bingo card. Note what conspicuously isn’t filled in: the poster sessions took place in nice big room, and were not that loud. In retrospect, I probably didn’t go to enough of the non-genetic inheritance talks, and I should’ve put Fisher 1930 instead of 1918.