Different ways to cite papers

The journals Genetics and Nature Genetics seem to take opposite views on citations. See first this editorial from Nature Genetics: ”Neutral citation is poor scholarship”. It is strongly worded in a way that is surprising and entertaining:

The journal deplores and will decline to consider manuscripts that fail to identify the key findings of published articles and that—deliberately or inadvertently—omit the reason the prior work is cited.

(All the emphasis in all the quotes was added by me.)

The passage that suggests a difference in citation policy occurs at the end:

Authors are of course free to select the literature that is relevant to their current work and to cite in their arguments only those publications that meet their standards of evidence and quality.

Genetics, on the other hand, says this in the instructions for preparing a manuscript:

Authors are encouraged to:

  • cite the supporting literature completely rather than select a subset of citations;
  • provide important background citations, including relevant review papers (to help orient the non-specialist reader);
  • to cite similar work in other organisms.

I’m sure the editors of Genetics also don’t support scattershot citation of tangentially related papers (as in ”This field exists [1-20]”), but they seem to take a different stance on how to choose what to cite.

I wonder what the writers of the respective recommendations would make of these, in my opinion delightful, opening sentences (from Yun & Agrawal 2014). Note the absence of hundreds of citations.

Inbreeding depression has been estimated hundreds of times in a wide variety of taxa. From this body [of] work, it is clear that inbreeding depression is common but also that it is highly variable in magnitude.

På dna-dagen: dna-metaforer

Det finns olika metaforer för deoxyribonukleinsyran och vad den betyder för oss. Dna kan vara en ritning, ett recept, ett program eller skrift.

Det är nästan omöjligt att säga något om molekylärgenetik utan metaforer. Med kvantitativ genetik går det lite lättare, i all fall tills de statistiska modellerna och beräkningarna kommer fram. Kvantitativ genetik handlar om saker som alla kan se i vardagen, som familjelikhet och släktskap. Molekylärgenetik handlar om saker som, i och för sig finns i det allmäna medvetandet, men inte syns omkring oss.

Men metaforer kan vara ohjälpsamma och leda tanken fel. Bilden av dna som en ritning av organismen kan verka för enkel och leda tanken till genetisk determinism. Nu vet jag, trots att jag ska föreställa ingenjör, inte mycket om ritningar. På flera sätt är det inte så tokigt: en ritning representerar det som ska byggas med ett specialiserat bildspråk i en lägre dimension. Ett hus är i 3D, men en ritning i 2D. Proteiner är tredimensionella; den genetiska koden beskriver dem i en dimension. Men det kanske är sant att ordet ”ritning” (eller ”blåkopia”) för tanken till något som är för exakt och för avbildande.

Ett alternativ är att dna är ett recept (det är många som föreslagit det; bland annat Richard Dawkins i The Blind Watchmaker, 1986). Receptet har den fördelen att det beskriver en process med både ingredienser och instruktioner. Det är lite som organismens utveckling från ett befruktat ägg till en vuxen. ”Tillsätt maternell bicoid i ena änden och nanos i andra änden; låt proteinerna blandas fritt”, och så vidare (Gilbert 2000). En annan fördel är att det naturligt påminner om att dna inte är allt. Samma recept med lokala skillnader i ingredienser och improvisationer från den som lagar blir olika anrättningar. Å andra sidan överdriver receptet vad som finns i dna. Vilka gener som uttrycks var och när är ett samspel av dna och de proteiner och rna som redan finns i en cell vid en viss tidpunkt.

Eller så är dna ett program. Program är också instruktioner, så det har samma fördelar och nackdelar som receptet på den punkten. Å andra sidan är program abstrakta och fria från konkreta ingredienser och associationer till matlagning. Lite som en ritning låter det mekaniskt och exakt. Det spelar tydligt också roll vad dna skulle vara en ritning av eller ett recept på. Det är viss skillnad att kalla dna en ritning av proteiner än ett recept på en organism.

Till sist finns det metaforer inskrivna i själva terminologin. När genetiker pratar om dna, hur det förs vidare och används, pratar vi om det som ett skriftspråk. Det kallas kopiering när dna reproduceras när celler ska dela sig. Det kallas transkription, alltså kopiering men med en ton av överföring till en annan form eller ett annat medium, när rna produceras från dna. Det kallas translation, översättning, när rna i sin tur fungerar som mall för proteinsyntes. Till råga på allt skriver vi dna med ett alfabet på fyra bokstäver: A, C, T, G. Det är en bild som är så passande att den nästan är sann.

(Den 25 april 1953 publicerades artiklarna som presenterade dna-molekylens struktur. Därav dna-dagen. Gamla dna-dagsposter: Genetik utan dna (2016), Gener, orsak och verkan (2015), På dna-dagen (2014))

Clearly, obviously

This is my kind of letter to Nature:

This is a friendly suggestion to colleagues across all scientific disciplines to think twice about ever again using the words ‘obviously’ and ‘clearly’ in scientific and technical writing. These words are largely unhelpful, particularly to students, who may be counterproductively discouraged if what is described is not in fact obvious or clear to them.

Clearly, this is easier said than done. It is common writers’ advice to remove adverbs, and to a lesser extent adjectives. These words may be pointless filler words, and when they’re not, there is a risk of telling the reader what to think in a manner that seems impolite. But they also do some work to make the text flow, and prose without them can seem sterile and disconnected.

If we could also get rid of ”surprisingly”, I would be happy.

Journal club of one: ”Give one species the task to come up with a theory that spans them all: what good can come out of that?”

This paper by Hanna Kokko on human biases in evolutionary biology and behavioural biology is wonderful. The style is great, and it’s full of ideas. The paper asks, pretty much, the question in the title. How much do particularities of human nature limit our thinking when we try to understand other species?

Here are some of the points Kokko comes up with:

The use of introspection and perspective-taking in invention of hypotheses. The paper starts out with a quote from Robert Trivers advocating introspection in hypothesis generation. This is interesting, because I’m sure researchers do this all the time, but to celebrate it in public is another thing. To understand evolutionary hypotheses one often has to take the perspective of an animal, or some other entity like an allele of an enhancer or a transposable element, and imagine what its interests are, or how its situation resembles a social situation such as competition or a conflict of interest.

If this sounds fuzzy or unscientific, we try to justify it by saying that such language is a short-hand, and what we really mean is some impersonal, mechanistic account of variation and natural selection. This is true to some extent; population genetics and behavioural ecology make heavy use of mathematical models that are free of such fuzzy terms. However, the intuitive and allegorical parts of the theory really do play an important role both in invention and in understanding of the research.

While scientists avoid using such anthropomorphizing language (to an extent; see [18,19] for critical views), it would be dishonest to deny that such thoughts are essential for the ease with which we grasp the many dilemmas that individuals of other species face. If the rules of the game change from A to B, the expected behaviours or life-history traits change too, and unless a mathematical model forces us to reconsider, we accept the implicit ‘what would I do if…’ as a powerful hypothesis generation tool. Finding out whether the hypothesized causation is strong enough to leave a trace in the phylogenetic pattern then necessitates much more work. Being forced to examine whether our initial predictions hold water when looking at the circumstances of many species is definitely part of what makes evolutionary and behavioural ecology so exciting.

Bias against hermaphrodites and inbreeding. There is a downside, of course. Two of the examples Kokko gives of human biases possibly hampering evolutionary thought are hermaphroditism and inbreeding — two things that may seem quite strange and surprising from a mammalian perspective, but are the norm in a substantial number of taxa.

Null models and default assumptions. One passage clashes with how I like to think. Kokko brings up null models, or default assumptions, and identifies a correct null assumption with being ”simpler, i.e. more parsimonious”. I tend to think that null models may be occasionally useful for statistical inference, but are a bit suspect in scientific reasoning. Both because there’s an asymmetry in defaulting to one model and putting the burden of proof on any alternative, and because parsimony is quite often in the eye of the beholder, or in the structure of the theories you’ve already accepted. But I may be wrong, at least in this case. If you want to formulate an evolutionary hypothesis about a particular behaviour (in this case, female multiple mating), it really does seem to matter for what needs explaining if the behaviour could be explained by a simple model (bumping into mates randomly and not discriminating between them).

However, I think that in this case, what needs explaining is not actually a question about scope and explanatory power, but about phylogeny. There is an ancestral state and what needs explaining is how it evolved from there.

Group-level and individual-level selection. The most fun part, I think, is the speculation that our human biases may make us particularly prone to think of group-level benefits. I’ll just leave this quote here:

Although I cannot possibly prove the following claim, I consider it an interesting conjecture to think about how living in human societies makes us unusually strongly aware of the group-level consequences of our actions. Whether innate, or frequently enough drilled during upbringing to become part of our psyche, the outcome is clear. By the time a biology student enters university, there is a belief in place that evolution in general produces traits because they benefit entire species. /…/ What follows, then, is that teachers need to point out the flaws in one set of ideas (e.g. ‘individuals die to avoid overpopulation’) much more strongly than the other. After the necessary training, students then graduate with the lesson not only learnt but also generalized, at which point it takes the form ‘as soon as someone evokes group-level thinking, we’ve entered “bad logic territory”’.

Literature

Kokko, Hanna. (2017) ”Give one species the task to come up with a theory that spans them all: what good can come out of that?” Proc. R. Soc. B. Vol. 284. No. 1867.

NASA and Orphan Black

A few months ago I wrote a post about the (fictitious, and also evil) clone experiment in Orphan Black. I said that comparison of complex traits between a handful of individuals isn’t, even in principle, a ”scientifically beautiful setup to learn myriad things”, but garbage. You can’t take two humans, even if they’re clones, put them in different environments, and expect to learn much of anything.

Funnily enough, it seems like NASA has been doing just that with the NASA twin study: there are two astronauts who are twins, and researchers have compared various things between them and before/after one of them went to space. Of course, those various things include headline-attracting assays like telomere length and DNA methylation (including ”epigenetic age” — something like Horvath 2013, I assume).

The news coverage has been confused — mixing up DNA methylation, gene expression and mutation. But can one blame news outlet for reporting about ”7% changes to his DNA” and ”space genes” when the press release said this:

Another interesting finding concerned what some call the “space gene”, which was alluded to in 2017. Researchers now know that 93% of Scott’s genes returned to normal after landing. However, the remaining 7% point to possible longer term changes in genes related to his immune system, DNA repair, bone formation networks, hypoxia, and hypercapnia.

Someone who knows some biology can guess that this doesn’t refer to mutation, but it’s not making things easy for the reader, and when put like that, the 7% could be DNA methylation, gene expression, or something else transient and genomic. (They’ve since clarified that it was gene expression — in some sample; my bet is on white blood cells.)

Now that we’ve made fun of NASA a little, there are some circumstances when we can learn useful things from studies of even a single individual. For example, if Chaser the Border Collie can learn the names of 1000 toys, and learn new toy names through reasoning by exclusion (Pilley & Reid 2011), then we can safely assume that this is within the realm of dog abilities. Another example is a reference genome, which in the best case is made from a single individual, ideally an individual who is as homozygous as possible. When comparing the reference genome to that of other species, we feel confident enough to publish genome papers with comparisons of gene content, gene family evolution, and selection on protein coding sequences over evolutionary timescales. But when it comes to functional genomics, many variable molecular trait measurements all along the genome? No.

The study is not out. It may be better than the advertisement. It’s seems they’ve compared the two men before and after, so they can get some handle on differences that came about in the years leading up to the study. And maybe they’ve run a crazy number of technical replicates to make sure that the value they get from each data point is as a good measurement as possible. And maybe there is data on what happens with these kinds of assays when people do other strenuous things, putting the differences into context. Maybe.

Literature

Pilley, John W., and Alliston K. Reid. ”Border collie comprehends object names as verbal referents.” Behavioural processes 86.2 (2011): 184-195.

Horvath, Steve. ”DNA methylation age of human tissues and cell types.” Genome biology 14.10 (2013): 3156.

Prata svenska

Nu när jag inte alls behöver prata om genetik på svenska känns det plötsligt extra viktigt att tänka på det.

Helst skulle jag förstås vilja kunna prata om genetik på svenska med termer som är begripliga, smidiga och inte känns konstlade. Vad som känns konstlat är naturligtvis en smakfråga. Ska man skriva ”enbaspolymorfier” eller ”snippar”? Det första låter som kanslihussvenska och det andra är ett lustigt ljud med genitala associationer.

Jag kan komma på alla möjliga svepskäl att inte prata om genetik på svenska — ”det låter töntigt”; ”det finns inte ord” — men de är inte särskilt bra. Det är också såklart sant att någon som jag är bättre på mitt modersmål än ett andra språk jag lärde mig skolan, och antagligen både tänker och skriver mer effektivt och nyanserat på svenska än på engelska.

Vilka är de bästa källorna till svenska genetiska termer? Jag antar att de flesta svensktalande genetiker gör som jag och litar till en blandning av: vad vi hört äldre akademiker säga, uppslagsverk som Nationalencyklopedin och Wikipedia, Biotermgruppens lista, kanske KI-bibliotekets svenska MeSH-termer och, om allt annat tryter, översättning från engelska enligt eget huvud.

Genetisk terminologi har flera besvärliga egenskaper. Dels finns det många låneord från latin och grekiska — epistasi, pleiotropi, eukaryot, … — som antagligen inte direkt är självförklarande ens för den som kan latin eller grekiska. ”Epistasi” förresten … Biotermgruppen kallar det ”epistas”, KI-MeSH skriver ”epistasi” och Wikipedia ”epistasis”. Naturligtvis använder genetiker inom olika specialområden samma ord på olika sätt också. ”Pleiotropi” betyder tre olika saker (Paaby & Rockman 2013). Eller var det sju olika saker (Hodkin 1998)?

Sedan finns det massor av ord som betyder ungefär samma sak. Vad är skillnaden på ”variant” och ”allel”? Betyder ”gen” samma sak som ”locus”, eller är det ”variant” och ”locus” som betyder samma sak? Det beror på vem som svarar.

Och till sist verkar genetiker tro att att det hjälper läsaren, eller får dem att verka klyftiga, om de myntar massor av förkortningar. Och sedan helst, som med snipparna ovan, förvandlar förkortningarna till roliga små läten. Snipp och BLUP och tork och kvark voro sex små dvärgar.

Selected, causal, and relevant

What is ”function”? In discussions about junk DNA people often make the distinction between ”selected effects” and ”causal roles”. Doolittle & Brunet (2017) put it like this:

By the first (selected effect, or SE), the function(s) of trait T is that (those) of its effects E that was (were) selected for in previous generations. They explain why T is there. … [A]ny claim for an SE trait has an etiological justification, invoking a history of selection for its current effect.

/…/

ENCODE assumed that measurable effects of various kinds—being transcribed, having putative transcription factor binding sites, exhibiting (as chromatin) DNase hypersensitivity or histone modifications, being methylated or interacting three-dimensionally with other sites — are functions prima facie, thus embracing the second sort of definition of function, which philosophers call causal role …

In other words, their argument goes: a DNA sequence can be without a selected effect while it has, potentially several, causal roles. Therefore, junk DNA isn’t dead.

Two things about these ideas:

First, if we want to know the fraction of the genome that is functional, we’d like to talk about positions in some reference genome, but the selected effect definition really only works for alleles. Positions aren’t adaptive, but alleles can be. They use the word ”trait”, but we can think of an allele as a trait (with really simple genetics — its genetic basis its presence or absence in the genome).

Also, unfortunately for us, selection doesn’t act on alleles in isolation; there is linked selection, where alleles can be affected by selection without causally contributing anything to the adaptive trait. In fact, they may counteract the adaptive trait. It stands to reason that linked variants are not functional in the selected effect sense, but they complicate analysis of recent adaptation.

The authors note that there is a problem with alleles that have not seen positive selection, but only purifying selection (that could happen in constructive neutral evolution, which is when something becomes indispensable through a series of neutral or deleterious substitutions). Imagine a sequence where most mutations are neutral, but deleterious mutations can happen rarely. A realistic example could be the causal mutation for Freidreich’s ataxia: microsatellite repeats in an intron that occasionally expand enough to prevent transcription (Bidichandani et al. 1998, Ohshima et al. 1998; I recently read about it in Nessa Carey’s ”Junk DNA”). In such cases, selection does not preserve any function of the microsatellite. That a thing can break in a dangerous way is not enough to know that it was useful when whole.

Second, these distinctions may be relevant to the junk DNA debate, but for any research into the genetic basis of traits currently or in the future, such as medical genetics or breeding, neither of these perspectives is what we need. The question is not what parts of the genome come from adaptive alleles, nor what parts of the genome have causal roles. The question is what parts of the genome have causal roles that are relevant to the traits we care about.

The same example is relevant. It seems like the Friedriech’s ataxia-associated microsatellite does not fulfill the selected effect criterion. It does, however, have a causal role, and a causal role relevant to human disease, at that.

I do not dare to guess whether the set of sequences with causal roles relevant to human health is bigger or smaller than the set of sequences with selected effects. But they are not identical. And I will dare to guess that the relevant set, like the selected effect set, is a small fraction of the genome.

Literature

Doolittle, W. Ford, and Tyler DP Brunet. ”On causal roles and selected effects: our genome is mostly junk.” BMC biology 15.1 (2017): 116.