Journal club: ”Template plasmid integration in germline genome-edited cattle”

(This time it’s not just a Journal Club of One, because this post is based on a presentation given at the Hickey group journal club.)

The backstory goes like this: Polled cattle lack horns, and it would be safer and more convenient if more cattle were born polled. Unfortunately, not all breeds have a lot of polled cattle, and that means that breeding hornless cattle is difficult. Gene editing could help (see Bastiaansen et al. (2018) for a model).

In 2013, Tan et al. reported taking cells from horned cattle and editing them to carry the polled allele. In 2016, Carlson et al. cloned bulls based on a couple of these cell lines. The plan was to use the bulls, now grown, to breed polled cattle in Brazil (Molteni 2019). But a few weeks ago, FDA scientists (Norris et al 2019) posted a preprint that found inadvertent plasmid insertion in the bulls, using the public sequence data from 2016. Recombinetics, the company making the edited bulls, conceded that they’d missed the insertion.

”We weren’t looking for plasmid integrations,” says Tad Sonstegard, CEO of Recombinetics’ agriculture subsidiary, Acceligen, which was running the research with a Brazilian consulting partner. ”We should have.”


For context: To gene edit a cell, one needs to bring both the editing machinery (proteins in the case of TALENS, the method used here; proteins and RNA in the case of CRISPR) and the template DNA into the cell. The template DNA is the DNA you want to put in instead of the piece that you’re changing. There are different ways to get the components into the cell. In this case, the template was delivered as part of a plasmid, which is a bacterially-derived circular DNA.

The idea is that the editing machinery should find a specific place in the genome (where the variant that causes polledness is located), make a cut in the DNA, and the cell, in its efforts to repair the cut, will incorporate the template. Crucially, it’s supposed to incorporate only the template, and not the rest of the plasmid. But in this case, the plasmid DNA snuck in too, and became part of the edited chromosome. Biological accidents happen.

How did they miss that, and how did the FDA team detect it? Both the 2016 and 2019 paper are short letters where a lot of the action is relegated to the supplementary materials. Here are pertinent excerpts from Carlson & al 2016:

A first PCR assay was performed using (btHP-F1: 5’- GAAGGCGGCACTATCTTGATGGAA; btHP-R2- 5’- GGCAGAGATGTTGGTCTTGGGTGT) … The PCR creates a 591 bp product for Pc compared to the 389 bp product from the horned allele.

Secondly, clones were analyzed by PCR using the flanking F1 and R1 primers (HP1748-F1- 5’- GGGCAAGTTGCTCAGCTGTTTTTG; HP1594_1748-R1- 5’-TCCGCATGGTTTAGCAGGATTCA) … The PCR creates a 1,748 bp product for Pc compared to the 1,546 bp product from the horned allele.

All PCR products were TOPO cloned and sequenced.

Thus, they checked that the animals were homozygotes for the polled allele (called ”Pc”) by amplifying two diagnostic regions and sequenced them to check the edit. This shows that the target DNA is there.

Then, they used whole-genome short read sequencing to check for off-target edits:

Samples were sequenced to an average 20X coverage on the Illumina HiSeq 2500 high output mode with paired end 125 bp reads were compared to the bovine reference sequence (UMD3.1).

Structural variations were called using CLC probabilistic variant detection tools, and those with >7 reads were further considered even though this coverage provides only a 27.5% probability of accurately detecting heterozygosity.

Upon indel calls for the original non-edited cell lines and 2 of the edited animals, we screened for de novo indels in edited animal RCI-001, which are not in the progenitor cell-line, 2120.

We then applied PROGNOS4 with reference bovine genome build UMD3.1 to compute all potential off-targets likely caused by the TALENs pair.

For all matching sequences computed, we extract their corresponding information for comparison with de novo indels of RCI-001 and RCI-002. BEDTools was adopted to find de novo indels within 20 bp distance of predicted potential targets for the edited animal.

Only our intended edit mapped to within 10 bp of any of the identified degenerate targets, revealing that our animals are free of off-target events and further supporting the high specificity of TALENs, particularly for this locus.

That means, they sequenced the animals’ genomes in short fragment, puzzled it together by aligning it to the cow reference genome, and looked for insertions and deletions in regions that look similar enough that they might also be targeted by their TALENs and cut. And because they didn’t find any insertions or deletions close to these potential off-target sites, they concluded that the edits were fine.

The problem is that short read sequencing is notoriously bad at detecting larger insertions and deletions, especially of sequences that are not in the reference genome. In this case, the plasmid is not normally part of a cattle genome, and thus not in the reference genome. That means that short reads deriving from the inserted plasmid sequence would probably not be aligned anywhere, but thrown away in the alignment process. The irony is that with short reads, the bigger something is, the harder it is to detect. If you want to see a plasmid insertion, you have to make special efforts to look for it.

Tan et al. (2013) were aware of the risk of plasmid insertion, though, at least when concerned with the plasmid delivering the TALEN. Here is a quote:

In addition, after finding that one pair of TALENs delivered as mRNA had similar activity as plasmid DNA (SI Appendix, Fig. S2), we chose to deliver TALENs as mRNA to eliminate the possible genomic integration of TALEN expression plasmids. (my emphasis)

As a sidenote, the variant calling method used to look for off-target effects (CLC Probabilistic variant detection) doesn’t even seem that well suited to the task. The manual for the software says:

The size of insertions and deletions that can be found depend on how the reads are mapped: Only indels that are spanned by reads will be detected. This means that the reads have to align both before and after the indel. In order to detect larger insertions and deletions, please use the InDels and Structural Variation tool instead.

The CLC InDels and Structural Variation tool looks at the unaligned (soft-clipped) ends of short sequence reads, which is one way to get at structural variation with short read sequences. However, it might not have worked either; structural variation calling is a hard task, and the tool does not seem to be built for this kind of task.

What did Norris & al (2019) do differently? They took the published sequence data and aligned it to a cattle reference genome with the plasmid sequence added. Then, they loaded the alignment into the trusty Integrative Genomics Viewer and manually looked for reads aligning to the plasmid and reads supporting junctions between plasmid, template DNA and genome. This bespoken analysis is targeted to find plasmid insertions. The FDA authors must have gone ”nope, we don’t buy this” and decided to look for the plasmid.

Here is what they claim happened (Fig 1): The template DNA is there, as evidenced by the PCR genotyping, but it inserted twice, with the rest of the plasmid in-between.


Here is the evidence (Supplementary figs 1 and 2): These are two annotated screenshots from IGV. The first shows alignments of reads from the calves and the unedited cell lines to the plasmid sequence. In the unedited cells, there are only stray reads, probably misplaced, but in the edited calves, ther are reads covering the plasmid throughout. Unless somehow else contaminated, this shows that the plasmid is somewhere in their genomes.


Where is it then? This second supplementary figure shows alignments to expected junctions: where template DNA and genome are supposed to join. The colourful letters are mismatches, showing where unexpected DNA shows up. This is the evidence for where the plasmid integrated and what kind of complex rearrangement of template, plasmid and genome happened at the cut site. This must have been found by looking at alignments, hypothesising an insertion, and looking for the junctions supporting it.


Why didn’t the PCR and targeted sequencing find this? As this third supplementary figure shows, the PCRs used could, theoretically, produce longer products including plasmid sequence. But they are way too long for regular PCR.


Looking at this picture, I wonder if there were a few attempts to make a primer pair that went from insert into the downstream sequence, that failed and got blamed on bad primer design or PCR conditions.

In summary, the 2019 preprint finds indirect evidence of the plasmid insertion by looking hard at short read alignments. Targeted sequencing or long read sequencing could give better evidence by observing he whole insertion. Recombinetics have acknowledged the problem, which makes me think that they’ve gone back to the DNA samples and checked.

Where does that leave us with quality control of gene editing? There are three kinds of problems to worry about:

  • Off-target edits in similar places in other parts of the genome; this seems to be what people used to worry about the most, and what Carlson & al checked for
  • Complex rearrangements around cut site (probably due to repeated cutting; this became a big concern after Kosicki & al (2018), and should apply both to on- and off-target cuts
  • Insertion of plasmid or mutated target; this is what happened in here

The ways people check gene edits (targeted Sanger sequencing and short read sequencing) doesn’t detect any of them particularly well, at least not without bespoke analysis. Maybe the kind of analysis that Norris & al do could be automated to some extent, but currently, the state of the art seems to be to manually look closely at alignments. If I was reviewing the preprint, I would have liked it if the manuscript had given a fuller description of how they arrived at this picture, and exactly what the evidence for this particular complex rearrangement is. This is a bit hard to follow.

Finally, is this embarrassing? On the one hand, this is important stuff, plasmid integration is a known problem, so the original researchers probably should have looked harder for it. On the other hand, the cell lines were edited and the clones born before a lot of the discussion and research of off-target edits and on-target rearrangements that came out of CRISPR being widely applied, and when long read sequencing was a lot less common. Maybe it was easier to think that the sort read off-target analysis was enough then. In any case, we need a solid way to quality check edits.


Molteni M. (2019) Brazil’s plan for gene edited-cows got scrapped–here’s why. Wired.

Carlson DF, et al. (2016) Production of hornless dairy cattle from genome-edited cell lines. Nature Biotechnology.

Norris AL, et al. (2019) Template plasmid integration in germline genome-edited cattle. BioRxiv.

Tan W, et al. (2013) Efficient nonmeiotic allele introgression in livestock using custom endonucleases. Proceedings of the National Academy of Sciences.

Bastiaansen JWM, et al. (2018) The impact of genome editing on the introduction of monogenic traits in livestock. Genetics Selection Evolution.

Kosicki M, Tomberg K & Bradley A. (2018) Repair of double-strand breaks induced by CRISPR–Cas9 leads to large deletions and complex rearrangements. Nature Biotechnology.

‘All domestic animals and plants are genetically modified already’

There is an argument among people who, like yours truly, support (or at least are not in principle against) applications of genetic modification in plant and animal breeding that ‘all domestic animals and plants are genetically modified already’ because of domestication and breeding. See for example Food Evolution or this little video from Sonal Katyal.

This is true in one sense, but it’s not very helpful, for two reasons.

First, it makes it look as if the safety and efficacy of genome editing turns on a definition. I don’t know what the people who pull out this idea in discussion expect that the response will be — that the people who worry about genetic modification as some kind of threat will go ‘oh, that was a clever turn of phrase; I guess it’s no problem then’. Again, I think the honest thing to say is that genetic modification (be it mutagenesis, transgenetics, or genome editing) is a different thing than classic breeding, but that it’s still okay.

Second, I also fear that it promotes the misunderstanding that selective breeding is somehow outdated and unimportant. This video is an example (and I don’t mean to bash on the video; I think what’s said in it is true, but not the whole story). Yes, genome editing allows us to introduce certain genetic changes precisely and independently of the surrounding region. This is as opposed to introducing a certain variant by crossing, when other undesired genetic variants will follow along. However, we need to know what to edit and what to put instead, so until knowledge of causative variants is near perfect (spoiler: never), selection will still play a role.

Genome editing in EU law

The European Court of Justice recently produced a judgement (Case C-528/16) that means that genome edited organisms will be regarded as genetically modified and subject to the EU directive 2001/18 about genetically modified organisms, which is bad news for anyone who wants to use genome editing to do anything with plant or animal breeding in Europe.

The judgement is in legalese, but I actually found it more clear and readable than the press coverage about it. The court does not seem conceptually confused: it knows what genome editing is, and makes reasonable distinctions. It’s just that it’s bound by the 2001 directive, and if we want genome editing to be useful, we need something better than that.

First, let’s talk about what ‘genetic modification’, ‘transgenics’, ‘mutagenesis’, and ‘genome editing’ are. This is how I understand the terms.

  • A genetically modified organism, the directive says, is ‘an organism, with the exception of human beings, in which the genetic material has been altered in a way that does not occur naturally by mating and/or natural recombination’. The directive goes on to clarify with some examples that count as genetic modification, and some that don’t, including in vitro fertilisation as well as bacterial and viral processes of horizontal gene transfer. As far as I can tell, this is sensible. The definition isn’t unassailable, of course, because a lot hinges on what counts as a natural process, but no definition in biology ever is.
  • Transgenics are organisms that have had new DNA sequences introduced into them for example from a different species. As such, their DNA is different in a way that is very unlikely to happen by spontaneous mutation. For technical reasons, this kind of genetic modification, even if it may seem more dramatic than changing a few basepairs, is easier to achieve than genome editing. This the old, ‘classic’, genetic modification that the directive was written to deal with.
  • Mutagenesis is when you do something to an organism to change the rate of spontaneous mutation, e.g. treat it with some mutagenic chemical or radiation. With mutagenesis, you don’t control what change will happen (but you may be able to affect the probability of causing a certain type of mutation, because mutagens have different properties).
  • Finally, genome editing means changing a genetic variant into another. These are changes that could probably be introduced by mutagenesis or crossing, but they can be made more quickly and precisely with editing techniques. This is what people often envisage when we talk about using Crispr/Cas9 in breeding or medicine.

On these definitions, the Crispr/Cas9 (and related systems) can be used to do either transgenics, mutagenesis or editing. You could use it for mutagenesis to generate targeted cuts, and let the cell repair by non-homologous end joining, which introduces deletions or rearrangements. This is how Crispr/Cas9 is used in a lot of molecular biology research, to knock out genes by directing disruptive mutations to them. You could also use it to make transgenics by introducing a foreign DNA sequence. For example, this is what happens when Crispr/Cas9 is used to create artificial gene drive systems. Or, you could edit by replacing alleles with other naturally occurring alleles.

Looking back at what is in the directive, it defines genetically modified organisms, and then it goes on to make a few exceptions — means of genetic modification that are exempted from the directive because they’re considered safe and accepted. The top one is mutagenesis, which was already old hat in 2001. And that takes us to the main question that the judgment answers: Should genome editing methods be slotted in there, with chemical and radiation mutagenesis, which are exempt from the directive even if they’re actually a kind of genetic modification, or should they be subject to the full regulatory weight of the directive, like transgenics? Unfortunately, the court found the latter. They write:

[T]he precautionary principle was taken into account in the drafting of the directive and must also be taken into account in its implementation. … In those circumstances, Article 3(1) of Directive 2001/18, read in conjunction with point 1 of Annex I B to that directive [these passages are where the exemption happens — MJ], cannot be interpreted as excluding, from the scope of the directive, organisms obtained by means of new techniques/methods of mutagenesis which have appeared or have been mostly developed since Directive 2001/18 was adopted. Such an interpretation would fail to have regard to the intention of the EU legislature … to exclude from the scope of the directive only organisms obtained by means of techniques/methods which have conventionally been used in a number of applications and have a long safety record.

My opinion is this: Crispr/Cas9, whether used for genome editing, targeted mutagenesis, or even to make transgenics is genetic modification, but genetic modification can be just as safe as old mutagenesis methods. So what do we need instead of the current genetic modification directive?

First, one could include genome edited and targeted mutagenesis products among the exclusions to the directive. There is no reason to think they’d be any less safe than varieties developed by traditional mutagenesis or by crossing. In fact, the new techniques will give you fewer unexpected other variants as side effects. However, EU law does not seem to acknowledge that kind of argument. There would need to be a new law that isn’t based on the precautionary principle.

Second, one could reform the entire directive to something less draconian. It’s not obvious how to do that, though. On the one hand, the directive is based on perceived risks to human health and the environment of genetic modification itself that have little basis in fact. Maybe starting from the precautionary principle was a reasonable position when the directive was written, but now we know that transgenic organisms in themselves are not a threat to human health, and there is no reason to demand each product be individually evaluated to establish that. On the other hand, one can see the need for some risk assessment of transgenic systems. Say for instance that synthetic gene drives become a reality. We really would want to see some kind of environmental risk assessment before they were used outside of the lab.

Immunhämmande antikroppar hjälper nog inte mot depression

Det var ett tag sedan jag skrev något om medicin, för medicin är inte riktigt min grej. Svt och DN hade i alla fall häromdagen artiklar om en studie som testat effekten av det biologiska läkemedlet infliximab på svårbehandlad depression. Båda artiklarna beskrev också från början denna speciella (och dyra) immunhämmande antikropp som en värktablett, men det har efterhand korrigerats nästan överallt. So called science skrev om den och letade också fram den vetenskapliga artikeln: A Randomized Controlled Trial of the Tumor Necrosis Factor Antagonist Infliximab for Treatment-Resistant Depression (tyvärr bakom betalvägg).

Författarnas hypotes är att inflammation orsakar åtminstone en del fall av depression och att ett immunhämmande medel som infliximab skulle kunna hjälpa vissa som leder av svårbehandlad depression. Tyvärr verkar resultatet vara att det inte hjälper. Infliximab är en biotekniskt tillverkad antikropp som angriper TNF-alfa, ett signalämne som är inblandat i inflammation. Så 60 människor med depression blev behandlade med infliximab eller placebo. De mätte depression med Hamiltons depressionsskala och som mått på inflammation mätte de koncentrationen av bland annat CRP, ett protein som finns i blodet och går upp vid inflammation — det senare för att se till att det fanns människor med inflammation jämnt fördelade i båda grupperna. Här är det på sin plats att erkänna att varken depression eller inflammation är mitt ämne, så jag läser helt enkelt artikeln och litar på författarna.

Efter tolv veckor med tre dropp med antikroppar hade båda grupperna blivit bättre (minskat i medeltal typ 7 poäng på skalan), men det var ingen skillnad på de som fått antikroppar och de som fått placebo. Huvudresultatet är negativt. Så vad kommer de påstått positiva resultaten ifrån? Jo, i efterhand gjorde de andra analyser för att se om det kanske finns någon effekt i någon mindre delmängd av patienterna, någon ny hypotes att titta närmare på i ett nytt experiment. Där fann de att människor med hög halt CRP i blodet (alltså mycket inflammation i kroppen) kanske svarar något bättre på infliximab (figur 3 och 4 i artikeln), men osäkerheten är fortfarande mycket stor. Enligt diagrammet i figur 3 kan effekten mycket väl vara noll eller negativ. Deras gräns för ”hög CRP” sattes ju genom att titta på resultaten i samma studie, och när de bara jämför de människor som har hög CRP finns det bara 22 deltagare kvar. Om författarna själva tror på det här och har pengar kommer de förhoppningsvis göra någon uppföljning som prövar effekten av infliximab endast på människor med hög CRP.

Det var också ett tag sedan jag klagade på bristen på källhänvisningar när tidningar skriver om vetenskapliga artiklar. Men bara för protokollet: varken DN eller Svt lyckas lägga in en länk till artikeln ifråga, och det är aldrig okej. Om det funnits en länk till originalartikeln hade vi kunnat klicka vidare och, utan att ens behöva ha tillgång till själva artikeln, kunnat läsa följande mening i sammanfattningen:

Results  No overall difference in change of HAM-D scores between treatment groups across time was found.


Charles L. Raison, Robin E. Rutherford, Bobbi J. Woolwine, Chen Shuo, Pamela Schettler, Daniel F. Drake, Ebrahim Haroon, Andrew H. Miller. (2013) A Randomized Controlled Trial of the Tumor Necrosis Factor Antagonist Infliximab for Treatment-Resistant Depression. JAMA Psychiatry 70 ss. 31-41. doi:10.1001/2013.jamapsychiatry.4

Om syntetbiologi och artificiellt liv

Tidigare i år kom det rubriker om artificiellt liv. Det var Daniel Gibson & co, under ledning av Craig Venter, kanske den moderna biologins mest kända ansikte (med viss rätt — de organisationer han leder gör coola projekt och han är en rätt kul talare) som publicerade en artikel om den första cellen med ett helt syntetiserat genom. Det är en extremt imponerande insats, men om vad det betyder i övrigt tvista de lärde (med flera). Det ska vi inte heller ge något uttömmande svar på, utan  fördjupa oss lite i hur det gick till.

För de som har tillgång till prenumerationer finns artikeln att läsa här. Lyssna annars på Venter.

Synthetic biology är ett modeuttryck för tillfället, låt oss skriva syntetbiologi på svenska. Inte för att ge sig på definitionsdiskussionen, men låt oss helt enkelt kalla det en uppdaterad variant av bioteknik — inte alls hela vägen, men ännu ett steg närmare att kunna bygga och bygga om biologiska system efter eget huvud. Det är nämligen inte helt lätt, om nu någon undgått att märka det. Biologiska system är, för att använda Drew Endys uttryck, inte byggda för att vara lätta att förstå och förändra. De är strängt taget inte byggda alls, utan har blivit till genom en lång, irrationell och nyckfull evolutionsprocess. Även om jag vet att mina ingenjörskompisar tvivlar på det ibland — hur kontraintuitiva och illa dokumenterade tekniska system än är så är de ändå konstruerade av ett (mer eller mindre) intelligent människosinne

Drew Endy nämner syntesen av långa DNA-strängar som en central teknik — och det är precis det den här artikeln excellerar i. DNA, vår och alla andra organismers stabila lagringsmedium för ärftlig data, är nämligen i grund och botten en väldigt lång sockermolekyl, som kan framställas på kontrollerad syntetisk väg. Det är alltså det som är syntetiskt i den syntetiska cell Gibson & co har framställt — den första mallen för dess arvsmassa har tillverkats i en DNA-syntesmaskin.

Läs mer

Laxar och källförteckningar

(Laxägg från Wikipedia. Så vitt jag vet inte genmodifierade.)

Bara en kort notis om något vi borde titta närmare på senare. Idag publicerades en artikel på vetenskapsdelen av DN:s hemsida som handlar om genmodifierade laxar som snart kan vara ”här” — det vill säga, Food and Drug Administration i USA utreder om de ska godkänna en genförändrad lax som människodföda.

Fisken kallas the AquaBounty Salmon efter AquaBounty Technologies som säljer den och — får vi gissa — sitter på patenten.

Men först en sak om DN-artikeln och en slående skillnad mellan vetenskapsjournalistik och vetenskapligt skrivande. I vetenskapliga skrifter förväntas det finnas källor dit läsaren kan gå för att få reda på vilket stöd det finns för påståenden; och stödet ska helst vara andra granskade vetenskapliga skrifter. Journalister har såklart inga sådana krav på sig. Och när journalister skriver om vetenskap blir det ibland en dråplig krock. Det hör till exempel till undantagen att en vetenskapsjournalist som skriver om ett nytt resultat som publicerats i en vetenskaplig tidskrift uppger en fullständig källa. I de bättre fallen uppger de vilken tidskrift artikeln publiceras i. Författarnamnet uppges bara om författaren gjort något kul uttalande som går att citera i texten. I värsta fall får vi nöja oss med att det står att ”amerikanska forskare” kommit fram till någonting — förmodligen att någonting orsakar eller skyddar mot cancer.

Detta har nog sina randiga och rutiga skäl. Jag har förstått att vetenskapsjournalister ofta jobbar med presskit med förhandsmaterial från de stora tidkrifterna, så de kanske inte vet exakt när och var artikeln kommer finnas. Oftast krävs det dyra prenumerationer för att komma åt den, och det är inte självklart att vi som läsare skulle begripa originalartikeln ens om vi såg den. Därför kanske inte referensen alltid är så viktig för oss. Men det vore ändå bra om den fanns där, så vi kunde gå till originalartikeln ifall vi vill veta mer — särskilt när det gäller något kontroversiellt.

Den här artikeln handlar i alla fall inte om något visst genombrott som publiceras i en vetenskaplig artikel. Men den verkar vara ett referat av den här artikeln — och kanske någon mer — i The Guardian. Nog för att artikeln verkar — verkar; det borde vi som sagt titta närmre på, kanske titta på några av de vetenskapliga referenser som faktiskt finns — innehålla en rätt bra sammanfattning av vad som är genmodifierat med laxen ifråga. Men den har ändå karaktären av en sammanfattning av en sammanfattning av en sammanfattning. Mest av allt påminner det om viskleken.

Vad är biologiska läkemedel?

För några dagar sedan kom en artikel från Ekot — som ekade här och där — om biologiska läkemedel och hur förskrivningen av dem skiljer sig mellan olika delar av landet.

Men det står inte särskilt mycket om vilken sorts läkemedel det egentligen är fråga om, mer än att de är nya, bra och dyra. Så vad är biologiska läkemedel? Jag kan nog inte svara på varför de är dyrare (mer än att de är nya, relativt outforskade, med patent som fortfarande gäller), men istället blir det en liten utflykt till immunologins förlovade och ganska komplicerade land.

Läs mer