The tale of the blue soy products – from contaminated soy milk to a new publication

A new paper is out from my lab. This one is a remarkable story of work by PhD Student Marina E. De León (https://phylogenomics.me/people/marina-de-leon/).

It started with her pouring out some soy milk from her fridge that was blue.

See her Tweet about this here: https://twitter.com/MicrobialFuture/status/1220399781165461504?s=20https://twitter.com/MicrobialFuture/status/1220399781165461504?s=20

https://platform.twitter.com/widgets.js

And then she isolated bacteria from the soy milk and from some blue tofu in her fridge, identified them, did experiments to see if these isolated bacteria could cause soy milk to turn blue, found some that did, sequenced their genomes, and analyzed them to show that these ones had similar properties to other bacteria known to cause blue discoloration of food products. A truly remarkable piece of work.

See the paper here: “Draft Genome Sequences and Genomic Analysis for Pigment Production in Bacteria Isolated from Blue Discolored Soymilk and Tofu

And thanks to Guillaume Jospin and Harriet Wilson who helped with the work and all the people in my lab and via social media that encouraged and supported Marina along the way.

And see also:

https://platform.twitter.com/widgets.js

Matt Hahn @3rdreviewer talk at #UCDavis – pen and paper notes

Matt Hahn was at UC Davis giving a talk yesterday.

//platform.twitter.com/widgets.js I did not have my laptop available so took notes with – gasp – a pen and paper.  I thought it was quite a nice talk so am posting my notes here.  More about Matt and his work can be found here: http://www.indiana.edu/~hahnlab/.

Nina Jablonski talk at #UCDavis on Evolution of Skin Pigmentation

Tandy Warnow at #UCDavis on species trees and gene trees

1/15 at #UCDavis: Tandy Warnow: New methods for species tree estimation in the presence of gene tree heterogeneity

Special Seminar:

Tandy Warnow

The University of Illinois at Urbana-Champaign

New methods for species tree estimation in the

presence of gene tree heterogeneity

Friday, January 15, 2016

1:30 PM

GBSF 1005

Abstract.

Estimating the Tree of Life will likely involve a two-step procedure, where in the first step trees are estimated on many genes, and then the gene trees are combined into a tree on all the taxa. However, the true gene trees may not agree with the species tree due to biological processes such as deep coalescence, gene duplication and loss, and horizontal gene transfer. Statistically consistent methods based on the multi-species coalescent model have been developed to estimate species trees in the presence of incomplete lineage sorting; however, the relative accuracy of these methods compared to the usual “concatenation” approach is a matter of substantial debate within the research community.

I will present results showing that coalescent-based estimation methods are impacted by gene tree estimation error, so that they can be less accurate than concatenation in many cases. I will also present two new methods, ASTRAL (Mirarab et al., Bioinformatics 2014) and statistical binning (Mirarab et al., Science 2014, Bayzid et al., PLOS One 2015) for estimating species trees in the presence of gene tree conflict due to ILS.  Statistical binning and weighted statistical binning are used to improve gene tree estimation, while ASTRAL is a coalescent-based method that is provably statistically consistent and that can construct very accurate large species trees. Finally, I will present theoretical results investigating whether statistically consistent accurate species tree estimation is possible when gene trees have estimation error, and discuss the controversy about statistical binning (Liu and Edwards, Science 2015, Mirarab et al. Science 2015).

See Dr. Warnow’s home page for more information on her work: http://tandy.cs.illinois.edu

Host: Jonathan Eisen

 

 

Notes from Searching for Life meeting Dec 2015 #NewLife15

I helped organize this meeting that happened Dec 16-17 in Pacifica, CA. It was mainly organized by people from DOE-JGI and Global Viral. Officially titled “Exploring Diversity of Life.”

BLAST from the past – a bit of history behind Craig Pikaard’s discovery in 2000 of RNA Pol IV in Arabidopsis

I saw this post by Craig Pikaard on Facebook and it brought back some memories:

New paper from my lab in which we identified the RNAs made by RNA Polymerase IV, an enzyme we discovered ~15 years ago. Took us more than ten years to find the little buggers, but we finally got ’em. The paper is “open access”, meaning that anyone can read it without paying a download fee or subscription. So have at it if you need a nap. 

 

And the post included a link to a new paper in Elife.  This brought back memories because I had a small part in the discovery (or more accurately, some post discovery analysis).  So – let’s step into a time machine here provided by, well, me keeping all my email forever I guess.

It was September 2000.  I was working as a faculty member at TIGR (The Institute for Genomic Research) and I was doing some evolutionary analysis of the Arabidopsis thaliana genome, for what would become my most highly cited paper: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.  And then on Sept 6 day I got an email from someone who I had gotten to know a little bit who was also analyzing the genome:

———————————-
9/6

Dear Jonathan, 

In helping Mike Bevan search for the general transcription machinery, I’ve
stumbled across something odd that might also interest you given its
evolutionary implications. 

There should be three related genes in the Arabidopsis genome (or more if
any of the genes are duplicated) encoding ~135 kd (2nd largest)
DNA-dependent RNA polymerase subunits – one each for pol I, II and III.
These subunits are similar and are clearly related to one another (also to
the B subunit of the single bacterial RNA polymerase) yet they have
distinct motifs that allow them to be placed in each class (pol I, II, or
III) based on clustal analysis with orthologs from other species. Anyway,
there ARE three distinct ~135 kd subunit genes in the thaliana genome and
based on multiple alignments vs. mouse, yeast, drosophila etc genes, and
clustal analysis to draw phylogenetic trees, one is clearly for pol II, and
one is clearly for pol III. The third paralog is strange- it does not group
with other pol I 135 kd subunits (from yeast, Drosophila, Euplotes, mouse,
C. elegans), nor with pol II or III subunits. In fact, it appears as an
outgroup even when archael subunits (e.g. Sulfolobus) are included in the
analysis: archael subunits are more closely related to the pol II second
largest subunit than the mystery subunit is to other pol I, II, or III
subunits. By BLAST searching Genbank, the mystery subunit does not match
anything better than eukaryotic 135 kd subunits and it doesn’t look like a
chloroplast or mitochondrial subunit. I’m wondering if a plant Pol I can
really be that weird. 

Is this something you would be interested in looking at if I send you the
protein sequences for clustal analysis? 

Cheers
Craig 

Craig S. Pikaard
Associate Professor
Biology Department, Washington University
Campus Box 1137, One Brookings Drive
St. Louis, MO 63130

Now this certainly seemed interesting and as I was doing a variety of analyses of RNA polymerase homologs for some studies of the evolution of microbes, it was something I actually knew a little bit about. So I wrote back immediately:

Craig 

This sounds quite interesting. I have found that for many of the DNA repair genes I have been looking at, the A. thaliana genes do show quite long branches, so long branches might be a possibility. A good phylogenetic analysis should be able to detemrine if that is the case. If you send me the sequences and/or an alignment, I would be happy to put them through a more deailed phylogenetic analysis.

Jonathan

Then, a few minutes later I got another email:

Hi Jonathan,

I’m pasting below the sequences I used for the multiple alignments (using
DNAStar), starting with the mystery gene and then known second subunits of
pol I, II, III, and archae.
Thanks for having a look at this.
Craig
———–

Arabidopsis mystery gene (from chromosome 3):
DEFINITION DNA-dependent RNA polymerase II [Arabidopsis thaliana].
ACCESSION BAB02021
A. thal chromosome III sequence. Does not group with pol I, II or III
despite its description. Two chromosome 3 P1 clones and two partial cDNAs
(that are the same)from developing seeds match it (see accessions below,
with match scores)
GSDB:S:3264005|AB020749|AB020749|Arabidopsis thaliana genomic D… 598 0.0
GSDB:S:4681131|AP000377|AP000377|Arabidopsis thaliana genomic D… 566 0.0
GSDB:S:1038672|Z19120|ATRNAPIIM|A.thaliana mRNA for RNA polymer… 504 e-142
GSDB:S:8430488|BE522782|BE522782|M28H12STM Arabidopsis developi… 171
1e-046
GSDB:S:8430529|BE522823|BE522823|M29C3STM Arabidopsis developin… 171
2e-041

mdvdeiesagqiniselgesflqtfckkaatsffeefglishqlnsynffiehglqnvfesfgdilvepsfdvikkkdgd
wryatvfkkivikhdkfktgqdeyvekeildvkkqdiligsipvmvksvlcktsekgkenckkgncafdqggyfvikgae
kvfiaqeqmctkrlwisnspwtvsfrsetkrnrfivrlsenekaedykimekvltvyflsteipvwllffalgvssdkea
mdliafdgddasitnsliasiheadavceafrcgnnaltyvehqikstkfppaesvddclrlylfpclqglkkkarflgy
mvkcllsayagkrkcenrdsfrnkrielagellereirvhlaharrkmtramqkqlsgdgdlkpiehyldasvitnglnr
afstgawshpfrkmervsgvvanlgranplqtlidlrrtrqqvlytgkvgdarhphpshwgrvcflstpdgencglvknm
sllglvstqglesvvemlftcgmeelmndtstplcgkhkvllngdwvglcadsesfvgelksrrrqselplemeikrdkd
dnevriftdagrllrpllvvenlhklkqdkptqypfkhlldqgileligieeeedcttawgikqllkepknythceldls
fllgvscaivpfanhdhgkrvlyqsqkhcqqaigfsstnpnircdtlsqqlfypqkplfktlaseclekevlfngqnaiv
avnvhlgynqedsivmnkaslergmfrseqirsykaevdtkdsekrkkmdelvqfgktyskigkvdsleddgfpfiganm
stgdivigrctesgadhsiklkhtergivqkvvlssndegknfaavslrqvrspclgdkfssmhgqkgvlgyleeqqnfp
ftiqgivpdivinphafpsrqtpgqlleaalskgiacpiqkkegssaaytkltrhatpfstpgvteiteqlhragfsrwg
nervyngrsgemmrslifmgptfyqrlvhmsenkvkfrntgpvhpltrqpvadrkrfggirfgemerdcliahgasanlh
erlftlsdssqmhicrkcktyanviertpssgrkirgpycrvcassdhvvrvyvpygakllcqelfsmgitlnfdtklc
——————————— 

Known Pol I ~135 kd subunits: 

Yeast (S. cerevisae)
MSKVIKPPGQARTADFRTLERESRFINPPKDKSAFPLLQEAVQPHIGSFNALTEGPDGGLLNLGVKDIGEKVIFDGKPLN
SEDEISNSGYLGNKLSVSVEQVSIAKPMSNDGVSSAVERKVYPSESRQRLTSYRGKLLLKLKWSVNNGEENLFEVRDCGG
LPVMLQSNRCHLNKMSPYELVQHKEESDEIGGYFIVNGIEKLIRMLIVQRRNHPMAIIRPSFANRGASYSHYGIQIRSVR
PDQTSQTNVLHYLNDGQVTFRFSWRKNEYLVPVVMILKALCHTSDREIFDGIIGNDVKDSFLTDRLELLLRGFKKRYPHL
QNRTQVLQYLGDKFRVVFQASPDQSDLEVGQEVLDRIVLVHLGKDGSQDKFRMLLFMIRKLYSLVAGECSPDNPDATQHQ
EVLLGGFLYGMILKEKIDEYLQNIIAQVRMDINRGMAINFKDKRYMSRVLMRVNENIGSKMQYFLSTGNLVSQSGLDLQQ
VSGYTVVAEKINFYRFISHFRMVHRGSFFAQLKTTTVRKLLPESWGFLCPVHTPDGSPCGLLNHFAHKCRISTQQSDVSR
IPSILYSLGVAPASHTFAAGPSLCCVQIDGKIIGWVSHEQGKIIADTLRYWKVEGKTPGLPIDLEIGYVPPSTRGQYPGL
YLFGGHSRMLRPVRYLPLDKEDIVGPFEQVYMNIAVTPQEIQNNVHTHVEFTPTNILSILANLTPFSDFNQSPRNMYQCQ
MGKQTMGTPGVALCHRSDNKLYRLQTGQTPIVKANLYDDYGMDNFPNGFNAVVAVISYTGYDMDDAMIINKSADERGFGY
GTMYKTEKVDLALNRNRGDPITQHFGFGNDEWPKEWLEKLDEDGLPYIGTYVEEGDPICAYFDDTLNKTKIKTYHSSEPA
YIEEVNLIGDESNKFQELQTVSIKYRIRRTPQIGDKFSSRHGQKGVCSRKWPTIDMPFSETGIQPDIIINPHAFPSRMTI
GMFVESLAGKAGALHGIAQDSTPWIFNEDDTPADYFGEQLAKAGYNYHGNEPMYSGATGEELRADIYVGVVYYQRLRHMV
NDKFQVRSTGPVNSLTMQPVKGRKRHGGIRVGEMERDALIGHGTSFLLQDRLLNSSDYTQASVCRECGSILTTQQSVPRI
GSISTVCCRRCSMRFEDAKKLLTKSEDGEKIFIDDSQIWEDGQGNKFVGGNETTTVAIPFVLKYLDSELSAMGIRLRYNV
EPK
 

C. elegans
MDCDIASYHVDSFDFLVSKGCQFAAQAVPAEKFRLKNGDAVTMKFTSAQLHKPTLDTGAKLTSDTLPLLPAECRQRGLTY
AGNLKVGIDVHVNGSRLDIIEIILGKVPIMLRSEGCHLRGMSRKELVVAGEEPIEKGGYFIVNGSEKVIRLLIANRRNFP
IAIIRKTFKEKGKLFSEFGVMMRSVKENHTAVMMTLHYLDTGTMQLALQFRREIFYVPLMYIVKALTDKNDAVISAGFKR
GRNQDQFYSSCILNMLAQCQEEEILNQEAAIRAIGSRFRVAVSDRVAPWEDDLEAGRFIIRECVLIHLDSDEEKFHTLAY
MTQKLIALVKGECAPETPDNPQFQEASVSGHILLLILRERMENIIGMVRRKLEYMSSRKDFILTSAAILKALGNHTGGEI
TRGMAYFLATGNLVTRVGLALQQESGFSVIAERINQLRFVSHFRAIHRGAFFMEMRTTDVRKLRPEAWGFICPVHTPDGA
PCGLLNHVTASCRIVTDLSDNSNVPSLLAELGMYTHKTVALAPPGEELYPVLMNGRFLGYVPITKAASIERYLRCAKVAK
DARIPYTSEIALVRRSTDIKNIQTQYPGIYILSDAGRLIRPVRNLAMDAVEHIGTFEQVYLSVVLDPEEAEPGVTMHQEL
HPSCLFSFAGNLIPFPDHNQSPRNVYQCQMGKQTMGTAVHAWHSRADNKMYRLQFPQQPMLKLEAYEKYEMDEYPLGTNA
CVAVISYTGYDMEDAMTINKASYQRGFAHGTVIKVERINLVTERERKTIFYRNPREEIKTVGPDGLPIPGRRYFLDEVYY
VTFNMETGDFRTHKFHYAEPAYCGLVRIVEQGEGDSGAKHALIQWRIERNPIIGDKFASRHGQKGINSFLWPVESLPFSE
TGMVPDIIFNPHGFPSRMTIGMMIESMAGKAAATHGENYDASPFVFNEDNTAINHFGELLTKAGYNYYGNETFYSGVDGR
QMEMQIFFGIVYYQRLRHMIADKFQVRATGPIDPITHQPVKGRKKGGGIRFGEMERDAIIAHGTSFVLQDRLLNCSDRDV
AYACRRCGSLLSVLMSSRAGSHLLKKKRKDDEPLDYTETQRCRTCDKDDQVFLLQVPRVFRYLTAELAAMNVKIKLGIEH
PSKVTGS
 

D. melanogaster
MLEEMQQMKTIPVLTNSRPEFKQIPKKLSRHLANLGGPHVDSFDEMLTVGLDNSAKHMIPNHWLSPAGEKISMKVESIWI
AKPKVPQDVIDVRTREIYPTDSRQLHVSYSGMCSVRLGWSVNGVQKTPINMDLGEVPIMLRSKACNLGQATPEEMVKHGE
HDSEWGGIFVIRGNEKIVRMLIMTRRNHPICVKRSSWKDRGQNFSDLGMLVQTVREDESSLSNVVHYLNNGTAKFMFSHV
KRLSYVPVCLILKCLMDYTDEEIYNRLVQGYESDQYYVSCVQAMLREVQNENVYTHAQCKSFIGNLFRARFPEVPEWQPD
DDVTDFILRERVMIHLDTYEDKFQLIVFMIQKLFQCAQGKYKVENVDSSMMQEVLLPGHLYQKYLSERVESWVSQVRRCL
QKKLTSPDALVTSAVMTQCMRQAGGVGRAIESFLATGNIASRTGLGLMQNSGLVIMAENINRMRYMSHFRAIHRGSYFTT
MRTTEARQLLPDAWGFICPVHTPDGTPCGLLNHLTLTCEISMRPDPKLVKAIPKHLIDMGMMPLSNRRYLGEKLYVVFLD
GKHLGHIHQSEAEKIVDELRYGKIFGTLPQMMEIGFIPFKKNGQFPGLYIATGPARLMRPVWNLKWKRVEYIGTLEQLYM
EIAIDAKEMYPDFTTHLELAKTHFMSNLANLIPMPDYNQSPRNMYQCQMGKQTMGTPCLNWPKQAANKLYRLQTPGTPLF
RPVHYDIIQLDDFAMGTNAIVAVISYTGYDMEDAMIINKAAYERGFAYGSIYKTKFLTLDKKSSYFARHPHMPELIKHLD
TDGLPHPGSKLSYGSPLYCYFDGEVATYKVVKMDEKEDCIVESIRQLGSFDLSPTKMVAITLRVPRPATIGDKFASRAGQ
KGICSQKYPAEDLPFTESGLIPDIVFNPHGFPSRMTIAMMIETMAGKGAAIHGNVYDATPFRFSEENTAIDYFGKMLEAG
GYNYYGTERLYSGVDGREMTADIFFGVVHYQRLRHMVFDKWQVRSTGAVEARTHQPIKGRKRGGGVRFGEMERDALISHG
AAFLLQDRLFHNSDKTHTLVCHKCGSILAPLQRIVKRNETGGLSSQPDTCRLCGDNSSVSMIEIPFSFKYLVTELSSVNI
NARFKLNEI
 

mouse
MDVDGRWRNLPSGPSLKHLTDPSYGIPPEQQKAALQDLTRAHVDSFNYAALEGLSHAVQAIPPFEFAFKDERISLTIVDA
VISPPSVPKGTICKDLNVYPAECRGRKSTYRGRLTADISWAVNGVPKGIIKQFLGYVPIMVKSKLCNLYNLPPRVLIEHH
EEAEEMGGYFIINGIEKVIRMLIEPRRNFPVAMVRPKWKSRGLGYTQFGVSMRCVREEHSAVNMNLHYVENGTVMLNFIY
RKELFFLPLGFALKALVSFSDYQIFQELIKGKEEDSFFRNSVSQMLRIVIEEGCHSQKQVLNYLGECFRVKLSLPDWYPN
VEAAEFLLNQGICIHLQSNTDKFYLRCLMTRKLFALARGECMDDNPDSLVNQEVLSPGQLFLMFLKEKMENWLVSIKIVL
DKRAQKANVSINNENLMKIFSMGTELTRPFEYLLATGNLRSKTGLGFLEDSGLCVVADKLNFLRYLSHFRCVHRGAAFAK
MRTTTVRRLLPESWGFLCPVHTPDGAPCGLLNHLTAVCEVVTKFGDTASIPALLCGLGVTGADTAPCRPYSDCYPVLLDG
VMVGWVDKDLAPEVADTLRRFKVLREKRIPPWMEVALIPMTGKPSLYPGLFLFTTPCRLVRPVQNLELGREELIGTMEQL
FMNVAIFEDEVFGGISTHQELFPHSLLSVIANFIPFSDHNQSPRNMYQCQMGKQTMGFPLLTYQNRSDNKLYRLQTPQSP
LVRPCMYDFYDMDNYPIGTNAIVAVISYTGYDMEDAMIVNKASWERGFAHGSVYKSEFIDLSEKFKQGEDNLVFGVKPGD
PRVMQKLDDDGLPSIGAKLEYGDPYYSYLNLNTGEGFVVYYKSKENCVVDNIKVCSNDMGSGKFKCICITVRIPRNPTIG
DKFASRHGQKGILSRLWPAEDMPFTESGMMPDILFNPHGFPSRMTIGMLIESMAGKSAALHGLCHDATPFIFSEENSALE
YFGEMLKAAGYNFYGTERLYSGISGMELEADIFIGVVYYQRLRHMVSDKFQVRTTGARDKVTNQPLGGRNVQGGIRFGEM
ERDALLAHGTSFLLHDRLFNCSDRSVAHVCVECGSLLSPLLEKPPPSWSAMRNRKYNCTVCGRSDTIDTVSVPYVFRYFV
AELAAMNIKVKLDVI
 

Euplotes
MKTNAKFDRKEISKIYKNIARHHIDSFDFAMSTCLNRACEHMLPFDYIVPEESASCGFKKLTLWYDSFELGQPSLGEIDY
DSHILYPSECRQRKMTYTIPLFATIFKKFDDEMVDNFKVKLGDIPTMGRKKFCNLKGLTKKELAKRGEDMLEFGGYFIVN
GNEKVIRMLIVPKRNFPIAFKRSKFLERGKDFTDYGVQMRCVRDDFTAQTITLTYLSDGSVSLRLIYQKQEFLIPIILIL
KALKNCTDRQIYERIVKGNFNQRQISDRVEAILAVGKDLNIYDSDQSKALIGSRFRIVLAGITSETSDIDAGDLFLSKHI
CIHTDSYEAKFDTLILMIDKLYASVANEVELDNLDSVAMQDVLLGGHLYLQILSEKLFDCLHINLRARLNKELKRHNFDP
MKFRDVLTNQKINCGIGLIGKRMENFLATGNLISRTNLDLMQTSGFCIIGDKLNNIRFLSHFRSIHRGQYFAEQKTTSVR
KLLPESWGFICPVHTPDGAPCGLLNHISMSCVPIGSEEKQIDIDKFRNILGELGMNSISSDLCLNYHTGYYPVIFDGIHL
GYVEKDIGESFVEGLRYLKCTQSQPDYAIPRTLEIAFIPFSGYSRNLQWPGIFLASTPARFTRPVKNLHYNCIEWISPLE
QMNLSIACTDEDITPETTHQELDPINILSIVASVGVFAEYNQSPRNMYQCQMAKQTMGTPYHNHQFRTDNKIYRLLFPHR
PIVKTRTQVDFDIEEYPSGTNAVVAVISYTGYDLEDAMIINKSSYERGFGHGVVYKSYTHDLNESNSQSTRGIKSSVRYK
FLNNVSQKDKSKIKLENIDPDGLPKIGSQLTKGKPELCIFDTLKRGAKLSKFKDSEKARIETVRVCGNDDKNPDNLSIGY
TIRYSRIPVIGDKFSSRHGQKGVLSVLWPQVDMPFTENGITPDLIINPHAFPSRMTMGMLIQSMAAKSGSLRGEFKTVET
FQRYDDNDIVGHFGKELLDKGFNYHGNELMYSGIFGTPLKADIFIGVVYYQRLRHMVSDKSQARGTGPIDILTHQPVKGR
KKGGGIRFGEMERDSLLAHGAAYCLNDRLFRSSDYSEGFVCQNCGSILSCYVNRAIMKTQTFIPPSLDESNKDTEDKEIH
MNEKVICKVCKKNSNCKKVALPFVLRFLANELASMGIKLKFTVNDF
——————–
 

Pol II second largest subunits 

S. cerevisae
msdlansekyydedpygfedesapitaedswavisaffrekglvsqqldsfnqfvdytlqdiicedstlileqlaqhtte
sdnisrkyeisfgkiyvtkpmvnesdgvthalypqearlrnltyssglfvdvkkrtyeaidvpgrelkyeliaeesedds
esgkvfigrlpimlrskncylseatesdlyklkecpfdmggyfiingsekvliaqersagnivqvfkkaapspishvaei
rsalekgsrfistlqvklygregssartikatlpyikqdipiviifralgiipdgeilehicydvndwqmlemlkpcved
gfviqdretaldfigrrgtalgikkekriqyakdilqkeflphitqlegfesrkafflgyminrlllcaldrkdqddrdh
fgkkrldlagpllaqlfktlfkkltkdifrymqrtveeahdfnmklainaktitsglkyalatgnwgeqkkamssragvs
qvlnrytysstlshlrrtntpigrdgklakprqlhnthwglvcpaetpegqacglvknlslmscisvgtdpmpiitflse
wgmepledyvphqspdatrvfvngvwhgvhrnparlmetlrtlrrkgdinpevsmirdirekelkiftdagrvyrplfiv
eddeslghkelkvrkghiaklmateyqdieggfedveeytwssllneglveyidaeeeesiliamqpedlepaeaneend
ldvdpakrirvshhattfthceihpsmilgvaasiipfpdhnqsprntyqsamgkqamgvfltnynvrmdtmanilyypq
kplgttrameylkfrelpagqnaivaiacysgynqedsmimnqssidrglfrslffrsymdqekkygmsitetfekpqrt
ntlrmkhgtydkldddgliapgvrvsgedviigkttpispdeeelgqrtayhskrdastplrstengivdqvlvttnqdg
lkfvkvrvrttkipqigdkfasrhgqkgtigityrredmpftaegivpdliinphaipsrmtvahliecllskvaalsgn
egdaspftditvegiskllrehgyqsrgfevmynghtgkklmaqiffgptyyqrlrhmvddkiharargpmqvltrqpve
grsrdgglrfgemerdcmiahgaasflkerlmeasdafrvhicgicglmtviaklnhnqfeckgcdnkidiyqihipyaa
kllfqelmamnitprlytdrsrdf
 

C. elegans
myddedemvndpmdgdyiddsdeisaeawqeacwvvisayfdekglvrqqldsfdefvqmnvqrivedsppvelqsenqh
lgtdmenpakfslkfnqiylskpthwekdgapmpmmpnearlrnltyasplyvditkvvtrddsatekvydkvfvgkvpv
mlrssycmlsnmtdrdltelnecpldpggyfvingsekvliaqekmatntvyvfsmkdgkyafktecrsclenssrptst
mwvnmlargggggkktamgqriigilpyikqeipimivfralgfvsdrdilghiiydfndpemmemvkpsldeafviqeq
nvalnfigargakpgvtreqrikyareilqkellphvgvsehcetkkaffigymvhrlllaalgrrelddrdhignkrld
lagpllaflfrslfrnllkemrmtaqkyinknddfaldvcvktstitrgltyslatgnwgdqkkahqsragvsqvlnrlt
ytatlshlrranspigregklakprqlhntqwgmvcpaetpegqavglvknlalmayisvgslpepilefleewsmenle
evspsaiadatkifvngawvgihrepdqlmttlkklrrqmdiivsevsmvrdirdreiriytdagrvcrpllivenqkla
lkkrhidqlkeaadeankytwsdlvgggvvelidsmeeetsmiammpedlrsggycdththceihpamilgvcasiipfp
dhnqsprntyqsamgkqamgvyttnfhvrmdtlahvlyypqkplvttrsmeylrfnelpaginaivailsysgynqedsv
imnnsaidrglfrsvfyrsyrdneanldnaneeliekptrekcsgmrhslydkldedgiispgmrvsgddviigktvalp
didddldasgkkypkrdastflrssetgivdqvmlslnsdgnkfvkirmrsvrlpqigdkfasrhgqkgtmgimyrqedm
pftaegltpdiiinphavpsrmtighlieclqgklsankgeigdatpfndtvnvqkisgllceygyhlrgnevmynghtg
kklttqiffgptyyqrlkhmvddkihsrargpiqmmnrqpmegrardgglrfgemerdcqishgatqflrerlfevsdpy
hvyvcnncglivvanlrtnsfeckacrnktqvsavripyackllfqelmsmsiaprlmvkprqskrskhqsea
 

Drosophila
msvqrivedspaielqaerqhtsgevetpprfslkfeqiylskpthwekdgspspmmpnearlrnltysaplyvditktk
nvegldpvetqhqktfigkipimlrstycllsqltdrdltelnecpldpggyfiingsekvliaqekmatntvyvfsmkd
gkyafkteirsclehssrptstlwvnmmargsqnikksaigqriiailpyikqeipimivfralgfvadrdilehiiydf
ddpemmemvkpsldeafvvqeqnvalnfigargarpgvtkdkrikyakeilqkemlphvgvsdfcetkkayflgymvhrl
llaslgrrelddrdhygnkrldlagpllaflfrglfknlmkevrmytqkfidrgkdfnlelaiktniitdglryslatgn
wgdqkkahqaragvsqvlnrltfastlshlrrvnspigrdgklakprqlhntlwgmlcpaetpegaavglvknlalmayi
svgsqpspilefleewsmenleeiapsaiadatkifvngcwvgihrdpeqlmatlrklrrqmdiivsevsmirdirdrei
riytdagricrpllivengslllkkthvemlkerdyknyswqvlvasgvveymytleeetvmiamspydlkqdkdyayct
tythceihpamilgvcasiipfpdhnqsprntyqsamgkqamgvyitnfhvrmdtlahvlyypmkplvttrsmeylrfre
lpaginsivailcytgynqedsvilnasavergffrsvfyrsykdsenkrvgdqeenfekphrgtcqgmrnahydklddd
giiapgirvsgddvvigktitlpenddeldsntkrfskrdastflrnsetgivdqvmltlnsegykfckirvrsvripqi
gdkfasrhgqkgtcgiqyrqedmaftceglapdiiinphaipsrmtighlieclqgklgsnkgeigdatpfndavnvqki
stflqeygyhlrgnevmynghtgrkinaqvflgptyyqrlkhmvddkihsrargpvqilvrqpmegrardgglrfgemer
dcqishgaaqflrerlfevsdpyrvhicnfcgliaianlrnntfeckgcknktqisqvrlpyaakllfqelmsmniaprl
mvt
 

Human
mydadedmqydedddeitpdlwqeacwivissyfdekglvrqqldsfdefiqmsvqrivedappidlqaeaqhasgevee
ppryllkfeqiylskpthwerdgapspmmpnearlrnltysaplyvditktvikegeeqlqtqhqktfigkipimlrsty
cllngltdrdlcelnecpldpggyfiingsekvliaqekmatntvyvfakkdskyaytgecrsclenssrptstiwvsml
arggqgakksaigqrivatlpyikqevpiiivfralgfvsdrdilehiiydfedpemmemvkpsldeafviqeqnvalnf
igsrgakpgvtkekrikyakevlqkemlphvgvsdfcetkkayflgymvhrlllaalgrrelddrdhygnkrldlagpll
aflfrgmfknllkevriyaqkfidrgkdfnlelaiktriisdglkyslatgnwgdqkkahqaragvsqvlnrltfastls
hlrrlnspigrdgklakprqlhntlwgmvcpaetpeghavglvknlalmayisvgsqpspilefleewsmenleeispaa
iadatkifvngcwvgihkdpeqlmntlrklrrqmdiivsevsmirdirereiriytdagricrpllivekqklllkkrhi
dqlkereynnyswqdlvasgvveyidtleeetvmlamtpddlqekevaycstythceihpsmilgvcasiipfpdhnqsp
rntyqsamgkqamgvyitnfhvrmdtlahvlyypqkplvttrsmeylrfrelpaginsivaiasytgynqedsvimnrsa
vdrgffrsvfyrsykeqeskkgfdqeevfekptretcqgmrhaiydkldddgliapgvrvsgddviigktvtlpenedel
estnrrytkrdcstflrtsetgivdqvmvtlnqegykfckirvrsvripqigdkfasrhgqkgtcgiqyrqedmpftceg
itpdiiinphaipsrmtighlieclqgkvsankgeigdatpfndavnvqkisnllsdygyhlrgnevlyngftgrkitsq
ifigptyyqrlkhmvddkihsrargpiqilnrqpmegrsrdgglrfgemerdcqiahgaaqflrerlfeasdpyqvhvcn
lcgimaiantrthtyecrgcrnktqislvrmpyackllfqelmsmsiaprmmsv
Peperomia (Plant)
wgmmcpaetpegqacglvknlalmvyitvgsaanpilefleewstenfeeispavipqatkifvngcwvgihrnpdllvk
tlrqlrrqidvntevgvirdirlkelrlytdygrcsrplfivenqkllikkrdiqalqqretqeegwhflvskgfieyvd
teeeettmismtindlvqarrskdaysttythceihpslilgvcasiipfpdhnqsprntyqsamgkqamgiyvtnyqlr
mdtlayvlyypqkplvttramehlhfrqlpaginaivaiacysgynqedsvimnqssidrgffrslffrsyrdeekkmgt
lvkedfgrpnrentmgmrhgsydkldddglappgtrvsgedviigktspiaqdesqgqasrynrrdhstslrhsesgmvd
qvllttnadglrfvkvrmrsvripqigdkfssrhgqkgtvgmtytqedmpwtaegitpdiivnqhaipsrmtigqlieci
mgkvaahmgkegdatpftdvtvdniskalhkcgyqmrgfetmynghtgrrlsamiflgptyyqrlkhmvddkih
 

Arabidopsis
Columbia; BAC clone F17L22.
Essentially identical to Larkin and Guilfoyle sequence for pol II 2nd
 

largest subunit
MEYNEYEPEPQYVEDDDDEEITQEDAWAVISAYFEEKGLVRQQLDSFDEFIQNTMQEIVDESADIEIRPESQHNPGHQSD
FAETIYKISFGQIYLSKPMMTESDGETATLFPKAARLRNLTYSAPLYVDVTKRVIKKGHDGEEVTETQDFTKVFIGKVPI
MLRSSYCTLFQNSEKDLTELGECPYDQGGYFIINGSEKVLIAQEKMSTNHVYVFKKRQPNKYAYVGEVRSMAENQNRPPS
TMFVRMLARASAKGGSSGQYIRCTLPYIRTEIPIIIVFRALGFVADKDILEHICYDFADTQMMELLRPSLEEAFVIQNQL
VALDYIGKRGATVGVTKEKRIKYARDILQKEMLPHVGIGEHCETKKAYYFGYIIHRLLLCALGRRPEDDRDHYGNKRLDL
AGPLLGGLFRMLFRKLTRDVRSYVQKCVDNGKEVNLQFAIKAKTITSGLKYSLATGNWGQANAAGTRAGVSQVLNRLTYA
STLSHLRRLNSPIGREGKLAKPRQLHNSQWGMMCPAETPEGQACGLVKNLALMVYITVGSAAYPILEFLEEWGTENFEEI
SPSVIPQATKIFVNGMWVGVHRDPDMLVKTLRRLRRRVDVNTEVGVVRDIRLKELRIYTDYGRCSRPLFIVDNQKLLIKK
RDIYALQQRESAEEDGWHHLVAKGFIEYIDTEEEETTMISMTISDLVQARLRPEEAYTENYTHCEIHPSLILGVCASIIP
FPDHNQSPRNTYQSAMGKQAMGIYVTNYQFRMDTLAYVLYYPQKPLVTTRAMEHLHFRQLPAGINAIVAISCYSGYNQED
SVIMNQSSIDRGFFRSLFFRSYRDEEKKMGTLVKEDFGRPDRGSTMGMRHGSYDKLDDDGLAPPGTRVSGEDVIIGKTTP
ISQDEAQGQSSRYTRRDHSISLRHSETGMVDQVLLTTNADGLRFVKVRVRSVRIPQIGDKFSSRHGQKGTVGMTYTQEDM
PWTIEGVTPDIIVNPHAIPSRMTIGQLIECIMGKVAAHMGKEGDATPFTDVTVDNISKALHKCGYQMRGFERMYNGHTGR
PLTAMIFLGPTYYQRLKHMVDDKIHSRGRGPVQILTRQPAEGRSRDGGLRFGEMERDCMIAHGAAHFLKERLFDQSDAYR
VHVCEVCGLIAIANLKKNSFECRGCKNKTDIVQVYIPYACKLLFQELMSMAIAPRMLTKHLKSAKGRQ
—————–
 

RNA polymerase III 

Yeast (cerevisae)
mvaatkrrkthihkhvkdeafddllkpvykgkkltdeintaqdkwhllpaflkvkglvkqhldsfnyfvdtdlkkiikan
qlilsdvdpefylkyvdirvgkksssstkdyltpphecrlrdmtysapiyvdieytrgrniimhkdveigrmpimlrsnk
cilydadeskmaklnecpldpggyfivngtekvilvqeqlsknriiveadekkgivqasvtsstherksktyvitkngki
ylkhnsiaeeipiaivlkacgilsdleimqlvcgndssyqdifavnleesskldiytqqqaleyigakvktmrrqkltil
qegieaiattviahltvealdfrekalyiammtrrvvmamynpkmiddrdyvgnkrlelagqlisllfedlfkkfnndfk
lsidkvlkkpnrameydallsinvhsnnitsglnraistgnwslkrfkmeragvthvlsrlsyisalgmmtrissqfeks
rkvsgpralqpsqfgmlctadtpegeacglvknlalmthittddeeepikklcyvlgveditlidsaslhlnygvylngt
ligsirfptkfvtqfrhlrrtgkvsefisiysnshqmavhiatdggricrpliivsdgqsrvkdihlrklldgeldfddf
lklglveyldvneendsyialyekdivpsmthleiepftilgavaglipyphhnqsprntyqcamgkqaigaiaynqfkr
idtllylmtypqqpmvktktielidydklpagqnatvavmsysgydiedalvlnkssidrgfgrcetrrktttvlkryan
htqdiiggmrvdengdpiwqhqslgpdglgevgmkvqsgqiyinksvptnsadapnpnnvnvqtqyreapviyrgpepsh
idqvmmsvsdndqalikvllrqnrrpelgdkfssrhgqkgvcgiivkqedmpfndqgivpdiimnphgfpsrmtvgkmie
lisgkagvlngtleygtcfggskledmskilvdqgfnysgkdmlysgitgeclqayiffgpiyyqklkhmvldkmharar
gpravltrqptegrsrdgglrlgemerdcviaygasqlllerlmissdafevdvcdkcglmgysgwcttcksaeniikmt
ipyaakllfqellsmniaprlrledifqq
S. pombe
mgvntagdpqksqpkinkggigkdesfgalfkpvykgkkladpvptiedkwqllpaflkvkglvkqhldsynyfvdvdlk
kivqanekvtsdvepwfylkyldirvgapvrtdadaiqasisphecrlrdltyganiyvdieytrgkqvvrrrnvpigrm
pvmlrsnkcvlsgknememaalnecpldpggyfivkgtekvilvqeqlsknriiveaepkkglwqasvtsstherkskty
vitkngklylkhnsvaddipivvvlkamglqsdqeifelvagaeasyqdlfapsieecaklniytaqqaleyigarvkvn
rraganrlppheealevlaavvlahinvfnlefrpkavyigimarrvlmamvdplqvddrdyvgnkrlelagqllallfe
dlfkkfnsdlklnidkvlkkphrtqefdaynqltvhsdhitqgmvralstgnwslkrfkmeragvthvlsrlsyisalgm
mtritsqfektrkvsgprslqasqfgmlctsdtpegeacglvknlalmthittdeeeepiiklayafgiedihvisgrel
hshgtylvylngailgisrypslfvasfrklrrsgkispfigifinthqravfistdggricrpliivqnglpkveskhi
rllkegkwgfedflkqglveyvdvneendslisvyerditpdtthleiepftilgavaglipyphhnqsprntyqcamgk
qaigaiaynqlqridtllylmvypqqpmvktktieligydklpagqnatvaimsysgydiedalvlnkssidrgfgrcqv
fhkhsvivrkypngthdrigdpqrdpetgevvwkhgvveddglagvgcrvqpgqiyvnkqtptnaldnsitlghtqtves
gykatpmtykapepgyidkvmltttdsdqtlikvlmrqtrrpelgdkfssrhgqkgvcgvivqqedmpfndqgicpdiim
nphgfpsrmtvgkmiellsgkvgvlrgtleygtcfggtkvedasrilvehgynysgkdmltsgitgetleayifmgpiyy
qklkhmvmdkmharargpravltrqptegrsrdgglrlgemerdcliaygasqlllerlmissdacdvdvcgqcgllgyk
gwcnscqstrevvkmtipyaakllfqellsmnivprlaledefky
 

Drosophila
mvelkmgdhnveattwdpgdskdwsvpikpltekwklvpaflqvkglvkqhidsfnhfinvdikkivkanelvtsgadpl
fylkyldvrvgkpdiddgfnitkattphecrlrdttysapitvdieytrgtqrikrnnlligrmplmlrsncaltgksef
elsklnecpldpggyfvvrgqekviliqeqlswnkmltedfngvvqcqvtssthekksrtlvlskhgkyylkhnsmtddi
pivvifkalgvvsdqeiqsligidsksqnrfgaslidaynlkvftqqraleymgsklvvkrfqsattktpseearelllt
tilahvpvdnfnlqmkaiyvsmmvrrvmaaeldktlfddrdyygnkrlelagsllsmmfedlfkrmnwelktiadknipk
vkaaqfdvvkhmraaqitaglesaissgnwtikrfkmeragvtqvlsrlsyisalgmmtrvnsqfektrkvsgprslqps
qwgmlcpsytpegeacglvknlalmthitteveerpvmivafnagvedirevsgnpinnpnvflvfingnvlgltlnhkh
lvrnlrymrrkgrmgsyvsvhtsytqrciyihtdggrlcrpyvivenrrplvkqhhldelnrgirkfddflldglieyld
vneendsfiawnedqiedrtthleietftllgvcaglvpyphhnqsprntyqcamgkqamgmigynhnnridslmynlvy
phapmvksktieltnfdklpagqnatvavmsysgydiedalilnkasidrgygrclvyknskctvkryanqtfdrimgpm
kdaltnkvifkhdvldtdgivapgeqvqnkqiminkempavtsmnplqgqsaqvpytavpisykgpepsyiervmvsana
eedflikillrqtriprgdkfssrhgqkgvtgliveqedmpfndfgicpdmimnphgfpsrmtvgktlellggkaglleg
kfhygtafggskvediqaelerhgfnyvgkdffysgitgtpleayiysgpvyyqklkhmvqdkmharargpkavltrqpt
qgrsregglrlgemerdclisygasmlimerlmissdafevdvcrtcgrmaycswchfcqssanvskismpyackllfqe
ltsmnvvpkmileny
 

A. thaliana (chromosome 5)
DEFINITION DNA-directed RNA polymerase subunit [Arabidopsis thaliana].
 

ACCESSION BAB11387
mliiflhgfqitdsliaklramgldqedldltnddhfidkeklsapikstadkfqlvpeflkvrglvkqhldsfnyfinv
gihkivkansritstvdpsiylrfkkvrvgepsiinvntveninphmcrladmtyaapifvnieyvhgshgnkaksakdn
viigrmpimlrscrcvlhgkdeeelarlgecpldpggyfiikgtekvlliqeqlsknriiidsdkkgninasvtsstemt
ksktviqmekekiylflhrfvkkipiiivlkamgmesdqeivqmvgrdprfsasllpsieecvsegvntqkqaldyleak
vkkisygtppekdgralsilrdlflahvpvpdnnfrqkcfyvgvmlrrmieamlnkdamddkdyvgnkrlelsgqlisll
fedlfktmlseaiknvdhilnkpirasrfdfsqclnkdsrysislglertlstgnfdikrfrmhrkgmtqvltrlsfigs
mgfitkispqfeksrkvsgprslqpsqwgmlcpcdtpegescglvknlalmthvttdeeegplvamcyklgvtdlevlsa
eelhtpdsflvilnglilgkhsrpqyfanslrrlrragkigefvsvftnekqhcvyvasdvgrvcrplviadkgisrvkq
hhmkelqdgvrtfddfirdglieyldvneennalvclraeaakadtthieiepftilgvvaglipyphhnqsprntyqca
mgkqamgniaynqlnrmdtllyllvypqrpllttrtielvgydklgagqnatvavmsfsgydiedaivmnkssldrgfgr
civmkkivamsqkydnctadrilipqrtgpdaekmqildddglatpgeiirpndiyinkqvpvdtvtkftsalsdsqyrp
areyfkgpegetqvvdrvalcsdkkgqlcikyiirhtrrpelgdkfssrhgqkgvcgiiiqqedfpfselgicpdlimnp
hgfpsrmtvgkmiellgskagvscgrfhygsafgersghadkvetisatlvekgfsysgkdllysgisgepveayifmgp
iyyqklkhmvldkmhargsgprvmmtrqptegkskngglrvgemerdcliaygasmliyerlmissdpfevqvcracgll
gyynyklkkavcttckngdniatmklpyackllfqvktiglffklklstsshlendkiilisgykflpkisknh

——————— 

Archae second subunit (only one polymerase, though multi-subunit similar to
eukaryotes)
 

Sulfolobus
DEFINITION DNA-DIRECTED RNA POLYMERASE SUBUNIT B.
ACCESSION P11513
PID g133422
mldtesrwaiaesffktrglvrqhldsfndflrnklqqviyeqgeivtevpglkiklgkiryekpsiretdkgpmreitp
mearlrnltysspiflsmipvenniegepieiyigdlpimlksvadptsnlpidklieigedpkdpggyfivngsekmii
aqedlatnrvlvdygksgsnithvakvtssaagyrvqvmierlkdstiqisfatvpgripfaiimralgfvtdrdivyav
sldpqiqnellpsleqassitsaeealdfignrvaigqkrenriqkaeqvidkyflphlgtspedrkkkgyylasavnki
lelylgrrepddkdhyankrvrlagdlftslfrvafkafvkdlvyqlekskvrgrrlsltalvradiiterirhalatgn
wvggrtgvsqlldrtnwlsmlshlrrvvsslargqpnfeardlhgtqwgrmcpfetpegpnsglvknlallaqvsvgine
svvervayelgvvsvedvirriseqnedvekymswskvylngrllgyyedgkelakkiresrrqgklsdevnvayiatdy
lnevhincdagrvrrpliivnngtplvdtedikklkngeitfddlvkqgkiefidaeeeenayvalnpqdltpdhthlei
wpsailgiiasiipypehnqsprntyqsamakqslglyasnyqirtdtrahllhypqmplvqtrmlgvigyndrpagana
ilaimsytgynmedsiimnkssiergmyrstffrlysteevkypggqedkivtpeagvkgykgkdyyrlledngvvspev
evkggdvligkvspprflqefkelspeqakrdtsivtrhgengivdlvlitetlegnklvkvrvrdlripeigdkfatrh
gqkgvvgilidqvdmpytakgivpdiilnphalpsrmtigqimeaiggkyaalsgkpvdatpfletpklqemqkeilklg
hlpdstevvydgrtgqklksrilfgivyyqklhhmvadkmharargpvqiltrqptegraregglrfgemerdcligfgt
amlikdrlldnsykavvyicdqcgyvgwydrsknryvcpvhgdksvlhpvtvsyafklliqelmsmvisprlilgekvnl
ggasne

Well, this was helpful. Sequences and useful notes about them. So I played around with the sequences and searched for some other homologs and built a few alignments, build some masks to filter out poorly aligned regions, and then fed the data into PAUP and built a tree. (I note – I know about this because amazingly I still have all the files)

And I wrote back to Mike Bevan and Craig on Sept 8:

Mike and Craig 

Attached is a phylogenetic tree of RNA polymerase subunits (Craig suggested I look at these because of an unusual protein in the A. thaliana genome). A. thaliana has representatives in five different subfamilies – Pol-I, Pol-II, Pol-III and RpoB (for the chloroplast) as would be expected and then this novel Pol which I have called Pol-IV. 

I do not know much about RNA polymerase, but it seems like this is a pretty big deal and I think should be emphasized in the paper. What do you think? I could try to make a pretty tree figure to show the different families. 

Jonathan

I got an email back:

Dear Jonathan (and Mike), 

Many thanks for the detailed phylogenetic tree of the mystery pol subunit.
I think a figure is the only way to show clearly that this protein defines
a new clade. Is there room for such a figure, Mike? 

In the lab we have also been calling it a putative pol IV subunit just for
the shock value of saying the words (a radical idea in the transcription
field), though in the absence of knowing what other subunits associate with
it, I’m not sure what to call it in the annotation or figure. Maybe
“oddpol” or “atypical polymerase 2nd subunit”. It takes more than a dozen
subunits to make a eukaryotic polymerase, so it is not clear that one
unusual subunit is enough to confer new properties-i.e. a true pol IV.
Obviously, that will require quite a bit of work. 

Cheers,
Craig

Me to Craig on 9/11/00:

Yes 

I agree that it is too early to call it a true polIV, and I was doing it for the shock value too 

Jonathan 

PS. Do you mind if I present this at the TIGR GSAC meeting later this week 

Jonathan

Craig to me 9/11:

Hi Jonathan, 

Feel free to show the data. In thinking more about this, it is worth also
making a phylogenetic tree for the largest pol subunit (the equivalent of
eubacterial B’) just to see if there might be a fourth class out there for
the largest subunit, too. If there is, pol IV may not be such a wild idea. 

In case you are interested in giving this a try, I’m including some
sequences below. In the meantime, is there a good web site for performing
the types of extensive phylogenetic trees you’ve done for the mystery
subunit? I should do this for many of the general transcription factors
just to be sure they really group with the correct homologs, as you
suggested. 

Anyway, here are some largest subunit sequences for pol I, II and III.
Vive la difference! 

Craig

Pol I:
rat
mlaskhtpwrrlqgisfgmysaeelkklsvksitnpryvdslgnpsadglydlalgpadskevcstcvqdfnncsghlgh
idlpltvynpllfdklylllrgsclnchmltcpraaihllvcqlkvldvgalqavyelerilsrfleetsdpsafeiqee
leeytskilqnnllgsqgahvknvcesrsklvahfwkthmaakrcphcktgrsvvrkehnskltitypamvhkksgqkda
elpegapaapgideaqmgkrgyltpssaqehlfaiwknegfflnylfsglddigpessfnpsmffldfivvppsryrpin
rlgdqmftngqtvnlqavmkdavlirkllavmaqeqklpcemteitidkendssgaidrsflsllpgqsltdklyniwir
lqshvnivfdsdmdklmlekypgirqilekkeglfrkhmmgkrvdyaarsvicpdmyintneigipmvfatkltypqpvt
pwnvqelrqavingpnvhpgasmvinedgsrtalsavdatqreavakqlltpstgipkpqgakvvcrhvkngdilllnrq
ptlhrpsiqahrahilpeekvlrlhyanckaynadfdgdemnahfpqselgraeayvlactdqqylvpkdgqplagliqd
hmvsganmtirgcfftreqymelvyrgltdkvgrvklfppailkpfplwtgkqvvstlliniipedytplnltgkakigs
kawvkekprpvpdfdpdsmcesqviiregellcgvldkahygssayglvhccyeiyggetsgrvltclarlftaylqlyr
gftlgvedilvkpnadvmrqriieestqcgpravraalnlpeaascdeiqgkwqdaiwrkdqrdfnmidmkfkeevnhys
neinkacmpfglhrqfpennlqmmvqsgakgstvntmqiscllgqielegrrpplmasgkslpcfepyeftpraggfvtg
rfltgirppefffhcmagreglvdtavktsrsgylqrciikhleglviqydltvrdsdgsvvqflygedgldipktqflq
pkqfpflasnyevimkskhlhevlsradpqkvlrhfraikkwhhrhssallrkgaflsfsqkiqaavkalnlegktqngr
spetqqmlqmwheldeqsrrkyqkraapcpdpslsvwrpdihfasvsetfekkiddysqewaaqaekshnrselsldrlr
tllqlkwqrslcdpgeavgllaaqsigepstqmtlntfhfagrgemnvtlgiprlreilmvasaniktpmmsvpvfntkk
alrrvkslkkqltrvclgevlqkvdiqesfcmgekqnkfrvyelrfqflphayyqqekclrpedilhfmetrffkllmea
ikkknskasafrsvntrratqkdlddtedsgrnrreeerdeeeegnivdaeaeegdadasdtkrkekqeeevdyeseeeg
eeeeeedvqeeenikgegahqthepdeeegsgleeessqnppcrhsrpqgaeamerriqavreshsfiedyqydteeslw
cqvtvklplmkinfdmsslvvslahnaivyttkgitrcllnetinsknekefvlnteginlpelfkysevldlrrlysnd
ihavantygieaalrviekeikdvfavygiavdprhlslvadymcfegvykplnrfgiqssssplqqmtfetsfqflkqa
tmmgshdelkspsaclvvgkvvkggtglfelkqplr
 

Drosophila
mgskramdvhmfpsdlefavftdqeirklsvvkvitgitfdalghaipgglydirmgsygrcmdpcgtclklqdcpghmg
hielgtpvynpffikfvqrllcifclhcyklqmkdheceiimlqlrlidagyiieaqelelfkseivcqntenlvaikng
dmvhphiaamykllekneknssnstktscslrtaithsalqrlgkkcrhcnksmrfvrymhrrlvfyvtladikervgtg
aetggqnkvifadecrrylrqiyanypellkllvpvlglsntdltqgdrspvdlffmdtlpvtpprarplnmvgdmlkgn
pqtdiyiniiennhvlnvvlkymkggqeklteeakaayqtlkgetaheklytawlalqmsvdvlldvnmsremksgeglk
qiiekkcglirshmmgkrvnyaartvitpaypninvdeigipdifakklsypvpvtewnvtdvrkmvmngpdvhpganyi
qdkngfttyipadnaskreslaklllsnpkdgikivhrhvlngdvlllnrqpslhkpsimghkarilhgektfrlhysnc
kaynadfdgdemnahypqsevaraeaynlvnvasnylvpkdgtplggliqdhvisgvklsirgrffnredyqqlvfqgls
qlkkdikllpptilkpavlwsgkqilstiiiniipegyerinldsfakiagknwnvsrprppicgtnpegndlsesqvqi
rngellvgvldkqqygattyglihcmyelyggdvstllvtaftkvftfflqlegftlgvkdilvtdvadrkrrkiirecr
nvgnsavaaaleledepphdelvekmeaayvkdskfrvlldrkykslldgytndinstclprglitkfpsnnlqlmvlsg
akgsmvntmqiscllgqielegkrpplmisgkslpsftsfetspksggfidgrfmtgiqpqdfffhcmagreglidtavk
tsrsgylqrclikhleglsvhydltvrdsdnsvvqflygedgldilkskffndkfcadfltqnatailrpaqlqlmkdee
qlvkvqrhekhirswekkkpaklraafthfseelreevevkrpnevnsktgrrrfdegllklwkkadaedkalyrkkyar
cpdptvavykqdlyygsvsertrklitdyakrkpalketiadimrvktikslaapgepvgliaaqsigepstqmtlntfh
fagrgemnvtlgiprlreilmlassniktpsmdipikpgqqhqaeklrinlnsvtlanlleyvhvstgltldpersyeyd
mrfqflprevykedygvrpkhiikymhqtffkqlipppilkvsnasrttkivviddkkdadkdddndldngdevgrskak
andddssddnddddatgvklkqrktdekdyddpddveelhdanddddeaededdeekgqdgndndgddkaverllsndmv
kaytydkenhlwcqvklnlsvryqkpdltsiirelagksvvhqvqhikraiiykgndddqllktdginigemfqhnkild
lnrlysndihaiartygieaasqvivkevsnvfkvygitvdrrhlsliadymtfdgtfqplsrkgmehsssplqqmsfes
slqflksaagfgradelsspssrlmvglpvrngtgafelltkic
 

yeast (S.c.)
mdiskpvgseitsvdfgiltakeirnlsakqitnptvldnlghpvsgglydlalgaflrnlcstcgldekfcpghqghie
lpvpcynplffnqlyiylrasclfchhfrlksvevhryacklrllqyglidesykldeitlgslnssmytddeaiedted
emdgegskqskdisstllnelkskrseyvdmaiakalsdgrttergsftatvnderkklvhefhkkllsrgkcdncgmfs
pkfrkdgftkifetalnekqitnnrvkgfirqdmikkqkqakkldgsneasandeesfdvgrnpttrpktgstyilstev
knildtvfrkeqcvlqyvfhsrpnlsrklvkadsffmdvlvvpptrfrlpsklgeevhensqnqllskvlttsllirdln
ddlsklqkdkvsledrrvifsrlmnafvtiqndvnafidstkaqgrtsgkvpipgvkqalekkeglfrkhmmgkrvnyaa
rsvispdpnietneigvppvfavkltypepvtayniaelrqavingpdkwpgatqiqnedgslvsligmsveqrkalanq
lltpssnvsthtlnkkvyrhiknrdvvlmnrqptlhkasmmghkvrvlpnektlrlhyantgaynadfdgdemnmhfpqn
enaraealnlantdsqyltptsgspvrgliqdhisagvwltskdsfftreqyqqyiygcirpedghttrskivtlpptif
kpyplwtgkqiittvllnvtppdmpginlisknkikneywgkgslenevlfkdgallcgildksqygaskygivhslhev
ygpevaakvlsvlgrlftnyitataftcgmddlrltaegnkwrtdilktsvdtgreaaaevtnldkdtpaddpellkrlq
eilrdnnksgildavtsskvnaitsqvvskcvpdgtmkkfpcnsmqamalsgakgsnvnvsqimcllgqqalegrrvpvm
vsgktlpsfkpyetdamaggyvkgrfysgikpqeyyfhcmagreglidtavktsrsgylqrcltkqlegvhvsydnsird
adgtlvqfmyggdaiditkeshmtqfefcldnyyallkkynpsaliehldvesalkyskktlkyrkkhskephykqsvky
dpvlakynpakylgsvsenfqdklesfldknsklfkssdgvnekkfralmqlkymrslinpgeavgiiasqsvgepstqm
tlntfhfaghgaanvtlgiprlreivmtasaaiktpqmtlpiwndvsdeqadtfcksiskvllsevidkvivtettgtsn
taggnaarsyvihmrffdnneyseeydvskeelqnvisnqfihlleaaivkeikkqkrttgpdigvavprlqtdvansss
nskrleedndeeqshkktkqavsydepdedeietmreaekssdeegidsdkesdsdsededvdmneqinksiveannnmn
kvqrdrqsaiishhrfitkynfddesgkwcefklelaadtekllmvniveeicrksiirqiphidrcvhpepengkrvlv
tegvnfqamwdqeafidvdgitsndvaavlktygveaarntivneinnvfsryaisvsfrhldliadmmtrqgtylafnr
qgmetstssfmkmsyettcqfltkavldnereqldspsarivvgklnnvgtgsfdvlakvpnaa
 

Arabidopsis
MAHAQTTEVCLSFHRSLLFPMGASQVVESVRFSFMTEQDVRKHSFLKVTSPILHDNVGNPFPGGLYDLKLGPKDDKQACN
SCGQLKLACPGHCGHIELVFPIYHPLLFNLLFNFLQRACFFCHHFMAKPEDVERAVSQLKLIIKGDIVSAKQLESNTPTK
SKSSDESCESVVTTDSSEECEDSDVEDQRWTSLQFAEVTAVLKNFMRLSSKSCSRCKGINPKLEKPMFGWVRMRAMKDSD
VGANVIRGLKLKKSTSSVENPDGFDDSGIDALSEVEDGDKETREKSTEVAAEFEEHNSKRDLLPSEVRNILKHLWQNEHE
FCSFIGDLWQSGSEKIDYSMFFLESVLVPPTKFRPPTTGGDSVMEHPQTVGLNKVIESNNILGNACTNKLDQSKVIFRWR
NLQESVNVLFDSKTATVQSQRDSSGICQLLEKKEGLFRQKMMGKRVNHACRSVISPDPYIAVNDIGIPPCFALKLTYPER
VTPWNVEKLREAIINGPDIHPGATHYSDKSSTMKLPSTEKARRAIARKLLSSRGATTELGKTCDINFEGKTVHRHMRDGD
IVLVNRQPTLHKPSLMAHKVRVLKGEKTLRLHYANCSTYNADFDGDEMNVHFPQDEISRAEAYNIVNANNQYARPSNGEP
LRALIQDHIVSSVLLTKRDTFLDKDHFNQLLFSSGVTDMVLSTFSGRSGKKVMVSASDAELLTVTPAILKPVPLWTGKQV
ITAVLNQITKGHPPFTVEKATKLPVDFFKCRSREVKPNSGDLTKKKEIDESWKQNLNEDKLHIRKNEFVCGVIDKAQFAD
YGLVHTVHELYGSNAAGNLLSVFSRLFTVFLQTHGFTCGVDDLIILKDMDEERTKQLQECENVGERVLRKTFGIDVDVQI
DPQDMRSRIERILYEDGESALASLDRSIVNYLNQCSSKGVMNDLLSDGLLKTPGRNCISLMTISGAKGSKVNFQQISSHL
GQQDLEGKRVPRMVSGKTLPCFHPWDWSPRAGGFISDRFLSGLRPQEYYFHCMAGREGLVDTAVKTSRSGYLQRCLMKNL
ESLKVNYDCTVRDADGSIIQFQYGEDGVDVHRSSFIEKFKELTINQDMVLQKCSEDMLSGASSYISDLPISLKKGAEKFV
EAMPMNERIASKFVRQEELLKLVKSKFFASLAQPGEPVGVLAAQSVGEPSTQMTLNTFHLAGRGEMNVTLGIPRLQEILM
TAAANIKTPIMTCPLLKGKTKEDANDITDRLRKITVADIIKSMELSVVPYTVYENEVCSIHKLKINLYKPEHYPKHTDIT
EEDWEETMRAVFLRKLEDAIETHMKMLHRIRGIHNDVTGPIAGNETDNDDSVSGKQNEDDGDDDGEGTEVDDLGSDAQKQ
KKQETDEMDYEENSEDETNEPSSISGVEDPEMDSENEDTEVSKEDTPEPQEESMEPQKEVKGVKNVKEQSKKKRRKFVRA
KSDRHIFVKGEGEKFEVHFKFATDDPHILLAQIAQQTAQKVYIQNSGKIERCTVANCGDPQVIYHGDNPKERREISNDEK
KASPALHASGVDFPALWEFQDKLDVRYLYSNSIHDMLNIFGVEAARETIIREINHVFKSYGISVSIRHLNLIADYMTFSG
GYRPMSRMGGIAESTSPFCRMTFETATKFIVQAATYGEKDTLETPSARICLGLPALSGTGCFDLMQRVEL
 

Pol II:
Arabidopsis
mdtrfpfspaevskvrvvqfgilspdeirqmsvihvehsettekgkpkvgglsdtrlgtidrkvkcetcmanmaecpghf
gylelakpmyhvgfmktvlsimrcvcfncskiladeamkiknpknrlkkildacknktkcdggddiddvqshstdepvkk
srggcgaqqpkltiegmkmiaeyknskeendepdqlpepaerkqtlgadrvlsvlkrisdadcqllgfnpkfarpdwmil
evlpippppvrpsvmmdatsrseddlthqlamiirhnenlkrqekngaprhiisrftqllqfhiatyfdnelpgqpratq
ksgrpiksicsrlkakegrirgnlmgkrvdfsartvitpdptinidelgvpwsialnltypetvtpynierlkelvdygp
hpppgktgakyiirddgqrldlrylkkssdqhlelgyryvllsysihsthkrlflevvifmlswsqverhlqdgdfvlfn
rqpslhkmsimghririmpystfrlnlsvtspynadfdgdemnmhvpqsfetraevlelmmvpkcivspqanrpvmgivq
dtllgcrkitkrdtfiekdvfmntlmwwedfdgkvpapailkprplwtgkqvfnliipkqinllrysawhadtetgfitp
gdtqvriergellagtlckktlgtsngslvhviweevgpdaarkflghtqwlvnywllqngftigigdtiadsstmekin
etisnaktavkdlirqfqgkeldpepgrtmrdtfenrvnqvlnkarddagssaqkslaetnnlkamvtagskgsfinisq
mtacvgqqnvegkripfgfdgrtlphftkddygpesrgfvensylrgltpqefffhamggreglidtavktsetgyiqrr
lvkamedimvkydgtvrnslgdviqflygedgmdavwiesqkldslkmkksefdrtfkyeiddenwnptylsdehledlk
girelrdvfdaeyskletdrfqlgteiatngdstwplpvnikrhiwnaqktfkidlrkisdmhpveivdavdklqerllv
vpgddalsveaqknatlffnillrstlaskrvleeyklsreafewvigeiesrflqslvapgemigcvpaqsigepatqm
tlntfhyagvsaknvtlgvprlreiinvakriktpslsvyltpeaskskegaktvqcaleyttlrsvtqatevwydpdpm
stiieedfefvrsyyempdedvspdkispwllrielnremmvdkklsmadiaekinlefdddltcifnddnaqklilrir
imndegpkgelqdesaeddvflkkiesnmltemalrgipdinkvfikqvrksrfdeeggfktseewmldtegvnllavmc
hedvdpkrttsnhlieiievlgieavrralldelrvvisfdgsyvnyrhlailcdtmtyrghlmaitrhginrndtgplm
rcsfeetvdilldaaayaetdclrgvtenimlgqlapigtgdcelylndemlknaielqlpsymdglefgmtparspvsg
tpyhegmmspnyllspnmrlspmsdaqfspyvggmafspssspgyspsspgysptspgysptspgysptspgysptspty
spsspgysptspaysptspsysptspsysptspsysptspsysptspsysptspsysptspaysptspaysptspayspt
spsysptspsysptspsysptspsysptspsysptspaysptspgysptspsysptspsygptspsynpqsakyspsiay
spsnarlspaspysptspnysptspsysptspsyspssptyspsspyssgaspdyspsagysptlpgyspsstgqytphe
gdkkdktgkkdaskddkgnp
 

Drosophila
mstptdskaplrqvkrvqfgilspdeirrmsvteggvqfaetmeggrpklgglmdprqgvidrtsrcqtcagnmtecpgh
fghidlakpvfhigfitktikilrcvcfycskmlvsphnpkikeivmksrgqprkrlayvydlckgkticeggedmdltk
enqqpdpnkkpghggcghyqpsirrtgldltaewkhqnedsqekkivvsaervweilkhitdeecfilgmdpkyarpdwm
ivtvlpvpplavrpavvmfgaaknqddlthklsdiikannelrkneasgaaahviqenikmlqfhvatlvdndmpgmpra
mqksgkplkaikarlkgkegrirgnlmgkrvdfsartvitpdpnlridqvgvprsiaqnltfpelvtpfnidrmqelvrr
gnsqypgakyivrdngeridlrfhpkssdlhlqcgykverhlrdddlvifnrqptlhkmsmmghrvkvlpwstfrmnlsc
tspynadfdgdemnlhvpqsmetraevenihitprqiitpqankpvmgivqdtltavrkmtkrdvfitreqvmnllmflp
twdakmpqpcilkprplwtgkqifsliipgnvnmirthsthpdeedegpykwispgdtkvmvehgelimgilckkslgts
agsllhicflelghdiagrfygniqtvinnwllfeghsigigdtiadpqtyneiqqaikkakddvinviqkahnmelept
pgntlrqtfenkvnrilndahdktggsakkslteynnlkamvvsgskgsninisqviacvgqqnvegkripygfrkrtlp
hfikddygpesrgfvensylagltpsefyfhamggreglidtavktaetgyiqrrlikamesvmvnydgtvrnsvgqliq
lrygedglcgelvefqnmptvklsnksfekrfkfdwsnerlmkkvftddvikemtdsseaiqeleaewdrlvsdrdslrq
ifpngeskvvlpcnlqrmiwnvqkifhinkrlptdlspirvikgvktllercvivtgndriskqanenatllfqclirst
lctkyvseefrlsteafewlvgeietrfqqaqanpgemvgalaaqslgepatqmtlntfhfagvssknvtlgvprlkeii
niskkpkapsltvfltggaardaekaknvlcrlehttlrkvtantaiyydpdpqrtvisedqefvnvyyempdfdptris
pwllrieldrkrmtdkkltmeqiaekinvgfgedlncifnddnadklvlririmnneenkfqdedeavdkmeddmflrci
eanmlsdmtlqgieaigkvymhlpqtdskkrivitetgefkaigewlletdgtsmmkvlserdvdpirtssndiceifqv
lgieavrksvekemnavlqfyglyvnyrhlallcdvmtakghlmaitrhginrqdtgalmrcsfeetvdvlmdaaahaet
dpmrgvseniimgqlpkmgtgcfdllldaekcrfgieipntlgnsmlggaamfigggstpsmtppeldsawancntpryf
sppghvsamtpggpsfspsaasdasgmspswspahpgsspsspgpsmspyfpaspsvspsysptspnytasspggaspny
spsspnysptsplyaspryasttpnfnpqstgyspsssgysptspvysptvqfqsspsfagsgsniyspgnayspsssny
spnspsysptspsyspsspsysptspcysptspsysptspnytpvtpsysptspnysaspqyspaspaysqtgvkyspts
ptysppspsydgspgspqytpgspqyspaspkysptsplyspsspqhspsnqysptgstysatspryspnmsiyspsstk
ysptsptytptarnysptspmysptapshysptspayspssptfeesedvrkggrg
 

human
mhgggppsgdsacplrtikrvqfgvlspdelkrmsvteggikypetteggrpklgglmdprqgviertgrcqtcagnmte
cpghfghielakpvfhvgflvktmkvlrcvcffcskllvdsnnpkikdilakskgqpkkrlthvydlckgkniceggeem
dnkfgveqpegdedltkekghggcgryqprirrsglelyaewkhvnedsqekkillspervheifkrisdeecfvlgmep
ryarpewmivtvlpvpplsvrpavvmqgsarnqddlthkladivkinnqlrrneqngaaahviaedvkllqfhvatmvdn
elpglpramqksgrplkslkqrlkgkegrvrgnlmgkrvdfsartvitpdpnlsidqvgvprsiaanmtfaeivtpfnid
rlqelvrrgnsqypgakyiirdngdridlrfhpkpsdlhlqtgykverhmcdgdivifnrqptlhkmsmmghrvrilpws
tfrlnlsvttpynadfdgdemnlhlpqsletraeiqelamvprmivtpqsnrpvmgivqdtltavrkftkrdvflergev
mnllmflstwdgkvpqpailkprplwtgkqifsliipghincirthsthpddedsgpykhispgdtkvvvengelimgil
ckkslgtsagslvhisylemghditrlfysniqtvinnwllieghtigigdsiadsktyqdiqntikkakqdvievieka
hnneleptpgntlrqtfenqvnrilndardktgssaqkslseynnfksmvvsgakgskinisqviavvgqqnvegkripf
gfkhrtlphfikddygpesrgfvensylagltptefffhamggreglidtavktaetgyiqrrliksmesvmvkydatvr
nsinqvvqlrygedglagesvefqnlatlkpsnkafekkfrfdytneralrrtlqedlvkdvlsnahiqnelerefermr
edrevlrvifptgdskvvlpcnllrmiwnaqkifhinprlpsdlhpikvvegvkelskklvivngddplsrqaqenatll
fnihlrstlcsrrmaeefrlsgeafdwllgeieskfnqaiahpgemvgalaaqslgepatqmtlntfhyagvsaknvtlg
vprlkeliniskkpktpsltvfllgqsardaerakdilcrlehttlrkvtantaiyydpnpqstvvaedqewvnvyyemp
dfdvarispwllrveldrkhmtdrkltmeqiaekinagfgddlncifnddnaeklvlririmnsdenkmqeeeevvdkmd
ddvflrciesnmltdmtlqgieqiskvymhlpqtdnkkkiiitedgefkalqewiletdgvslmrvlsekdvdpvrttsn
diveiftvlgieavrkalerelyhvisfdgsyvnyrhlallcdtmtcrghlmaitrhgvnrqdtgplmkcsfeetvdvlm
eaaahgesdpmkgvsenimlgqlapagtgcfdllldaekckygmeiptnipglgaagptgmffgsapspmggispamtpw
nqgatpaygawspsvgsgmtpgaagfspsaasdasgfspgyspawsptpgspgspgpsspyipspggamspsysptspay
eprspggytpqspsysptspsysptspsysptspnysptspsysptspsysptspsysptspsysptspsysptspsysp
tspsysptspsysptspsysptspsysptspsysptspsysptspsysptspsysptspsysptspnysptspnytptsp
sysptspsysptspnytptspnysptspsysptspsysptspsyspssprytpqsptytpsspsyspsspsysptspkyt
ptspsyspsspeytpaspkysptspkysptspkysptsptyspttpkysptsptysptspvytptspkysptsptyspts
pkysptsptysptspkgstysptspgysptsptysltspaispddsdeen
 

yeast (S.c)
mvgqqyssaplrtvkevqfglfspeevraisvakirfpetmdetqtrakigglndprlgsidrnlkcqtcqegmnecpgh
fghidlakpvfhvgfiakikkvcecvcmhcgkllldehnelmrqalaikdskkrfaaiwtlcktkmvcetdvpseddptq
lvsrggcgntqptirkdglklvgswkkdratgdadepelrvlsteeilnifkhisvkdftslgfnevfsrpewmiltclp
vppppvrpsisfnesqrgeddltfkladilkanisletlehngaphhaieeaesllqfhvatymdndiagqpqalqksgr
pvksirarlkgkegrirgnlmgkrvdfsartvisgdpnleldqvgvpksiaktltypevvtpynidrltqlvrngpnehp
gakyvirdsgdridlryskragdiqlqygwkverhimdndpvlfnrqpslhkmsmmahrvkvipystfrlnlsvtspyna
dfdgdemnlhvpqseetraelsqlcavplqivspqsnkpcmgivqdtlcgirkltlrdtfieldqvlnmlywvpdwdgvi
ptpaiikpkplwsgkqilsvaipngihlqrfdegttllspkdngmliidgqiifgvvekktvgssngglihvvtrekgpq
vcaklfgniqkvvnfwllhngfstgigdtiadgptmreitetiaeakkkvldvtkeaqanlltakhgmtlresfednvvr
flneardkagrlaevnlkdlnnvkqmvmagskgsfiniaqmsacvgqqsvegkriafgfvdrtlphfskddyspeskgfv
ensylrgltpqefffhamggreglidtavktaetgyiqrrlvkaledimvhydnttrnslgnviqfiygedgmdaahiek
qsldtiggsdaafekryrvdllntdhtldpsllesgseilgdlklqvlldeeykqlvkdrkflrevfvdgeanwplpvni
rriiqnaqqtfhidhtkpsdltikdivlgvkdlqenllvlrgkneiiqnaqrdavtlfccllrsrlatrrvlqeyrltkq
afdwvlsnieaqflrsvvhpgemvgvlaaqsigepatqmtlntfhfagvaskkvtsgvprlkeilnvaknmktpsltvyl
epghaadqeqaklirsaiehttlksvtiaseiyydpdprstvipedeeiiqlhfslldeeaeqsfdqqspwllrleldra
amndkdltmgqvgerikqtfkndlfviwsedndekliircrvvrpksldaeteaeedhmlkkientmlenitlrgvenie
rvvmmkydrkvpsptgeyvkepewvletdgvnlsevmtvpgidptriytnsfidimevlgieagraalykevynviasdg
syvnyrhmallvdvmttqggltsvtrhgfnrsntgalmrcsfeetveilfeagasaelddcrgvsenvilgqmapigtga
fdvmideeslvkympeqkiteiedgqdggvtpysnesglvnadldvkdelmfsplvdsgsndamaggftayggadygeat
spfgaygeaptspgfgvsspgfsptsptysptspaysptspsysptspsysptspsysptspsysptspsysptspsysp
tspsysptspsysptspsysptspsysptspsysptspsysptspsysptspsysptspaysptspsysptspsysptsp
sysptspsysptspnysptspsysptspgyspgspayspkqdeqkhnenensr

pol III:
human
mvkeqfretdvakktshicfgmkspeemrqqahiqvvsknlysqdnqhapllygvldhrmgtsekdrpcetcgknladcl
ghygyidlelpcfhvgyfravigilqmicktcchimlsqeekkqfldylkrpgltylqkrglkkkisdkcrkknichhcg
afngtvkkcgllkiihekyktnkkvvdpivsnflqsfetaiehnkevepllgraqenlnplvvlnlfkripaedvplllm
npeagkpsdliltrllvpplcfrpsvvsdlksgtneddltmklteiiflndvikkhrisgaktqmimedwdflqlqcaly
inselsgiplnmapkkwtrgfvqrlkgkqgrfrgnlsgkrvdfsgrtvispdpnlridevavpvhvakiltfpekvnkan
inflrklvqngpevhpganfiqqrhtqmkrflkygnrekmaqelkygdiverhlidgdvvlfnrqpslhklsimahlarv
kphrtfrfnecvctpynadfdgdemnlhlpqteeakaealvlmgtkanlvtprngepliaaiqdfltgaylltlkdtffd
rakacqiiasilvgkdekikvrlppptilkpvtlwtgkqifsvilrpsddnpvranlrtkgkqycgkgedlcandsyvti
qnselmsgsmdkgtlgsgsknnifyillrdwgqneaadamsrlarlapvylsnrgfsigigdvtpgqgllkakyellnag
ykkcdeyiealntgklqqqpgctaeetlealilkelsvirdhagsaclreldksnspltmalcgskgsfinisqmiacvg
qqaisgsrvpdgfenrslphfekhsklpaakgfvansfysgltptefffhtmagreglvdtavktaetgymqrrlvksle
dlcsqydltvrsstgdiiqfiyggdgldpaamegkdeplefkrvldnikavfpcpsepalsknelilttesimkkseflc
cqdsflqeikkfikgvsekikktrdkygindngtteprvlyqldritptqvekfletcrdkymraqmepgsavgalcaqs
igepgtqmtlktfhfggvasmnitlgvprikeiinaskaistpiitaqldkdddadyarlvkgriektllgeiseyieev
flpddcfilvklslerirllrlevnaetvrysictsklrvkpgdvavhgeavvcvtprenskssmyyvlqflkedlpkvv
vqgipevsravihideqsgkekykllvegdnlravmathgvkgtrttsnntyevektlgieaarttiineiqytmvvnhg
msidrrhvmllsdlmtykgevlgitrfglakmkesvlmlasfektadhlfdaayfgqkdsvcgvseciimgipmnigtgl
fkllhkadrdpnppkrplifdtnefhiplvt

trypanosome
mlkgssstsfllpqqfveplphapveisalhygllsrndvhrlsvlpcrrvvgdvkeygvndarlgvcdrlsicetcgln
siecvghpghidleapvfhlgffttvlricrtickrcshvllddteidyykrrlssssleplqrtmliktiqtdayktrv
clkcgglngvvrrvrpmrlvhekyhveprrgegprenpggffdaelrtacaynkvvgecrefvhdfldpvrvrqlflavp
pgevillglapgvsptdllmttllvppvpvrprgcagtttvrdddltaqyndilvstdtmqdgsldatrytetwemlqmr
aarlldsslpgfppnvrtsdlksyaqrlkskhgrfrcnlsgkrvdysgrsvispdpnldvdelavplhvarvltypqrvf
kanhelmrrlvrngphvhpgattvylaqegskkslknerdrhrlaarlavgdiverhvmngdlvlfnrqpslhrvsmmah
rarvlpfrtfrfnecccapynadfdgdemnvhfvqtekaraealqlmstarniisakngepiiactqdflaaaylvtsrd
vffdrgefsqmvshwlgpvtqfrlpipailkpvelwtgkqlfelivrpspevdvllsfeaptkfytrkgkhdcaeegyva
fldscfisgrldkkllgggakdglfarlhtiagggytarvmsriaqftsryltnygfslglgdvaptpelnkqkaavlar
svevcdgliksaktgrmiplpgltvkqslearlntelskvrdecgtaavqtlsihnntplimvqsgskgsalniaqmmac
vgqqtvsgkrildafqdrslphfhrfeeapaargfvansfysglsptefffhtmagreglvdtavktaetgyiyrrlmka
menlsvrydgtvrntkgdviqlrfgedgldpqlmegnsgtplnleqewlsvraayarwvvgllagsktasdgnairdnen
yfnefismlptegpsfveaclngdqealkvceeqesredalhnsngktndresrprtgrlrravlishlvkvcsrkfkdd
iqdffvkkvreqqrirnllnlpntsrertegggdnsgpiankrtkkrapslkvkdskeggrvselrdlemlqtellpltr
gmvtrfiaqcaskylrkacepgtpcgaiaaqsvgepstqmtlrtfhfagvasmsitqgvprlvevinanrniatpvvtap
vllmegeenhceifrkrarfvkaqiervllrevvseivevcsdtefylrvhlnmsvitklhlpinaitvrqrilaaaght
msplrmlnedcievfsldtlavyphfqdarwvhfslrrilgllpdvvvggigginramissngtevlaegaelravmnlw
gvdstrvvcnhvavvervlgieaarrvivdeiqnilkayslsidvrhvylladlmtqrgvvlgitrygiqkmnfnvltma
sferttdhlynaaatqrvdrdlsvsdsiivgkpvplgttsfdllldgsisndilppqrcvkrgmgpnfhtakrhhlvpla
aegvfrldlf 

yeast (S.c)
mkevvvsetpkrikglefsalsaadivaqsevevstrdlfdlekdrapkangaldpkmgvsssslecatchgnlaschgh
fghlklalpvfhigyfkatiqilqgickncsaillsetdkrqflhelrrpgvdnlrrmgilkkildqckkqrrclhcgal
ngvvkkaaagagsaalkiihdtfrwvgkksapekdiwvgewkevlahnpeleryvkrcmddlnplktlnlfkqiksadce
llgidatvpsgrpetyiwrylpappvcirpsvmmqdspasneddltvklteivwtsslikagldkgisinnmmehwdylq
ltvamyinsdsvnpamlpgssngggkvkpirgfcqrlkgkqgrfrgnlsgkrvdfsgrtvispdpnlsidevavpdrvak
vltypekvtrynrhklqelivngpnvhpganyllkrnedarrnlrygdrmklaknlqigdvverhledgdvvlfnrqpsl
hrlsilshyakirpwrtfrlnecvctpynadfdgdemnlhvpqteearaeainlmgvknnlltpksgepiiaatqdfitg
sylishkdsfydratltqllsmmsdgiehfdipppaimkpyylwtgkqvfsllikpnhnspvvinldaknkvfvppksks
lpnemsqndgfviirgsqilsgvmdksvlgdgkkhsvfytilrdygpqeaanamnrmaklcarflgnrgfsigindvtpa
ddlkqkkeelveiayhkcdelitlfnkgeletqpgcneeqtleakiggllskvreevgdvcineldnwnaplimatcgsk
gstlnvsqmvavvgqqiisgnrvpdgfqdrslphfpknsktpqskgfvrnsffsglsppeflfhaisgreglvdtavkta
etgymsrrlmksledlscqydntvrtsangivqftyggdgldplemegnaqpvnfnrswdhaynitfnnqdkgllpyaim
etaneilgpleerlvrydnsgclvkredlnkaeyvdqydaerdfyhslreyingkatalanlrksrgmlglleppakelq
gidpdetvpdnvktsvsqlyriseksvrkfleialfkyrkarlepgtaigaigaqsigepgtqmtlktfhfagvasmnvt
lgvprikeiinaskvistpiinavlvndnderaarvvkgrvektllsdvafyvqdvykdnlsfiqvridlgtidklqlel
tiediavaitrasklkiqasdvniigkdriainvfpegykaksistsakepsendvfyrmqqlrralpdvvvkglpdisr
avinirddgkrellvegyglrdvmctdgvigsrtttnhvlevfsvlgieaarysiireinytmsnhgmsvdprhiqllgd
vmtykgevlgitrfglskmrdsvlqlasfekttdhlfdaafymkkdavegvseciilgqtmsigtgsfkvvkgtnisekd
lvpkrclfeslsneaalkan

9/11 me to Craig

sorry .. no useful sites out there for doing phylogenetic analysis … I am working on such a type of thing right now. I tis tricky becuase to do it correctly you need to filter out parts of a multiple sequence alignment to remove badly aligned regions as well as hypervariable regions.

9/12 Craig to Me

Dear Mike, 

Yes, I can do this for the atypical RNA polymerase 2nd subunit. I have
already done multiple alignments with it against pol I, II, III subunits
and it is clear that the atypical subunit has amino acid differences that
set it apart, rather than large indels that skew the data. So I think
Jonathan is safe to go ahead and make a figure while I examine the gene
sequences and gene models more carefully. 

Any comments on the tone/amount of detail in the section I wrote on the
general transcription machinery? Either way, I will add some references
and send you an updated version as soon as I can. 

cheers
Craig

———————
>Speaking on behalf of the editorial committee whom I have not consulted, I
>would be delighted to have this in our section. But we need to check out the
>gene structure in detail (dodgy gene prediction, missing exons etc. Craig,
>could you so this as you know most about these enzymes
>
>All the best
>
>Mike

Me to Craig

Craig 

I am still working on a slightly better figure … but I have attached the latest version … I think it is sufficient for submission 

I have attached it in a few different formats. 

I will be out of town for a few days but checking email. 

Jonathan

Craig to Me:

Hi Jonathan,

The phytlogenetic tree figure for the atypical pol subunit looks good
though the font size may need to be reduced to fit “Fungal Plasmids”
between the dividing lines for the adjacent categories. Have you sent a
copy to Mike?

Craig

Craig again

Hi Jonathan, 

I forwarded a copy to Mike. Did you ever have a chance to do a tree for the
largest subunit to further test the hypothesis of a pol IV? 

Hope you are having fun in LA 

Craig

> I am not sure if I sent a copy to mike
>
>I am in LA right now and it would be easier if you could send mike a copy to
>make sure he has one. I will try and edit the figure and send one with a
>smaller font.
>
>J

10/3 Me to Craig:

Criag 

Attached is a new version of the rna pol tree with fonts corrected. I am going to add a few more sequences a rerun it and make a new tree tomorrow. 

Jonathan 

PS Also … here is a potential figure legend 

Figure. Phylogenetic tree of RNA polymerase homologs. Homologs of RNA polymerase were identified by searching sequence databases with representatives of the major known RNA polymerase subfamilies. These proteins, as well as six DNA polymerase homologs from A. thaliana, were aligned using clustalx using default settings. Phylogenetic trees were generated from the alignment (with ambiguously aligned regions and hypervariable regions excluded) using the PAUP* program. The tree shows was generated using the neighbor-joining algorithm with pairwise distances between sequences calculated with a PAM-like matrix. Numbers on the branches are bootstrap values indicating the percentage of 100 trees in which the proteins to the right of the node grouped together to the exclusion of all other proteins.

Craig 10/3

Hi Jonathan, 

I will look forward to seeing the final tree, as will Mike, I’m sure. For
the legend, the fact that this is an alignment of second-largest subunits
should be made clear. Here is a stab at a minor revision:

Figure—–. Phylogenetic tree for the second-largest subunit of
DNA-dependent RNA polymerases. Homologs of RNA polymerase second-largest
subunits were identified by searching sequence databases with
representatives of the major known subfamilies (e.g. pol I, II, III and
eubacterial beta subunits). Identified proteins, including six homologs
from A. thaliana, were
aligned using clustalx using default settings. Phylogenetic trees were
generated from the alignment (with ambiguously aligned regions and
hypervariable regions excluded) using the PAUP* program. The treewas generated using the neighbor-joining algorithm with pairwise distances
between sequences calculated with a PAM-like matrix. Numbers on the
branches are bootstrap values indicating the percentage of 100 trees inwhich the proteins to the right of the node group together to the
exclusion of all other proteins.

Thanks,
Craig

Me:

much better figure legend 

j

Anyway – and so it went. Alas, for a variety of reasons not much made it into the final paper. What was there was this:

Unexpectedly, Arabidopsis has two genes encoding a fourth class of largest subunit and second-largest subunit (Supplementary Information Fig. 5). It will be interesting to determine whether the atypical subunits comprise a polymerase that has a plant-specific function.

And of course, this Supplemental Information is not exactly easy to find and does not actually work correctly anymore:

Downloading the Zip file and opening first page.htm gets one to this

And then clicking on the Figure 5 you get a broken page w/o the Figure.

But there, hidden in the folder with the Supplemental Information is the figure

So that is the beginning of the story on RNA Pol IV in Arabidopsis.

Go read the E-life paper and some of what it cites for the last 15 years of the story.


Wrap up of talk by Rich Lenski at UC Davis

Rich Lenski gave a talk today at UC Davis – part of a two talk series. This was a presentation more for the public and tomorrow he gives one more for the science crowd. Today’s talk was a really nice overview of Lenski’s work on long term evolution experiments in E. coli. I made a Storify of the tweets about the talk:

Horizontal gene transfer into humans? I am not convinced. Full text of my comments to reporters here

Some news stories about a new paper claiming evidence for horizontal gene transfer into humans and other chordates. I got asked by many reporters about it and some used some of my email comments in their articles.

See for example

 Here is the full text of my responses:


“got asked by another reporter to comment on this

so – have seen the paper 

it is interesting .. but I am not overwhelmed by what they present in the paper itself. For example, the HAS story seems really incomplete as presented (e.g., the Figure showing the tree does not have all the HAS1, HAS2, HAS3 genes even though they imply they studied that). “


I have been looking through the supplemental information. I find it impossible to judge the quality of this paper without being able to see the alignments they used for each phylogenetic tree. I cannot find alignments for the trees even after going to their Figshare site with the trees. I therefore think there is not much to say about the paper until being able to see those. 

Without seeing the alignments I offer multiple alternative hypothesis for their findings

  1. They have identified genes for which they are unable to produce reasonable alingnments. Alignments are central to phylogenetic analysis and if their alignments are poor quality then the trees will show all sorts of anomalies that have nothing to do with phylogenetic history. By scanning through 1000s of genes and flagging those with unusual patterns they may be selectively identifying genes for which producing good alignments between species is tough. I note – clustalw is a bit notorious for not producing idea alignments in some cases.
  2. I do not buy their arguments for why gene loss is not a possible explanation. They need to present more detail on how many gene losses would be required for each gene family under consideration and then present some evidence for why that # of gene losses is less likely than HGT.
  3. They have not even considered as far as I can tell, the possibility of divergent evolution (as opposed to gene loss) in many taxa which could lead to them being unable to identify homologs in some species
  4. I am not convinced by the arguments against long branch attraction as an explanation for some of the tree patterns.
  5. Related to alignments they need to show which regions of alignments they excluded from phylo
  6. Convergent evolution could also explain some of the patterns they observe.
  7. I could go on. I am NOT saying that HGT into chordates is impossible. It seems plausible. But it is up to them to exclude other MORE plausible alternatives and I just do not think they have done that.

Reporter: asking if it was OK to quote me

Yes it is OK to quote from me. I would like to reiterate – I am not saying they are wrong. Just that I would like to see (1) all the data (e.g., alignments) that unreels their conclusions and (2) them do more to exclude other possibilities.


Reporter asking what other analyses could they do

So – I don’t want to be difficult, but it is their job to figure out how to do such tests before claiming they have strong evidence for HGT. 

In general, this is pretty typical of claims of HGT. Many researchers show evidence that is consistent with the occurence of HGT (which they did here) but few actually explicitly test alternative hypotheses such as gene loss, bad alignments, convergence, divergence, contamination, random noise, and more. I think their work is certainly interesting, but they just have not tested all of these alternatives. And I personally have grown a bit tired of pointing out how people can do better controls for their papers.


Reporter asking about initial impressions:

I see little here that is particularly convincing evidence for HGT …


My follow up email

Note – I am not saying that this is a bad paper — just that I am not overwhelmed by their evidence and especially by what they put in the paper. 

For example, the HAS1 gene story seems incomplete.  Figure 3 seems to show just HAS1 but in the text the say they show the same thing for all HAS genes.  And the tree they show shows a tiny subset of all the available sequences (e.g., HAS1 HAS2 HAS3 and fungal and bacterial homologs).  They claim that they now have proof that HAS1 was transferred near the base of chordates but I just don’t see how they tested alternative hypotheses …


Some related links:

Also here are some presentations from many years ago with some discussion of HGT

Repeated, extremely biased ratio of M:F at meetings from SFB 680 "Evolutionary Innovations" group #YAMMM

Well, this is disappointing, to say the least – there is a conference coming up in July 2015 on “Forecasting Evolution”:  SFB 680 | Molecular Basis of Evolutionary Innovations at the Gulbenkian Foundation in Lisbon.

Here is the listed lineup of invited speakers:

  1. Andersson (Uppsala University), (NOTE I AM ASSUMING THIS IS DAN ANDERSSON)
  2. Trevor Bedford (Hutchinson Cancer Research Center), 
  3. Jesse Bloom (Fred Hutchinson Cancer Research Center), 
  4. Arup Chakraborty (MIT)
  5. Michael Desai (Harvard University), 
  6. Michael Doebeli (University of British Columbia), 
  7. Marco Gerlinger (Institute of Cancer Research, London, 
  8. Michael Hochberg (CRNS, Montpellier), 
  9. Christopher Illingworth (Cambridge University), 
  10. Roy Kishoni (Harvard University), 
  11. Richard Lenski (Michigan State University), 
  12. Stanislas Leibler (Rockefeller University), 
  13. Marta Luksza (IAS Princeton), 
  14. Luke Mahler (University of California, Davis), 
  15. Leonid Mirny (MIT), 
  16. Richard Neher (MPI Tuebingen), 
  17. Julian Parkhill (Sanger Institute), 
  18. Colin Russell (University of Cambridge), 
  19. Sohrab Shah (University of British Columbia), 
  20. Boris Shraiman (UCSB), 
  21. Olivier Tenaillon (Inserm Paris).

For a whopping 20:1 ratio of men to women or 4.8% women. And this in a field that is just overflowing with excellent female researchers.

So I dug around a little bit.  Here is another meeting from the same group at the University of Cologne – a group known as SFB 680. SFB 680: Molecular Ecology and Evolution: Cologne Spring Meeting 2012.

Speakers:

  1. Ian Thomas Baldwin, MPI Jena
  2. Nitin Baliga, ISB Seattle 
  3. Andrew Beckerman, University of Sheffield 
  4. Joy Bergelson, University of Chicago
  5. Michael Boots, University of Sheffield 
  6. John Colbourne, Indiana University 
  7. David Conway, LSHTM London
  8. Santiago Elena, IBMCP Valencia
  9. Duncan Greig, MPI Plön 
  10. Bryan Grenfell, Princeton University 
  11. Eddie Holmes, Pennsylvania State University 
  12. Peter Keightley, University of Edinburgh
  13. Britt Koskella, University of Oxford
  14. Juliette de Meaux, University of Münster 
  15. Thomas Mitchell-Olds, Duke University
  16. Hélène Morlon, Ecole Polytechnique Paris 
  17. Wayne Potts, University of Utah 
  18. Michael Purugganan, New York University
  19. Andrew Rambaut, University of Edinburgh 
  20. Walter Salzburger, University of Basel 
  21. Johanna Schmitt, Brown University
  22. Ralf Sommer, MPI Tübingen
  23. Miltos Tsiantis, University of Oxford 
  24. Diethardt Tautz, MPI Plön 
  25. Daniel Weinreich, Brown University

Session and Meeting Chairs:

  1. Michael Lassig
  2. Maarten Koornneef
  3. Eric von Elert
  4. Thomas Wiehe
  5. Jonathan Howard

That would be 25:5 or 16.6% female.

And then there was this: Perspectives in Biophysics in October 2014

  1. Konstantin Doubrovinski
  2. Tobias Bollenbach
  3. Stefano Pagliara
  4. Damien Faivre
  5. Ingmar Schön
  6. Kurt Schmoller
  7. Max Ulbrich
  8. Florian Rehfeld
  9. Steffen Sahl
  10. Timo Betz
  11. Alexandre Persat
  1. Rubén Alcázar (MPI for Plant Breeding Research, Cologne)
  2. John Baines (Christian-Albrechts-University, Kiel)
  3. Thomas Bataillon (University of Aarhus)
  4. Frank Chan (MPI for Evolutionary Biology, Plön)
  5. George Coupland (MPI for Plant Breeding Research, Cologne)
  6. Susanne Foitzik (Johannes Gutenberg-University, Mainz)
  7. Isabel Gordo (Instituto Gulbenkian, Lisbon)
  8. Oskar Hallatschek (MPI for Dynamics and Self-Organization, Göttingen
  9. Jonathan Howard (University of Cologne)
  10. JinYong Hu (MPI for Plant Breeding Research, Cologne)
  11. Jeffrey Jensen (University of Massachusetts, Medical School, Worchester)
  12. Michael Lässig (University of Cologne)
  13. Dirk Metzler (Ludwig-Maximilians-University, Munich)
  14. Ville Mustonen (Welcome Trust Sanger Institute)
  15. John Parsch (Ludwig-Maximilians-University, Munich)
  16. Frank Rosenzweig (University of Montana, Missoula)
  17. Christian Schlötterer (University of Veterinary Medicine, Vienna)
  18. Shamil Sunyaev (Brigham & Women’s Hospital and Harvard Medical School) 
  19. Karl Schmid (University of Hohenheim)
  20. Ana Sousa (Instituto Gulbenkian, Lisbon)
  21. Diethard Tautz (MPI for Evolutionary Biology, Plön)
  22. Xavier Vekemans (University of Lille)
Session and Meeting Chairs
  • Wolfgang Stephan
  • Michael Lässig
  • Berenike Maier
  • Wolfgang Stephan
  • Peter Pfaffelhuber
  • Juliette de Meaux

For a 19:3 ratio or 13.6 % women for the speakers and if you include session chairs it comes to 23:5 or 18 % female total.

And Evolutionary Innovations in 2010. 

Invited speakers:

  1. R. Bundschuh (Ohio State University), 
  2. C. Callan (Princeton University),
  3. A. Clark (Cornell University), 
  4. J. Colbourne (Indiana University),
  5. E. Dekel (Weizmann Institute),
  6. L. Hurst (University of Bath), 
  7. S. Elena (Universidad Polytecnica de Valencia), 
  8. E. Koonin (National Center for Biotechnology Information), 
  9. M. Kreitman (University of Chicago),
  10. S. Leibler (Rockefeller University, New York and Institute for Advanced Study, Princeton),
  11. T. Lengauer (Max Planck Institute for Informatics), 
  12. S. Maerkl (Ecole Polytechnique de Lausanne), 
  13. C. Marx (Harvard University), 
  14. L. Mirny (Massachusetts Intitute of Technology), 
  15. V. Mustonen (Sanger Institute), 
  16. C. Pal (Biological Research Center, Szeged),
  17. D. Petrov (Stanford University), 
  18. B. Shraiman (Kavli Institute for Theoretical Physics, Santa Barbara),
  19. S. Sunyaev (Harvard University), 
  20. D. Tautz (Max-Planck-Institute for Evolutionary Biology)
Plus session chairs 
  1. Johannes Berg
  2. Siegfried Roth
  3. Wolfgang Werr
  4. Martin Lercher
And addition speakers not listed on their invited speakers page:
  1. Michael Lassig
  2. Ruben Alcazar
  3. Juliette de Meaux
  4. Joachim Krug

For a whopping ratio of 27:1 or 3.6 %

The only meeting from them I could find with a decent / non massively skewed ratio was the following very small one: Evolution of Development

  1. Cassandra Extavour
  2. Angela Hay
  3. Felicity Jones
  4. Nicolas Gompe
  5. Kristen Panfillio
  6. Christiane Kiefer
This is a nice case.  But it really seems like an exception in a long list of meetings with a much smaller representation of female speakers than one would expect based on the researchers in the fields.   I think the SFB680 seriously need to consider what is causing these biases and they should do something about it.

———————————————
See this page for other posts of mine on this and related topics.