Well, just getting around to writing up some thoughts on the iEVOBIO meeting I went to earlier this week. It was really quite excellent so here are some thoughts/notes. Today I am writing about the background and Day 1. Most of this is simply a catalog of what happened along with some twitter details … In a few days I will write up a post on what I think it meant ….
The background: how I heard about iEVOBIO (skip to below if you just want to know about what happened in the meeting)
The first I heard about regarding the meeting was Dec 7, 2009, in a Direct Message on Twitter from @rdmpage. That would be Rod Page, who I had never met, but followed remotely via twitter, his blog, his software and his papers. He wrote
Hi Jonathan, hope you got my email about speaking at iEvoBio in June. No pressure, just checking that it made it into your in box.
I had known about Rod for a long time since I had used his software since I was in grad school. For example, I used to use Treeview for all phylogenetic tree viewing/drawing etc. It seems from the history, it has been available since 1996. Not 100% when I started using it, but it was around then. Then I switched over to using Treeview X a few years later. And I have used on and off some of his other software. More recently I have followed his blog/tweets/web sites closely.
When Rod invited me, I was on a mini vacation in Monterrey and I had not actually seen his email yet (I am ALWAYS behind in reading email). So I found the email, inviting me to give a Keynote at this cool sounding iEVOBIO meeting focusing on informatics for phylogenetics, evolution and biodiversity. Sounded great actually. Especially the part about Open Source:
iEvoBio and its sponsors are dedicated to promoting the practice and philosophy of Open Source software developmentand reuse within the research community. For this reason, if a submitted talk concerns a specific software system for use by the research community, that software must be licensed with arecognized Open Source License, and be available for download, including source code, by a tar/zip file accessed through ftp/http or through a widely used version control system like cvs, Subversion, git, Bazaar, or Mercurial
I also liked the notion of a challenge – in this case there was a challenge for new visualization methods for evolutionary data. In summary the challenge was:
From phylogenetic trees to population networks, whether on printed pages or in GoogleEarth, visualizing evolution is a key part of our discipline. Inspired by the challenges and opportunities visualizing presents for our field, the first iEvoBio challenge is “To create a new visualization tool or platform to support evolutionary science”.
Alas, since I was on vacation I did not have all my schedule information with me, so I said I was not sure. Fortunately, when I got back, it looked like good timing with the Evolution meeting just before so I said sure.
Going to iEVOBIO (skip to below if you just want to know about what happened in the meeting)
Anyway – jump to last week, skipping over some of the preparatory stuff for the meeting. I was planning on being in Oregon for almost a week, including the SSE meeting just before iEVOBIO and a meeting for my iSEEM project in Eugene before that. But I just could not deal with being away for that long including over the weekend, after having not really taken any time off in a while.
So I went home and skipped SSE2010 and then headed back to iEVOBIO on Monday the 28th. I flew on Southwest from Sacramento to Oregon, took the light rail into the city, and walked the last bit to my hotel. I arranged to have dinner with Aaron Darling, a Research Scientists working in/with my lab who was at SSE. We had a good dinner and then I went back to my room and stayed up until about 3:30 AM working on my keynote talk.
I really wanted to include some new stuff and also include some background on microbes and microbial diversity and so worked very late making new slides, piecing together slides from multiple talks, and then trying to delete slides since my talk was way way way too long. The final project I did not finish that night.
I set my alarm on my phone and asked for a wake up call and got about 3 hours sleep. I got up around 6:30 worked on my slides for an hour, and then took a showed and heading downstairs where I borrowed a hotel bike (god, I love Portland – free bikes at my hotel) and biked along the river, over the bridge, and after a little hunting around found where I could park and lock the bike. And I went in.
I worked on my talk for another 30 minutes in an isolated corner and then went over to the main part of the conference center. Finally I was done. I was amazed at how crowded it was. Were all of those people there for iEVOBIO? Alas, no – SSE2010 was still going (I did not realize it would still be on).
After asking around I found the meeting room and met the one and only Rod Page (we had never met). I made sure my laptop would connect with their system and then headed out to get coffee – there was a Starbucks in the hallway outside the meeting room. Alas there was a giant line and my talk was in 25 minutes. Fortunately, Aaron Darling was in the line and he agreed to purchase a latte for me. I went back in, made sure everything was set, and paced around until I got my coffee from Aaron and then it was time for the meeting to start.
The meeting itself: Day1 part 1: keynote by me
The meeting kicked off with a few details from some of the organizers including Rod Page and Todd Vision. We found out who the other organizers were (Rod Page (University of Glasgow), Cecile Ane (University of Wisconsin at Madison), Rob Guralnick (University of Colorado at Boulder), Hilmar Lapp (NESCent), and Cynthia Parr (Encyclopedia of Life). We also found out who helped fund the meetings (US National Evolutionary Synthesis Center (NESCent), and the Society of Systematic Biologists (SSB). I am no longer sure exactly what else they said. But there seems to have been at least one tweet about the intro:
- Todd Vision mentions how computational biology is a guild, full of people that take great pride in their craft.
And then Rod introduced me. Pretty funny actually. He gave me grief for writing about bad omics words and yet inventing and then using phylogenomics for everything. And then my talk was on. Here it is on slideshare.
Also – there were a few tweets about my talk including the following:
- IEvoBio starts with keynote by @
- toranaga Jonathan Eisen – Phylogenomics of microbes: the dark matter of biology
- Need @ input to Biodiversity Science Triage BoF: using informatics to find the gaps in expertise and knowledge.
- Eisen is recording his own talk. Not out of ego – he’ll put his presentation (and voice recording) on slideshare
- An homage to Donald Rumsfeld by Jonathan Eisen: “There are known knowns. There are known unknowns, there are also unknown unknowns”
- He’s mentions how rRNA has been used to study the diversity of microbes, esp via molecular phylogenetics
- Quote of the conference so far: “Microbes run the planet” – Jonathan Eisen
- @ mentions that rRNA analysis doesn’t capture all of the variation in nature, esp at functional level
- @ discussing metagenomics and how messy the analysis of it is (this is what I’m trying to solve!)
- A major challenge is the binning of metagenomic data, whereby sequences are sorted into their appropriate genomes
- Lineage sorting may be major contributor of noise in microbial phylogeny
- Nice talk on microbial diversity from @. Many calls to the community re: assistance building new computational tools.
- Here are my slides from my talk at “Phylogenomics of microbes – the dark matter of biology”
- RT @: Jonathan Eisen talk at 2010
- I can see @ on @ 2010 slides. Phylogenomics + Nutrigenomics = efficient, nutrition rich probiotics 🙂 ?
- Some great slides at the beginning RT @: Jonathan Eisen talk at 2010
- Ha lk 2 C it 2 RT @: I cn C @ on @ 2010 slides. Phylogenomics + Nutrigenomics => probiotics+++++
- Eisen: Analysis of metagenomics is an absolute mess
I got asked some great questions afterwards including one by Joe Felsenstein, one by Arlin Stoltzfus, and one by James McInerney (well, McInerney mostly gave me grief about how he disagreed with me about the extent of lateral gene transfer).
Day1 part 2: Short talks
After my talk, thankfully for all involved, there was a coffee break. And then we were back with short, ~15 min talks. These are listed below with some information, most of it from Twitter.
- Vince Smith: Top-down and bottom-up informatics: who has the high ground?
- Vincent Smith discussing scratchpads – looks like neat system for website/research/publication
- Top down projects have low usage, bottom up have high user base @
- top-down processes tend to be correlated with low usage and slow development. bottom up dependent on user feedback
- Smith conclusions: top down projects need institutional support; bottom up depend on many users
- @ says sequential greedy hill climbing is flawed; his simulated annealing method is better
- Cynthia Parr: Community content building for evolutionary biology: Lessons learned from LepTree and Encyclopedia of Life
- Cynthia Parr – Community content building for evolutionary biology – a talk on LepTree and Encyclopedia of Life
- Now @ is comparing LepTree and Encyclopedia of Life
- @ on
- and for the interested parties.
- @ showing info from LepTree including molecular & morphological & fossil data
- @ using Drupal for LepTree to support community interaction
- Interesting to see playing such an important role in the various biodiversity web tools people are describing at
- @ built a taxon template for LepTree using semantic tools on top of Drupal
- @ now talking about @ and LifeDesks
- Semanticizing not useful, communities hard, divide and conqueror scales
- @ LepTree much more structured DB than EOL
- Users have used tools that they asked for @
- @ Reminds us that in biology, the usability adage holds true: Users really don’t know what they want.
- my talk is over so now I can really have fun!
- My SlideShare upload :Community content building for evolutionary …
- Arlin Stoltzfus: EvoIO: Interop technology meets community science
- NExt up at Arlin Stoltzfus from NIST/UMD discussing – informatics/standards for big scale phylogenetics
- http://.evoio.org at
- Interoperability is an important consideration when building an infrastructure
- Stoltzfus discussing Great Fire of Baltimore 1904 & how it lack of interoperability led to standards for fireplugs
- as e.g. Of interop failure
- While standards can be recommended, often times adoption of standard is voluntary.
- Standards are voluntary, conformance in case of fire hydrants follows disaster
- Standards developed by stakeholders, compliance is voluntary business decision
- Standards are voluntary; to further interoperability need to mitigate cost & enhance benefits of compliance; hence EvoIO
- aims to compliance with standards easier
- The EvoIO Stack: Data semantics -> Ontoloties (CDAO), data syntax -> NeXML Format, data access -> phyloWS API
-
- CDAO
- Phylows
- Hackathons at @ play a big role in developing these standards
- Arlin Stolzfus is asked “What would evoio.org do with 3 million pounds?” Hackathons, translators, etc.
- Brandon Chisham: CDAO-Store: A New Vision for Data Integration
- Next up Brandon Chisham @ on CDAO Store for Comparative Data Analysis Ontology
- CDAO store populated with TreeBase data
- CDAO-Store queried with Phylows
-
- CDAO using Prefuse framework for tree viewing searching see
- CDAO future, SPARQL, taxonomy ids, other stores
-  
CDAO on Twitter @
- If anyone was wondering where the CDAO ontology comes from:
- Presentation went well having a working lunch. Working on phenex integration 🙂
- Joe Felsenstein: Using molecular and morphological data to connect fossils to a phylogeny
- Next up at , the one, the only Joe Felsenstein on molecular data & paleontology
- Joe Felsenstein speaking on fossils and molecular dates
- Felsenstein discussing how most approaches use synapomorphies/discrete traits and try to date them
- Joe Felsenstein @ on placing fossils on the molecular tree. My old adviser has experience doing this:
- Felsenstein presenting a brownian motion model for dating that sounds similar to independent contrasts
- Felsenstein: Brownian motion model for continuous (extant) characters to figure out where to connect fossils to tree. Cool.
- I have a suspicion that Joe Felsenstein dreams in greek
- Felsenstein specifically says this dating method is analogous to independent contrasts
- Felsenstein, simple to compute max likelihood placement of fossil taxa on molecular tree # ievobio
- But catch is need calibrated tree
- Felsenstein method designed to work if you have a time calibrated molecular clock
- Now handling case where tree not calibrated
- Felsenstein: Traffic light visualization shows probabilities of potential placements in tree.
- Missing data in fossils is tiresome
- Felsenstein: Cannot mention non-open-source pub here, cannot thank NSF because they have no clue how to fund methodology
- Felsenstein said he was distressed with difficulty in getting funds for software/methods from NSF
- WTF! Funding agencies aren’t picking up Joe Felsenstein’s research?!
- Question “How many characters do you need to be precise?” Felsenstein answer “Infinity”
- Victor Hanson-Smith: Phylogenetic Mixture Models and Optimization by Simulated Annealing
- Shoot – was having a conversation about frogs and I missed the title of the next talk. ‘mabadblogger at
- Hanson-Smith making very big statements re possible problems with studies using Max-Likelihood phylogeny b/c of tree search method
- PhyESTA @
Day 1: part 3: Lunch – here are some tweets that came out around that time …
- At least evolutionary biologists know what is important: big crowd watching vs
- Lightning talks, Challenge Entries, and Software bazaar coming up at
- Good idea. Maybe a NeXML / PhyloXML / BEASTXML discussion? RT @: Plz rt: do we need a NeXML BoF? @
Day 1 part 4: Challenge talks
And then the post lunch challenge talks began. These related to visualization tools entered into the meeting challenge mentioned above and described here.
- Mike Porter: GenGIS. Here is a link to their submission.
- Porter: GenGIS visualizes phylogenies & pie charts on maps, hooks to R for statistics, can record animations. Impressive. challenge
- challenge up and running. GenGIS and up first
- Michael Porter now talking about GenGIS system for mapping biodiversity data
- GenGIS example of katydids is same @ uses for its poster
- GenGIS can be scripted using Python
- Kris Urie: VoLE (Viewer of Life in EOL). Here is a link to their challenge submission.
- Next up VOLE “Viewer of Life in EOL”
- Treemap viewer for @ at
- For background entry on @ entry see
- Kris Urie: Viewer of Life on EOL. A treemap, using AJAJ (AJAX + JSON). Shoutout for EOL API. challenge
- Arlin Stoltzfus again: Nexplorer3. Here is their challenge entry.
- Next challenge entry Nexplorer
- “Never do live demo in fron of life audience…” @
- Nexplorer3 built upon @
- Arlin is back, showing Neplorer3, which uses the EvoIO stack and CDAO, has sparql query window, tree & sequence vis challenge
- Nexplorer3 displays trees, alignments, supports SPARQL o
- Interesting that RDF and SPARQL crop up several times, but nobody has explained what they are
- Don’t vote for nexplorer3! 😉 Gov’t employee can’t accept money prize
- Andrew Hill: PhyloBox. Here is a link to the challenge entry.
- Sam Smits: jsPhyloSVG. Here is a link to the challenge entry.
- Smits and Ouverney: library for visualizing vector-based phylogenies on web challenge
- Next up jsPhyloSVG
- Next up at Samuel Smiths on Visualizing interactive vector based trees on web; goal to make interactive tool
- Fractured web (support for standards, devices) make web development tricky
- I still want to see a user study on effectiveness of circular tree vis. What to do for large trees in print?
- Great point about how diversity of devices and tools frustrates modern web dev
- Static phylogeny images can’t be mined
- jsPhyloSVG uses SVG via @
- Smits: created a java script jsPhyloSVG to render trees in SVG ; scalable; runs on most OS/browsers
- jsPhyloSVG written entirely in Javascript
- Using js to circumvent need for a server – lets the browser parse tree file and render image
- PhyloTouch, touch enabled for mobile devices. Slick.
- PhyloTouch will be available for us Android folks, good to hear’
- Smits: phyloTouch for browsing trees on touch-screen enabled systems
- Smits: Making tree graphics interactive and searchable , exposing data — all in the markup & javascript interpreted by browser
- Some really awesome visualization tools here at. Excellent resources for open, collaborative, web-enabled science
Day 1. Part 5. Lightning talks.
- Now time for the lightning talks. Gives me a good idea what to expect for mine tomorrow. Glad I’m on day 2!
- lightening talks! 5 minutes and then the gong goes off
- A. Thessen: New Biology: The Data Conservancy and Data Driven Discovery
- Next up at Anne Thessen on – working on data sharing methods
- Anne Thessen asks: “How do we make data sharing part of the normal work flow of the life sciences?” A great, important question.
- Thesson types of data: observational, experimental, high throughput, monitoring, simulation
- Lighting talks: Thessen attempts to describe Data Conservancy in five minutes. Good luck!
- I agree with Thessen that data visualization, esp in big data projects, is often key to discovery.
- arrival of @: tweets >> tweets
- There are not enough power sources here in the conference room. Running out of juice!
- Curious about what’s happening at , check out
- B. Gemeinholzer: DNA Bank Network Ð a virtual linkage of natural history collections’ voucher specimens and documentation with physical DNA, sequences, and publications
- Gemeinholzer: molecular data metadata frequently missing from studies; linkage to vouchers also limited
- Next up: Birgit Gemeinholzer linking molecular and specimen data
- B. Gemeinholzer: DNA Bank Network
- I just realized that the Smithsonian logo, currently on a slide at , looks a lot like the @profile pic
- M. Porter: iBarcode-nextgen: tools for next generation biodiversity analysis
- Next up Michael Porter on “iBarcode-nextgen: tools for next generation biodiversity analysis”
- lightning talks would be even more thrilling with a slowly charging van de graaf generator used as an overtime alarm
- Microbes and metagenomics getting many mentions at including GOS; binning; phylotyping; rRNA
- C.T. Hittinger: Leveraging skewed transcript abundance by next-generation sequencing to increase the genomic depth of the tree of life
- Next up: C.T. Hittinger on leveraging skewed transcript abundance by next-gen seq to increase the genomic depth of tree of life
- @ running a tight ship at
- First mention of “phylogenomics” as a method for inferring phylogeny using genomes
- Wow, an empirical data-based lightening talk. Impressively efficient used of time from C.T. Hittinger at
- The PhD comics take on starting up a new conference. Ouch!
- Susanna Lewis: Functional Gene Ontology Annotation across Species using PAINT
- Next up SuZanna Lewis: Functional Gene Ontology Annotation across Species using PAINT – annotate gene families
- Lewis: PAINT “Phylogenetic Annotation INference Tool” allows one to annotate a single gene family across many species
- Suzanna Lewis: propogating protein properties (GO terms) using PAINT . Finally a good use of power of semantics!
- For moving presentations across computers @ was a saviour
- J. Balhoff: Phenex: Ontological Annotation of Phenotypic Diversity
- Next up: J. Balhoff: Phenex: Ontological Annotation of Phenotypic Diversity see
- Jim Balhoff: “Phenes – Ontological Annotation of Phenotypic Diversity”
- And here is the paper on Phenex
- Phenex
- Balhof: Phenex.–> Powerful though assumes biocurators, not biologists.
- P. Midford: The Teleost Taxonomy Ontology
- Next up P. Midford: The Teleost Taxonomy Ontology includes all species in Eschmeyer’s Catalog of Fishes’
- Midford: Teleost Taxonomy Ontology <– Number of talks including semantic buzzwords is huge. Time to compile refs to prove value?
- Linnaean taxonomy easier than phylogeny to deal with as an ontology
- T.M. Keesey: Toward a Complete Phyloreferencing Language
-   Next up @ on
- Next up: Mike Keesey: Toward a Complete Phyloreferencing Language (“sort of a SQL for phylogeny”)
- Downloading Flash 10.1 … … … vobio
- Keesey is @; working on h/t @
- OK, Flash 10.1 installed, looks nice
- R. Buels: GMOD for Evolutionary Biology
- Next up R. Buels & D. Clements on GMOD for Evolutionary Biology
- Last lightning talk is on GMOD
- Yay! A GMOD tag-team presentation at
- GMOD is making use of Chado “Natural diversity module”
- Chado Natural Diversity Module
- interesting to see so many folks (at least 2 out of 5 of the projects in the visualization challenge at ) using
- Great day @ a lot of great talks and demos
- RT @ GMOD having an evo hackathon
- Very productive day @, @ a lot of interesting talks and demos. Some great stuff to take back.
Day 1. Part6. Software bazaar and demos
Then there were was the software bazaar and challenge demonstrations, which alas, I skipped most of because of the lack of sleep the night before. It seemed quite packed in there and I was just exhausted. So I went back to my hotel, riding the bike I had borrowed from the hotel back, slowly, along the river.
Here is what I missed:
- Software bazaar
- W. Berendsohn: The EDIT Platform for Cybertaxonomy
- R.J. Challis: Pipefinder – semantic pipelines made easy
- B. Gemeinholzer: DNA Bank Network Ð a virtual linkage of natural history collections’ voucher specimens and documentation with physical DNA, sequences, and publications
- M.J. FavŽ: eFECTIV: Shape analysis using elliptical harmonics
- T.M. Keesey: Names on Nodes: Automating the Application of Taxonomic Names within a Phylogenetic Context
- S. Lewis: Functional Gene Ontology Annotation across Species using PAINT
- S. McKay: GBrowse_syn
- M. Porter: iBarcode-nextgen: tools for next generation biodiversity analysis
- D. Rosauer: Biodiverse, a tool for spatial analysis of biological diversity
- R. Scherle: The Dryad Digital Repository
- C.L. Strope: indel-Seq-Gen version 2.0
- M. Youngblood: mt-tRNA-Draw
- Challenge demonstrations
- M. Porter: GenGIS
- K. Urie: VoLE (Viewer of Life in EOL)
- V. Gopalan: Nexplorer3
- A. Hill: PhyloBox
- S. Smits: jsPhyloSVG
Summary of Day 1.
Here are some tweets summarizing Day 1:
- Open science and data sharing at Evolution 2010 andiEvoBio: Posted by petersuber to oa.notes oa.biology oa.new oa…
- @ @ sounds like is going well & interesting. v. sorry I couldn’t work it in!
- Demos of Biodiverse, GenGIS, various tree visualizers… very cool stuff at today!
- someone explain to me how the ievobio wants open source yet they charged for registration???
- data visualization stuff was way cool today at
At the end of the day, I had dinner with Steven Kembel and Tom Sharpton (@toronaga) who I work with on a Gordon and Betty Moore Foundation funded project we call iSEEM. Dinner and conversation were great. I then went for a walk along the river and went back to my room.