Calling all computational biologists – do as C. Titus Brown does – submit your pubs to arXiv

Can I just express my love/respect for C. Titus Brown?  Not only is he into openness in science and metagenomics and such.  But he practices what he preaches.  For example see – Daily Life in an Ivory Basement : /mar-12/diginorm-paper-posted in which he describes his new submission to arXiv and some background.  I know I am big on Open Access and all, but even we have been lame about submitting things to preprint servers like arXiv.  Gonne do my best to fix that and try and copy Titus.

California Breast Cancer Research Fund Tax Checkoff; wondering about Open Access policies

Just got this email below about what seems to be a worth cause:

March 8, 2012 

Dear UC Colleagues, Throughout California and UC, researchers are developing new approaches to prevent, treat and cure cancer. I am writing you to share important information with those of you who will soon file your California state tax return. At the end of California Tax form 540, there is a section in which you can donate to two highly regarded cancer research programs that are administered by the UC Office of the President.

If you go to line 405, you can contribute to the California Breast Cancer Research Fund and if you go to line 413, you can contribute to the California Cancer Research Fund.

No contribution is too small, and 95 percent of contributions to these two programs via this tax check-off go directly to cancer research or community-based education.

Donations from line 405 go to our California Breast Cancer Research Program, which is renowned not only for its cutting-edge research, but also for working with community advocates and health care providers in targeting the issues and needs of patients and families, especially the underserved.

In recent years, donations from line 405 have supported critical research including: identifying environmental factors that potentially cause breast cancer; developing targeted therapies to block breast cancer from spreading to other organs; and improving support networks to empower patients as they maneuver the health care system. See this website for more information: http://cabreastcancer.org/taxcheckoff/

Donations to line 413 go to the California Cancer Research Fund, which is helping to provide prevention and awareness programs in communities disproportionately affected by cancer. One major ongoing project is increasing the understanding of the impact of tobacco use and cancer on vulnerable populations. This research could lead to reduced smoking, increased cancer awareness and strengthened prevention programs.

I wanted to be sure you were aware of this means of investing in research that can serve all Californians and our communities.

Sincerely,

Lawrence Pitts, M.D. Provost and Executive Vice President University of California, Office of the President

However, I wonder about the open access requirements of the fund. I sniffer around at their web site http://cbcrp.org/about/ and could not find anything about guaranteeing access to the results of the work supported by this fund. That is too bad – this seems to be a great case where openness could be both a good thing and a useful marketing tool (to get people to chip in money from their taxes).

Not sure what to make of this new "Datasets.Com" effort from Hindawi

Just got this email and I thought I would share.  Not sure what to make of this effort.  I do support the sharing of data sets but I am think we probably do not need a whole new cadre of data journals to handle this data.

But there is a spread of what some have called “Predatory” open access publishers (see http://metadata.posterous.com/83235355 for example).  Hindawi, who is behind this, seems to have a mix of good and predatory tendencies and this seems like it may fit into the more predatory categorization.  And I just thought it would be good to bring this a bit more into the open to discuss it.

Dear Dr. Eisen,

My name is Safa Tahoon and I am a Journal Developer for the Hindawi Publishing Corporation. We are in the process of launching a new peer-reviewed, open access journal titled Dataset Papers in Genetics, which will publish Dataset Papers in all areas of genetics research, and I am writing to invite you to join the Editorial Board of this new journal.

Dataset Papers in Genetics is part of a new journal platform that Hindawi is developing called Datasets International (http://www.datasets.com). The main objective of Datasets International is to help researchers in all academic disciplines archive, document, and distribute the datasets produced in their research to the entire academic community. In addition to publishing a series of journals devoted to the dissemination of Dataset Papers, Datasets International hosts the underlying data behind these Dataset Papers and makes it accessible to all researchers worldwide.

The journal will be run using a collaborative editorial model which is designed to provide a fast peer review process for all submitted manuscripts. The journal will be edited by a distributed Editorial Board, and it aims for an average review time of 4 weeks from submission until a final decision has been reached.

Manuscripts that are submitted to the journal will be sent to a number of Editorial Board Members (typically each manuscript will be sent to at least 5 Editors), who will have two weeks to provide either a recommendation for the publication of the manuscript, along with a written commentary detailing any improvements that the authors should make to their manuscript, or a written critique of why the manuscript should not be published.

After the two-week period has elapsed, if the majority of the editorial evaluations recommend the manuscript be rejected, the manuscript will be rejected. If all the editorial evaluations that are received recommend that the manuscript be accepted for publication, the manuscript will be accepted. Otherwise, the editorial evaluations will be anonymously communicated to all of the Editors who participated in the peer review process. Each Editor will be given an additional week to review the feedback of the other Editors and to either confirm or revise their earlier editorial recommendations. If the majority of the editorial evaluations that are received by the end of this second round of review recommend the manuscript be accepted for publication, the manuscript will be accepted. Otherwise, the manuscript will be rejected. If the manuscript is accepted for publication, the names of the Editors who recommended the publication of the manuscript will be published alongside the ma!
nuscript. More information on the journal is available on the following web pages:

http://www.datasets.com/ (Datasets International Home Page)
http://www.datasets.com/journals/genetics/ (Journal Home Page)
http://www.datasets.com/journals/genetics/workflow/ (Editorial Workflow)
http://www.datasets.com/journals/genetics/editors/ (Editorial Board)

The journal will be published using an open access model, which allows disseminating scholarly articles by removing the access barriers imposed by the subscription model, in order to make the full-text of all published articles freely available for any interested reader. In this model the publication costs of an article are covered in the form of Article Processing Charges, which are publication fees paid from the research budget of accepted authors. In this model authors retain the copyright of their work, and we make every possible effort to ensure that the full-text of every published article is both visible and accessible to all potential readers.

Manuscripts that are submitted by the members of the Editorial Board of Dataset Papers in Genetics to the journal will automatically receive a 50% reduction in their Article Processing Charges.

Please do visit the web pages above and let me know if you have any questions or comments. We hope you will accept to join the Editorial Board of the journal and I will be looking forward to hearing from you soon.

Best regards,

Safa Tahoon

——————————
Safa Tahoon
Journal Developer
Hindawi Publishing Corporation
http://www.hindawi.com/
——————————

Nice #openaccess review on the ecology of chemosynthetic symbioses from @chicaScientific & Guus Roeselers

Figure 1 from 10.1007/s00253-011-3819-9. Sediment cross section 
exposing the characteristic Y-shaped burrow dug by S. velum. 
Positioning itself at the triple junction of the Y, the bivalve alternates
 between actively pumping oxygenated water from the upper arms of
 the burrow through the mantle cavity and across the gills and 
accessing reduced sulfur compounds diffusing up from the anoxic 
zones below and pumped through a ventral incurrent opening in the 
mantle. Scale bar equals 2.5 cm

For those who do not know, I got my first taste of microbiology research when I was an undergrad at Harvard and I did my senior/honors research project in the lab of Colleen Cavanaugh. Colleen studied (and in fact still studies) symbioses between invertebrates and chemosynthetic bacteria. The bacteria basically allow these invertebrates to function like plants in many ways. Some of these invertebrates (like the giant tube worms in hydrothermal vents) have lost their mouths and digestive systems and basically live by bringing in high energy chemicals for their symbionts which then make sugars, vitamins, amino acids and other goodies for the host.
Anyway – I am still very interested in these symbioses and have published a few papers on the topic here and there. All that lead in is to simply point everyone out there to a nice new Open Access review paper by Guus Roeselers and Irene Newton: On the evolutionary ecology of symbioses between chemosynthetic bacteria and bivalves. When I first saw the reference in the “Applied Microbiology and Biotechnology” journal I was worried I would not have access to it, but I clicked on the link and discovered it was published using Springer’s version of Open Access. Yippee.  The article is worth a look.

ResearchBlogging.org Roeselers, G., & Newton, I. (2012). On the evolutionary ecology of symbioses between chemosynthetic bacteria and bivalves Applied Microbiology and Biotechnology, 94 (1), 1-10 DOI: 10.1007/s00253-011-3819-9

In case you didn’t hear – #openness WON – Research Works Act shelved

See this story from infojustice.org: Research Works Act Shelved by Sponsors
This is just awesome news. The act was completely inane.
Hooray for Open Access and Open Science and Openness in general.

Calling on AAAS to Deposit all Archives of Science in Pubmed Central

Much has been written recently about a call to boycott Elsevier due to their outrageous policies regarding academic publishing.  I support the boycott but I also agree with many others who have said it perhaps unnecessarily singles out one publisher over others who also have publishing policies that could, well, use a bit of work.  And one such publisher is AAAS – the American Association for the Advancement of Science.

Today, the annual meeting of AAAS begins today in Vancouver.  I was supposed to be there by now, but thanks to some technical problems at Alaska Airlines, I am back in Davis for the day.  AAAS has some policies regarding openness that I believe are unnecessary and not in the general interest of scientific progress.  One is the strange “talk embargoes” I have written about recently.  Another, which is much more problematic, is the fact that Science Magazine (published by AAAS) does not deposit archival content in Pubmed Central.  Now, mind you, I think all scientific publishing funded by taxpayer money should be openly and freely available immediately. But that is not going to happen immediately.

One helpful tool in making scientific literature freely available is Pubmed Central.  Most scientific societies I know of deposit published material in Pubmed Central after some initial delay of 3-6-12 months.  But for reasons that are not entirely clear (to me at least, or to a Google search), AAAS clings to their archival material making it only available through their own web site.  Sure – they do allow authors to deposit their version of their manuscripts in Pubmed Central after a delay.  But most alas do not do this.  And I note – this option is only open to NIH and Wellcome Trust funded work.  So much material cannot be deposited anyway.

AAAS’s policy seems unnecessarily closed accessy and limits the impact and spread of the knowledge contained within papers in Science.  I note – this policy is yet another reason to not publish in Science and to instead choose either fully open access journals or ones that at least release their stranglehold on the papers after a short delay.

Today I call on AAAS to make archival literature from Science Magazine available in Pubmed Central.  And I call on others out there, such as those at the AAAS meeting, to pressure AAAS to do this.  Write blog posts.  Call and email AAAS members and leadership.  Email AAAS.  And so on.

Ideally everyone would just publish in fully open access journals and the journals would deposit material in archives.  But until that happens, we need to make every effort to increase the amount of literature getting into Pubmed Central and other archives.  So – pressure AAAS.  And while everyone is at it, please deposit whatever you can in preprint servers, in various repositories and in Pubmed Central.  Every little bit helps.

Trolls and flames discuss #NotSoFunny satire at the Scholarly Kitchen

Bit of a tiff going on over at the Scholarly Kitchen over a “satire” someone named Ken Anderson wrote related to the Research Works Act. The piece was about the “Restaurant Works Act” — Someone pointed me to the post and I found the satire to be, well, unfunny so I chose to ignore it. My brother alas could not ignore it, nor could some others and there is some discussion going on there now.

I will skip commenting on the discussion itself – go read it. But a few things there annoyed me. One of these is that Anderson has resorted to criticizing the punctuation of some of his critics there. That is pretty lame.  See start of thread below

Alex Merz wrote 

The inappropriateness of the analogy was clear by the end of paragraph 2. For the rest: TL;DR

To which Anderson responded

For those of us not as hip as Alex, TL DR means “too long, didn’t read.” I won’t comment on the inappropriateness of the semicolon in his Urban Dictionaryesque construction. The post is about 850 words, by the way

To which Alex re-responded

It is sad when an overly serious someone attempts a grammar or usage flame, and fails. 

“Too long; didn’t read” is both proper usage and a more effective construction than “too long, didn’t read.” 

Bryan Garner: “Fourth, the semicolon sometimes appears simply to give a weightier pause than a comma would. This use is discretionary. A comma would do, but the writer wants a stronger stop—e.g.: “There is never anything sexy about Lautrec’s art; but there also is never anything deliberately, sarcastically anti-feminist in it.” Aldous Huxley, “Doodles in the Dictionary” (1956), in Aldous Huxley: Selected Essays 198, 206 (1961).” 

Don’t be sad, though. Like you, a lot of smart people don’t know their way around a semicolon. 

If you’re too timid to wade into Fowler, Strunk & White, or Garner, there is help:
http://theoatmeal.com/comics/semicolon

 

To which Anderson responded

Nice try, but you don’t have any reason to use a semicolon there. In any event, I think you’re just covering up a typo with sophistry, so let’s move on.

To which Merz responded

You attempt a punctuation flame. When your own-goal is pointed out (with reference to authoritative sources) you mumble that your flame was correct (though it wasn’t), and you indicate that we should drop the discussion of punctuation that *you initiated.* 

Do you have *any* idea that makes you look? 

I’m guessing that you don’t: http://en.wikipedia.org/wiki/Dunning–Kruger_effect

and so on …

And before the folks at Scholarly Kitchen accuse me of having no sense of humor about such things – I suggest they look at my history of making fun of EVERYONE in publishing all the time.  The key to me is to be funny first and if you have some political comments you want to make, make them in that context.  I found the cooking / food RWA story to just not be funny so I did not pay any attention to its other messages.  Though clearly those messages bothered some folks, like my brother.

Storification of Fake Science Publishing @fakeelsevier @fakeplos @realelsevier @fakeeisen @closedaccessj

So – I have been enjoying all the Fake Scientific Publishing Posts on Twitter from @fakeelsevier @fakeplos @realelsevier @fakeeisen @closedaccessj and others.  Now I understand why some people think I am behind some of these (e.g., here are some of my Fake Science News posts).  But alas though I WISH I was behind some of these accounts, I am not.  Anyway – I created a storification of the beginning of some of these postings if you want to see some of the origins of the fakery.

http://storify.com/phylogenomics/fake-scientific-publishing.js[<a href=”http://storify.com/phylogenomics/fake-scientific-publishing” target=”_blank”>View the story “Fake Scientific Publishing” on Storify</a>]

New openaccess paper from my lab on "Zorro" software for automated masking of sequence alignments

A new Open Access paper from my lab was just published in PLoS One: Accounting For Alignment Uncertainty in Phylogenomics. Wu M, Chatterji S, Eisen JA (2012) Accounting For Alignment Uncertainty in Phylogenomics. PLoS ONE 7(1): e30288. doi:10.1371/journal.pone.0030288

The paper describes the software “Zorro” which is used for automated “masking” of sequence alignments.  Basically, if you have a multiple sequence alignment you would like to use to infer a phylogenetic tree, in some cases it is desirable to block out regions of the alignment that are not reliable.  This blocking is called “masking.”

Masking is thought by many to be important because sequence alignments are in essence a hypothesis about the common ancestry of specific residues in different genes/proteins/regions of the genome.  This “positional homology” is not always easy to assign and for regions where positional homology is ambiguous it may be better to ignore such regions when inferring phylogenetic trees from alignments.

Historically, masking has been done by hand/eye looking for columns in a multiple sequence alignment that seem to have issues and then either eliminating those columns or giving them a lower weight and using a weighting scheme in the phylogenetic analysis.

What Zorro does is it removes much of the subjectivity of this process and generates automated masking patterns for sequence alignments.  It does this by assigning confidence scores to each column in a multiple seqeunce alignment. These scores can then be used to account for alignment accuracy in phylogenetic inference pipelines.

The software is available at Sourceforge: ZORRO – probabilistic masking for phylogenetics.  It was written primarily by Martin Wu (who is now a Professor at the University of Virginia) and Sourav Chatterji with a little help here and there from Aaron Darling I think.  The development of Zorro was part of my “iSEEM” project that was supported by the Gordon and Betty Moore Foundation.

In the interest of sharing, since the paper is fully open access, I am posting it here below the fold. UPDATE 2/9 – decided to remove this since it got in the way of getting to the comments …

Interesting new metagenomics paper w/ one big big big caveat – critical software not available "

Very very strange.  There is an interesting new metagenomics paper that has come out in Science this week.  It is titled “Untangling Genomes from Metagenomes: Revealing an Uncultured Class of Marine Euryarchaeota” and it is from the Armbrust lab at U. Washington.

One of the main points of this paper is that the lab has developed software that apparently can help assemble the complete genomes of organisms that are present in low abundance in a metagenomic sample.  At some point I will comment on the science in the paper, (which seems very interesting) though as the paper in non Open Access I feel uncomfortable doing so since many of the readers of this blog will not be able to read it.

But something else relating to this paper is worth noting and it is disturbing to me.  In a Nature News story on the paper by Virginia Gewin there is some detail about the computational method used in the paper:

“He developed a computational method to break the stitched metagenome into chunks that could be separated into different types of organisms. He was then able to assemble the complete genome of Euryarchaeota, even though it was rare within the sample. He plans to release the software over the next six months.”

What?  It is imperative that software that is so critical to a publication be released in association with the paper.  It is really unacceptable for the authors to say “we developed a novel computational method” and then to say “we will make it available in six months”.  I am hoping the authors change their mind on this but I find it disturbing that Science would allow publication of a paper highlighting a new method and then not have the method be available.  If the methods and results in a paper are not usable how can one test/reproduce the work?