VenomKB, a new knowledge base for facilitating the validation of putative venom therapies

Interview on VenomKB and open data

Margaux Phares — Master of Science student in Comparative Media Studies at MIT — asked me a few questions regarding VenomKB [1] for an article she was writing. VenomKB catalogs therapeutic uses of animal venoms. With permission, I've posted Margaux's questions and my replies below. Our email conversation took place on December 3, 2015.

As a user, what have you gotten out of VenomKB so far?

My research integrates public data to predict new uses for existing drugs. One big struggle has been compiling a machine-readable catalog of which drugs treat which diseases. This information is essential, since our computational method requires a "gold standard" of efficacious therapies to learn how drugs work.

So I'm super excited when anyone releases open data on a broad range of diseases. I hadn't given much thought to including venom therapies, but the VenomKB paper convincingly argues why venoms are a pharmacological treasure trove.

I don't have any immediate plans to incorporate the VenomKB data, but hope to in the future, especially as the database matures. For example, when the venoms and effects are converted to standardized terminologies, which provide the structure needed for machine learning.

Where do you see the field of bioinformatics and this sort of open source accessibility in science going?

I'm a big believer that the best thing to do with your data will be thought of by someone else. In bioinformatics, your code and data will often be much more influential than your results. Scientists have been slow to adapt to this reality.

I was happy to see VenomKB released openly prior to applied follow up. In the past, most exciting new data resources have been coupled with high impact analyses by the same group. This coupling delays the availability of the data and generally leads to lower quality work as separate components that require separate skillsets and evaluations are bundled together.

Going forward I expect open data will become the norm. Because those who don't make open data will fade into irrelevance. My own project has exposed the problematic nature of data copyright and the barriers to using resources that are not openly licensed. As data becomes a first class scientific output, researchers will favor the data that allows them to research without restriction.

Cite this as
Daniel Himmelstein (2016) Interview on VenomKB and open data. Thinklab. doi:10.15363/thinklab.d151

Creative Commons License