Project:
Rephetio: Repurposing drugs on a hetnet [rephetio]

Publication:
Uberon, an integrative multi-species anatomy ontology

Tissue Node


Instead of using Brenda Tissue Ontology, I would suggest using Uberon for anatomy which incorporates CL.These ontologies provide greater breadth of anatomy and cell types for various species. In addition, I think that with integration with Disease Ontology into EFO you can add more data sources to this network as well as links to further experiments.

ENCODE makes use of Uberon, you can read more [1]

  • Daniel Himmelstein: @vsmalladi, you may consider switching the associated publication to the Uberon paper [1] and move the ENCODE use-case paper [2] as an inline citation.

    16
    1.
    Uberon, an integrative multi-species anatomy ontology
    Christopher J Mungall, Carlo Torniai, Georgios V Gkoutos, Suzanna E Lewis, Melissa A Haendel (2012) Genome Biol. doi:10.1186/gb-2012-13-1-r5
    0
    2.
    Ontology application and use at the ENCODE DCC
    V. S. Malladi, D. T. Erickson, N. R. Podduturi, L. D. Rowe, E. T. Chan, J. M. Davidson, B. C. Hitz, M. Ho, B. T. Lee, S. Miyasato, G. R. Roe, M. Simison, C. A. Sloan, J. S. Strattan, F. Tanaka, W. J. Kent, J. M. Cherry, E. L. Hong (2015) Database. doi:10.1093/database/bav010
  • Daniel Himmelstein: @jspauld, see the citation display failure in the previous note.

  • Jesse Spaulding: @dhimmel Yeah, the inline citations don't work in the notes right now. I suspect it will be rare that people will try to use them in notes given that most of the substantive conversation is intended to take place in comments. So, for the time being I think I'll wait and see how much demand there will be. [Edit: After speaking with @dhimmel offline he has convinced me this was a bug. Inline note citations now work!]

  • Jesse Spaulding: With regard to the associated publication — at the present moment this cannot be changed by the user. @vsmalladi, if you agree it makes sense to change it please just let me know and I'll take care of it.

  • Venkat Malladi: @jspauld yes please update the citation

  • Jesse Spaulding: Updated. If you'd like to do an inline citation to the original DOI you were referencing just insert the following in your markdown: [@10.1093/database/bav010]

    0
    1.
    Ontology application and use at the ENCODE DCC
    V. S. Malladi, D. T. Erickson, N. R. Podduturi, L. D. Rowe, E. T. Chan, J. M. Davidson, B. C. Hitz, M. Ho, B. T. Lee, S. Miyasato, G. R. Roe, M. Simison, C. A. Sloan, J. S. Strattan, F. Tanaka, W. J. Kent, J. M. Cherry, E. L. Hong (2015) Database. doi:10.1093/database/bav010

Hi @vsmalladi, thanks for the Uberon [1] suggestion.

The project has good documentation, a nice user interface, and is actively maintained — three important features when choosing an ontology (and areas where the BRENDA Tissue Ontology (BTO) [2] sometimes lags behind.

In our previous network [3], we had under 100 tissues and only used BTO as a common vocabulary. Since the current project is in an early stage, it is difficult to know whether we will take full advantage of the rich structure and cross-referencing provided by next-generation ontologies. However, I agree it makes sense to build an extensible and forward-thinking network and Uberon will assist in these pursuits. We also prefer ontologies that will have the widest adoption and are free of restrictive licensing. Do you know Uberon's license?

I noticed Uberon includes mappings to other ontologies, called bridges. A BTO bridge doesn't currently exist, but perhaps we could contribute one for the 77 BTO terms (download link) used in our disease network.

Hi @dantericci, I believe there is no restrictive licensing and is open to use.

As for adpotion, they have many big projects and consotrium adopting use of Uberon. Such as EBI, other adoptors.

Uberon does have cross-references to BTO, so I don't think we need to make a bridge.

I updated the proposal to replace BTO with Uberon.

As for the BTO mapping, I didn't see any bridges on the GitHub and the BTO bridge download produces an error. Am I looking in the wrong place or are the BTO mappings not populated yet?

@dhimmel Within the Uberon file there are DbXref's. An example is for

<owl:Class rdf:about="http://purl.obolibrary.org/obo/UBERON_0000002">
    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">uterine cervix</rdfs:label>
    <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">BTO:0001421</oboInOwl:hasDbXref>
    <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">BTO:0002249</oboInOwl:hasDbXref>

Hi @vsmalladi, we have proceeded with Uberon and incorporated the MeSH cross-references. Specifically, we identified disease–anatomy localization using literature mining.

We would like a way to restrict terms to structures in humans. Does anyone know how to implement a species filter?

We would also like to incorporate the Cell Ontology (CL) for cell information [1]. However, there is a MeSH xref issue that will need to be remedied first.

@dhimmel You can restrict structures by NCBI taxon ID Human would be Taxon:9606

Can you elaborate on the MeSH issue? I might be able to help.

Restricting to human terms

@vsmalladi, thanks for the pointer. In the obo header I see:

treat-xrefs-as-reverse-genus-differentia: DHBA part_of NCBITaxon:9606
treat-xrefs-as-reverse-genus-differentia: EHDAA2 part_of NCBITaxon:9606
treat-xrefs-as-reverse-genus-differentia: FMA part_of NCBITaxon:9606
treat-xrefs-as-reverse-genus-differentia: HBA part_of NCBITaxon:9606
treat-xrefs-as-reverse-genus-differentia: HsapDv part_of NCBITaxon:9606

Therefore I speculate the best way to identify human applicable terms would be to identify all terms with a cross-reference to the above resources and all broader terms in the hierarchy. Is that what you suggest?

@dhimmel

Actually what you want to do is for each term filter on present_in_taxon:

property_value: present_in_taxon NCBITaxon:7777

@vsmalladi, the obo only contains relationship: present_in_taxon NCBITaxon:9606 for four terms. Is there a more extensive listing of present_in_taxon relationships that you are aware of?

Meanwhile the results of the treat-xrefs-as-reverse-genus-differentia method look satisfactory. Terms were annotated as human (in_human = 1) if they or any terms they subsumed (along is_a, part_of, and develops_from relationships) contained a cross-reference to any of the following resources: DHBA, EHDAA2, FMA, HBA, and HsapDv (notebook). Most terms where in_human = 0 are not appropriate for humans.

So you probably don't want structures that are uniquely human or that evolved after the human-chimp common ancestor - there are probably only a handful of these, e.g. certain minor glands and brain regions.

You probably want structures that are typically present in humans but not necessarily absent from other species. There are two ways to go about answering this, based on your tolerance for accidentally including a non-human structure vs accidentally excluding a human structure.

  1. List structures that exclude those that are known not to be found in human
  2. List structures for which there is evidence that the structure is found in humans.

Currently we are well geared up for answering (1), but you have to tolerate the occasional inclusion of some obscure brain region that was only actually observed in macaques or mice. We call these 'taxon modules'. We are not well geared up for (2) but this could be prioritized. Using cross-references to DHBA, EHDAA2, FMA, HBA, and HsapDv is a good start but you may still miss some things.

Either way, you need to do something more specific than just look up the direct properties of the class. You need inference over both the anatomical graph, and the taxonomy graph. For (1) you need to make use of negative evidence, which intuitively 'reverses the flow' of inference. So if a larval stage is never found in amniotes, then any structure that necessarily exists at the larval stage is never found in any descendant of amniotes.

You'd be better using an owl reasoner or some existing tooling for this.

Some background on the taxon axioms:
https://github.com/obophenotype/uberon/wiki/Taxon-constraints

Hmm the taxon subsets files could do with better documentation:

http://uberon.github.io/downloads.html#subsets

(these follow (1) above)

We should really make a ready-made human subset here. It would probably be more popular than the Aves one.

These are built using owltools

owltools --use-catalog ext.owl--reasoner elk --make-species-subset -t NCBITaxon:9606 --remove-dangling --assert-inferred-subclass-axioms --useIsInferred --remove-dangling --set-ontology-id $(OBO)/uberon/subsets/human.owl -o human.owl

Human constraint: positive versus no negative evidence

@chrismungall, could not have hoped for a more informative response!

I am going to refer to your two methods as:

  1. no negative evidence, which should include all human structures and some non-human structures
  2. positive evidence, which should include some human structures and exclude all non-human structures

I created a comparison of the two methods for Uberon terms, using @fbastian's implementation of @chrismungall's owltools command above for (1) and my implementation of positive evidence for (2). My implementation has not been vetted, and if there is an owltools command for this functionality, we should switch.

I created two tsv files: one with all uberon terms and another with only MeSH-mapping uberon terms.

Since I'm primarily concerned with terms in MeSH, I looked through MeSH-mapping uberon terms with no positive evidence (positive_evidence == 0) and also no negative evidence (no_negative_evidence == 1). Looking through this subset, there were a few notable terms where we would like negative evidence to exist:

uberon_iduberon_name
UBERON:0013196strand of wool
UBERON:0002415tail
UBERON:0007113venom
UBERON:0005079eggshell
UBERON:0001011hemolymph
UBERON:0006378strand of vibrissa hair
UBERON:0004758salt gland
UBERON:0011123stifle joint

However, overall the no negative evidence method appeared to have higher accuracy than the positive evidence method. Therefore, we will proceed using no negative evidence, which should improve over time as Uberon matures.

Nomenclature

We have been referring to the metanode (node type) for uberon as Tissue. Is Tissue a misnomer? Does Tissue encompass all uberon nodes? If we add CL terms under the same metanode, what term can we use that encompass uberon terms and cell types?

A single word is preferred to a compound term. Some options I can think of are

  • Tissue
  • Anatomy
  • Structure
  • Anatomical Structure

@chrismungall, do you have an inclination?

tissue is too specific, but people will still know what you mean. Uberon follows the CARO upper level ontology:

So formally speaking 'anatomical structure' would exclude non-material entities such as lumens, 2D boundaries... but you should never have gene expression in any of these sites

formally speaking 'anatomical structure' would exclude non-material entities such as lumens, 2D boundaries

Nodes that don't have expression may still be connected via a disease localization edge. These edges are currently created via MEDLINE cooccurrence and can connect any uberon or CL terms that map to MeSH.

I am leaning towards calling the metanode Anatomy. We would end up with metapaths (path types) like: Compound–target–Gene–expression–Anatomy–localization–Disease (abbreviated as CtGeAlD). This metapath refers to when a compound targets a gene that is expressed in an anatomy/tissue/cell-type where the disease is localized. Does that seem like a misuse of the word Anatomy?

Set of anatomy nodes

We have settled on 402 Uberon terms to use as our anatomy vocabulary (tsv, notebook).

We included terms that met the following conditions:

  • were in the uberon_slim subset.
  • were not in the non_informative, upper_level, or grouping_class subsets. See this related GitHub issue.
  • contained a MeSH cross-reference
  • were human-relevant based on the no negative evidence standard.

We chose a restrictive subset of Uberon terms because the vast extent of tissue-specific gene expression edges can become computationally troubling. We did not include cell types from the Cell Ontology because this ontology lags behind Uberon in terms of subset assignments, cross-references, and documentation.

 
Status: Completed
Views
181
Topics
Referenced by
Cite this as
Venkat Malladi, Daniel Himmelstein, Chris Mungall (2015) Tissue Node. Thinklab. doi:10.15363/thinklab.d41
License

Creative Commons License

Share