OBO Ontology Editing Guide
This skill provides guidance and tools for editing ontologies in OBO format.
Project Layout Conventions
Most OBO ontologies follow a similar structure:
- •Main development file is typically
src/ontology/{ontology}-edit.obo - •Individual terms can be checked out to
terms/directory for editing - •Some projects may have different layouts - check the project's documentation
Querying Ontology Terms
Use the obo-grep.pl script for searching OBO files:
- •Look at a specific term by ID:
- •
obo-grep.pl --noheader -r 'id: ONTO:0004177' src/ontology/{ontology}-edit.obo
- •
- •All mentions of an ID:
- •
obo-grep.pl --noheader -r 'ONTO:0004177' src/ontology/{ontology}-edit.obo
- •
- •Search by regex (e.g., all mentions of hand or foot):
- •
obo-grep.pl --noheader -r '(hand|foot)' src/ontology/{ontology}-edit.obo
- •
- •Search is much faster than full file reads
- •ONLY search the main edit file (usually
src/ontology/{ontology}-edit.obo) - •DO NOT do manual greps or read entire files unless necessary
Before Making Edits
- •Read the request carefully and make a plan, especially if there is nuance
- •If a PMID is mentioned, try to read it using:
aurelian fulltext PMID:NNNNNN - •This also works for DOIs and URLs for scientific papers (if accessible)
- •ALWAYS check proposed parent terms for consistency
- •Check project-specific guidelines if available
Editing Workflow
IMPORTANT: Use Checkout/Checkin for Large Files
- •Do not edit large ontology files directly
- •Use the checkout/checkin workflow for individual terms
- •Check out a term:
obo-checkout.pl src/ontology/{ontology}-edit.obo ONTO:1234567 [OTHER_IDS] - •This creates a single stanza file:
terms/{ontology}_1234567.obo(note: colon replaced with underscore) - •Edit the small file in the
terms/folder - •Check back in:
obo-checkin.pl src/ontology/{ontology}-edit.obo ONTO:1234567 [OTHER_IDS] - •Checking in updates the edit file and removes the file from
terms/ - •You can edit multiple terms in one batch file if needed
Scripts Available
This skill includes three essential scripts:
- •
obo-grep.pl- Fast searching of OBO files - •
obo-checkout.pl- Extract terms to individual files for editing - •
obo-checkin.pl- Merge edited terms back into main file
All scripts are available in your PATH when this skill is loaded.
OBO Format Guidelines
Basic Structure
- •Term ID format:
ONTO:NNNNNNN(check project conventions for number of digits) - •Each term requires:
- •
id:- unique identifier - •
name:- human-readable label - •
namespace:- ontology namespace - •
def:- definition with references in square brackets
- •
- •Use standard relationship types:
is_a,part_of,has_part, etc. - •Follow existing term patterns for consistency
Handling New Term Requests (NTRs)
- •Check project conventions for temporary ID ranges
- •Example: Some projects use ranges like
ONTO:777xxxxfor new terms - •Always check for ID clashes:
grep 'id: ONTO:777' src/ontology/{ontology}-edit.obo - •NEVER guess ontology IDs - use search tools to find actual terms
- •NEVER guess PMIDs for references - do web searches if needed
Citations and References
- •Cite publications appropriately:
def: "..." [PMID:nnnn, doi:mmmm] - •Fetch full text when needed:
aurelian fulltext <PMID:nnn>(also works with DOIs and URLs) - •All synonyms should include proper citations
- •Never use empty brackets
[]without a source
Synonyms
Synonyms should include proper attribution:
Correct:
synonym: "alternative name" EXACT [PMID:12345678] synonym: "abbrev" EXACT ABBREVIATION [PMID:12345678]
Relationships and Logical Definitions
- •All terms should have at least one
is_aparent - •Logical definitions follow genus-differentia form
- •Text definitions should mirror logical definitions
- •Include source attribution for relationships when based on literature:
Logical Definitions (intersection_of)
Example of proper intersection_of usage:
[Term] id: ONTO:0000715 name: specific disease def: "A general disease that involves specific location." [PMID:12345678] is_a: ONTO:0001082 ! general disease intersection_of: ONTO:0004971 ! general disease intersection_of: disease_has_location UBERON:0000029 ! specific location
Note that in OWL this corresponds to: 'specific disease' EquivalentTo 'general disease' and 'disease has location' some 'specific location'
Obsoleting Terms
- •Obsolete terms should have NO logical axioms (
is_a,relationship,intersection_of) - •Obsolete terms may have one
replaced_bytag (exact replacement) - •Or multiple
considertags (suggested alternatives) - •Always include obsolescence reason and tracker reference
Example of simple obsolescence:
[Term]
id: ONTO:0100334
name: obsolete term name
property_value: IAO:0000231 OMO:0001000
property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI
is_obsolete: true
replaced_by: ONTO:0100321
Example with considerations instead of replacement:
[Term]
id: ONTO:0100229
name: obsolete term name
def: "OBSOLETE. Original definition here." [original references]
property_value: IAO:0000231 OMO:0001000
property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI
is_obsolete: true
consider: ONTO:0100259
consider: ONTO:0100260
Important Notes on Obsolescence
- •Synonyms and xrefs can be migrated to replacement terms judiciously
- •Never do complete merges with
alt_id- use obsolescence with replacement instead - •No relationships should point to an obsolete term
- •When obsoleting, you may need to rewire other terms to "skip" the obsoleted term
Metadata Best Practices
- •Link to issue trackers:
property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI - •Sign new terms (don't tag pre-existing terms):
code
property_value: http://purl.org/dc/terms/creator https://orcid.org/0000-0001-2345-6789
- •All terms should have definitions with at least one reference (preferably PMID)
- •Dates are typically auto-generated by build processes
Syntax Checking
Validate OBO syntax using ROBOT:
robot convert --catalog src/ontology/catalog-v001.xml \
-i src/ontology/{ontology}-edit.obo \
-f obo \
-o {ontology}-edit.TMP.obo
Use -vvv flag for full stack trace if there are errors.
Design Patterns
Many OBO ontologies use DOSDP (Dead Simple Ontology Design Patterns):
- •Check
src/patterns/dosdp-patterns/*.yamlfor project-specific patterns - •Follow existing patterns when creating similar terms
- •Common patterns include:
- •Location-based disease patterns
- •Gene-related disease patterns
- •Part-of hierarchies
- •Abnormality patterns
Important Reminders
- •NEVER guess identifiers of any kind
- •If you include an identifier not provided by the user, you MUST verify it
- •PMIDs can be checked with
aurelianor web search - •Always follow project-specific conventions and check existing examples
- •When in doubt, ask for clarification rather than making assumptions