AgentSkillsCN

orm-guide

ORM 结构及其对应数据库表的概览

SKILL.md
--- frontmatter
name: orm-guide
description: Overview of ORM structure and corresponding database tables
user-invocable: false

ORM Structure Guide

This skill provides context about the ORM structure in narrativegraphs/db/.

Core Concepts

The data model supports two graph paradigms:

Graph TypePrimary AnnotationsHas Relations/Predicates
NarrativeGraphTriplets (subject-predicate-object)Yes
CooccurrenceGraphTuplets (entity-entity pairs)No

Annotation Types (have doc_id directly)

TripletOrm (triplets.py)

  • Represents a subject-predicate-object extraction from text
  • Has: doc_id, subject_id, predicate_id, object_id, relation_id, cooccurrence_id
  • Stores span positions and text for subject, predicate, and object
  • Mixes in AnnotationMixin (provides doc_id, timestamp, document relationship)

TupletOrm (tuplets.py)

  • Represents an entity-entity cooccurrence extraction
  • Has: doc_id, entity_one_id, entity_two_id, cooccurrence_id
  • Stores span positions and text for both entities
  • Mixes in AnnotationMixin

EntityOccurrenceOrm (entityoccurrences.py)

  • Represents a single entity mention/occurrence in text
  • Has: doc_id, entity_id, span_start, span_end, span_text
  • Relationships: entity (→ EntityOrm), document (→ DocumentOrm)
  • Mixes in AnnotationMixin
  • Used by EntityOrm to derive alt_labels (alternative surface forms)

Higher-Level ORMs (backed by annotations)

All these mix in AnnotationBackedTextStatsMixin which provides:

  • Stats columns: frequency, doc_frequency, spread, adjusted_tf_idf, first_occurrence, last_occurrence
  • _annotations property (abstract, returns backing triplets/tuplets)
  • doc_ids property (derived from _annotations)

EntityOrm (entities.py)

  • Canonical entity (e.g., "Microsoft", "Satya Nadella")
  • Relationships:
    • occurrences → EntityOccurrenceOrm (all mentions of this entity)
    • subject_triplets / object_tripletstriplets property
    • _entity_one_tuplets / _entity_two_tupletstuplets property
    • subject_relations / object_relationsrelations property
    • _entity_one_cooccurrences / _entity_two_cooccurrencescooccurrences property
  • _annotations returns triplets + tuplets (union for both graph types)
  • Has alt_labels hybrid property (derived from occurrences span texts)

PredicateOrm (predicates.py)

  • Canonical predicate/verb (e.g., "acquired", "announced")
  • Relationships: triplets, relations
  • _annotations returns triplets
  • Has alt_labels hybrid property

RelationOrm (relations.py)

  • Canonical relation tuple: (subject_entity, predicate, object_entity)
  • Has: subject_id, predicate_id, object_id, significance
  • Relationships: subject, predicate, object, triplets
  • _annotations returns triplets
  • Has alt_labels hybrid property

CooccurrenceOrm (cooccurrences.py)

  • Canonical cooccurrence: (entity_one, entity_two) where entity_one_id <= entity_two_id
  • Has: entity_one_id, entity_two_id, pmi
  • Relationships: entity_one, entity_two, tuplets
  • _annotations returns tuplets

DocumentOrm (documents.py)

  • Source document with text, str_id, timestamp
  • Relationships: triplets, tuplets, entity_occurrences
  • Has categories via CategorizableMixin

Mixins (common.py, documents.py)

  • CategorizableMixin: Provides category support
  • CategoryMixin: Base for category tables (e.g., EntityCategory)
  • HasAltLabels: For ORMs with alternative surface forms
  • AnnotationMixin: For triplets/tuplets (provides doc_id, document relationship)
  • AnnotationBackedTextStatsMixin: For higher-level ORMs (stats + doc_ids)

Relationship Diagram

code
DocumentOrm
    │
    ├── triplets ──────────► TripletOrm ◄── subject/object ── EntityOrm
    │                            │                              │
    │                            ├── predicate ── PredicateOrm  │
    │                            │                    │         │
    │                            └── relation ─── RelationOrm ◄─┘
    │                                                │
    ├── tuplets ────────────► TupletOrm ◄────────────┼── entity_one/two ── EntityOrm
    │                            │                   │
    │                            └── cooccurrence ── CooccurrenceOrm
    │
    └── entity_occurrences ─► EntityOccurrenceOrm ◄── entity ── EntityOrm