Salmon Data Integration System
A practical workflow that combines ontology standards, data packaging, and tooling for interoperable salmon datasets.
Overview + demo
- Project page: br-johnson.github.io/salmon-data-integration-system
- Overview video: youtu.be/B0Zqac49zng
- Source repository: github.com/Br-Johnson/salmon-data-integration-system
Core repositories and tools
- DFO salmon ontology: github.com/dfo-pacific-science/dfo-salmon-ontology
- Salmon domain ontology (shared domain model): github.com/salmon-data-mobilization/salmon-domain-ontology
- Salmon domain ontology hub/docs: github.com/salmon-data-mobilization/salmon-ontology-hub
- Salmon Data Package (SDP) specification: github.com/dfo-pacific-science/smn-data-pkg
metasalmonR package: github.com/dfo-pacific-science/metasalmon- Salmon Data GPT app: chatgpt.com/g/.../salmon-data-standardizer
How the system works together
- Shared domain semantics live in
salmon-domain-ontology. - DFO-specific semantics live in
dfo-salmon-ontology. - Data exchange structure is defined by the SDP spec (
smn-data-pkg). - R-based validation and transformation are handled by
metasalmon. - AI-assisted standardization is accelerated by the Salmon Data GPT app.
Quick how-to
1) Start with ontology alignment
- Identify dataset concepts, entities, and relationships.
- Reuse shared terms from
salmon-domain-ontologywhere possible. - Use DFO terms from
dfo-salmon-ontologyfor policy/operational context.
2) Use the Salmon Data GPT app
- Provide dataset context (columns, units, source notes, known constraints).
- Ask for SDP-aligned field mapping suggestions and ontology term candidates.
- Use the generated plan as a draft, then review it before final packaging.
3) Build an SDP-compliant package
- Follow the specification in
smn-data-pkg. - Create package metadata and required data tables.
- Apply reviewed mappings from the GPT-assisted standardization step.
4) Validate with metasalmon
- Inspect and validate structure + content in R.
- Fix schema and semantic issues before sharing or publishing.
5) Integrate downstream
- Share valid SDPs across teams/systems.
- Leverage ontology mappings to maintain interoperability for analytics and apps.