Appendix 3: Getting Started Checklist
Getting Started Checklist for Salmon Data Stewardship
Use this practical checklist to assess how well your project, program, or organization aligns to the seven Best Practices. Start at the Project level, then scale to Program and Organization. Check off items you’ve completed and note gaps to prioritize.
Tip: For each item, capture a link to the living source (e.g., repository, shared drive, policy page) and the responsible owner.
Practice 1 — Make Data Governance Explicit
Project
- [ ] Do you have a Data Management Plan (DMP) covering scope, sensitive data, retention, and sharing? (link)
- [ ] Is there a RACI (Responsible, Accountable, Consulted, Informed) table for key tasks? (owner)
- [ ] Are Indigenous knoweledge holders or community members involved in the project?
- [ ] Are Indigenous Data Sovereignty (IDS) requirements identified and documented (who to consult, approvals needed)?
- [ ] Is a data product charter written for each dataset or analysis product with purpose, audience, quality thresholds, release plan?
Program
- [ ] Are DMP and charter templates standardized across projects and stored centrally?
- [ ] Are role definitions for Data Steward, Product Owner, and Maintainer explicit and assigned for priority datasets?
- [ ] Is this 'community-engaged' research that provides tangible benefit to communities?
- [ ] Are data sharing agreements/MOUs and ethical review pathways documented and reusable?
Organization
- [ ] Does a governance policy exist that sets minimum requirements for DMPs, RACI, retention, IDS, and release reviews?
- [ ] Is there a standing review forum (e.g., monthly data governance check‑in) and a registry of governed data products?
Evidence to collect: DMP link, data product charter(s), RACI, IDS guidance, sharing agreements registry.
Practice 2 — Reuse Proven Infrastructure
Project
- [ ] Have you researched the existing data sharing infrastructure and data storage options specific to your data and context? Eg. Ocean Biodiversity Information System, Global Biodiversity Information Facility, Knowledge Network for Biocomplexity, Zenodo, Dataverse, etc.
- [ ] Is your code in version control (e.g., Git) with an issue tracker and releases?
- [ ] Are you using an approved repository or data store rather than creating a new silo? (where)
- [ ] Do you use existing organization authentication/authorization and backup processes?
Program
- [ ] Is there a preferred stack list (storage, metadata catalog, workflow runner, packaging, container base images)?
- [ ] Do projects consistently deposit finalized data in approved repositories with clear intake criteria?
Organization
- [ ] Are enterprise services available and documented (data lake, object store, catalog/portal, archival repository)?
- [ ] Is there a deprecation pathway for legacy systems and a migration plan for priority datasets?
Evidence to collect: repository URLs, infrastructure inventory, intake criteria, backup/DR documentation.
Practice 3 — Use Persistent Identifiers (PIDs) for People, Projects, Data, and Methods
Project
- [ ] Do all contributors have ORCID IDs recorded in metadata?
- [ ] Does the project have a resolvable PID (e.g., DOI for a project page or protocol, internal project ID)?
- [ ] Are datasets assigned DOIs (or other PIDs) at publication, and are versions tracked?
- [ ] Are methods/protocols published and citable (e.g., protocol DOI) and linked from dataset metadata?
Program
- [ ] Is there guidance on when to mint PIDs, by whom, and where they resolve?
- [ ] Are projects linked to organizational identifiers (e.g., ROR for institutions) in metadata?
Organization
- [ ] Is there a PID policy and a provider/registrar configured (e.g., DataCite) with a documented workflow?
- [ ] Are PID linkages automated in the catalog (people ↔ projects ↔ datasets ↔ publications)?
Evidence to collect: ORCID list, PID policy, DOI records, resolver links in the catalog.
Practice 5 — Store and Analyze Data for Easy Access, Use, and Trust
Project
- [ ] Is raw data immutable and separated from processed/analysis outputs?
- [ ] Is there a fully reproducible workflow (scripts/notebooks + environment + parameters) that runs end‑to‑end?
- [ ] Is the computational environment captured (lockfile/conda env, container image) and versioned?
- [ ] Are QA/QC checks automated with logs and thresholds documented?
- [ ] Are access controls and sensitive data handling documented and implemented?
Program
- [ ] Do projects follow a common repo layout and release process (tags, changelog, signed artifacts)?
- [ ] Are standard storage classes, lifecycle policies, and archival rules applied?
Organization
- [ ] Are security, backup/retention, and audit requirements defined and routinely verified?
- [ ] Is there a trusted long‑term archive with fixity checking and preservation metadata?
Evidence to collect: workflow definition, environment files, container references, QA/QC reports, storage/backup settings.
Practice 6 — Incentivize and Track Sharing and Reuse
Project
- [ ] Is a clear citation and license statement included in metadata and README?
- [ ] Are reuse metrics collected (downloads, citations, API hits) and reviewed?
- [ ] Do release notes document what changed and implications for users?
Program
- [ ] Are common metrics dashboards available for priority datasets and updated automatically?
- [ ] Are data citations tracked in assessments, reports, and staff evaluations?
Organization
- [ ] Do policies require citation guidance and permissive, appropriate licensing where possible?
- [ ] Are automated reports of reuse (e.g., via DOI provider APIs) delivered to product owners and leadership?
Evidence to collect: LICENSE, CITATION, reuse dashboard link, policy excerpts, sample citations in reports.
Practice 7 — Build Community Through Co‑Development and Mutual Benefit
Project
- [ ] Are stakeholders identified, including First Nations/Tribes/Indigenous partners, and engagement needs documented?
- [ ] Have you held at least one co‑design session to validate user needs and success criteria?
- [ ] Is there an open feedback channel (issues form, contact) and a published roadmap?
Program
- [ ] Do cross‑project working groups exist for models, vocabularies, and tooling with regular cadence and notes?
- [ ] Are community contributions recognized (authorship, acknowledgements, meeting time, funding)?
Organization
- [ ] Is there an endorsed governance body or community of practice with decision records?
- [ ] Are procurement/funding mechanisms available to support shared components and Indigenous partnerships?
Evidence to collect: stakeholder map, engagement records, roadmap, working group notes, decision log.
Quick Start: 30/60/90‑Day Plan
- First 30 days
- By 60 days
- By 90 days
Minimal Artifacts Checklist (Project Level)
Maintain this list as a living issue in your repository and review quarterly.