Skip to content

Data Models

This document is the authoritative reference for the Horus database schema. It covers every production table, the multi-tenant isolation strategy, the global soft-delete pattern, and the key indexes and their rationale.


1. Database Overview

Horus runs on Supabase Postgres (Postgres 15+). All application tables live in the public schema. Supabase Auth manages user identity in the auth schema; the application extends it with the profiles table.

Multi-tenancy

Every data table carries an org_id uuid column that references organizations(id). Tenant isolation is enforced at the database level through Row Level Security (RLS). All tables have RLS enabled. The helper function current_org_id() is the single source of truth:

sql
create or replace function current_org_id()
returns uuid as $$
  select org_id from profiles where id = auth.uid()
$$ language sql security definer;

RLS policies take the form using (org_id = current_org_id()). Bearer-token (Supabase JWT) clients automatically satisfy this check. API-key clients use the service-role client, which bypasses RLS; those callers must supply the correct org_id manually, which the backend enforces in every query.

Soft Deletes

No user-managed row is ever physically removed through the API. See Section 4 for the full explanation.


2. Entity Reference

2.1 organizations

Top-level tenant. Created once per customer account.

ColumnTypeNotes
iduuid PKGenerated by gen_random_uuid()
nametext NOT NULLDisplay name
domaintextPrimary email domain; used by HIBP checks
settingsjsonbTenant-level feature flags and preferences
created_attimestamptzDefault now()

RLS: organizations are not themselves row-level-isolated (each user's JWT already scopes them to one org). Physical deletes are intentional: removing an org cascades through every foreign key.


2.2 profiles

Extends Supabase auth.users with org membership and role.

ColumnTypeNotes
iduuid PKFK to auth.users(id), cascades on delete
org_iduuidFK to organizations(id), cascades on delete
roletextadmin, analyst, or viewer
full_nametextDisplay name shown in the UI
created_attimestamptzDefault now()
deleted_attimestamptzSoft-delete timestamp; NULL means active

RLS policy org_read (FOR SELECT): org_id = current_org_id() and deleted_at is null.

One profile per user. The role drives the backend require_role() dependency guard.


2.3 assets

Represents a scannable target. This is the central entity: scans, findings, and Iris agents all reference it.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
nametext NOT NULLHuman label
hosttext NOT NULLHostname, IP, or URL
portintegerOptional; overrides protocol default
typetext NOT NULLweb, ip, api, domain, or cloud
is_internalbooleanDefault false; skips public-IP validation
is_activebooleanDefault true; inactive assets are excluded from scan-all
tagstext[]Free-form tags for grouping
metadatajsonbArbitrary extra fields
created_attimestamptz
deleted_attimestamptzSoft-delete; RLS hides non-null rows

The cloud type was added in migration 20260621120000_cloud_assets to support AWS account assets. Cloud credentials are stored in integrations (type aws), not on the asset itself.

RLS policy org_isolation: org_id = current_org_id() and deleted_at is null.


2.4 scans

A single scan execution against one asset. Created by the API and processed by the Horus pipeline.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
asset_iduuid NOT NULLFK to assets, cascades on delete
schedule_iduuidFK to scan_schedules; null for manual scans
statustext NOT NULLpending, running, completed, failed, or canceled
tools_usedtext[]e.g. {nuclei, nmap}
triggered_bytextuser:<uuid> or schedule
triggered_by_user_iduuidDenormalized user ID for label lookups
raw_outputjsonbRaw scanner output (debug only)
error_messagetextPopulated on failure or cancellation
started_attimestamptzNull until the worker picks it up
completed_attimestamptzNull until terminal state
created_attimestamptz

The triggered_by_label field seen in API responses is a computed Python string, not a database column. The backend derives it by joining triggered_by with profiles.full_name.


2.5 findings

Deduplicated vulnerabilities discovered by scans or imported manually. This is the most queried table in the system.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
scan_iduuidFK to scans; null for manually imported findings
asset_iduuid NOT NULLFK to assets
titletext NOT NULLShort vulnerability description
descriptiontextFull explanation
severitytext NOT NULLcritical, high, medium, low, or info
cvss_scorenumeric(4,1)CVSS v3 base score
cve_idstext[]Associated CVE identifiers
statustext NOT NULLopen, in_progress, resolved, false_positive, or accepted_risk
fingerprinttext NOT NULLSHA-256 of title + host + source (first 32 chars). Unique per org.
is_noisebooleanTrue for absence/informational findings (hidden by default)
raw_datajsonbScanner-specific payload including cvss_v3_score and tool
first_seen_attimestamptzSet on first insert; never overwritten
last_seen_attimestamptzUpdated on every re-import or re-scan
created_attimestamptz
severity_rankint GENERATED STORED0=critical, 1=high, 2=medium, 3=low, 4=info. Added in 20260622090000_findings_severity_rank.

Deduplication: upserts use the unique constraint on (org_id, fingerprint). Re-importing the same finding updates last_seen_at without creating a duplicate.

Noise filter: findings where is_noise = true are hidden from list/detail responses unless the caller passes include_noise=true. The API also returns a noise_count field so the UI can display a "N hidden" banner.

Severity ordering: severity_rank solves the problem that ordering by the severity text column sorts alphabetically (critical, high, info, low, medium) rather than by actual risk. The generated column enables correct risk-order sorts via order("severity_rank").

RLS policy org_isolation: org_id = current_org_id(). No soft-delete on findings; they accumulate over time.


2.6 incidents

Groups related findings under a single owner with a status lifecycle and SLA. Used for case management.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
titletext NOT NULL
descriptiontext
statustext NOT NULLopen, in_progress, resolved, or closed
severitytext NOT NULLcritical, high, medium, or low
assignee_iduuidFK to auth.users; set null on user deletion
sla_deadlinetimestamptzNullable; shown as overdue in the UI when past
created_byuuidFK to auth.users
closed_attimestamptzStamped when status transitions to closed
created_attimestamptz
updated_attimestamptzMaintained by the incidents_updated_at trigger

Incidents do not have a deleted_at column. Deletion through the API sets status = closed.


2.7 incident_findings

Many-to-many join between incidents and findings.

ColumnTypeNotes
incident_iduuidFK to incidents, cascades on delete
finding_iduuidFK to findings, cascades on delete
added_attimestamptz
deleted_attimestamptzSoft-delete; set to unlink a finding without losing the history

Primary key: (incident_id, finding_id). Upsert on this PK allows relinking a previously soft-deleted association.

RLS: inherits org scope through the parent incident. The policy checks exists (select 1 from incidents where id = incident_id and org_id = current_org_id()) and deleted_at is null.


2.8 incident_notes

Append-only activity log per incident.

ColumnTypeNotes
iduuid PK
incident_iduuid NOT NULLFK to incidents, cascades on delete
author_iduuid NOT NULLFK to auth.users, cascades on delete
bodytext NOT NULLMarkdown-safe free text, max 10,000 chars
created_attimestamptz

No updates or deletes. Notes are returned in chronological order.


2.9 phishing_campaigns

Top-level entity for a phishing simulation campaign.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
nametext NOT NULL
objectivetext NOT NULLclick, credentials, or report
statustextdraft, scheduled, running, or completed
context_asset_idsuuid[]Assets whose context informs AI-generated emails
schedule_crontextOptional cron; null for one-shot campaigns
template_iduuidFK to phishing_templates; null uses AI generation
launched_attimestamptzStamped when the campaign transitions to running
completed_attimestamptz
created_byuuidFK to auth.users
created_attimestamptz
deleted_attimestamptzSoft-delete

2.10 phishing_contacts

Reusable recipient list for scheduled phishing campaigns. Distinct from employees: contacts are used by the scheduler without requiring an employee record.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
nametext NOT NULL
emailtext NOT NULL
departmenttext
created_attimestamptz

Unique constraint: (org_id, email).


2.11 phishing_schedules

Recurring phishing schedule (cron-based). Stores a list of contact IDs and context asset IDs as arrays.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
nametext NOT NULL
cron_expressiontext NOT NULLStandard cron syntax
objectivetext NOT NULLclick, credentials, or report
contact_idsuuid[]References to phishing_contacts.id
context_asset_idsuuid[]References to assets.id
enabledbooleanDefault true
created_attimestamptz

2.12 phishing_templates

HTML email templates used in phishing campaigns. Can be org-private or publicly shared in the community library.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
nametext NOT NULL
subjecttextEmail subject line
body_htmltextFull HTML email body with placeholder
is_publicbooleantrue makes the template visible in the community library
created_byuuidFK to auth.users
created_attimestamptz
deleted_attimestamptzSoft-delete

Two RLS policies: org_isolation for private read/write (org_id = current_org_id() and deleted_at is null) and phishing_templates_read_public for the community library (is_public = true and deleted_at is null).


2.13 employees

Org members targeted in phishing campaigns. Also linked to HIBP breach data.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
emailtext NOT NULLLowercased on insert
full_nametext
departmenttext
hibp_checked_attimestamptzLast time HIBP checked this address
created_attimestamptz
deleted_attimestamptzSoft-delete

Unique constraint: (org_id, email). Upsert on this constraint handles bulk CSV import.


2.14 phishing_targets

Per-employee state for one campaign launch. Tracks email delivery and engagement events.

ColumnTypeNotes
iduuid PK
campaign_iduuidFK to phishing_campaigns
org_iduuidFK to organizations
employee_iduuidFK to employees; nullable (scheduler can write name/email directly)
employee_nametextDenormalized; used when employee_id is null
employee_emailtextDenormalized; used when employee_id is null
tracking_tokentextUnique URL-safe token embedded in each email
email_subjecttextRendered subject line
email_body_htmltextRendered HTML body
email_pretexttextAI-generated pretext shown in campaign results
link_clicked_attimestamptzSet by the honeypot endpoint when the employee clicks
creds_entered_attimestamptzSet when credential capture is triggered
reported_attimestamptzSet when the employee reports the email

Unique constraint: (campaign_id, employee_id).


2.15 iris_agents

Registered Iris daemon instances. Each agent runs on a server, monitors filesystem/process/auth events, and POSTs batches to the API.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
nametext NOT NULLHuman label
hostnametextAuto-sent by the daemon
platformtextlinux or darwin
iptextLast known IP
api_key_hashtext NOT NULLSHA-256 of the irs_ prefixed token
key_prefixtext NOT NULLFirst 12 chars; shown in the UI for identification
asset_iduuidOptional FK to assets; set null on asset deletion
last_seen_attimestamptzUpdated on each agent heartbeat
statustext NOT NULLonline, offline, or degraded
configjsonbWatch paths, ignore patterns, polling interval
created_attimestamptz
created_byuuidFK to auth.users
deleted_attimestamptzSoft-delete

Iris agents use their own API key type (irs_ prefix, SHA-256 stored) that is completely separate from user API keys (hrs_ prefix).


2.16 iris_events

Raw events streamed by Iris daemons. Events accumulate here until the batch-processing endpoint converts them into findings via the AI pipeline.

ColumnTypeNotes
iduuid PK
agent_iduuid NOT NULLFK to iris_agents, cascades on delete
org_iduuid NOT NULLFK to organizations
event_typetext NOT NULLfile_change, new_process, new_listener, new_connection, auth_event, or log_anomaly
severitytext NOT NULLinfo, low, medium, high, or critical
titletext NOT NULLShort description
payloadjsonb NOT NULLEvent-specific data (paths, PIDs, ports, etc.)
received_attimestamptz
processedbooleanDefault false; set to true when batched into a scan
scan_iduuidFK to scans; filled when the event is batched

2.17 integrations

External notification and credential storage. One row per integration type per org.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
typetext NOT NULLslack, email, or aws
configjsonbType-specific payload. Slack: {webhook_url, min_severity}. Email: {to[], min_severity, smtp_*}. AWS: {access_key_id, secret_access_key, region}.
enabledbooleanDefault true
created_attimestamptz
deleted_attimestamptzSoft-delete

AWS integrations provide credentials for cloud asset scans. The config JSONB stores credentials at rest; they are not shown in list responses.


2.18 api_keys

Long-lived programmatic credentials for service integrations. The full secret is shown to the user once at creation time; only the SHA-256 hash is stored.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
nametext NOT NULL
key_hashtext NOT NULLSHA-256 hex of the full hrs_ key
key_prefixtext NOT NULLFirst 16 chars; displayed in the UI for identification
roletext NOT NULLanalyst or admin; determines the permission level
created_byuuidFK to profiles; set null on profile deletion
created_attimestamptz
last_used_attimestamptzUpdated on each successful authentication
revoked_attimestamptzSoft revocation; non-null keys are rejected at login

Unique constraint: (org_id, name). Unlike other tables, api_keys uses revoked_at (not deleted_at) for soft revocation. This is intentional: revoked keys must remain queryable for audit purposes.


2.19 scan_schedules

Recurring scan schedules. One schedule can target multiple assets.

ColumnTypeNotes
iduuid PK
org_iduuid NOT NULLFK to organizations
nametext NOT NULL
asset_idsuuid[]Assets to scan on each execution
cron_expressiontext NOT NULLDefault 0 2 * * * (2 AM daily)
toolstext[]Default {nuclei, nmap}
enabledbooleanDefault true
last_run_attimestamptz
next_run_attimestamptz
created_attimestamptz
deleted_attimestamptzSoft-delete

3. Entity Relationship Diagram

mermaid
erDiagram
    organizations ||--o{ profiles : "has members"
    organizations ||--o{ assets : "owns"
    organizations ||--o{ scans : "runs"
    organizations ||--o{ findings : "accumulates"
    organizations ||--o{ incidents : "manages"
    organizations ||--o{ phishing_campaigns : "runs"
    organizations ||--o{ employees : "employs"
    organizations ||--o{ iris_agents : "deploys"
    organizations ||--o{ integrations : "configures"
    organizations ||--o{ api_keys : "issues"
    organizations ||--o{ scan_schedules : "schedules"

    assets ||--o{ scans : "scanned by"
    assets ||--o{ findings : "has"
    assets }o--o| iris_agents : "linked to"

    scans ||--o{ findings : "produces"
    scans ||--o{ iris_events : "batches"

    findings }o--o{ incidents : "grouped in"
    incidents ||--o{ incident_findings : "links via"
    findings ||--o{ incident_findings : "linked via"
    incidents ||--o{ incident_notes : "has"

    phishing_campaigns ||--o{ phishing_targets : "targets"
    employees ||--o{ phishing_targets : "is targeted"
    phishing_campaigns }o--o| phishing_templates : "uses"

    phishing_schedules }o--o{ phishing_contacts : "references"

    iris_agents ||--o{ iris_events : "streams"

4. Soft Delete Pattern

Motivation

Nothing managed through the Horus API is ever physically deleted from the database. Hard deletes are irreversible and complicate audit trails. Instead, a deleted_at timestamptz column is added to every user-managed table.

How It Works

Writing a soft delete: the API sets deleted_at = now() and, where relevant, also sets is_active = false (assets). The Python code uses datetime.now(timezone.utc).isoformat().

Reading: the org_isolation RLS policy for each table is updated to include and deleted_at is null in the USING expression. This means:

  • No SELECT query in the Python codebase needs to filter deleted_at explicitly. RLS handles it transparently.
  • The WITH CHECK clause is intentionally left without deleted_at is null, so the API can still execute the UPDATE … SET deleted_at = now() that performs the soft delete.

Recovery: restoring a soft-deleted row requires either the Supabase service-role key (which bypasses RLS) or direct SQL access. Recovery is intentionally out-of-band and not exposed via the API.

Tables with deleted_at

assets, permission_policies, scan_schedules, integrations, discovery_sources, employees, phishing_campaigns, phishing_templates, red_findings, iris_agents, incident_findings, notifications, profiles, adversarial_schedules.

Tables without deleted_at

  • api_keys: uses revoked_at instead for auditable key revocation.
  • organizations: intentionally supports hard delete (cascades through all children).
  • incidents: deletion through the API closes the incident (status = closed) rather than setting deleted_at.
  • findings, scans, iris_events, incident_notes: append-only; not deletable through the API.

5. Key Indexes

IndexTableColumnsPurpose
findings_severity_rank_idxfindings(org_id, severity_rank, created_at desc)Risk-order finding lists; the generated severity_rank column avoids text-sort issues
incidents_org_idxincidents(org_id, created_at desc)Paginated incident list, the most common read path
incidents_status_idxincidents(org_id, status)Status filter on the incident list (common query parameter)
incidents_assignee_idxincidents(assignee_id)"My incidents" views and assignee filter
incident_findings_finding_idxincident_findings(finding_id)Reverse lookup: "which incidents contain this finding?" Used to compute incident_count on the finding detail endpoint
incident_notes_incident_idxincident_notes(incident_id, created_at)Chronological note list per incident
iris_agents_org_idxiris_agents(org_id)Agent list per org
iris_events_agent_idxiris_events(agent_id, processed)Batch-processing query: fetch unprocessed events per agent
iris_events_org_idxiris_events(org_id, received_at desc)Event feed per org
phishing_contacts_org_idxphishing_contacts(org_id)Contact list per org
phishing_schedules_org_idxphishing_schedules(org_id)Schedule list per org
idx_api_keys_orgapi_keys(org_id) where revoked_at is nullPartial index: only active keys; used to list a tenant's valid keys
idx_api_keys_hashapi_keys(key_hash) where revoked_at is nullAuthentication hot path: hash lookup on every API request with an hrs_ token

Released under the MIT License.