Initial commit: funeral provider discovery pipeline

Python crawlers for VIC Register, Funerals Australia, NFDA
n8n workflows for scheduled discovery and enrichment
SQLite schema and seeded dev database (1,463 providers)
End-to-end process documentation in n8n/PROCESS.md
This commit is contained in:
Richie
2026-04-24 10:27:08 +10:00
commit cc91427789
30 changed files with 4706 additions and 0 deletions

69
database/IMAGE-MAPPING.md Normal file
View File

@@ -0,0 +1,69 @@
# Image Assets & Verified Provider Mapping
## Image Directory Structure
All images are downloaded locally in `images/` with the following structure:
```
images/
├── manifest.json # Full index mapping CMS IDs → local paths
├── providers/{slug}/ # 12 verified brands
│ ├── logo.{ext} # Rectangular/stacked logo
│ └── badge.{ext} # Circular/square badge (for cards)
├── funeral-homes/{slug}/ # 7 parent organisations
│ └── logo.{ext}
├── locations/{slug}/ # 20 physical offices
│ └── photo.{ext} # Building/staff hero photo
├── coffins/{category}/ # 201 coffins by range
│ └── {slug}/01.{ext} # 1-4 images per coffin
├── venues/{slug}/ # 1,678 service venues
│ └── 01.{ext}
└── crematoriums/{slug}/ # 38 crematoriums
└── 01.{ext}
```
## Verified Brand → Image Mapping
These are the 12 existing verified brands from the CMS, with their image paths:
| CMS ID | Brand | Logo | Badge |
|--------|-------|------|-------|
| 1 | H.Parsons Funeral Directors | `providers/hparsons-funeral-directors/logo.png` | `providers/hparsons-funeral-directors/badge.png` |
| 3 | Rankins Funerals | `providers/rankins-funerals/logo.webp` | `providers/rankins-funerals/badge.png` |
| 4 | Parsons Ladies Funeral Directors | `providers/parsons-ladies-funeral-directors/logo.png` | `providers/parsons-ladies-funeral-directors/badge.png` |
| 5 | Wollongong City Funerals | `providers/wollongong-city-funerals/logo.webp` | `providers/wollongong-city-funerals/badge.png` |
| 6 | Easy Funerals | `providers/easy-funerals/logo.webp` | `providers/easy-funerals/badge.png` |
| 7 | Mackay Family Funerals | `providers/mackay-family-funerals/logo.webp` | `providers/mackay-family-funerals/badge.png` |
| 8 | H.Parsons Shoalhaven | `providers/hparsons-funeral-directors-shoalhaven/logo.png` | `providers/hparsons-funeral-directors-shoalhaven/badge.png` |
| 9 | Killick Family Funerals | `providers/killick-family-funerals/logo.webp` | `providers/killick-family-funerals/badge.png` |
| 10 | Kenneally's Funerals | `providers/kenneallys-funerals/logo.webp` | `providers/kenneallys-funerals/badge.png` |
| 11 | Lady Anne Funerals | `providers/lady-anne-funerals/logo.webp` | `providers/lady-anne-funerals/badge.png` |
| 12 | Mannings Funerals | `providers/mannings-funerals/logo.webp` | `providers/mannings-funerals/badge.png` |
| 13 | Botanical Funerals | `providers/botanical-funerals-by-ian-allison/logo.webp` | `providers/botanical-funerals-by-ian-allison/badge.png` |
## How to Use on the Demo Site
### For verified providers:
- Serve images from `images/providers/{slug}/` for logos and badges
- Serve location photos from `images/locations/{slug}/`
- Serve product images from `images/coffins/`, `images/venues/`, `images/crematoriums/`
- The `manifest.json` contains the full mapping from CMS record IDs to local file paths
### For unverified providers:
- **No images** — they have no logo, badge, or photos
- Use a generic placeholder or text-based display (business name initials, etc.)
- Images are only added when a provider signs up to become verified
### Importing verified brands:
The 12 verified brands need to be imported into the database with their full data from
`schemas/brands-full.json` (brand details, locations, packages, inclusions) and linked
to their images. Some of these brands were also discovered by the crawler and already
exist in `providers.db` as unverified — they should be **upgraded** (set `verified = true`,
add images) rather than duplicated.
### Product images:
- 201 coffins with 1-4 images each, organised by range (solid-timber, custom-board, etc.)
- 1,678 venue photos
- 38 crematorium photos
- These are only relevant for verified provider flows (arrangement booking)
- The `manifest.json` maps each product's CMS ID to its local image path

View File

@@ -0,0 +1,209 @@
# Provider Data Model — Verified & Unverified Providers
This document extends the CMS schema (`schemas/cms-schema-spec.md`) with support for
unverified (auto-discovered) providers alongside the existing verified (signed-up) providers.
---
## Overview
The platform lists funeral directors in two categories:
- **Verified providers** — Signed up to the platform. Full branding (logo, badge, colours),
complete package configuration, and online arrangement booking enabled.
- **Unverified providers** — Auto-discovered from public registries and their own websites.
Listed with whatever public information is available. Can apply to become verified.
All providers share the same `funeral_brand` table and schema. The difference is driven
by data completeness and the `verified` / `listing_tier` fields.
---
## Schema Changes to FuneralBrand
These fields are **added** to the existing FuneralBrand collection from `cms-schema-spec.md`:
| Field | Type | Default | Purpose |
|-------|------|---------|---------|
| `verified` | Boolean | `false` | `true` for signed-up partners, `false` for auto-discovered |
| `listing_tier` | Enum | `'listed'` | Display tier, computed from data quality (see below) |
| `hidden` | Boolean | `true` | Unverified providers start hidden until admin-reviewed |
| `source_key` | String (unique) | `null` | Provenance identifier, e.g. `"nfda:1234"` |
| `source_url` | String (URL) | `null` | Where this record was discovered |
| `last_enriched_at` | DateTime | `null` | When data was last refreshed from provider's website |
| `enrichment_status` | Enum | `'pending'` | `pending` / `partial` / `complete` / `failed` |
### Fields that become optional for unverified providers
These fields are **required** for verified providers but **nullable** for unverified:
| Field | Verified | Unverified |
|-------|----------|------------|
| `logo` | Required (brand logo image) | `null` — no images until they sign up |
| `badge` | Required (card badge image) | `null` — no images until they sign up |
| `description` | Required | Optional (extracted from their website if available) |
| `backgroundColour` | Set (brand theme) | `null` — use platform default |
| `foregroundColour` | Set (brand theme) | `null` — use platform default |
| `modalDescription` | Set | `null` |
| `code` | Set (URL slug) | Auto-generated from business name |
### Fields present for both verified and unverified
| Field | Notes |
|-------|-------|
| `title` | Business name (always present) |
| `phone` | Contact phone (present for ~94% of providers) |
| `email` | Contact email (present for ~66%) |
| `website` | External website URL (present for ~68%) |
| `abn` | Australian Business Number (strongest dedup key) |
| `businessAddress/Suburb/State/Postcode` | Business location |
| `availableFuneralTypes` | Comma-separated funeral type IDs |
---
## Listing Tiers
Every provider is assigned a `listing_tier` that determines how they appear on the platform.
The tier is **computed from data quality** — specifically from what package/pricing data exists.
| Tier | Value | Criteria | UI Treatment |
|------|-------|----------|-------------|
| **Verified** | `'verified'` | `verified = true` | Full branding, package selection, online arrangements, custom images |
| **Priced** | `'priced'` | Unverified + 2 or more packages with itemized inclusion prices | Show packages with line-item breakdowns, no arrangements |
| **Estimated** | `'estimated'` | Unverified + at least 1 package with a total price | Show package prices, "Contact for full details" on breakdowns |
| **Listed** | `'listed'` | Unverified + no pricing data | Show contact info only, "Contact for pricing" CTA |
### Tier computation logic
```
if brand.verified:
tier = 'verified'
elif brand has 2+ packages, each with 2+ priced inclusions:
tier = 'priced'
elif brand has 1+ packages with any price:
tier = 'estimated'
else:
tier = 'listed'
```
### Upgrade incentive
Each tier below verified creates a natural CTA for the provider:
- `listed` → "Publish your pricing to help families compare"
- `estimated` → "Add detailed breakdowns to stand out"
- `priced` → "Sign up to enable online arrangements and add your branding"
---
## Data Relationships (unchanged from CMS spec, but applied to both tiers)
```
FuneralBrand (verified or unverified)
├── Location[] (physical offices — at least 1 per provider)
├── Package[] (funeral plan bundles — 0 for 'listed' tier)
│ └── PackageInclusion[] (fee line items — 0 for 'estimated' tier)
├── KnownFor[] (feature badges — verified only typically)
└── FuneralArea[] (service regions — M:N)
```
### Package (same schema as CMS spec, with additions)
| Field | Type | Notes |
|-------|------|-------|
| `id` | PK | |
| `title` | String | e.g. "Direct Cremation", "Chapel Service" |
| `description` | Text | What's included |
| `funeral_type` | Enum | `Service & Cremation`, `Service & Burial`, `Cremation Only`, `Graveside Burial`, `Water Cremation` |
| `brand_id` | FK → FuneralBrand | |
| `source_url` | String | Where this pricing was found (provider's website) |
| `extraction_confidence` | Float 0-1 | How reliable the extracted data is (0.7 = HTML, 0.6 = PDF) |
| `sort` | Integer | Display order |
| `hidden` | Boolean | |
### PackageInclusion (same schema as CMS spec)
| Field | Type | Notes |
|-------|------|-------|
| `id` | PK | |
| `price` | Decimal | Dollar amount |
| `optional` | Boolean | User can opt in/out |
| `complimentary` | Boolean | Included free |
| `display` | Boolean | Whether shown to user |
| `inclusion_type_title` | String | Category label (see standard types below) |
| `package_id` | FK → Package | |
### Standard inclusion type names
These are the consistent labels used across all providers:
**Standard fees:** Professional Service Fee, Transportation Service Fee, Professional Mortuary Care, Death Registration Certificate, Cremation Certificate/Permit, Government Levy, Accommodation
**Products:** Coffin, Cremation Fee, Cemetery Fee, Celebrant Fee
**Optional extras:** Saturday Service Fee, Twilight Service Surcharge, Viewing Fee, After Hours Transfer Surcharge, Dressing Fee, Embalming, Digital Recording, Webstreaming, Coffin Bearing by Funeral Directors
---
## Current Data
The database (`database/providers.db`, SQLite) contains:
| Metric | Count |
|--------|-------|
| Total providers | 1,463 |
| With phone | 1,380 (94%) |
| With email | 972 (66%) |
| With website | 994 (68%) |
| With description | 618 (42%) |
| Total packages | 416 |
| Total inclusions | 388 |
### Tier distribution
| Tier | Providers |
|------|-----------|
| Verified | 0 (existing 12 brands not yet imported as verified) |
| Priced | 10 |
| Estimated | 111 |
| Listed | 1,342 |
### State distribution
| State | Providers | With Pricing |
|-------|-----------|-------------|
| VIC | 701 | 77 |
| NSW | 269 | 8 |
| QLD | 151 | 21 |
| SA | 85 | 1 |
| WA | 79 | 12 |
| TAS | 25 | 0 |
| NT | 7 | 0 |
| ACT | 9 | 0 |
---
## Database Schema Files
- **`database/schema.sql`** — Full Postgres schema (production-ready)
- **`database/schema_sqlite.sql`** — SQLite schema (dev/demo)
- **`database/providers.db`** — Live SQLite database with 1,463 providers
- **`database/seed_verified.sql`** — Script to mark imported CMS brands as verified
The schema is designed to be **additive** to the existing CMS schema from `schemas/cms-schema-spec.md`.
The original 12 verified brands and their packages/products should be imported first, then
`seed_verified.sql` marks them as `verified = true, listing_tier = 'verified'`.
---
## Verified Provider Upgrade Path
When an unverified provider applies to become verified:
1. They claim their listing (email verification or ABN match)
2. They fill in missing fields: description, logo, badge, brand colours
3. They configure packages with full inclusion breakdowns
4. They enable arrangement booking
5. Admin approves → `verified = true, listing_tier = 'verified'`
The backend should support this flow — updating an existing unverified brand
record rather than creating a new one.

BIN
database/providers.db Normal file

Binary file not shown.

285
database/schema.sql Normal file
View File

@@ -0,0 +1,285 @@
-- Provider Discovery Pipeline - Database Schema
-- Designed for Postgres. Compatible with SilverStripe CMS adaptation.
--
-- This schema covers the provider-facing tables needed for both
-- verified (signed-up) and unverified (auto-discovered) providers.
-- Product catalog tables (coffins, venues, etc.) are NOT included here —
-- those only apply to verified providers and live in the main CMS.
BEGIN;
-- ============================================================
-- ENUMS
-- ============================================================
CREATE TYPE enrichment_status AS ENUM ('pending', 'partial', 'complete', 'failed');
-- Listing tier determines how a provider appears on the platform.
-- Computed from data quality: verified status + packages + inclusions.
CREATE TYPE listing_tier AS ENUM (
'verified', -- Tier 1: Signed up, full branding, arrangements enabled
'priced', -- Tier 2: Unverified, 2+ packages with itemized inclusion prices
'estimated', -- Tier 3: Unverified, at least one total package price
'listed' -- Tier 4: Unverified, contact info only, no pricing
);
CREATE TYPE funeral_type_enum AS ENUM (
'Service & Cremation',
'Service & Burial',
'Cremation Only',
'Graveside Burial',
'Water Cremation'
);
-- ============================================================
-- 1. FUNERAL HOME (parent organisation)
-- ============================================================
CREATE TABLE funeral_home (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
website TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- ============================================================
-- 2. FUNERAL BRAND (customer-facing provider)
-- ============================================================
CREATE TABLE funeral_brand (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
description TEXT,
modal_description TEXT,
email TEXT,
phone TEXT,
website TEXT,
abn TEXT,
code TEXT UNIQUE, -- URL slug (e.g. "hparsons")
sort INTEGER DEFAULT 0,
hidden BOOLEAN NOT NULL DEFAULT TRUE, -- unverified start hidden
-- Address
business_address TEXT,
business_suburb TEXT,
business_state TEXT,
business_postcode TEXT,
-- Branding (nullable — unverified providers have no images)
background_colour TEXT,
foreground_colour TEXT,
-- Organisation
funeral_home_id INTEGER REFERENCES funeral_home(id) ON DELETE SET NULL,
-- Verified vs auto-discovered
verified BOOLEAN NOT NULL DEFAULT FALSE,
-- Provenance tracking
source_key TEXT UNIQUE, -- "{source}:{externalId}" for dedup
source_url TEXT, -- where this record was found
last_enriched_at TIMESTAMPTZ,
enrichment_status enrichment_status NOT NULL DEFAULT 'pending',
-- Listing tier (computed from data quality)
listing_tier listing_tier NOT NULL DEFAULT 'listed',
-- Funeral types offered (comma-separated IDs, same as existing CMS)
available_funeral_types TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Deduplication indexes
CREATE INDEX idx_brand_abn ON funeral_brand(abn) WHERE abn IS NOT NULL;
CREATE INDEX idx_brand_listing_tier ON funeral_brand(listing_tier);
CREATE INDEX idx_brand_source_key ON funeral_brand(source_key) WHERE source_key IS NOT NULL;
CREATE INDEX idx_brand_name_postcode ON funeral_brand(title, business_postcode);
CREATE INDEX idx_brand_verified ON funeral_brand(verified);
CREATE INDEX idx_brand_hidden ON funeral_brand(hidden);
CREATE INDEX idx_brand_enrichment ON funeral_brand(enrichment_status) WHERE verified = FALSE;
-- ============================================================
-- 3. LOCATION (physical office/chapel)
-- ============================================================
CREATE TABLE location (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL, -- display name (e.g. "Kingaroy, QLD")
address TEXT,
suburb TEXT,
state TEXT,
postcode TEXT,
country TEXT DEFAULT 'Australia',
lat DOUBLE PRECISION,
lng DOUBLE PRECISION,
rating REAL, -- Google rating 0-5
rating_num INTEGER, -- number of Google reviews
google_place_key TEXT, -- Google Places ID
brand_id INTEGER NOT NULL REFERENCES funeral_brand(id) ON DELETE CASCADE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_location_brand ON location(brand_id);
CREATE INDEX idx_location_state ON location(state);
CREATE INDEX idx_location_postcode ON location(postcode);
CREATE INDEX idx_location_coords ON location(lat, lng);
CREATE INDEX idx_location_google ON location(google_place_key) WHERE google_place_key IS NOT NULL;
-- ============================================================
-- 4. FUNERAL AREA (service region)
-- ============================================================
CREATE TABLE funeral_area (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
code TEXT,
description TEXT,
postcodes TEXT, -- comma-separated postcode list
sort INTEGER DEFAULT 0,
hidden BOOLEAN DEFAULT FALSE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Junction: brand <-> funeral_area
CREATE TABLE brand_funeral_area (
brand_id INTEGER NOT NULL REFERENCES funeral_brand(id) ON DELETE CASCADE,
funeral_area_id INTEGER NOT NULL REFERENCES funeral_area(id) ON DELETE CASCADE,
PRIMARY KEY (brand_id, funeral_area_id)
);
-- ============================================================
-- 5. PACKAGE (funeral plan bundle)
-- ============================================================
CREATE TABLE package (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
description TEXT,
sort INTEGER DEFAULT 0,
hidden BOOLEAN DEFAULT FALSE,
for_whom TEXT, -- 'myself' / 'someone' / null (both)
religion TEXT, -- comma-separated supported religions
funeral_type funeral_type_enum,
brand_id INTEGER NOT NULL REFERENCES funeral_brand(id) ON DELETE CASCADE,
-- Provenance (for AI-extracted packages)
source_url TEXT, -- page this was extracted from
extraction_confidence REAL, -- 0-1 confidence score from AI
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_package_brand ON package(brand_id);
CREATE INDEX idx_package_type ON package(funeral_type);
-- Junction: package <-> funeral_area
CREATE TABLE package_funeral_area (
package_id INTEGER NOT NULL REFERENCES package(id) ON DELETE CASCADE,
funeral_area_id INTEGER NOT NULL REFERENCES funeral_area(id) ON DELETE CASCADE,
PRIMARY KEY (package_id, funeral_area_id)
);
-- ============================================================
-- 6. PACKAGE INCLUSION (fee line item within a package)
-- ============================================================
CREATE TABLE package_inclusion (
id SERIAL PRIMARY KEY,
price NUMERIC(10,2) NOT NULL,
optional BOOLEAN NOT NULL DEFAULT FALSE,
complimentary BOOLEAN NOT NULL DEFAULT FALSE,
display BOOLEAN NOT NULL DEFAULT TRUE,
description TEXT,
sort INTEGER DEFAULT 0,
inclusion_type_title TEXT NOT NULL, -- category label (e.g. "Professional Service Fee")
package_id INTEGER NOT NULL REFERENCES package(id) ON DELETE CASCADE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_inclusion_package ON package_inclusion(package_id);
-- ============================================================
-- 7. KNOWN FOR (feature badges on provider cards)
-- ============================================================
CREATE TABLE known_for (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
brand_id INTEGER NOT NULL REFERENCES funeral_brand(id) ON DELETE CASCADE
);
CREATE INDEX idx_known_for_brand ON known_for(brand_id);
-- ============================================================
-- 8. SOURCE LOG (audit trail of scrape runs)
-- ============================================================
CREATE TABLE source_log (
id SERIAL PRIMARY KEY,
source_name TEXT NOT NULL, -- 'vic_register', 'gathered_here', 'nfda', 'funerals_australia'
run_started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
run_finished_at TIMESTAMPTZ,
records_found INTEGER DEFAULT 0,
records_new INTEGER DEFAULT 0,
records_updated INTEGER DEFAULT 0,
records_skipped INTEGER DEFAULT 0,
status TEXT DEFAULT 'running', -- 'running', 'completed', 'failed'
error_message TEXT,
metadata JSONB -- any extra run info
);
-- ============================================================
-- 9. SOURCE RECORD (raw scraped data, kept for audit)
-- ============================================================
CREATE TABLE source_record (
id SERIAL PRIMARY KEY,
source_name TEXT NOT NULL,
source_id TEXT NOT NULL, -- external ID from the source
source_url TEXT,
raw_data JSONB NOT NULL, -- original scraped data
normalized_data JSONB, -- mapped to intermediate format
matched_brand_id INTEGER REFERENCES funeral_brand(id) ON DELETE SET NULL,
match_type TEXT, -- 'source_key', 'abn', 'name_postcode', 'fuzzy', 'new'
processed_at TIMESTAMPTZ,
log_id INTEGER REFERENCES source_log(id) ON DELETE SET NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(source_name, source_id)
);
CREATE INDEX idx_source_record_source ON source_record(source_name, source_id);
CREATE INDEX idx_source_record_brand ON source_record(matched_brand_id) WHERE matched_brand_id IS NOT NULL;
-- ============================================================
-- UPDATED_AT TRIGGER
-- ============================================================
CREATE OR REPLACE FUNCTION update_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_funeral_home_updated BEFORE UPDATE ON funeral_home FOR EACH ROW EXECUTE FUNCTION update_updated_at();
CREATE TRIGGER trg_funeral_brand_updated BEFORE UPDATE ON funeral_brand FOR EACH ROW EXECUTE FUNCTION update_updated_at();
CREATE TRIGGER trg_location_updated BEFORE UPDATE ON location FOR EACH ROW EXECUTE FUNCTION update_updated_at();
CREATE TRIGGER trg_funeral_area_updated BEFORE UPDATE ON funeral_area FOR EACH ROW EXECUTE FUNCTION update_updated_at();
CREATE TRIGGER trg_package_updated BEFORE UPDATE ON package FOR EACH ROW EXECUTE FUNCTION update_updated_at();
CREATE TRIGGER trg_package_inclusion_updated BEFORE UPDATE ON package_inclusion FOR EACH ROW EXECUTE FUNCTION update_updated_at();
COMMIT;

221
database/schema_sqlite.sql Normal file
View File

@@ -0,0 +1,221 @@
-- Provider Discovery Pipeline - SQLite Schema (for local dev/testing)
-- Production uses Postgres (see schema.sql)
-- ============================================================
-- FUNERAL HOME
-- ============================================================
CREATE TABLE IF NOT EXISTS funeral_home (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
website TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
-- ============================================================
-- FUNERAL BRAND
-- ============================================================
CREATE TABLE IF NOT EXISTS funeral_brand (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
modal_description TEXT,
email TEXT,
phone TEXT,
website TEXT,
abn TEXT,
code TEXT UNIQUE,
sort INTEGER DEFAULT 0,
hidden INTEGER NOT NULL DEFAULT 1,
business_address TEXT,
business_suburb TEXT,
business_state TEXT,
business_postcode TEXT,
background_colour TEXT,
foreground_colour TEXT,
funeral_home_id INTEGER REFERENCES funeral_home(id) ON DELETE SET NULL,
verified INTEGER NOT NULL DEFAULT 0,
source_key TEXT UNIQUE,
source_url TEXT,
last_enriched_at TEXT,
enrichment_status TEXT NOT NULL DEFAULT 'pending' CHECK(enrichment_status IN ('pending','partial','complete','failed')),
-- Listing tier: verified | priced | estimated | listed
listing_tier TEXT NOT NULL DEFAULT 'listed'
CHECK(listing_tier IN ('verified','priced','estimated','listed')),
available_funeral_types TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX IF NOT EXISTS idx_brand_abn ON funeral_brand(abn);
CREATE INDEX IF NOT EXISTS idx_brand_source_key ON funeral_brand(source_key);
CREATE INDEX IF NOT EXISTS idx_brand_listing_tier ON funeral_brand(listing_tier);
CREATE INDEX IF NOT EXISTS idx_brand_name_postcode ON funeral_brand(title, business_postcode);
CREATE INDEX IF NOT EXISTS idx_brand_verified ON funeral_brand(verified);
CREATE INDEX IF NOT EXISTS idx_brand_hidden ON funeral_brand(hidden);
-- ============================================================
-- LOCATION
-- ============================================================
CREATE TABLE IF NOT EXISTS location (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
address TEXT,
suburb TEXT,
state TEXT,
postcode TEXT,
country TEXT DEFAULT 'Australia',
lat REAL,
lng REAL,
rating REAL,
rating_num INTEGER,
google_place_key TEXT,
brand_id INTEGER NOT NULL REFERENCES funeral_brand(id) ON DELETE CASCADE,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX IF NOT EXISTS idx_location_brand ON location(brand_id);
CREATE INDEX IF NOT EXISTS idx_location_postcode ON location(postcode);
-- ============================================================
-- FUNERAL AREA
-- ============================================================
CREATE TABLE IF NOT EXISTS funeral_area (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
code TEXT,
description TEXT,
postcodes TEXT,
sort INTEGER DEFAULT 0,
hidden INTEGER DEFAULT 0,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE IF NOT EXISTS brand_funeral_area (
brand_id INTEGER NOT NULL REFERENCES funeral_brand(id) ON DELETE CASCADE,
funeral_area_id INTEGER NOT NULL REFERENCES funeral_area(id) ON DELETE CASCADE,
PRIMARY KEY (brand_id, funeral_area_id)
);
-- ============================================================
-- PACKAGE
-- ============================================================
CREATE TABLE IF NOT EXISTS package (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
sort INTEGER DEFAULT 0,
hidden INTEGER DEFAULT 0,
for_whom TEXT,
religion TEXT,
funeral_type TEXT CHECK(funeral_type IN (
'Service & Cremation','Service & Burial','Cremation Only',
'Graveside Burial','Water Cremation'
)),
brand_id INTEGER NOT NULL REFERENCES funeral_brand(id) ON DELETE CASCADE,
source_url TEXT,
extraction_confidence REAL,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX IF NOT EXISTS idx_package_brand ON package(brand_id);
CREATE TABLE IF NOT EXISTS package_funeral_area (
package_id INTEGER NOT NULL REFERENCES package(id) ON DELETE CASCADE,
funeral_area_id INTEGER NOT NULL REFERENCES funeral_area(id) ON DELETE CASCADE,
PRIMARY KEY (package_id, funeral_area_id)
);
-- ============================================================
-- PACKAGE INCLUSION
-- ============================================================
CREATE TABLE IF NOT EXISTS package_inclusion (
id INTEGER PRIMARY KEY AUTOINCREMENT,
price REAL NOT NULL,
optional INTEGER NOT NULL DEFAULT 0,
complimentary INTEGER NOT NULL DEFAULT 0,
display INTEGER NOT NULL DEFAULT 1,
description TEXT,
sort INTEGER DEFAULT 0,
inclusion_type_title TEXT NOT NULL,
package_id INTEGER NOT NULL REFERENCES package(id) ON DELETE CASCADE,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX IF NOT EXISTS idx_inclusion_package ON package_inclusion(package_id);
-- ============================================================
-- KNOWN FOR
-- ============================================================
CREATE TABLE IF NOT EXISTS known_for (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
brand_id INTEGER NOT NULL REFERENCES funeral_brand(id) ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS idx_known_for_brand ON known_for(brand_id);
-- ============================================================
-- SOURCE LOG
-- ============================================================
CREATE TABLE IF NOT EXISTS source_log (
id INTEGER PRIMARY KEY AUTOINCREMENT,
source_name TEXT NOT NULL,
run_started_at TEXT NOT NULL DEFAULT (datetime('now')),
run_finished_at TEXT,
records_found INTEGER DEFAULT 0,
records_new INTEGER DEFAULT 0,
records_updated INTEGER DEFAULT 0,
records_skipped INTEGER DEFAULT 0,
status TEXT DEFAULT 'running',
error_message TEXT,
metadata TEXT -- JSON string
);
-- ============================================================
-- SOURCE RECORD
-- ============================================================
CREATE TABLE IF NOT EXISTS source_record (
id INTEGER PRIMARY KEY AUTOINCREMENT,
source_name TEXT NOT NULL,
source_id TEXT NOT NULL,
source_url TEXT,
raw_data TEXT NOT NULL, -- JSON string
normalized_data TEXT, -- JSON string
matched_brand_id INTEGER REFERENCES funeral_brand(id) ON DELETE SET NULL,
match_type TEXT,
processed_at TEXT,
log_id INTEGER REFERENCES source_log(id) ON DELETE SET NULL,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE(source_name, source_id)
);
CREATE INDEX IF NOT EXISTS idx_source_record_source ON source_record(source_name, source_id);

View File

@@ -0,0 +1,24 @@
-- Seed script: Mark existing brands as verified
-- Run after importing existing CMS data into the new schema.
--
-- This updates all pre-existing brands (imported from brands-full.json)
-- to verified=true, hidden=false, enrichment_status='complete'.
UPDATE funeral_brand
SET verified = TRUE,
hidden = FALSE,
enrichment_status = 'complete',
listing_tier = 'verified',
updated_at = NOW()
WHERE id IN (
-- IDs from the existing 12 brands in brands-full.json
-- These will be populated during the initial CMS data import.
-- Update this list to match actual imported IDs.
SELECT id FROM funeral_brand WHERE source_key IS NULL
);
-- Alternatively, if importing with known codes:
-- UPDATE funeral_brand SET verified = TRUE, hidden = FALSE, enrichment_status = 'complete'
-- WHERE code IN ('hparsons', 'parsons-ladies', 'rankins', 'killick', 'botanical',
-- 'easy', 'wollongong-city', 'kenneallys', 'lady-anne',
-- 'mackay', 'mannings', 'guardian');