Python crawlers for VIC Register, Funerals Australia, NFDA n8n workflows for scheduled discovery and enrichment SQLite schema and seeded dev database (1,463 providers) End-to-end process documentation in n8n/PROCESS.md
210 lines
8.0 KiB
Markdown
210 lines
8.0 KiB
Markdown
# Provider Data Model — Verified & Unverified Providers
|
|
|
|
This document extends the CMS schema (`schemas/cms-schema-spec.md`) with support for
|
|
unverified (auto-discovered) providers alongside the existing verified (signed-up) providers.
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
The platform lists funeral directors in two categories:
|
|
|
|
- **Verified providers** — Signed up to the platform. Full branding (logo, badge, colours),
|
|
complete package configuration, and online arrangement booking enabled.
|
|
- **Unverified providers** — Auto-discovered from public registries and their own websites.
|
|
Listed with whatever public information is available. Can apply to become verified.
|
|
|
|
All providers share the same `funeral_brand` table and schema. The difference is driven
|
|
by data completeness and the `verified` / `listing_tier` fields.
|
|
|
|
---
|
|
|
|
## Schema Changes to FuneralBrand
|
|
|
|
These fields are **added** to the existing FuneralBrand collection from `cms-schema-spec.md`:
|
|
|
|
| Field | Type | Default | Purpose |
|
|
|-------|------|---------|---------|
|
|
| `verified` | Boolean | `false` | `true` for signed-up partners, `false` for auto-discovered |
|
|
| `listing_tier` | Enum | `'listed'` | Display tier, computed from data quality (see below) |
|
|
| `hidden` | Boolean | `true` | Unverified providers start hidden until admin-reviewed |
|
|
| `source_key` | String (unique) | `null` | Provenance identifier, e.g. `"nfda:1234"` |
|
|
| `source_url` | String (URL) | `null` | Where this record was discovered |
|
|
| `last_enriched_at` | DateTime | `null` | When data was last refreshed from provider's website |
|
|
| `enrichment_status` | Enum | `'pending'` | `pending` / `partial` / `complete` / `failed` |
|
|
|
|
### Fields that become optional for unverified providers
|
|
|
|
These fields are **required** for verified providers but **nullable** for unverified:
|
|
|
|
| Field | Verified | Unverified |
|
|
|-------|----------|------------|
|
|
| `logo` | Required (brand logo image) | `null` — no images until they sign up |
|
|
| `badge` | Required (card badge image) | `null` — no images until they sign up |
|
|
| `description` | Required | Optional (extracted from their website if available) |
|
|
| `backgroundColour` | Set (brand theme) | `null` — use platform default |
|
|
| `foregroundColour` | Set (brand theme) | `null` — use platform default |
|
|
| `modalDescription` | Set | `null` |
|
|
| `code` | Set (URL slug) | Auto-generated from business name |
|
|
|
|
### Fields present for both verified and unverified
|
|
|
|
| Field | Notes |
|
|
|-------|-------|
|
|
| `title` | Business name (always present) |
|
|
| `phone` | Contact phone (present for ~94% of providers) |
|
|
| `email` | Contact email (present for ~66%) |
|
|
| `website` | External website URL (present for ~68%) |
|
|
| `abn` | Australian Business Number (strongest dedup key) |
|
|
| `businessAddress/Suburb/State/Postcode` | Business location |
|
|
| `availableFuneralTypes` | Comma-separated funeral type IDs |
|
|
|
|
---
|
|
|
|
## Listing Tiers
|
|
|
|
Every provider is assigned a `listing_tier` that determines how they appear on the platform.
|
|
The tier is **computed from data quality** — specifically from what package/pricing data exists.
|
|
|
|
| Tier | Value | Criteria | UI Treatment |
|
|
|------|-------|----------|-------------|
|
|
| **Verified** | `'verified'` | `verified = true` | Full branding, package selection, online arrangements, custom images |
|
|
| **Priced** | `'priced'` | Unverified + 2 or more packages with itemized inclusion prices | Show packages with line-item breakdowns, no arrangements |
|
|
| **Estimated** | `'estimated'` | Unverified + at least 1 package with a total price | Show package prices, "Contact for full details" on breakdowns |
|
|
| **Listed** | `'listed'` | Unverified + no pricing data | Show contact info only, "Contact for pricing" CTA |
|
|
|
|
### Tier computation logic
|
|
|
|
```
|
|
if brand.verified:
|
|
tier = 'verified'
|
|
elif brand has 2+ packages, each with 2+ priced inclusions:
|
|
tier = 'priced'
|
|
elif brand has 1+ packages with any price:
|
|
tier = 'estimated'
|
|
else:
|
|
tier = 'listed'
|
|
```
|
|
|
|
### Upgrade incentive
|
|
|
|
Each tier below verified creates a natural CTA for the provider:
|
|
- `listed` → "Publish your pricing to help families compare"
|
|
- `estimated` → "Add detailed breakdowns to stand out"
|
|
- `priced` → "Sign up to enable online arrangements and add your branding"
|
|
|
|
---
|
|
|
|
## Data Relationships (unchanged from CMS spec, but applied to both tiers)
|
|
|
|
```
|
|
FuneralBrand (verified or unverified)
|
|
├── Location[] (physical offices — at least 1 per provider)
|
|
├── Package[] (funeral plan bundles — 0 for 'listed' tier)
|
|
│ └── PackageInclusion[] (fee line items — 0 for 'estimated' tier)
|
|
├── KnownFor[] (feature badges — verified only typically)
|
|
└── FuneralArea[] (service regions — M:N)
|
|
```
|
|
|
|
### Package (same schema as CMS spec, with additions)
|
|
|
|
| Field | Type | Notes |
|
|
|-------|------|-------|
|
|
| `id` | PK | |
|
|
| `title` | String | e.g. "Direct Cremation", "Chapel Service" |
|
|
| `description` | Text | What's included |
|
|
| `funeral_type` | Enum | `Service & Cremation`, `Service & Burial`, `Cremation Only`, `Graveside Burial`, `Water Cremation` |
|
|
| `brand_id` | FK → FuneralBrand | |
|
|
| `source_url` | String | Where this pricing was found (provider's website) |
|
|
| `extraction_confidence` | Float 0-1 | How reliable the extracted data is (0.7 = HTML, 0.6 = PDF) |
|
|
| `sort` | Integer | Display order |
|
|
| `hidden` | Boolean | |
|
|
|
|
### PackageInclusion (same schema as CMS spec)
|
|
|
|
| Field | Type | Notes |
|
|
|-------|------|-------|
|
|
| `id` | PK | |
|
|
| `price` | Decimal | Dollar amount |
|
|
| `optional` | Boolean | User can opt in/out |
|
|
| `complimentary` | Boolean | Included free |
|
|
| `display` | Boolean | Whether shown to user |
|
|
| `inclusion_type_title` | String | Category label (see standard types below) |
|
|
| `package_id` | FK → Package | |
|
|
|
|
### Standard inclusion type names
|
|
|
|
These are the consistent labels used across all providers:
|
|
|
|
**Standard fees:** Professional Service Fee, Transportation Service Fee, Professional Mortuary Care, Death Registration Certificate, Cremation Certificate/Permit, Government Levy, Accommodation
|
|
|
|
**Products:** Coffin, Cremation Fee, Cemetery Fee, Celebrant Fee
|
|
|
|
**Optional extras:** Saturday Service Fee, Twilight Service Surcharge, Viewing Fee, After Hours Transfer Surcharge, Dressing Fee, Embalming, Digital Recording, Webstreaming, Coffin Bearing by Funeral Directors
|
|
|
|
---
|
|
|
|
## Current Data
|
|
|
|
The database (`database/providers.db`, SQLite) contains:
|
|
|
|
| Metric | Count |
|
|
|--------|-------|
|
|
| Total providers | 1,463 |
|
|
| With phone | 1,380 (94%) |
|
|
| With email | 972 (66%) |
|
|
| With website | 994 (68%) |
|
|
| With description | 618 (42%) |
|
|
| Total packages | 416 |
|
|
| Total inclusions | 388 |
|
|
|
|
### Tier distribution
|
|
|
|
| Tier | Providers |
|
|
|------|-----------|
|
|
| Verified | 0 (existing 12 brands not yet imported as verified) |
|
|
| Priced | 10 |
|
|
| Estimated | 111 |
|
|
| Listed | 1,342 |
|
|
|
|
### State distribution
|
|
|
|
| State | Providers | With Pricing |
|
|
|-------|-----------|-------------|
|
|
| VIC | 701 | 77 |
|
|
| NSW | 269 | 8 |
|
|
| QLD | 151 | 21 |
|
|
| SA | 85 | 1 |
|
|
| WA | 79 | 12 |
|
|
| TAS | 25 | 0 |
|
|
| NT | 7 | 0 |
|
|
| ACT | 9 | 0 |
|
|
|
|
---
|
|
|
|
## Database Schema Files
|
|
|
|
- **`database/schema.sql`** — Full Postgres schema (production-ready)
|
|
- **`database/schema_sqlite.sql`** — SQLite schema (dev/demo)
|
|
- **`database/providers.db`** — Live SQLite database with 1,463 providers
|
|
- **`database/seed_verified.sql`** — Script to mark imported CMS brands as verified
|
|
|
|
The schema is designed to be **additive** to the existing CMS schema from `schemas/cms-schema-spec.md`.
|
|
The original 12 verified brands and their packages/products should be imported first, then
|
|
`seed_verified.sql` marks them as `verified = true, listing_tier = 'verified'`.
|
|
|
|
---
|
|
|
|
## Verified Provider Upgrade Path
|
|
|
|
When an unverified provider applies to become verified:
|
|
|
|
1. They claim their listing (email verification or ABN match)
|
|
2. They fill in missing fields: description, logo, badge, brand colours
|
|
3. They configure packages with full inclusion breakdowns
|
|
4. They enable arrangement booking
|
|
5. Admin approves → `verified = true, listing_tier = 'verified'`
|
|
|
|
The backend should support this flow — updating an existing unverified brand
|
|
record rather than creating a new one.
|