Deleted
This commit is contained in:
35
CLAUDE.md
35
CLAUDE.md
@@ -1,35 +0,0 @@
|
|||||||
# Claude Code orientation
|
|
||||||
|
|
||||||
You've been handed a funeral-provider discovery pipeline. Before doing anything:
|
|
||||||
|
|
||||||
1. Read `README.md` for the repo layout.
|
|
||||||
2. Read `n8n/PROCESS.md` for the end-to-end flow and how data conforms to the DB schema. **This is the authoritative doc.**
|
|
||||||
3. Read `crawlers/PIPELINE.md` for Python module internals.
|
|
||||||
|
|
||||||
## Project shape
|
|
||||||
|
|
||||||
- `crawlers/` — Python modules, one per data source. Invoked either by `run_overnight.sh` (manual) or by n8n workflows via `executeCommand`.
|
|
||||||
- `n8n/workflows/*.json` — four scheduled workflows that drive the pipeline end-to-end.
|
|
||||||
- `database/providers.db` — live SQLite snapshot (~1,463 providers, 121 with pricing). Safe to inspect; re-creatable from `schema_sqlite.sql`.
|
|
||||||
|
|
||||||
## Key constraints
|
|
||||||
|
|
||||||
- **Never write to `funeral_brand.verified` or `funeral_brand.hidden`** — those are admin-only. The pipeline keeps providers hidden and unverified until a human reviews them.
|
|
||||||
- **Do not use Gathered Here data as a source of truth.** It's a competitor. `crawl_gathered_here.py` exists as historical tooling but isn't part of the active pipeline — all enrichment comes from providers' own websites or regulatory disclosure PDFs.
|
|
||||||
- **Listing tier is computed, not stored as the source of truth.** `compute_tiers.py` derives it from package/inclusion data. Don't set it manually.
|
|
||||||
|
|
||||||
## Running locally
|
|
||||||
|
|
||||||
You'll need a Serper API key (free 2,500/mo at serper.dev) to do website discovery. Everything else can run without keys, though AI pricing extraction in Workflow 3 needs an Anthropic key.
|
|
||||||
|
|
||||||
```
|
|
||||||
cp crawlers/config.example.json crawlers/config.json
|
|
||||||
# add keys to config.json
|
|
||||||
cd crawlers && ./run_overnight.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
## Things that aren't here
|
|
||||||
|
|
||||||
- No live secrets / API keys — `crawlers/config.json` is gitignored, use `config.example.json` as a template.
|
|
||||||
- No admin review UI — that's a separate frontend project.
|
|
||||||
- No Postgres migration tooling — `database/schema.sql` is the target, but the repo uses SQLite for dev.
|
|
||||||
Reference in New Issue
Block a user