Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new Markdown project plan for an Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as ENSdb-CLI
participant DB as Postgres
participant FS as Local\ FS
participant S3 as S3/Bucket
User->>CLI: run `snapshot create` (--jobs, --bucket)
CLI->>DB: pg_dump --format=directory (per-schema, parallel)
DB-->>FS: write schema dump directories
CLI->>FS: compress per-schema dirs -> `.dump.tar.zst`, write `manifest.json` + `checksums.sha256`
CLI->>S3: upload manifest + archives + shared dumps
S3-->>CLI: upload confirmations
CLI-->>User: return snapshot ID / success
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds a planning document for a proposed apps/ensdb-cli tool to support ENSDb inspection, schema management, and snapshot create/restore/push/pull workflows (including S3-compatible storage) for large PostgreSQL databases.
Changes:
- Introduces a detailed implementation plan covering CLI commands, snapshot format, S3 layout, and phased delivery.
- Documents snapshot composition (indexer schemas +
ponder_sync+ensnode.metadata) and restore safety constraints. - Proposes manifest structure and how to derive metadata (e.g.,
ensindexer_public_config) fromensnode.metadata.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ## Context | ||
|
|
||
| ENSNode production databases are 50-100GB PostgreSQL instances. Each chain deployment gets its own indexer schema following the naming convention `{deployment}Schema{version}`. Three schema types coexist in one database: |
There was a problem hiding this comment.
The schema naming convention here is documented as {deployment}Schema{version}, but the blue/green deploy workflow actually sets schema names using the Docker image tag input (e.g. alphaSchema${TAG}), which may not always be a semver version string. Consider updating the wording to {deployment}Schema{tag} (or explicitly “Docker image tag”) to match the workflow behavior.
| ENSNode production databases are 50-100GB PostgreSQL instances. Each chain deployment gets its own indexer schema following the naming convention `{deployment}Schema{version}`. Three schema types coexist in one database: | |
| ENSNode production databases are 50-100GB PostgreSQL instances. Each chain deployment gets its own indexer schema following the naming convention `{deployment}Schema{tag}`, where `tag` is the Docker image tag used by the blue-green deploy workflow. Three schema types coexist in one database: |
| - `ensnode` -- metadata table (rows scoped by `ens_indexer_schema_name`) | ||
| - `ponder_sync` -- shared RPC cache and sync state (needed by every indexer) | ||
|
|
||
| Schema names are set via `ENSINDEXER_SCHEMA_NAME` env var in the blue-green deploy workflow (`[.github/workflows/deploy_ensnode_blue_green.yml](.github/workflows/deploy_ensnode_blue_green.yml)`). Old schemas are orphaned on redeploy and must be dropped manually to reclaim space. |
There was a problem hiding this comment.
This markdown link is likely broken because the plan file lives under .cursor/plans/, so ( .github/workflows/... ) will resolve relative to that directory on GitHub. Use a repo-root absolute link (e.g. /.github/workflows/deploy_ensnode_blue_green.yml) or a correct relative path (e.g. ../../.github/workflows/...).
| Schema names are set via `ENSINDEXER_SCHEMA_NAME` env var in the blue-green deploy workflow (`[.github/workflows/deploy_ensnode_blue_green.yml](.github/workflows/deploy_ensnode_blue_green.yml)`). Old schemas are orphaned on redeploy and must be dropped manually to reclaim space. | |
| Schema names are set via `ENSINDEXER_SCHEMA_NAME` env var in the blue-green deploy workflow (`[.github/workflows/deploy_ensnode_blue_green.yml](/.github/workflows/deploy_ensnode_blue_green.yml)`). Old schemas are orphaned on redeploy and must be dropped manually to reclaim space. |
|
|
||
| ### Snapshot Format | ||
|
|
||
| Use `**pg_dump --format=directory`** with `**--jobs=N`** for parallel dump/restore. This is the only format that supports parallelism, which is critical for 50-100GB databases. Each directory-format dump is then archived as a `**.dump.tar.zst`** artifact for storage and transfer, and unpacked to a temporary directory before restore. |
There was a problem hiding this comment.
Inline code formatting is broken here due to mixing backticks and bold markers (e.g. `**pg_dump ...`**). Prefer either code formatting (backticks) or bold, but not both, so the command renders correctly in markdown.
| Use `**pg_dump --format=directory`** with `**--jobs=N`** for parallel dump/restore. This is the only format that supports parallelism, which is critical for 50-100GB databases. Each directory-format dump is then archived as a `**.dump.tar.zst`** artifact for storage and transfer, and unpacked to a temporary directory before restore. | |
| Use `pg_dump --format=directory` with `--jobs=N` for parallel dump/restore. This is the only format that supports parallelism, which is critical for 50-100GB databases. Each directory-format dump is then archived as a `.dump.tar.zst` artifact for storage and transfer, and unpacked to a temporary directory before restore. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md:
- Line 85: The plan doc is failing markdown lint (MD037/MD040): fix spacing
inside emphasis markers (remove spaces so examples like **pg_dump
--format=directory** and **--jobs=N** use contiguous markers) and add language
identifiers to fenced code blocks (e.g., use ```text, ```bash as appropriate)
for the code sections such as the manifest listing, command blocks like
ensdb-cli inspect / schema drop / snapshot create, and the sample
directory/package listings; apply these exact changes throughout the file
(.cursor/plans/ensdb_cli_tool_422abf99.plan.md) where the reviewer noted blocks
(97-105, 123-129, 133-136, 140-163, 226-230, 234-257) so MD037/MD040 are
resolved.
- Line 60: The plan assumes a required schema named "ponder_sync" but the
snapshot metadata only lists "ensnode.metadata" (0000_snapshot.json), so change
snapshot/restore logic to be capability-driven: detect presence of the
"ponder_sync" schema at runtime and only include it in pg_dump/restore and
safety checks when discovered; update the code that builds schema lists (where
it currently hardcodes "ponder_sync") to query the database for existing shared
schemas and merge those into the snapshot manifest (alongside
"ensnode.metadata") and adjust restore constraint checks to validate against the
actual detected shared state rather than a hardcoded set.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e9582e0e-060a-4409-8165-b63195afb37f
📒 Files selected for processing (1)
.cursor/plans/ensdb_cli_tool_422abf99.plan.md
There was a problem hiding this comment.
♻️ Duplicate comments (3)
.cursor/plans/ensdb_cli_tool_422abf99.plan.md (3)
70-75:⚠️ Potential issue | 🟠 MajorMake shared-schema handling capability-driven instead of hardcoded
ponder_sync.The plan currently treats
ponder_syncas always present/required across create/pull/restore/manifest semantics. That can break on deployments where this schema is absent. Define runtime discovery (information_schema/pg_namespace) and include shared schemas conditionally in dump, manifest, and restore safety checks.#!/bin/bash set -euo pipefail # Verify what schemas are defined in the ENSDb migration snapshot metadata. SNAPSHOT_FILE="$(fd -a '0000_snapshot.json' | head -n 1)" if [ -z "${SNAPSHOT_FILE}" ]; then echo "Could not find 0000_snapshot.json in repository" exit 1 fi echo "Using snapshot file: ${SNAPSHOT_FILE}" rg -n '"ponder_sync"|ensnode\.metadata|ensnode|schemas' "${SNAPSHOT_FILE}"Also applies to: 108-109, 142-147, 205-208, 295-296
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md around lines 70 - 75, The plan currently hardcodes the presence of the ponder_sync schema (and assumes ensnode.metadata) in create/pull/restore/manifest semantics; change this to capability-driven discovery by querying runtime metadata (information_schema/pg_namespace) and only include shared schemas like "ponder_sync" or "ensnode.metadata" when they actually exist; update the dump/manifest generation and restore safety checks to conditionally add/validate these schemas instead of assuming them, and revise all occurrences that refer to ponder_sync/ensnode.metadata in the plan (including the create/pull/restore/manifest logic) to perform an existence check before treating them as required.
81-82:⚠️ Potential issue | 🟠 MajorSpecify concrete preflight enforcement for “fresh/isolated DB only” restores.
The policy is documented, but enforcement criteria are not. Add explicit checks (what tables/schemas must be absent, what command fails with which message, and how
--drop-existingchanges behavior) so implementation cannot drift into unsafe partial restores.Also applies to: 145-149, 296-297
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md around lines 81 - 82, The restore command must enforce "fresh/isolated DB only" by explicitly checking for existing ponder-related artifacts before proceeding: in the snapshot restore handler (restore_snapshot / RestoreSnapshotCommand) query the target DB for existing schemas/tables and fail if any of the following exist: the ponder_sync table, ponder metadata tables, or the custom ponder schema (e.g., "ponder" or "ponder_sync" schema), allowing only empty user schemas and DB system schemas; emit a clear error like "restore aborted: target DB contains existing ponder data; use --drop-existing to overwrite" when found; implement --drop-existing to first drop those identified ponder tables/schemas (and only them) before continuing the restore; ensure the check runs early and returns a non-zero exit code on failure so partial restores cannot proceed.
85-85:⚠️ Potential issue | 🟡 MinorResolve markdown lint issues (MD037/MD040) before merge.
There are still spaced emphasis markers and unlabeled fenced code blocks. This is small but should be cleaned to keep docs CI green and readable.
Also applies to: 97-97, 123-123, 133-133, 140-140, 226-226, 234-234
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md at line 85, Fix MD037/MD040 by removing spaced emphasis markers and adding language labels to fenced code blocks: replace occurrences like ** pg_dump --format=directory ** and ** --jobs=N ** with inline code/backticks (`pg_dump --format=directory`, `--jobs=N`) or compact emphasis without extra spaces (e.g., **pg_dump --format=directory**) and ensure archive examples like **.dump.tar.zst** use inline code (`.dump.tar.zst`); also add explicit language tags (e.g., ```bash) to each unlabeled fenced code block so all fenced blocks are labeled. Target the text containing pg_dump --format=directory, --jobs=N, and .dump.tar.zst as well as the other cited occurrences to satisfy MD037/MD040.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md:
- Around line 70-75: The plan currently hardcodes the presence of the
ponder_sync schema (and assumes ensnode.metadata) in
create/pull/restore/manifest semantics; change this to capability-driven
discovery by querying runtime metadata (information_schema/pg_namespace) and
only include shared schemas like "ponder_sync" or "ensnode.metadata" when they
actually exist; update the dump/manifest generation and restore safety checks to
conditionally add/validate these schemas instead of assuming them, and revise
all occurrences that refer to ponder_sync/ensnode.metadata in the plan
(including the create/pull/restore/manifest logic) to perform an existence check
before treating them as required.
- Around line 81-82: The restore command must enforce "fresh/isolated DB only"
by explicitly checking for existing ponder-related artifacts before proceeding:
in the snapshot restore handler (restore_snapshot / RestoreSnapshotCommand)
query the target DB for existing schemas/tables and fail if any of the following
exist: the ponder_sync table, ponder metadata tables, or the custom ponder
schema (e.g., "ponder" or "ponder_sync" schema), allowing only empty user
schemas and DB system schemas; emit a clear error like "restore aborted: target
DB contains existing ponder data; use --drop-existing to overwrite" when found;
implement --drop-existing to first drop those identified ponder tables/schemas
(and only them) before continuing the restore; ensure the check runs early and
returns a non-zero exit code on failure so partial restores cannot proceed.
- Line 85: Fix MD037/MD040 by removing spaced emphasis markers and adding
language labels to fenced code blocks: replace occurrences like ** pg_dump
--format=directory ** and ** --jobs=N ** with inline code/backticks (`pg_dump
--format=directory`, `--jobs=N`) or compact emphasis without extra spaces (e.g.,
**pg_dump --format=directory**) and ensure archive examples like
**.dump.tar.zst** use inline code (`.dump.tar.zst`); also add explicit language
tags (e.g., ```bash) to each unlabeled fenced code block so all fenced blocks
are labeled. Target the text containing pg_dump --format=directory, --jobs=N,
and .dump.tar.zst as well as the other cited occurrences to satisfy MD037/MD040.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 02d6b078-d9e9-4015-8adc-6a7cf0328801
📒 Files selected for processing (1)
.cursor/plans/ensdb_cli_tool_422abf99.plan.md
|
Another related goal: #833 |
|
The planning document here needs to include details for how the goal described in #1127 will be supported. |
…n processes, including schema discovery, metadata handling, and new snapshot delete functionality.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
.cursor/plans/ensdb_cli_tool_422abf99.plan.md (2)
81-83:⚠️ Potential issue | 🟠 MajorMake
ponder_syncinclusion capability-driven, not unconditional.The plan still treats
ponder_syncas always present/required across create/pull/restore semantics. This can break on databases where that shared schema is absent. Define runtime discovery and includeponder_synconly when detected.Also applies to: 89-94, 394-395
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md around lines 81 - 83, The plan currently assumes the ponder_sync schema is always present; change the create/pull/restore logic to discover schemas at runtime by enumerating non-system schemas and only include ponder_sync when detected (do not treat it as unconditional), and add a CLI flag (--exclude-schemas or an allowlist) to let users omit unrelated application schemas; update any references to ponder_sync in create/pull/restore flow to be conditional on discovery and ensure schema enumeration excludes PostgreSQL system schemas and the configured excludes.
98-98:⚠️ Potential issue | 🟡 MinorResolve remaining markdownlint violations (MD037/MD040).
There are still fenced blocks without language identifiers and emphasis-marker spacing issues that will keep docs lint noisy.
Also applies to: 112-112, 139-139, 149-149, 156-156, 260-260, 268-268, 340-340, 379-379
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md at line 98, The document has markdownlint violations MD037 (fenced code blocks missing language identifiers) and MD040 (inconsistent emphasis-marker usage/spacing); fix by adding appropriate language tags (e.g., sh, bash, text) to every fenced code block in .cursor/plans/ensdb_cli_tool_422abf99.plan.md (notably at the blocks around lines 98, 112, 139, 149, 156, 260, 268, 340, 379) and normalize emphasis marker spacing by removing stray spaces between asterisks and text (use **bold** or *italic* with no inner spaces) so all fenced blocks and emphasis instances comply with MD037/MD040.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md:
- Around line 92-95: Add concrete preflight "fresh/isolated DB" checks before
performing snapshot restore: implement checks in the CLI flow that will run
prior to calling pg_restore (and before any use of --drop-existing) to detect
non-system schemas/tables, presence of rows in the shared-state table
ponder_sync, and existing rows in ensnode (or related metadata), and fail with a
clear error unless a new --force-or-confirm flag is provided; for selective
restores (when restoring indexer schema(s) and upserting from
ensnode_metadata.json) ensure the preflight verifies that only the targeted
indexer schemas are absent or isolated and refuse to proceed if other indexer
metadata rows already exist in ensnode to avoid clobbering. Ensure the checks
are invoked in the CLI command handler that triggers snapshot restore and
surface distinct error codes/messages for: non-empty ponder_sync, existing
ensnode rows, and non-system schema/table presence, plus a documented
interaction with --drop-existing.
- Around line 184-193: The snapshot subcommands are inconsistent: only "snapshot
list" accepts --prefix while "snapshot info" and "snapshot delete" do not,
causing wrong paths; update the CLI handlers for the "snapshot info" and
"snapshot delete" commands to accept and parse the --prefix option and pass it
through to the underlying storage functions (e.g.,
listSnapshots/getSnapshotManifest/deleteSnapshotObjects or their equivalents) so
all three commands compute the snapshot prefix the same way; ensure the option
name is --prefix, default behavior stays the same when omitted, and that the
prefix is used when constructing S3 keys/paths and when
listing/fetching/deleting objects.
---
Duplicate comments:
In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md:
- Around line 81-83: The plan currently assumes the ponder_sync schema is always
present; change the create/pull/restore logic to discover schemas at runtime by
enumerating non-system schemas and only include ponder_sync when detected (do
not treat it as unconditional), and add a CLI flag (--exclude-schemas or an
allowlist) to let users omit unrelated application schemas; update any
references to ponder_sync in create/pull/restore flow to be conditional on
discovery and ensure schema enumeration excludes PostgreSQL system schemas and
the configured excludes.
- Line 98: The document has markdownlint violations MD037 (fenced code blocks
missing language identifiers) and MD040 (inconsistent emphasis-marker
usage/spacing); fix by adding appropriate language tags (e.g., sh, bash, text)
to every fenced code block in .cursor/plans/ensdb_cli_tool_422abf99.plan.md
(notably at the blocks around lines 98, 112, 139, 149, 156, 260, 268, 340, 379)
and normalize emphasis marker spacing by removing stray spaces between asterisks
and text (use **bold** or *italic* with no inner spaces) so all fenced blocks
and emphasis instances comply with MD037/MD040.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6ebaf22c-30ff-4fe8-a5ff-a4579716aee0
📒 Files selected for processing (1)
.cursor/plans/ensdb_cli_tool_422abf99.plan.md
…s, detailing implications of `pg_dump` formats, checksum handling, and manifest finalization for stakeholder decision-making.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md:
- Around line 175-181: The snapshot pull CLI is missing the --prefix option,
causing incorrect S3 paths for non-default prefixes; add a --prefix flag to the
ensdb-cli snapshot pull command (alongside existing flags --bucket, --endpoint,
--schemas, --with-ensnode-schema, --ponder-sync-only) and propagate it into the
S3 path resolution logic used by the pull handler so it composes keys the same
way as push/list; update the pull command help text/usage to document --prefix
and ensure the code that constructs the remote snapshot key (the snapshot pull
handler) accepts and applies the prefix when building the S3 object path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7df32f3f-5228-4dfe-905c-9271f65f4925
📒 Files selected for processing (1)
.cursor/plans/ensdb_cli_tool_422abf99.plan.md
…ing preflight checks to ensure safe restoration, and clarifying metadata handling and schema validation steps.
shrugs
left a comment
There was a problem hiding this comment.
can we update the schema name construction to just ${deployment}_${version}? the Schema in the middle isn't necessary, imo. so like v2-sepolia_1.9.0 for example (using the - to kebab-case the deployment name
--database-url should probably be --ensdb-url (defaulting to process.env.ENSDB_URL, so the flag is technically optional) to match new terminology? or it could stay as --database-url because it's in the context of an ensdb-cli ¯\_(ツ)_/¯
re: autogenerated ids; i agree that we should treat these are more or less static artifacts and not allow overriding the generated ids
re: streaming upload mode; i think simplest is best for v1, let's stick to the two-part operation
|
otherwise looks great to me |
…database-url` with `--ensdb-url` across various commands, enhancing clarity and consistency in usage.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md:
- Line 134: Several fenced code blocks in the plan file are missing language
identifiers (causing MD040 lint warnings); update each triple-backtick block in
.cursor/plans/ensdb_cli_tool_422abf99.plan.md to include an appropriate language
tag (e.g., bash for command examples like the "ensdb-cli ..." blocks, text for
directory/listing or table snippets, json for manifest snippets) so that blocks
such as the directory trees, CLI usage lines, and the manifest.json/sample table
are annotated (look for blocks containing "{prefix}/", "ensdb-cli inspect",
"ensdb-cli schema drop", "manifest.json", directory listings like
"apps/ensdb-cli/", and command fragments like "ensdb-cli snapshot pull" and add
```text, ```bash or ```json accordingly).
- Line 120: The text contains instances like "50-100GB" that should follow
typographical conventions by adding a space between the numeric value and unit
(e.g., change "50-100GB" to "50-100 GB"); search the document around the
"pg_dump --format=directory --jobs=N" context and elsewhere (the other flagged
spots) for any digit-range immediately followed by unit letters and insert a
single space so ranges read "50-100 GB" (and apply the same fix to the other
flagged occurrences).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3b6a3848-5a0a-4e19-8125-94e35450b8cf
📒 Files selected for processing (1)
.cursor/plans/ensdb_cli_tool_422abf99.plan.md
|
|
||
| ### Snapshot Format | ||
|
|
||
| Use `pg_dump` with `--format=directory` and `--jobs=N` for parallel dump/restore. This is the only format that supports parallelism, which is critical for 50-100GB databases. Each directory-format dump is then archived as a `<schema>.dump.tar.zst` file for storage and transfer, and unpacked to a temporary directory before restore. |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Optional: Add space between numerical values and unit symbols.
Three instances of "50-100GB" lack proper spacing per typographical conventions. Consider "50-100 GB" for consistency with technical writing standards.
📝 Suggested spacing fixes
-Use `pg_dump` with `--format=directory` and `--jobs=N` for parallel dump/restore. This is the only format that supports parallelism, which is critical for 50-100GB databases. Each directory-format dump is then archived as a `<schema>.dump.tar.zst` file for storage and transfer, and unpacked to a temporary directory before restore.
+Use `pg_dump` with `--format=directory` and `--jobs=N` for parallel dump/restore. This is the only format that supports parallelism, which is critical for 50-100 GB databases. Each directory-format dump is then archived as a `<schema>.dump.tar.zst` file for storage and transfer, and unpacked to a temporary directory before restore.-- `analyze` performs heavy table scans over potentially millions of domain rows (seconds to minutes at 50-100GB scale). Mixing slow analytical queries into `inspect` would make it unpredictably slow.
+- `analyze` performs heavy table scans over potentially millions of domain rows (seconds to minutes at 50-100 GB scale). Mixing slow analytical queries into `inspect` would make it unpredictably slow.- - **Custom** (`pg_dump --format=custom` / `-Fc`): single-file output and can be streamed (e.g. piped into multipart upload). Loses parallel `pg_restore` compared to directory format unless you accept those trade-offs at 50-100GB scale.
+ - **Custom** (`pg_dump --format=custom` / `-Fc`): single-file output and can be streamed (e.g. piped into multipart upload). Loses parallel `pg_restore` compared to directory format unless you accept those trade-offs at 50-100 GB scale.Also applies to: 400-400, 434-434
🧰 Tools
🪛 LanguageTool
[typographical] ~120-~120: Insert a space between the numerical value and the unit symbol.
Context: ...ts parallelism, which is critical for 50-100GB databases. Each directory-format dump i...
(UNIT_SPACE)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md at line 120, The text contains
instances like "50-100GB" that should follow typographical conventions by adding
a space between the numeric value and unit (e.g., change "50-100GB" to "50-100
GB"); search the document around the "pg_dump --format=directory --jobs=N"
context and elsewhere (the other flagged spots) for any digit-range immediately
followed by unit letters and insert a single space so ranges read "50-100 GB"
(and apply the same fix to the other flagged occurrences).
|
|
||
| Discovery via `ListObjects` on `{prefix}/` -- each snapshot is a prefix containing a `manifest.json` and per-schema dump files: | ||
|
|
||
| ``` |
There was a problem hiding this comment.
Fix markdown lint violations (MD040).
Multiple fenced code blocks are missing language identifiers. Add appropriate language tags for proper syntax highlighting and to resolve linting warnings.
🧹 Proposed fixes for code block language identifiers
### S3 Storage Layout
Discovery via `ListObjects` on `{prefix}/` -- each snapshot is a prefix containing a `manifest.json` and per-schema dump files:
-```
+```text
{prefix}/
{snapshot-id}/
manifest.json # snapshot metadata (all schemas, sizes, versions) ### Inspect
-```
+```bash
ensdb-cli inspect [--ensdb-url <url>]
List all schemas with type classification and size info. ### Schema Management
-```
+```bash
ensdb-cli schema drop [--ensdb-url <url>] --schema <name> [--force]
Drop a schema. Requires --force or interactive confirmation. ### Snapshot Operations
-```
+```bash
ensdb-cli snapshot create [--ensdb-url <url>] --output <path> [--exclude-schemas <name,...>]
Export all discovered indexer schemas + ponder_sync + full ensnode schema + ensnode.metadata JSON. This means `snapshot list` can show rich summaries like:
-```
+```text
ID Namespace Plugins Chains Created
mainnetSchema1.9.0-2026-04-06-abc123 mainnet subgraph 1 2026-04-06 ## Project Structure
-```
+```text
apps/ensdb-cli/
package.json
tsconfig.json **CI workflow pattern:**
-```
+```bash
# 1. Pull only the schema needed for this matrix entry
ensdb-cli snapshot pull \ **Future command:**
-```
+```bash
ensdb-cli analyze unknown-labels [--ensdb-url <url>] --schema <name> [--top-n 100] [--output-format table|csv|json]
Count unknown names, unknown labels (distinct and non-distinct),Also applies to: 161-161, 171-171, 178-178, 287-287, 295-295, 369-369, 408-408
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)
[warning] 134-134: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.cursor/plans/ensdb_cli_tool_422abf99.plan.md at line 134, Several fenced
code blocks in the plan file are missing language identifiers (causing MD040
lint warnings); update each triple-backtick block in
.cursor/plans/ensdb_cli_tool_422abf99.plan.md to include an appropriate language
tag (e.g., bash for command examples like the "ensdb-cli ..." blocks, text for
directory/listing or table snippets, json for manifest snippets) so that blocks
such as the directory trees, CLI usage lines, and the manifest.json/sample table
are annotated (look for blocks containing "{prefix}/", "ensdb-cli inspect",
"ensdb-cli schema drop", "manifest.json", directory listings like
"apps/ensdb-cli/", and command fragments like "ensdb-cli snapshot pull" and add
```text, ```bash or ```json accordingly).
Lite PR
Tip: Review docs on the ENSNode PR process
Summary
Why
Testing
Notes for Reviewer (Optional)
Pre-Review Checklist (Blocking)