Conversation
|
WalkthroughAdds documentation and API specification for a new Query feature and Metrics insights. Updates docs navigation (adds "Insights" and "Query API" groups), introduces docs/insights/query.mdx and docs/insights/metrics.mdx, adds management/query/execute.mdx, updates docs/limits.mdx with Query and Metrics sections, and amends docs/docs.json navigation. Expands OpenAPI in docs/v3-openapi.yaml with POST /api/v1/query, request/response schemas, examples, and related schema additions. Changes are documentation and API spec additions; no application logic changes. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
| GROUP BY timeBucket | ||
| ORDER BY timeBucket |
There was a problem hiding this comment.
🟡 Inconsistent timeBucket usage in SQL example will cause query errors for users
The "CPU utilization over time" example uses timeBucket() (with parentheses) in the SELECT clause but timeBucket (without parentheses) in GROUP BY and ORDER BY. Since timeBucket() is a function call and is not aliased, referencing it as timeBucket (without parens) in GROUP BY/ORDER BY is inconsistent with the documented pattern.
Detailed explanation and fix
Compare with the correct pattern shown earlier in the same doc at docs/insights/query.mdx:238-240:
SELECT timeBucket(), count() AS runs
FROM runs
GROUP BY timeBucket()And the "Average compute duration over time" example at docs/insights/query.mdx:461-469 which correctly aliases the function:
SELECT timeBucket() AS time, ...
GROUP BY time, task_identifier
ORDER BY time ASCBut the CPU utilization example does neither — it uses timeBucket() in SELECT but bare timeBucket in GROUP BY/ORDER BY:
SELECT timeBucket(), avg(value) AS avg_cpu
FROM metrics
GROUP BY timeBucket -- missing parentheses
ORDER BY timeBucket -- missing parenthesesUsers copying this example will likely get a query error because timeBucket without parentheses refers to a non-existent column, not the timeBucket() function. The fix is to use timeBucket() consistently or alias it.
| GROUP BY timeBucket | |
| ORDER BY timeBucket | |
| GROUP BY timeBucket() | |
| ORDER BY timeBucket() | |
Was this helpful? React with 👍 or 👎 to provide feedback.
| ### Aggregate functions | ||
|
|
||
| - `count()` - Count rows | ||
| - `countIf(col, cond)` - Count rows matching condition |
There was a problem hiding this comment.
🚩 Incorrect countIf function signature in docs vs actual ClickHouse/TRQL usage
The aggregate function reference documents countIf(col, cond) taking two arguments, but every example in the same file uses the single-argument form countIf(status = 'Completed') (e.g., docs/insights/query.mdx:434). In ClickHouse, countIf takes a single condition expression, not a column and condition separately. Similarly, sumIf is documented as sumIf(col, cond) which is correct for ClickHouse (sumIf(column, condition)), so the inconsistency is specifically with countIf. This could confuse users reading the function reference who then try countIf(status, 'Failed') instead of countIf(status = 'Failed').
Was this helpful? React with 👍 or 👎 to provide feedback.
| period: "4d", // Last 4 days | ||
| }); | ||
|
|
||
| // Supported periods: "1h", "6h", "12h", "1d", "7d", "30d", "90d", etc. |
There was a problem hiding this comment.
🚩 Query lookback limits match Log retention limits — potentially intentional but worth verifying
The new Query lookback period limits (docs/limits.mdx:205-209) are identical to the Log retention limits (docs/limits.mdx:134-138): Free=1 day, Hobby=7 days, Pro=30 days. This appears intentional (query can look back as far as data is retained), but the Query docs at docs/insights/query.mdx:393 list "90d" as a supported period shorthand, which would exceed the Pro plan's 30-day lookback limit. The limits page does say the range gets "automatically clipped" (docs/limits.mdx:211), but the example in the query docs may set incorrect expectations for Pro users who see 90d listed as supported.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
|
||
| We have several limits to prevent abuse and ensure performance: | ||
|
|
||
| - **Concurrency limit**: We limit the number of concurrent queries per organization. |
There was a problem hiding this comment.
🚩 Concurrency limit inconsistency: Query docs say 'per organization' but limits page says 'per project'
The Query section in docs/insights/query.mdx:540 states "We limit the number of concurrent queries per organization" but the limits table at docs/limits.mdx:199 specifies "3 per project". These are different scoping levels and could confuse users trying to understand their actual constraints. One of these should be corrected to match the actual implementation.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/insights/query.mdx
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.
Applied to files:
docs/insights/query.mdx
🪛 Gitleaks (8.30.0)
docs/insights/query.mdx
[high] 90-91: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
🪛 LanguageTool
docs/insights/query.mdx
[style] ~125-~125: Consider using a synonym to be more concise.
Context: ...ouse database is columnar and selecting lots of columns isn't efficient). You should s...
(A_LOT_OF)
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/insights/query.mdx`:
- Around line 357-365: The SQL example incorrectly uses the expression
"output.error != NULL" which yields NULL; update the SELECT projection in the
example query to use the SQL NULL check operator, e.g. replace "output.error !=
NULL AS has_error" with "output.error IS NOT NULL AS has_error" so the has_error
column returns a proper boolean when querying the runs table.
- Line 207: The docs list countIf as `countIf(col, cond)` which is incorrect;
update the signature and any examples to the ClickHouse single-argument form
`countIf(cond)` (e.g., change `countIf(col, cond)` to `countIf(cond)` and ensure
examples like `countIf(status = 'Completed')` remain consistent); search for
occurrences of the symbol `countIf` in the file and replace the two-argument
signature and any two-argument examples with the single-argument combinator form
used in ClickHouse docs.
---
Duplicate comments:
In `@docs/insights/query.mdx`:
- Around line 540-543: Update the bullet for **Time restrictions** so it matches
the other items by adding a trailing colon; change the list entry from "**Time
restrictions**" to "**Time restrictions**:" in the same Markdown list (the
surrounding items **Concurrency limit**, **Row limit**, and **Time/Memory
limit** show the intended format).
- Around line 490-498: The GROUP BY and ORDER BY use timeBucket without
parentheses while the SELECT uses timeBucket(), causing inconsistent usage;
update the GROUP BY timeBucket and ORDER BY timeBucket to use timeBucket() to
match the SELECT example (referencing the timeBucket() function in the query) so
all occurrences are consistent.
- Around line 347-348: The alias is misleading: the expression has(tags,
'user_12345') is checking for a user tag but is named is_production. Update the
code so the alias matches the check — either change the checked tag to
'production' if you intended a production check, or (preferably) rename the
alias to something like has_user_12345 or is_user_12345; ensure the neighboring
arrayJoin(tags) AS individual_tag remains unchanged.
| ### Aggregate functions | ||
|
|
||
| - `count()` - Count rows | ||
| - `countIf(col, cond)` - Count rows matching condition |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
ClickHouse countIf function official documentation
💡 Result:
In ClickHouse, countIf is count with the -If aggregate-function combinator applied: the combinator adds an extra condition argument (of type Uint8), and the aggregate processes only rows where the condition is true. The official docs list countIf(cond) among the -If examples. [1]
Official documentation: Aggregate function combinators → -If [1]
🏁 Script executed:
# Check the file content at the specified lines
if [ -f "docs/insights/query.mdx" ]; then
echo "=== Line 207 (countIf signature) ==="
sed -n '207p' docs/insights/query.mdx
echo ""
echo "=== Lines 434-435 (examples) ==="
sed -n '434,435p' docs/insights/query.mdx
echo ""
echo "=== Context around line 207 (5 lines before/after) ==="
sed -n '202,212p' docs/insights/query.mdx
fiRepository: triggerdotdev/trigger.dev
Length of output: 688
countIf signature doesn't match ClickHouse documentation or usage examples.
The signature is listed as countIf(col, cond), but ClickHouse's official documentation specifies countIf(cond) — it takes only a condition argument due to the -If aggregate-function combinator. The usage examples on lines 434-435 correctly use the single-argument form: countIf(status = 'Completed').
Proposed fix
-- `countIf(col, cond)` - Count rows matching condition
+- `countIf(cond)` - Count rows matching condition🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/insights/query.mdx` at line 207, The docs list countIf as `countIf(col,
cond)` which is incorrect; update the signature and any examples to the
ClickHouse single-argument form `countIf(cond)` (e.g., change `countIf(col,
cond)` to `countIf(cond)` and ensure examples like `countIf(status =
'Completed')` remain consistent); search for occurrences of the symbol `countIf`
in the file and replace the two-argument signature and any two-argument examples
with the single-argument combinator form used in ClickHouse docs.
| ```sql | ||
| SELECT | ||
| run_id, | ||
| output.message AS output_message, | ||
| output.count AS count, | ||
| output.error != NULL AS has_error | ||
| FROM runs | ||
| WHERE output IS NOT NULL | ||
| ``` |
There was a problem hiding this comment.
!= NULL is incorrect SQL — use IS NOT NULL.
In SQL (including ClickHouse), != NULL evaluates to NULL rather than true/false because any comparison with NULL using standard operators yields NULL. Users copying this example will get unexpected results.
Proposed fix
- output.error != NULL AS has_error
+ output.error IS NOT NULL AS has_error🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/insights/query.mdx` around lines 357 - 365, The SQL example incorrectly
uses the expression "output.error != NULL" which yields NULL; update the SELECT
projection in the example query to use the SQL NULL check operator, e.g. replace
"output.error != NULL AS has_error" with "output.error IS NOT NULL AS has_error"
so the has_error column returns a proper boolean when querying the runs table.
No description provided.