Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
DESCRIPTION >
Precomputed organization-level KPIs for the org page. Rebuilt nightly by org_page_kpis_copy_pipe.
One row per organizationId. Used by org_page_kpis.pipe for cheap request-time lookups.

SCHEMA >
`organizationId` String,
`activeContributors` UInt64,
`activeContributorsPrevious` UInt64,
`maintainerRoles` UInt64,
`criticalProjects` UInt64,
`computedAt` DateTime

ENGINE ReplacingMergeTree
ENGINE_SORTING_KEY organizationId
ENGINE_VER computedAt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
DESCRIPTION >
Precomputed per-org per-project metrics for the org page. Rebuilt nightly by org_page_projects_copy_pipe.
One row per (organizationId, segmentId). Used by org_page_projects.pipe.

SCHEMA >
`organizationId` String,
`segmentId` String,
`projectSlug` String,
`projectName` String,
`projectLogo` String,
`activityCount` UInt64,
`contributorCount` UInt64,
`computedAt` DateTime

ENGINE ReplacingMergeTree
ENGINE_SORTING_KEY organizationId, segmentId
ENGINE_VER computedAt
18 changes: 18 additions & 0 deletions services/libs/tinybird/pipes/org_page_activities_timeseries.pipe
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
DESCRIPTION >
Activity timeseries for a given organization, bucketed by year (all-time).

Comment on lines +1 to +3
TAGS "Organization page"
Comment on lines +1 to +4

NODE org_page_activities_timeseries_data
SQL >
%
SELECT
toStartOfYear(timestamp) AS startDate,
toDate(toStartOfYear(timestamp) + INTERVAL 1 YEAR - INTERVAL 1 DAY) AS endDate,
count() AS activityCount
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider this feedback

Comment thread
cursor[bot] marked this conversation as resolved.
Outdated
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE
organizationId = {{ String(orgId, '', description="Organization ID", required=True) }}
AND timestamp >= '2005-01-01'
GROUP BY startDate, endDate
ORDER BY startDate
Comment on lines +1 to +18
67 changes: 67 additions & 0 deletions services/libs/tinybird/pipes/org_page_contributors.pipe
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
DESCRIPTION >
Top contributors for a given organization leaderboard.
Returns members sorted by contribution count within the specified date range.

TAGS "Organization page"

NODE org_page_contributors_activity_aggregates
SQL >
%
{% if Boolean(count, false) %}
SELECT count(distinct memberId)
FROM activityRelations_deduplicated_cleaned_bucket_union
Comment on lines +14 to +19
WHERE
organizationId = {{ String(orgId, '', description="Organization ID", required=True) }}
{% if defined(startDate) %}
AND timestamp
>= {{ DateTime(startDate, description="Filter activity timestamp after") }}
{% end %}
{% if defined(endDate) %}
AND timestamp < {{ DateTime(endDate, description="Filter activity timestamp before") }}
{% end %}
{% else %}
SELECT
memberId,
count() as "contributionCount",
ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) as "contributionPercentage"
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE
organizationId = {{ String(orgId, '', description="Organization ID", required=True) }}
{% if defined(startDate) %}
AND timestamp
>= {{ DateTime(startDate, description="Filter activity timestamp after") }}
{% end %}
{% if defined(endDate) %}
AND timestamp < {{ DateTime(endDate, description="Filter activity timestamp before") }}
{% end %}
GROUP BY memberId
ORDER BY contributionCount DESC, memberId DESC
LIMIT {{ Int32(limit, 10) }}
OFFSET {{ Int32(offset, 0) }}
{% end %}

NODE org_page_contributors_leaderboard
SQL >
%
{% if Boolean(count, false) %}
SELECT count(distinct memberId) as count
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE
organizationId = {{ String(orgId, '') }}
{% if defined(startDate) %} AND timestamp >= {{ DateTime(startDate) }} {% end %}
{% if defined(endDate) %} AND timestamp < {{ DateTime(endDate) }} {% end %}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Count path duplicates expensive cross-segment table scan

Medium Severity

When count=true, the org_page_contributors_leaderboard node independently queries activityRelations_deduplicated_cleaned_bucket_union for the same count already computed (but unused) in org_page_contributors_activity_aggregates. Unlike contributors_leaderboard.pipe which references the project-scoped activities_filtered, this pipe scans the full unscoped cross-segment union (10-bucket UNION ALL) twice per request-time count call.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b387268. Configure here.

{% else %}
SELECT
m.id,
m.avatar,
m.displayName,
m.githubHandleArray,
agg.contributionCount,
agg.contributionPercentage,
mr.roles
FROM members_sorted AS m ANY
INNER JOIN org_page_contributors_activity_aggregates agg ON agg.memberId = m.id
LEFT JOIN member_roles mr ON mr.memberId = m.id
WHERE m.id IN (SELECT memberId FROM org_page_contributors_activity_aggregates)
ORDER BY agg.contributionCount DESC
{% end %}
18 changes: 18 additions & 0 deletions services/libs/tinybird/pipes/org_page_contributors_timeseries.pipe
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
DESCRIPTION >
Contributor count timeseries for a given organization, bucketed by year (all-time).

TAGS "Organization page"

NODE org_page_contributors_timeseries_data
SQL >
%
SELECT
toStartOfYear(timestamp) AS startDate,
toDate(toStartOfYear(timestamp) + INTERVAL 1 YEAR - INTERVAL 1 DAY) AS endDate,
uniqExact(memberId) AS contributorCount
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE
organizationId = {{ String(orgId, '', description="Organization ID", required=True) }}
AND timestamp >= '2005-01-01'
GROUP BY startDate, endDate
ORDER BY startDate
28 changes: 28 additions & 0 deletions services/libs/tinybird/pipes/org_page_kpis.pipe
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
DESCRIPTION >
Returns KPIs for a given organization from the precomputed org_page_kpis_copy_ds.
Includes trend calculations comparing current to previous 365-day period.

TAGS "Organization page"

NODE org_page_kpis_main
SQL >
%
SELECT
activeContributors,
if(
activeContributorsPrevious = 0,
0,
round(
(toInt64(activeContributors) - toInt64(activeContributorsPrevious))
/ activeContributorsPrevious
* 100,
1
)
) AS activeContributorsTrend,
toInt64(activeContributors)
- toInt64(activeContributorsPrevious) AS activeContributorsTrendAbsolute,
activeContributorsPrevious AS activeContributorsTrendPrevious,
maintainerRoles,
criticalProjects
FROM org_page_kpis_copy_ds FINAL
WHERE organizationId = {{ String(orgId, '', description="Organization ID", required=True) }}
80 changes: 80 additions & 0 deletions services/libs/tinybird/pipes/org_page_kpis_copy_pipe.pipe
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
DESCRIPTION >
Nightly copy pipe that precomputes org-level KPIs for the org page.
Writes one row per organizationId into org_page_kpis_copy_ds.

TAGS "Organization page"

NODE org_page_kpis_current_contributors
DESCRIPTION >
Active contributors per org in the last 365 days

SQL >
SELECT organizationId, uniq(memberId) AS activeContributors
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE
organizationId != ''
AND timestamp >= toStartOfDay(now() - toIntervalDay(365))
AND timestamp < toStartOfDay(now() + toIntervalDay(1))
GROUP BY organizationId

NODE org_page_kpis_previous_contributors
DESCRIPTION >
Active contributors per org in the prior 365-day window (for trend calc)

SQL >
SELECT organizationId, uniq(memberId) AS activeContributorsPrevious
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE
organizationId != ''
AND timestamp >= toStartOfDay(now() - toIntervalDay(730))
AND timestamp < toStartOfDay(now() - toIntervalDay(365))
GROUP BY organizationId

NODE org_page_kpis_maintainer_roles
DESCRIPTION >
Count of active maintainer role assignments per org

SQL >
SELECT organizationId, uniq((memberId, insightsProjectId)) AS maintainerRoles
FROM maintainers_roles_copy_ds
WHERE role = 'maintainer' AND toYear(endDate) <= 1970 AND organizationId != ''
GROUP BY organizationId

NODE org_page_kpis_critical_projects
DESCRIPTION >
Count of distinct projects (segmentIds) an org contributed to in the last 365 days.
Serves as the "critical projects" placeholder until a real criticality filter is added.

SQL >
SELECT organizationId, uniq(segmentId) AS criticalProjects
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE
organizationId != ''
AND timestamp >= toStartOfDay(now() - toIntervalDay(365))
AND timestamp < toStartOfDay(now() + toIntervalDay(1))
GROUP BY organizationId

NODE org_page_kpis_final
DESCRIPTION >
Join all nodes into one row per org

SQL >
SELECT
coalesce(
c.organizationId, p.organizationId, m.organizationId, cp.organizationId
) AS organizationId,
coalesce(c.activeContributors, 0) AS activeContributors,
coalesce(p.activeContributorsPrevious, 0) AS activeContributorsPrevious,
coalesce(m.maintainerRoles, 0) AS maintainerRoles,
coalesce(cp.criticalProjects, 0) AS criticalProjects,
now() AS computedAt
FROM org_page_kpis_current_contributors c
FULL OUTER JOIN org_page_kpis_previous_contributors p ON c.organizationId = p.organizationId
FULL OUTER JOIN org_page_kpis_maintainer_roles m ON c.organizationId = m.organizationId
FULL OUTER JOIN org_page_kpis_critical_projects cp ON c.organizationId = cp.organizationId
WHERE organizationId != ''
Comment thread
cursor[bot] marked this conversation as resolved.

TYPE COPY
TARGET_DATASOURCE org_page_kpis_copy_ds
COPY_MODE replace
COPY_SCHEDULE 15 1 * * *
37 changes: 37 additions & 0 deletions services/libs/tinybird/pipes/org_page_profile.pipe
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
DESCRIPTION >
Organization profile for the org page. Returns one row for a given orgId.
Joins organizationIdentities for website and domain.
Comment on lines +2 to +3

TAGS "Organization page"

NODE org_page_profile_base
SQL >
%
SELECT id, displayName, logo, employees AS employeeCount, industry, headline AS description
FROM organizations FINAL
WHERE id = {{ String(orgId, '', description="Organization ID", required=True) }}

NODE org_page_profile_website
SQL >
%
SELECT organizationId, argMax(value, updatedAt) AS website
FROM organizationIdentities FINAL
WHERE
organizationId = {{ String(orgId, '', description="Organization ID", required=True) }}
AND platform = 'website'
AND type = 'primary'
GROUP BY organizationId

NODE org_page_profile_final
SQL >
SELECT
b.id,
b.displayName,
b.logo,
b.employeeCount,
b.industry,
b.description,
w.website,
domain(w.website) AS domain
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

domain() returns empty string for non-URL input

Medium Severity

The ClickHouse domain() function expects a full URL with protocol (e.g., https://example.com/path) and returns empty string for plain domain names. The value from organizationIdentities with type = 'primary-domain' likely stores a bare domain (e.g., example.com) not a URL, so domain(w.website) would always produce an empty string instead of the expected domain value.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit fc1a7d1. Configure here.

FROM org_page_profile_base b
LEFT JOIN org_page_profile_website w ON b.id = w.organizationId
14 changes: 14 additions & 0 deletions services/libs/tinybird/pipes/org_page_projects.pipe
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
DESCRIPTION >
Returns the list of projects an organization contributed to in the last 365 days,
read from the precomputed org_page_projects_copy_ds.

TAGS "Organization page"

NODE org_page_projects_main
SQL >
%
SELECT projectSlug, projectName, projectLogo, activityCount, contributorCount
FROM org_page_projects_copy_ds FINAL
WHERE organizationId = {{ String(orgId, '', description="Organization ID", required=True) }}
ORDER BY activityCount DESC
LIMIT 20
41 changes: 41 additions & 0 deletions services/libs/tinybird/pipes/org_page_projects_copy_pipe.pipe
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
DESCRIPTION >
Nightly copy pipe that precomputes per-org per-project metrics for the org page.
Writes one row per (organizationId, segmentId) into org_page_projects_copy_ds.

TAGS "Organization page"

NODE org_page_projects_org_segment_activity
DESCRIPTION >
Activity and contributor counts per org per project segment in the last 365 days

SQL >
SELECT organizationId, segmentId, count() AS activityCount, uniq(memberId) AS contributorCount
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE
organizationId != ''
AND timestamp >= toStartOfDay(now() - toIntervalDay(365))
AND timestamp < toStartOfDay(now() + toIntervalDay(1))
GROUP BY organizationId, segmentId

NODE org_page_projects_with_meta
DESCRIPTION >
Enrich with project name, logo and slug from insights_projects_populated_ds

SQL >
SELECT
a.organizationId,
a.segmentId,
p.slug AS projectSlug,
p.name AS projectName,
p.logoUrl AS projectLogo,
a.activityCount,
a.contributorCount,
now() AS computedAt
FROM org_page_projects_org_segment_activity a
LEFT JOIN insights_projects_populated_ds p ON a.segmentId = p.segmentId
WHERE p.slug != ''

TYPE COPY
TARGET_DATASOURCE org_page_projects_copy_ds
COPY_MODE replace
COPY_SCHEDULE 30 1 * * *
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two heavy copy pipes scheduled at identical time

Low Severity

org_page_projects_copy_pipe and org_page_activities_timeseries_copy_pipe both use COPY_SCHEDULE 30 1 * * *. Both perform heavy full scans of activityRelations_deduplicated_cleaned_bucket_union (a 10-bucket UNION ALL). The other two new copy pipes are staggered at 01:15 and 01:45, suggesting the intent was to spread load, but these two ended up at the same time.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b387268. Configure here.

Loading