Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package org.jetbrains.kotlinx.dataframe.api

import org.jetbrains.kotlinx.dataframe.AnyColumnReference
import org.jetbrains.kotlinx.dataframe.ColumnsSelector
import org.jetbrains.kotlinx.dataframe.DataColumn
import org.jetbrains.kotlinx.dataframe.DataFrame
import org.jetbrains.kotlinx.dataframe.DataRow
import org.jetbrains.kotlinx.dataframe.RowExpression
Expand Down Expand Up @@ -182,11 +183,14 @@ internal interface PivotDocs {
* * [frames][Pivot.frames] — returns this [Pivot] as a [DataRow] with pivot keys as columns
* (or [column groups][ColumnGroup]) and corresponding groups stored as [FrameColumn]s;
* * [values][Pivot.values] — creates a [DataRow] containing values collected into a single [List]
* from all rows of each group for the selected columns;
* from all rows of each group for the selected columns
* (values from [column groups][ColumnGroup] are collected into a [DataFrame]);
* * [count][Pivot.count] — creates a [DataRow] containing the pivot key columns and an additional column
* with the number of rows in each corresponding group;
* * [with][Pivot.with] — creates a [DataRow] containing values computed using a [RowExpression]
* across all rows of each group and collected into a single [List] for every group;
* across all rows of each group.
* Values of the [DataRow] type are collected into a [DataFrame], and the resulting column is a [FrameColumn].
* Values of other types are collected into a [List], and the resulting column is a [DataColumn] of [List];
* * [aggregate][Pivot.aggregate] — performs a set of custom aggregations using [AggregateDsl],
* allowing computation of one or more derived values per group;
* * [Various aggregation statistics][AggregationStatistics] — predefined shortcuts
Expand Down
14 changes: 10 additions & 4 deletions docs/StardustDocs/topics/pivot.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,8 +219,10 @@ Reducing is a specific case of [`aggregation`](pivot.md#aggregation).
### Step 1: use a reducing method
Use the following functions to collapse each group in a [`Pivot`](pivot.md) into a single row:
* [`first`](first.md) / [`last`](last.md) — take the first or last row (optionally, the first or last one that satisfies a predicate) of each group;
* [`minBy`](minBy.md) / [`maxBy`](maxBy.md) — take the row with the minimum or maximum value of the given `RowExpression` evaluated on rows within each group;
* [`medianBy`](median.md) / [`percentileBy`](percentile.md) — take the row with the median or a specific percentile value of the given `RowExpression` evaluated on rows within each group.
* [`minBy`](minBy.md) / [`maxBy`](maxBy.md) — take the row with the minimum or maximum value of the given
[`row expression`](DataRow.md#row-expressions) evaluated on rows within each group;
* [`medianBy`](median.md) / [`percentileBy`](percentile.md) — take the row with the median or a specific percentile value
of the given [`row expression`](DataRow.md#row-expressions) evaluated on rows within each group.

These functions return an instance of `ReducedPivot`, which is a class serving as a transitional step between performing a reduction on [`Pivot`](pivot.md) groups
and specifying how the resulting reduced rows should be represented in a resulting [`DataRow`](DataRow.md).
Expand Down Expand Up @@ -414,7 +416,8 @@ df.pivot("isHappy").percentileBy(25.0) { "weight"<Int>() }

To perform this transformation, use one of the following functions:
* [`values`](values.md) — creates a new [`DataRow`](DataRow.md) containing the values from the reduced rows in the selected columns;
* `with` — computes a new value for each reduced row using a `RowExpression` and produces a [`DataRow`](DataRow.md) containing these computed values.
* `with` — computes a new value for each reduced row using a [`row expression`](DataRow.md#row-expressions)
and produces a [`DataRow`](DataRow.md) containing these computed values.

Each of these functions returns a new [`DataRow`](DataRow.md) with [`Pivot`](pivot.md) keys as top-level columns (or as [`column groups`](DataColumn.md#columngroup))
and values composed of the reduced results from each group.
Expand Down Expand Up @@ -476,7 +479,10 @@ The following aggregation methods are available:
* [`values`](values.md) — collects values from all rows of each group for the selected columns into a single `List`
(values from [`column groups`](DataColumn.md#columngroup) are collected into a [`FrameColumn`](DataColumn.md#framecolumn));
Copy link
Copy Markdown
Collaborator Author

@Allex-Nik Allex-Nik May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would "collected into a DataFrame" be more correct?

* [`count`](count.md) — creates a [`DataRow`](DataRow.md) with the `Pivot` key columns containing the number of rows in each corresponding group;
* `with` — creates a [`DataRow`](DataRow.md) containing values computed using a `RowExpression` across all rows of each group and collected into a single List for every group;
* `with` — creates a [`DataRow`](DataRow.md) containing values computed using a [`row expression`](DataRow.md#row-expressions)
across all rows of each group:
* values of `DataRow<T>` are collected into `DataFrame<T>`, the resulting column is `FrameColumn<T>`;
* values of other types `R` are collected into `List<R>`, the resulting column is `DataColumn<List<R>>`;
* `aggregate` — performs a set of custom aggregations using `AggregateDsl`, allowing computation of one or more derived values per group;
* various [`aggregation statistics`](pivot.md#aggregation-statistics).

Expand Down
Loading