GroupedData API#
GroupedData objects are returned by groupby call:
Dataset.groupby().
Computations or Descriptive Stats#
| Compute count aggregation. | |
| Compute grouped max aggregation. | |
| Compute grouped mean aggregation. | |
| Compute grouped min aggregation. | |
| Compute grouped standard deviation aggregation. | |
| Compute grouped sum aggregation. | 
Function Application#
| Implements an accumulator-based aggregation. | |
| Apply the given function to each group of records of this dataset. | 
AggregateFn#
- class ray.data.aggregate.AggregateFn(init: Callable[[KeyType], AggType], merge: Callable[[AggType, AggType], AggType], name: str, accumulate_row: Callable[[AggType, T], AggType] = None, accumulate_block: Callable[[AggType, pyarrow.Table | pandas.DataFrame], AggType] = None, finalize: Callable[[AggType], U] | None = None)[source]#
- Defines an aggregate function in the accumulator style. - Aggregates a collection of inputs of type T into a single output value of type U. See https://www.sigops.org/s/conferences/sosp/2009/papers/yu-sosp09.pdf for more details about accumulator-based aggregation. - Parameters:
- init – This is called once for each group to return the empty accumulator. For example, an empty accumulator for a sum would be 0. 
- merge – This may be called multiple times, each time to merge two accumulators into one. 
- name – The name of the aggregation. This will be used as the column name in the output Dataset. 
- accumulate_row – This is called once per row of the same group. This combines the accumulator and the row, returns the updated accumulator. Exactly one of accumulate_row and accumulate_block must be provided. 
- accumulate_block – This is used to calculate the aggregation for a single block, and is vectorized alternative to accumulate_row. This will be given a base accumulator and the entire block, allowing for vectorized accumulation of the block. Exactly one of accumulate_row and accumulate_block must be provided. 
- finalize – This is called once to compute the final aggregation result from the fully merged accumulator.