f_summarise {fastplyr} | R Documentation |
Summarise each group down to one row
Description
Like dplyr::summarise()
but with some internal optimisations
for common statistical functions.
Usage
f_summarise(
data,
...,
.by = NULL,
.order = df_group_by_order_default(data),
.optimise = TRUE
)
f_summarize(
data,
...,
.by = NULL,
.order = df_group_by_order_default(data),
.optimise = TRUE
)
Arguments
data |
A data frame. |
... |
Name-value pairs of summary functions. Expressions with
|
.by |
(Optional). A selection of columns to group by for this operation. Columns are specified using tidy-select. |
.order |
Should the groups be returned in sorted order?
If |
.optimise |
(Optionally) turn off optimisations for common statistical
functions by setting to |
Details
f_summarise
behaves mostly like dplyr::summarise
except that expressions
supplied to ...
are evaluated independently.
Optimised statistical functions
Some functions are internally optimised using 'collapse' fast statistical functions. This makes execution on many groups very fast.
For fast quantiles (percentiles) by group, see tidy_quantiles
List of currently optimised functions and their equivalent 'collapse' function
base::sum
-> collapse::fsum
base::prod
-> collapse::fprod
base::min
-> collapse::fmin
base::max
-> collapse::fmax
stats::mean
-> collapse::fmean
stats::median
-> collapse::fmedian
stats::sd
-> collapse::fsd
stats::var
-> collapse::fvar
dplyr::first
-> collapse::ffirst
dplyr::last
-> collapse::flast
dplyr::n_distinct
-> collapse::fndistinct
Value
An un-grouped data frame of summaries by group.
See Also
Examples
library(fastplyr)
library(nycflights13)
# Number of flights per month, including first and last day
flights %>%
f_group_by(year, month) %>%
f_summarise(first_day = first(day),
last_day = last(day),
num_flights = n())
## Fast mean summary using `across()`
flights %>%
f_summarise(
across(where(is.double), mean),
.by = tailnum
)
# To ignore or keep NAs, use collapse::set_collapse(na.rm)
collapse::set_collapse(na.rm = FALSE)
flights %>%
f_summarise(
across(where(is.double), mean),
.by = origin
)
collapse::set_collapse(na.rm = TRUE)