summarize operator

summarize operator

Produces a table that aggregates the content of the input table.

Syntax

T | summarize [ SummarizeParameters ] [[Column =] Aggregation [, ...]] [by [Column =] GroupExpression [, ...]]

Parameters

NameTypeRequiredDescription
ColumnstringThe name for the result column. Defaults to a name derived from the expression.
AggregationstringA call to an aggregation function such as count() or avg(), with column names as arguments.
GroupExpressionscalarA scalar expression that can reference the input data. The output will have as many records as there are distinct values of all the group expressions.

[!NOTE] When the input table is empty, the output depends on whether GroupExpression is used:

  • If GroupExpression is not provided, the output will be a single (empty) row.
  • If GroupExpression is provided, the output will have no rows.

Returns

The input rows are arranged into groups having the same values of the by expressions. Then the specified aggregation functions are computed over each group, producing a row for each group. The result contains the by columns and also at least one column for each computed aggregate. (Some aggregation functions return multiple columns.)

The result has as many rows as there are distinct combinations of by values (which may be zero). If there are no group keys provided, the result has a single record.

To summarize over ranges of numeric values, use bin() to reduce ranges to discrete values.

Aggregates default values

The following table summarizes the default values of aggregations:

OperatorDefault value
count(), countif()0

When using these aggregates over entities that includes null values, the null values will be ignored and won't participate in the calculation (see examples below).

Examples

Unique combination

The following query determines what unique combinations of State and EventType there are for storms that resulted in direct injury. There are no aggregation functions, just group-by keys. The output will just show the columns for those results.

StormEvents
| where InjuriesDirect > 0
| summarize by State, EventType

Output

The following table shows only the first 5 rows. To see the full output, run the query.

StateEventType
TEXASThunderstorm Wind
TEXASFlash Flood
TEXASWinter Weather
TEXASHigh Wind
TEXASFlood
......

Minimum and maximum timestamp

Finds the minimum and maximum heavy rain storms in Hawaii. There's no group-by clause, so there's just one row in the output.

StormEvents
| where State == "HAWAII" and EventType == "Heavy Rain"
| project Duration = EndTime - StartTime
| summarize Min = min(Duration), Max = max(Duration)

Output

MinMax
01:08:0011:55:00

Distinct count

Create a row for each continent, showing a count of the cities in which activities occur. Because there are few values for "continent", no grouping function is needed in the 'by' clause:

[!div class="nextstepaction"] Run the query

StormEvents
| summarize TypesOfStorms=dcount(EventType) by State
| sort by TypesOfStorms

Output

The following table shows only the first 5 rows. To see the full output, run the query.

StateTypesOfStorms
TEXAS27
CALIFORNIA26
PENNSYLVANIA25
GEORGIA24
ILLINOIS23
......

Histogram

The following example calculates a histogram storm event types that had storms lasting longer than 1 day. Because Duration has many values, use bin() to group its values into 1-day intervals.

[!div class="nextstepaction"] Run the query

StormEvents
| project EventType, Duration = EndTime - StartTime
| where Duration > 1d
| summarize EventCount=count() by EventType, Length=bin(Duration, 1d)
| sort by Length

Output

EventTypeLengthEventCount
Drought30.00:00:001646
Wildfire30.00:00:0011
Heat30.00:00:0014
Flood30.00:00:0020
Heavy Rain29.00:00:0042
.........