max_by
Finds the row(s) containing the maximum value for a column and returns the value of another column in that row.
For example, if a table contains the columns employee_id
and salary
, MAX_BY(employee_id, salary)
returns the value of the employee_id
column for the row that has the highest value in the salary
column.
If multiple rows contain the specified maximum value, the function is non-deterministic.
Syntax
max_by( <col_to_return>, <col_containing_maximum> )
Arguments
<col_to_return>
Column containing the value to return.
<col_containing_maximum>
Column containing the maximum value.
Returns
The function returns a value of the same type as col_to_return
.
Example
The following examples demonstrate how to use the MAX_BY function.
To run these examples, execute the following statements to set up the table and data for the examples:
CREATE TABLE employees(employee_id int, department_id int, salary int);
VALUES
(1001, 10, 10000),
(1020, 10, 9000),
(1030, 10, 8000),
(900, 20, 15000),
(2000, 20, NULL),
(2010, 20, 15000),
(2020, 20, 8000),
INSERT INTO employees;
The following example returns the ID of the employee with the highest salary:
FROM employees aggregate max_by(employee_id, salary);
+-----------------------------+
| max_by(employee_id, salary) |
+-----------------------------+
| 2010 |
+-----------------------------+
Note the following:
- Because more than one row contains the maximum value for the
salary
column, the function is non-deterministic and might return the employee ID for a different row in subsequent executions. - The function ignores the NULL value in the salary column when determining the rows with the maximum values.