How to filter a ClickHouse table by an array-column?
Introduction
Filtering a ClickHouse table by an array-column is a common task and the product offers a lot of functions to work with array-columns.
In this article, we're going to focus on filtering a table by an array-column, but the video below covers a lot of other array-related functions:
Example
We'll use an example of a table with two columns tags_string
and tags_int
that contain an array of strings and integers respectively.
- Create a sample database and table.
- Create a sample table
- Insert some sample data into the table.
Filter the table using the has(arr, elem)
function to return the rows where the arr
array contains the elem
element.
Filter the table to return the rows where the tags_string
array contains the tag1
element.
Use the hasAll(arr, elems)
function to return the rows where all the elements in the elems
array are present in the arr
array.
Filter the table to return the rows where all the elements in the tags_string
array are present in the ['tag1', 'tag2']
array.
Use the hasAny(arr, elems)
function to return the rows where at least one element in the elems
array is present in the arr
array.
Filter the table to return the rows where at least one element in the tags_string
array is present in the ['tag1', 'tag2']
array.
We can use a lambda function to filter the table using the arrayExists(lambda, arr)
function.
Filter the table to return the rows where at least one element in the tags_int
array is greater than 3.