FLATTEN function for nested data

5 minutes 5 Questions

The FLATTEN function in Snowflake is a powerful table function designed to handle semi-structured data by converting nested arrays and objects into a relational format that can be easily queried and analyzed. When working with JSON, ARRAY, or VARIANT data types, you often encounter hierarchical st…

FLATTEN Function for Nested Data in Snowflake

Why FLATTEN is Important

The FLATTEN function is essential for working with semi-structured data in Snowflake, such as JSON, AVRO, and Parquet files. Modern data pipelines frequently ingest nested and hierarchical data structures, and FLATTEN allows you to transform these complex structures into relational rows that can be queried using standard SQL. This is a critical skill tested on the SnowPro Core exam.

What is FLATTEN?

FLATTEN is a table function that produces a lateral view of a VARIANT, OBJECT, or ARRAY column. It converts nested elements into separate rows, essentially 'exploding' arrays and objects into a tabular format. Each element in the nested structure becomes its own row in the output.

How FLATTEN Works

The basic syntax is:

SELECT * FROM table_name, LATERAL FLATTEN(input => column_name)

Key parameters include:
- INPUT: The VARIANT, OBJECT, or ARRAY column to flatten
- PATH: Optional path to a nested element within the input
- OUTER: When TRUE, returns rows even when input is empty (similar to outer join)
- RECURSIVE: When TRUE, flattens nested structures at all levels
- MODE: Can be OBJECT, ARRAY, or BOTH (default is BOTH)

FLATTEN returns several columns:
- SEQ: A unique sequence number
- KEY: The key for objects, or index for arrays
- PATH: The path to the element
- INDEX: The index for array elements (NULL for objects)
- VALUE: The value of the element
- THIS: The element being flattened

Common Use Cases

1. Flattening JSON arrays: Extract individual items from an array stored in a VARIANT column
2. Accessing nested objects: Navigate through multiple levels of JSON hierarchy
3. Converting semi-structured to structured data: Transform JSON data into standard relational tables

Example

For a table with JSON data containing an array of orders:
SELECT f.value:order_id::STRING as order_id, f.value:amount::NUMBER as amount FROM customers, LATERAL FLATTEN(input => customer_data:orders) f

Exam Tips: Answering Questions on FLATTEN

1. Remember LATERAL: FLATTEN is typically used with LATERAL keyword to correlate with the base table

2. Know the output columns: Exam questions often ask about SEQ, KEY, PATH, INDEX, VALUE, and THIS columns

3. Understand OUTER parameter: When OUTER is TRUE, rows with empty arrays or NULL values are preserved in the output. When FALSE (default), these rows are excluded

4. RECURSIVE vs non-recursive: Know that RECURSIVE TRUE will flatten all nested levels, while the default only flattens one level

5. MODE parameter: Questions may test whether you understand OBJECT mode (only objects), ARRAY mode (only arrays), or BOTH

6. Path notation: Be comfortable with dot notation and bracket notation for accessing nested elements (e.g., data:customer:address vs data['customer']['address'])

7. Casting output: Remember that VALUE returns VARIANT type, so you typically need to cast it (::STRING, ::NUMBER, etc.)

8. Multiple FLATTENs: For deeply nested structures, you may need to chain multiple FLATTEN operations

9. Performance consideration: FLATTEN can generate many rows from a single input row, which impacts query performance

Test mode:

Exam (Timed)

Practice (With explanations)

Start practice test

Unlock Premium Access

SnowPro Core Certification

Access to ALL Certifications: Study for any certification on our platform with one subscription
2935 Superior-grade SnowPro Core Certification practice questions
Unlimited practice tests across all certifications
Detailed explanations for every question
COF-C02: 5 full exams plus all other certification exams
100% Satisfaction Guaranteed: Full refund if unsatisfied
Risk-Free: 7-day free trial with all premium features!

More FLATTEN function for nested data questions

29 questions (total)

Start 29 question test