Snowflake provides robust data export capabilities through the COPY INTO <location> command, supporting multiple file formats and configuration options for unloading data efficiently.
**Supported File Formats:**
1. **CSV (Comma-Separated Values)**: The default format for data export. You can cust…Snowflake provides robust data export capabilities through the COPY INTO <location> command, supporting multiple file formats and configuration options for unloading data efficiently.
**Supported File Formats:**
1. **CSV (Comma-Separated Values)**: The default format for data export. You can customize delimiters, enclosure characters, escape characters, and null string representations. Options include FIELD_DELIMITER, RECORD_DELIMITER, and FIELD_OPTIONALLY_ENCLOSED_BY.
2. **JSON (JavaScript Object Notation)**: Ideal for semi-structured data export. Snowflake can preserve variant data types and nested structures. The STRIP_OUTER_ARRAY option controls array formatting.
3. **Parquet**: A columnar storage format offering excellent compression and query performance. Snowflake supports exporting to Parquet with SNAPPY compression by default.
**Key Export Options:**
- **COMPRESSION**: Supports GZIP, BZ2, BROTLI, ZSTD, DEFLATE, RAW_DEFLATE, and NONE. This reduces file sizes and transfer costs.
- **SINGLE**: When set to TRUE, produces one output file rather than multiple parallel files. Useful for smaller datasets requiring a single file output.
- **MAX_FILE_SIZE**: Controls the maximum size of each output file (default 16MB for cloud storage). Larger values create fewer files but may impact parallel processing.
- **OVERWRITE**: When TRUE, replaces existing files in the target location with the same names.
- **HEADER**: For CSV exports, setting this to TRUE includes column headers in the output file.
- **FILE_EXTENSION**: Allows custom file extensions for exported files.
**Export Destinations:**
Data can be exported to internal stages (user, table, or named stages) or external stages connected to cloud storage services like Amazon S3, Azure Blob Storage, or Google Cloud Storage.
**Best Practices:**
Use appropriate compression for cost optimization, leverage parallel file generation for large datasets, and select file formats compatible with downstream systems. Partitioning output by date or other columns using PARTITION BY improves data organization in cloud storage.
Data Export Formats and Options in Snowflake
Why Data Export Formats Matter
Understanding data export formats is critical for Snowflake professionals because it directly impacts how data moves out of Snowflake to external systems, data lakes, or other applications. Choosing the right format affects storage costs, processing speed, and compatibility with downstream systems.
What Are Data Export Formats in Snowflake?
Snowflake supports multiple file formats when unloading (exporting) data from tables to stages. The primary export formats include:
CSV (Comma-Separated Values) - Most common format for data exchange - Human-readable text format - Configurable delimiters, enclosures, and escape characters
JSON (JavaScript Object Notation) - Ideal for semi-structured data - Preserves nested data structures - Widely used in modern applications and APIs
Parquet - Columnar storage format - Excellent compression ratios - Optimized for analytical workloads - Supports schema preservation
How Data Export Works
Data export in Snowflake uses the COPY INTO location command:
1. Source: Specify the table or query to export 2. Destination: Define the stage location (internal or external) 3. File Format: Specify format options using FILE_FORMAT parameter 4. Partitioning: Optionally partition output files
Key Export Options
- SINGLE: Set to TRUE to produce one file, FALSE for multiple parallel files - MAX_FILE_SIZE: Controls the maximum size of each output file - OVERWRITE: Determines whether existing files are replaced - HEADER: Includes column headers in CSV exports - COMPRESSION: Options include GZIP, BZ2, BROTLI, ZSTD, or NONE
Compression Options by Format
CSV and JSON support: GZIP, BZ2, BROTLI, ZSTD, DEFLATE, RAW_DEFLATE, NONE Parquet uses: SNAPPY (default) or LZO compression built into the format
Exam Tips: Answering Questions on Data Export File Formats and Options
1. Remember the command syntax: COPY INTO is used for both loading AND unloading - context determines direction
2. Know format defaults: CSV is the default export format when none is specified
3. Understand Parquet benefits: When questions mention analytical workloads, cloud storage optimization, or columnar access patterns, Parquet is typically the answer
4. File splitting behavior: By default, Snowflake creates multiple parallel files for better performance. Use SINGLE=TRUE only when a single file is required
5. Compression awareness: GZIP is commonly used for CSV exports. Parquet has built-in compression using SNAPPY by default
6. Header option: The HEADER option only applies to CSV format exports
7. Stage requirements: You must have appropriate privileges on the stage to export data
8. Query-based exports: You can export query results, not just entire tables, allowing for filtered or transformed exports
9. Watch for tricky wording: Questions may reference unloading data - this means exporting FROM Snowflake, while loading means importing INTO Snowflake
10. MAX_FILE_SIZE default: The default is approximately 16MB for cloud storage exports - know this for optimization questions