Data Protection and Data Sharing Flashcards

Question 1

Time Travel feature

Accepted Answer

Time Travel is a powerful data protection feature in Snowflake that enables users to access historical data at any point within a defined retention period. This capability allows organizations to query, clone, and restore data as it existed at specific moments in the past, providing essential protection against accidental or malicious data modifications.

The Time Travel retention period can be configured from 0 to 90 days, depending on the Snowflake edition. Standard Edition supports up to 1 day of Time Travel, while Enterprise Edition and higher support up to 90 days. This retention period can be set at the account, database, schema, or table level using the DATA_RETENTION_TIME_IN_DAYS parameter.

Time Travel supports several key operations. The AT clause allows querying data as it existed at a specific timestamp, while the BEFORE clause retrieves data from just before a specified point. Users can reference data using timestamps, statement IDs, or time offsets. For example, SELECT * FROM table AT(TIMESTAMP => '2024-01-15 10:00:00') retrieves the table state at that exact moment.

The UNDROP command leverages Time Travel to restore dropped tables, schemas, or databases within the retention period. This provides a safety net against accidental deletions. Additionally, Time Travel enables zero-copy cloning of historical data states, allowing users to create clones of objects as they existed at previous points.

Time Travel consumes storage for maintaining historical data versions. Once the retention period expires, data transitions to Fail-safe, a seven-day period during which Snowflake can recover data for disaster recovery purposes. Unlike Time Travel, Fail-safe is accessible only by Snowflake support.

Understanding Time Travel is crucial for data governance, compliance requirements, and maintaining data integrity. It provides audit capabilities, supports data recovery scenarios, and enables analytical queries on historical data states, making it an essential feature for enterprise data management in Snowflake.

Question 2

Fail-safe period

Accepted Answer

The Fail-safe period is a critical data protection feature in Snowflake that provides an additional layer of recovery beyond Time Travel. It represents a 7-day period during which Snowflake can recover data that would otherwise be lost, but this recovery can only be performed by Snowflake Support personnel.

When you delete or modify data in Snowflake, the data lifecycle follows this sequence: Active Data → Time Travel Period → Fail-safe Period → Permanent Deletion. After your configured Time Travel retention period expires (which can range from 0 to 90 days depending on your Snowflake edition and table type), the data enters the Fail-safe period.

Key characteristics of Fail-safe include:

1. **Duration**: The Fail-safe period is fixed at 7 days and cannot be modified or disabled by users. This is a system-level protection mechanism.

2. **Access Restrictions**: Unlike Time Travel, where users can query historical data using AT or BEFORE clauses, Fail-safe data is not accessible through standard SQL commands. Only Snowflake Support can retrieve this data during disaster recovery scenarios.

3. **Storage Costs**: Data in Fail-safe contributes to your overall storage costs. Organizations should factor this into their storage budget calculations, as all data remains billable during this period.

4. **Table Types**: Fail-safe applies to permanent tables. Temporary and transient tables do not have Fail-safe protection, making them suitable for staging data where extended recovery is unnecessary.

5. **Recovery Process**: To recover data from Fail-safe, you must contact Snowflake Support and provide justification. Recovery is performed on a best-effort basis and may take time.

6. **Purpose**: Fail-safe serves as a last resort for catastrophic data loss scenarios, such as accidental bulk deletions or system failures that occur after Time Travel has expired.

Understanding Fail-safe is essential for designing appropriate data retention strategies and calculating total storage requirements in Snowflake environments.

Question 3

Data retention settings

Accepted Answer

Data retention settings in Snowflake are crucial configurations that determine how long historical data and deleted/modified data remain accessible for recovery purposes. These settings are primarily governed by two key features: Time Travel and Fail-safe.

Time Travel allows users to access historical data within a defined retention period. The DATA_RETENTION_TIME_IN_DAYS parameter controls this window, which can be set at the account, database, schema, or table level. For Snowflake Standard Edition, the maximum retention period is 1 day, while Enterprise Edition and higher support up to 90 days. During this period, users can query data as it existed at any point, clone objects from historical states, and restore dropped objects using the UNDROP command.

Fail-safe provides an additional 7-day period of data protection after the Time Travel retention period expires. This feature is managed exclusively by Snowflake and serves as a disaster recovery mechanism. Unlike Time Travel, Fail-safe data cannot be accessed by users through standard queries; only Snowflake support can retrieve this data in case of system failures or catastrophic events.

Storage costs are directly impacted by retention settings. Longer retention periods mean more historical data is maintained, increasing storage consumption. Organizations must balance data protection requirements against cost considerations when configuring these settings.

Transient and temporary tables have different retention characteristics. Transient tables support Time Travel retention of 0 or 1 day only and have no Fail-safe period. Temporary tables exist only for the session duration and similarly lack Fail-safe protection.

Best practices include setting appropriate retention periods based on compliance requirements, regularly reviewing retention configurations across objects, and understanding that changes to retention settings apply to new data modifications rather than retroactively affecting existing historical data. Proper configuration of data retention settings ensures optimal balance between data recoverability, compliance adherence, and cost management in your Snowflake environment.

Question 4

UNDROP command

Accepted Answer

The UNDROP command in Snowflake is a powerful data recovery feature that allows users to restore objects that have been accidentally or intentionally dropped. This command is part of Snowflake's Time Travel capability, which maintains historical data for a specified retention period.

The UNDROP command can be used to recover several types of objects including databases, schemas, tables, and tags. When an object is dropped in Snowflake, it is not permanently deleted right away. Instead, it moves to a recoverable state where it remains accessible for restoration during the Time Travel retention period.

The syntax for using UNDROP is straightforward. For example, to restore a dropped table, you would use: UNDROP TABLE table_name. Similarly, you can restore schemas with UNDROP SCHEMA schema_name or databases with UNDROP DATABASE database_name.

There are important considerations when using UNDROP. First, the restoration must occur within the Time Travel retention period, which can range from 0 to 90 days depending on your Snowflake edition and object configuration. Standard edition supports up to 1 day, while Enterprise edition and higher support up to 90 days.

Second, if a new object with the same name has been created after the original was dropped, you must first rename the new object before executing the UNDROP command. This is because Snowflake cannot have two objects with identical names in the same scope.

Third, when you restore a database or schema, all child objects that existed at the time of dropping are also restored, maintaining the hierarchical structure.

The UNDROP feature provides significant protection against accidental data loss and supports business continuity by enabling quick recovery of critical data assets. This capability is essential for maintaining data governance and ensuring that organizations can recover from human errors or unintended deletions efficiently. Understanding UNDROP is crucial for the SnowPro Core Certification as it demonstrates knowledge of Snowflake's data protection mechanisms.

Question 5

Zero-copy cloning

Accepted Answer

Zero-copy cloning is a powerful feature in Snowflake that allows you to create instant copies of databases, schemas, and tables at no additional storage cost at the time of cloning. This capability is fundamental to understanding data protection and efficient data management within the Snowflake platform.

When you execute a CLONE command, Snowflake creates a new object that references the same underlying micro-partitions as the source object. No physical data is copied during this operation, which is why it is called 'zero-copy.' The cloning process happens almost instantaneously, regardless of the size of the source object, because only metadata pointers are created.

The key benefits of zero-copy cloning include rapid development and testing environments, point-in-time snapshots for backup purposes, and the ability to experiment with data transformations safely. Since clones share the original data storage, you only incur additional costs when modifications are made to either the source or the cloned object. When changes occur, Snowflake creates new micro-partitions for the modified data while unchanged data continues to be shared.

Zero-copy cloning supports several object types including databases, schemas, tables, and streams. You can clone objects across different schemas or databases within the same account. The cloned objects inherit the structure and data of the source at the moment of cloning but become independent entities afterward.

For data protection purposes, cloning works seamlessly with Time Travel, allowing you to clone objects at specific points in time within the Time Travel retention period. This provides excellent recovery options and enables you to restore data to previous states when needed.

Important considerations include that clones do not copy privileges, external tables cannot be cloned, and transient or temporary tables produce clones of the same type. Understanding these nuances is essential for effectively leveraging this feature in your Snowflake environment.

Question 6

Cloning databases, schemas, and tables

Accepted Answer

Cloning in Snowflake is a powerful feature that allows you to create instant copies of databases, schemas, and tables using a zero-copy approach. This means that when you clone an object, Snowflake does not physically duplicate the underlying data. Instead, it creates metadata pointers to the existing micro-partitions, making the operation extremely fast and storage-efficient.

To clone objects, you use the CREATE ... CLONE command. For example, CREATE DATABASE new_db CLONE source_db creates a complete copy of a database including all its schemas and tables. Similarly, you can clone individual schemas with CREATE SCHEMA new_schema CLONE source_schema or tables with CREATE TABLE new_table CLONE source_table.

Cloning supports Time Travel, allowing you to clone objects as they existed at a specific point in time using the AT or BEFORE clause. This is invaluable for recovering from accidental data modifications or deletions.

When a clone is created, it initially shares the same micro-partitions as the source object. As modifications are made to either the source or the clone, Snowflake creates new micro-partitions for the changed data, following a copy-on-write model. This ensures data independence between the source and clone while optimizing storage usage.

Key considerations include: clones inherit the privileges of the source object at creation time, but subsequent privilege changes are independent. Database and schema clones include all child objects. Cloning is useful for creating development or testing environments, taking snapshots before major changes, and sharing data across teams.

Cloning works across schemas within the same database and across databases within the same account. However, you cannot clone objects across different Snowflake accounts. External tables, internal stages with files, and some other specialized objects have specific cloning behaviors or limitations. Understanding these nuances is essential for effective data management and protection strategies in Snowflake.

Question 7

Clone inheritance

Accepted Answer

Clone inheritance in Snowflake refers to how cloned objects inherit properties and privileges from their source objects. When you create a clone of a database, schema, or table, the clone inherits certain characteristics while maintaining independence from the original object.

Key aspects of clone inheritance include:

1. **Data Inheritance**: Clones initially share the same underlying data storage as the source object through Snowflake's zero-copy cloning feature. This means no additional storage is consumed at clone creation time. Data divergence occurs only when modifications are made to either the source or the clone.

2. **Structural Inheritance**: The clone inherits the complete structure of the source object, including column definitions, clustering keys, constraints, and other metadata properties.

3. **Privilege Inheritance**: When cloning databases or schemas, the clone inherits the privileges granted on contained objects. However, privileges granted on the container itself (the database or schema being cloned) are not inherited by the clone. The user creating the clone becomes the owner of the cloned object.

4. **Child Object Inheritance**: When cloning a database, all schemas and objects within are cloned. When cloning a schema, all tables, views, and other objects within are cloned. This creates a complete hierarchical copy.

5. **Time Travel Inheritance**: Clones can be created from historical data using Time Travel, allowing you to clone objects as they existed at a specific point in time.

6. **Data Sharing Considerations**: Clones of shared databases behave as independent objects. The cloned data is no longer tied to the data sharing relationship.

7. **Independence After Creation**: Once created, clones are fully independent objects. Changes to the source do not affect the clone, and changes to the clone do not affect the source.

Understanding clone inheritance is essential for managing data protection strategies, creating development environments, and implementing effective testing scenarios in Snowflake.

Question 8

Dynamic data masking

Accepted Answer

Dynamic Data Masking (DDM) in Snowflake is a powerful column-level security feature that protects sensitive data by obscuring it at query runtime while preserving the original data in storage. This capability is essential for organizations handling personally identifiable information (PII), financial data, or other confidential information.

Dynamic Data Masking works by applying masking policies to columns containing sensitive data. When users query these columns, the masking policy evaluates their role and context to determine whether to display the actual data or a masked version. The original data remains unchanged in the database, ensuring data integrity while controlling visibility.

Snowflake implements DDM through masking policies, which are schema-level objects that define the masking logic using SQL expressions. Administrators create these policies specifying conditions under which data should be masked and what the masked output should look like. For example, a social security number might display as XXX-XX-1234, showing only the last four digits to certain users.

Key characteristics of Dynamic Data Masking include:

1. Role-based access: Masking decisions are typically based on the executing user's role, allowing granular control over data visibility.

2. Centralized management: Policies are defined once and can be applied to multiple columns across different tables, ensuring consistent protection.

3. Real-time application: Masking occurs during query execution, meaning no duplicate masked tables are needed.

4. Flexibility: Custom masking functions can be created to handle various data types and masking requirements.

5. Audit compliance: Helps organizations meet regulatory requirements like GDPR, HIPAA, and PCI-DSS by limiting exposure of sensitive data.

Masking policies support conditional logic, allowing different masking behaviors based on user context. This enables scenarios where analysts see masked data while authorized personnel view complete information. The feature integrates seamlessly with Snowflake's role-based access control system, providing comprehensive data protection strategies for enterprise environments.

Question 9

Row access policies

Accepted Answer

Row Access Policies in Snowflake are a powerful data governance feature that enables fine-grained access control at the row level within tables and views. This functionality allows organizations to restrict which rows of data specific users or roles can see, ensuring sensitive information is only accessible to authorized personnel.

A Row Access Policy is a schema-level object that contains an expression defining the filtering logic. When applied to a table or view, Snowflake evaluates this expression for each row during query execution, determining whether the current user should see that particular row based on predefined conditions.

The policy uses a combination of the CURRENT_ROLE() or CURRENT_USER() functions along with mapping tables to make access decisions. For example, a healthcare organization might create a policy ensuring doctors only see patient records from their assigned department, while administrators have broader access.

Key characteristics of Row Access Policies include:

1. **Centralized Management**: Policies are defined once and can be applied to multiple tables, simplifying administration and ensuring consistent security enforcement.

2. **Transparent Operation**: End users experience seamless query execution - the filtering happens automatically in the background with no changes required to their SQL statements.

3. **Dynamic Evaluation**: Access decisions are made at query runtime, allowing policies to adapt based on current user context and role assignments.

4. **Stackable Design**: Multiple policies can be applied to the same table when different security requirements exist.

To implement Row Access Policies, users need the appropriate privileges including CREATE ROW ACCESS POLICY at the schema level and APPLY ROW ACCESS POLICY to attach policies to objects. The SECURITYADMIN or ACCOUNTADMIN roles typically manage these permissions.

Row Access Policies complement other Snowflake security features like Column-level Security and Data Masking, providing comprehensive protection for sensitive data while maintaining usability for authorized users across different business contexts.

Question 10

Object tagging

Accepted Answer

Object tagging in Snowflake is a powerful metadata management feature that enables organizations to classify, organize, and govern their data assets effectively. Tags are schema-level objects that can be assigned to various Snowflake objects including databases, schemas, tables, views, columns, warehouses, and more.

Tags consist of a tag name and an associated string value, allowing users to create meaningful classifications for their data. For example, you might create tags like 'sensitivity_level' with values such as 'public', 'confidential', or 'restricted' to categorize data based on its sensitivity.

One of the most significant benefits of object tagging is its support for data governance and compliance requirements. Organizations can track sensitive data across their Snowflake environment by applying tags that identify PII (Personally Identifiable Information), PHI (Protected Health Information), or other regulated data types. This makes it easier to maintain compliance with regulations like GDPR, HIPAA, and CCPA.

Tag lineage is another valuable feature where tags applied to parent objects automatically propagate to child objects. For instance, a tag applied to a table will be inherited by all columns within that table, ensuring consistent classification throughout the hierarchy.

Tags integrate seamlessly with Snowflake's access control mechanisms. You can create tag-based masking policies that automatically apply data masking based on tag values, providing dynamic data protection. Row access policies can also leverage tags to control which rows users can access.

To create and manage tags, users need appropriate privileges including CREATE TAG, APPLY TAG, and the ability to query tag references through the TAG_REFERENCES and TAG_REFERENCES_ALL_COLUMNS functions in the Account Usage schema.

Object tagging supports data sharing scenarios by helping organizations understand what data classifications exist before sharing with other accounts. This ensures proper governance when exposing data externally through secure data sharing features.

Question 11

Column-level security

Accepted Answer

Column-level security in Snowflake is a powerful data protection feature that allows organizations to control access to sensitive data at the granular column level within tables and views. This capability is essential for maintaining compliance with data privacy regulations and protecting confidential information while still enabling authorized users to query the data they need.

Snowflake implements column-level security primarily through two mechanisms: Dynamic Data Masking and External Tokenization.

Dynamic Data Masking allows you to define masking policies that determine how sensitive column data appears to different users based on their roles. When a user queries masked columns, the policy evaluates their context and either reveals the actual data or returns a masked version (such as replacing characters with asterisks or showing null values). This happens at query runtime, meaning the underlying data remains unchanged while different users see different representations based on their privileges.

Masking policies are schema-level objects that can be applied to multiple columns across different tables, promoting reusability and consistent security enforcement. Administrators create these policies using SQL commands and attach them to specific columns. The policies support conditional logic, allowing complex rules based on the querying user's role, current context, or other factors.

External Tokenization integrates with third-party tokenization providers to replace sensitive data with tokens. This approach is particularly useful when organizations already have tokenization infrastructure in place.

Key benefits of column-level security include: maintaining a single copy of data while serving multiple user groups with different access requirements, simplifying compliance with regulations like GDPR and HIPAA, reducing the need to create multiple views or table copies for different audiences, and enabling centralized policy management.

Column-level security works seamlessly with Snowflake's role-based access control system, allowing organizations to implement defense-in-depth strategies where both row-level and column-level protections can be combined for comprehensive data governance.

Question 12

Data classification

Accepted Answer

Data classification in Snowflake is a crucial feature that helps organizations identify, categorize, and protect sensitive data stored within their databases. This capability enables businesses to maintain compliance with regulatory requirements and implement appropriate security measures based on data sensitivity levels.

Snowflake provides built-in data classification functionality that automatically analyzes and categorizes data in your tables. The system examines column metadata and actual data content to assign classification tags. These classifications help identify sensitive information such as personally identifiable information (PII), financial data, healthcare records, and other confidential data types.

The classification process involves two main components: semantic categories and privacy categories. Semantic categories describe what the data represents, such as email addresses, phone numbers, credit card numbers, or social security numbers. Privacy categories indicate the sensitivity level, helping determine how the data should be handled and protected.

Snowflake offers system-defined classification tags that cover common sensitive data types. Organizations can also create custom classification tags to address specific business requirements or industry regulations. Once data is classified, you can view classification results through the EXTRACT_SEMANTIC_CATEGORIES function or access them via the Account Usage views.

Data classification integrates with other Snowflake security features. Organizations can use classification results to implement access policies, masking policies, and row access policies. This integration allows for automated data protection based on classification tags, ensuring sensitive data receives appropriate security controls.

The classification feature supports governance initiatives by providing visibility into where sensitive data resides across your Snowflake environment. Security teams can generate reports showing data distribution and take proactive measures to protect critical information assets.

By leveraging data classification, organizations can better manage their data protection strategies, meet compliance obligations, and reduce the risk of unauthorized access to sensitive information stored in Snowflake.

Question 13

Secure data sharing between accounts

Accepted Answer

Secure data sharing between accounts in Snowflake is a powerful feature that enables organizations to share live data across different Snowflake accounts seamlessly and securely. This capability eliminates the need for copying or moving data, ensuring that consumers always access the most current information while providers maintain complete control over their data assets.

Snowflake's data sharing architecture operates through a unique approach where the actual data never leaves the provider's account. Instead, providers create shares - named objects that encapsulate database objects such as tables, secure views, and secure user-defined functions. These shares contain metadata and access privileges that allow consumer accounts to query the underlying data in real-time.

The provider account creates a share and grants privileges on specific database objects to that share. Consumer accounts can then create databases from these shares, enabling their users to query the shared data as if it were stored locally. This process happens through Snowflake's services layer, ensuring data remains in place while access is extended securely.

Security is maintained through several mechanisms. Providers can use secure views to filter and restrict which data consumers can access, protecting sensitive information. Role-based access control allows granular permission management within both provider and consumer accounts. All data remains encrypted and protected by Snowflake's security infrastructure.

Data sharing supports both direct sharing between accounts and sharing through the Snowflake Marketplace, where data providers can publish listings for other organizations to discover and consume. Reader accounts can be created for organizations that do not have their own Snowflake account, enabling broader data distribution.

Key benefits include zero data movement, real-time data access, reduced storage costs for consumers, simplified data governance, and the ability to monetize data assets. This functionality transforms how organizations collaborate and exchange information across business boundaries.

Question 14

Reader accounts

Accepted Answer

Reader accounts in Snowflake are a specialized type of account designed to enable data sharing with consumers who do not have their own Snowflake account. This feature is particularly valuable for organizations that want to share data with external parties such as clients, partners, or vendors who may not be existing Snowflake customers.

When a data provider creates a reader account, Snowflake establishes a separate, managed account that the provider controls and pays for. The reader account operates under the provider's billing umbrella, meaning all compute costs incurred by the reader account are charged to the provider organization. This arrangement makes it easy to share data with external entities who lack Snowflake infrastructure.

Reader accounts have several key characteristics. First, they can only consume data that has been shared with them through secure shares - they cannot create their own databases or load their own data. Second, the provider maintains administrative control over the reader account, including the ability to manage users and set resource limits. Third, reader accounts use their own virtual warehouses for query processing, which the provider can configure and monitor.

From a security perspective, reader accounts benefit from the same robust data protection mechanisms as standard Snowflake accounts. Data remains encrypted, and access controls can be applied to determine what data the reader account can access. The shared data never leaves the provider's account - reader accounts simply have read-only access to the shared objects.

For the SnowPro Core Certification exam, understanding that reader accounts are provider-managed accounts specifically created for non-Snowflake consumers is essential. Key points include knowing that providers bear the compute costs, reader accounts have read-only capabilities, and they represent one of several data sharing options alongside standard Snowflake-to-Snowflake sharing through the Data Exchange or private listings.

Question 15

Shares and share objects

Accepted Answer

In Snowflake, Shares are a fundamental feature that enables secure data sharing between Snowflake accounts without the need to copy or transfer data. This capability is a cornerstone of Snowflake's Data Sharing functionality, allowing organizations to collaborate efficiently while maintaining data governance.

A Share is a named Snowflake object that encapsulates all the information required to share specific database objects with one or more consumer accounts. When you create a share, you are essentially creating a container that holds references to the data you want to make available to other parties.

Share objects refer to the database objects that can be included within a share. These include databases, schemas, tables, secure views, secure materialized views, and secure user-defined functions (UDFs). The provider account grants privileges on these objects to the share, which then makes them accessible to designated consumer accounts.

The key benefits of using shares include zero data movement, meaning consumers access the same underlying data stored in the provider's account. This ensures data is always current and eliminates storage costs for consumers. Additionally, providers maintain complete control over what data is shared and can revoke access at any time.

To create and manage shares, providers use SQL commands such as CREATE SHARE, GRANT privileges TO SHARE, and ALTER SHARE to add consumer accounts. Consumer accounts then create a database from the share using CREATE DATABASE FROM SHARE to access the shared data.

Secure views are particularly important in sharing scenarios because they prevent consumers from seeing the underlying table structures or data beyond what is explicitly shared. This adds an essential layer of security.

Shares support both direct sharing between accounts and listing on the Snowflake Marketplace, enabling broader data monetization opportunities. Reader accounts can also be created for organizations that do not have their own Snowflake account, expanding sharing possibilities to external parties.

Question 16

Snowflake Marketplace

Accepted Answer

Snowflake Marketplace is a powerful feature within the Snowflake Data Cloud that enables organizations to discover, access, and share data products and services seamlessly. It serves as a centralized hub where data providers can publish their datasets, data services, and applications, while data consumers can browse and subscribe to these offerings.

Key aspects of Snowflake Marketplace include:

**Data Sharing Capabilities**: The Marketplace leverages Snowflake's secure data sharing technology, allowing providers to share live, ready-to-query data with consumers. This eliminates the need for traditional ETL processes, file transfers, or data copying, ensuring consumers always have access to the most current information.

**Types of Listings**: Providers can offer free or paid listings, including standard datasets, personalized data products, and data services. Listings can be public (available to all Snowflake accounts) or private (restricted to specific consumers).

**Data Products**: These include third-party datasets covering various domains such as financial data, weather information, demographic statistics, geospatial data, and industry-specific datasets from leading data providers.

**Governance and Security**: Data shared through the Marketplace maintains Snowflake's robust security model. Providers retain control over their data, and consumers access it within their own Snowflake environment with appropriate access controls.

**Cross-Cloud Functionality**: Snowflake Marketplace supports data sharing across different cloud platforms and regions, enabling global collaboration while maintaining data governance standards.

**Native Applications**: Beyond raw data, providers can also publish Snowflake Native Apps through the Marketplace, offering packaged solutions that combine data with business logic.

**Benefits**: Organizations can monetize their data assets, reduce time-to-insight by accessing pre-built datasets, and collaborate with partners through secure data exchange. The Marketplace eliminates data silos and accelerates data-driven decision making across organizations.

For the SnowPro Core Certification, understanding how Marketplace facilitates secure, governed data sharing is essential knowledge.

Question 17

Data Exchange

Accepted Answer

Data Exchange in Snowflake is a powerful feature that enables organizations to securely share and consume data with external parties, creating a collaborative data ecosystem. It serves as a hub where data providers can publish datasets and data consumers can discover and access them.

A Data Exchange functions as a private marketplace for data sharing between selected participants. Unlike public marketplaces, Data Exchanges are invitation-only environments where organizations control who can participate. This makes them ideal for sharing sensitive information between business partners, subsidiaries, or trusted third parties.

Key characteristics of Data Exchange include:

1. **Governance and Control**: Data providers maintain complete control over their shared data. They can define access policies, track usage, and revoke access at any time. The original data never leaves the provider's account, ensuring data sovereignty.

2. **Live Data Access**: Consumers access real-time data rather than static copies. This eliminates data staleness and reduces storage costs since no physical data movement occurs between accounts.

3. **Secure Sharing**: All data sharing happens within Snowflake's secure infrastructure. Data is encrypted and protected by role-based access controls, ensuring compliance with security requirements.

4. **Listings**: Providers create listings that describe available datasets, including metadata, sample data, and usage terms. Consumers browse these listings to find relevant data for their needs.

5. **Cross-Region and Cross-Cloud Sharing**: Data Exchanges support sharing across different Snowflake regions and cloud platforms, enabling global data collaboration.

6. **Reader Accounts**: Providers can create reader accounts for consumers who do not have their own Snowflake accounts, expanding the potential audience for shared data.

Data Exchange differs from Snowflake Marketplace, which is a public platform open to all Snowflake customers. Organizations use Data Exchanges for controlled, private data sharing scenarios where participant selection and data governance are paramount priorities.

Question 18

Listings

Accepted Answer

Listings in Snowflake are a fundamental component of the Data Sharing and Marketplace ecosystem. A listing is essentially a package of data products that a provider creates to share or sell data with consumers through the Snowflake Marketplace or private data exchanges.

There are two main types of listings in Snowflake:

1. **Standard Listings**: These are free data shares that providers make available to consumers. They enable organizations to share data across different Snowflake accounts seamlessly, allowing consumers to access the shared data in real-time using their own compute resources.

2. **Personalized Listings**: These listings allow providers to offer customized data products tailored to specific consumer needs. Providers can set access controls and customize what data each consumer can view.

3. **Paid Listings**: Through the Snowflake Marketplace, providers can monetize their data by creating paid listings where consumers pay for access to premium datasets.

Key characteristics of Listings include:

- **No Data Movement**: When data is shared through listings, no actual data copying occurs. Consumers query the data from the provider's account, ensuring data remains current and eliminating storage duplication costs.

- **Secure Access**: Providers maintain full control over their data, determining who can access it and what portions are visible. Access can be revoked at any time.

- **Cross-Region and Cross-Cloud Sharing**: Listings support sharing data across different cloud providers and regions, though replication may be required for cross-region scenarios.

- **Discoverability**: Listings in the Snowflake Marketplace are discoverable by potential consumers, making it easier for data providers to reach their target audience.

- **Governance**: Providers can track usage and manage access through Snowflake's built-in governance features.

For the SnowPro Core exam, understanding how listings facilitate secure data sharing between organizations while maintaining data protection and access control is essential knowledge.

Learn Data Protection and Data Sharing (COF-C02) with Interactive Flashcards