Ampersand Blog Writings from the founding team

Integration Platforms

16 min read

Apr 29, 2026

Snowflake Integration Patterns for AI Agents: Customer Warehouses, Secure Sharing, and Real-Time Read Paths

How AI platforms securely query customer Snowflake data in real time without exports, latency issues, or data residency risks

Chris Lopez

Founding GTM

Snowflake Integration Patterns for AI Agents: Customer Warehouses, Secure Sharing, and Real-Time Read Paths

For the average enterprise customer, all their data lives in Snowflake. The customer has spent years consolidating their CRM data, product analytics, billing, and operational systems into a Snowflake account that is now the canonical source of truth for who their customers are, what those customers do, and what those customers are worth.

For an AI platforms hoping to tap into this source of truth, integrating against the customer's Snowflake account is the difference between a generic enrichment service and a customer-grounded one. The generic service can tell you that "Acme Corp is a 500-person SaaS company in San Francisco." The grounded service can tell you that "Acme Corp is one of your customers, currently in the renewal window, with declining product engagement, with an open expansion conversation in their account team's HubSpot." The difference is the customer's Snowflake.

But Snowflake integrations are not iPaaS-shaped. They are not CRM-shaped. They are warehouse-shaped, with their own auth model, their own performance characteristics, their own security expectations, and their own scaling questions. This post walks through the architectural patterns that work for AI enrichment platforms reading from customer-owned Snowflake warehouses at scale, the security model that enterprise buyers expect, and the read-path optimizations that determine whether the integration is fast enough to matter.

Why Snowflake is the integration that wins enterprise AI enrichment deals

The enterprise AI enrichment buyer in 2026 has a specific procurement pattern. They evaluate the vendor's external data quality (intent signals, firmographic enrichment, technographics). They evaluate the vendor's CRM integration depth. And then they ask the question that determines whether the deal closes: "Can your platform read from our Snowflake account?"

The reason is that the customer's Snowflake account contains the data the customer cannot afford to expose to a generic enrichment vendor. Account-level revenue. Product usage telemetry. Renewal forecasts. Custom segmentation that took the customer years of data engineering to build. The customer wants the AI enrichment platform to use this data in the agent's reasoning, but they want the data to stay in their own warehouse, queried in real time, not exfiltrated to a vendor's storage.

The platforms that win these deals have all converged on a common architectural shape. The integration vendor's product runs federated queries against the customer's Snowflake account. The customer's data does not leave their warehouse. The query results are streamed to the agent's reasoning layer, used in the moment, and discarded. The vendor's data residency commitments hold. The customer's data engineering team retains control. We have written about BYOC and data residency for product integrations in the broader context, and Snowflake federated reads are one of the cleanest cases.

The platforms that lose these deals are the ones that ask the customer to "export your relevant tables to our system on a schedule." This is unacceptable for any enterprise data engineering team running on a modern cloud architecture. The export model is slow, stale, expensive, and architecturally backwards.

The auth model: connected apps, key pairs, and OAuth flows

Snowflake's auth model offers several options for integration vendors, and the right choice depends on the customer's posture.

The first option is OAuth 2.0 through a Snowflake-issued OAuth integration. The integration vendor registers their application with Snowflake, the customer's admin authorizes the vendor's app, and the integration uses the resulting refresh token to obtain access tokens for queries. This is the cleanest model for vendor-side integration but requires the customer's Snowflake admin to set up the OAuth integration. For customers who want browse-style approval workflows, this is the right pattern.

The second option is key-pair authentication. The customer creates a service user in their Snowflake account, generates an RSA key pair, and provides the public key to the vendor. The vendor uses the corresponding private key to authenticate. This is the highest-control model: the customer can rotate the key, scope the service user to specific roles and warehouses, and audit every query the vendor's integration runs. For regulated industries and customers with strict data governance, this is often the only acceptable pattern.

The third option is external OAuth, where the customer's existing identity provider (Okta, Azure AD, Ping) federates authentication into Snowflake. The integration vendor authenticates against the identity provider, gets a token, and uses it for Snowflake queries. This is the right pattern for customers with mature identity infrastructure, where adding a new integration without going through the IdP is a policy violation.

The integration platform's job is to support all three patterns and let the customer pick the one that fits their posture. We have written about how auth and token management is its own product surface, and Snowflake auth is one of the cleanest illustrations because the customer's preferred pattern is always going to be the one their data security team trusts.

The query model: roles, warehouses, and quota interactions

Snowflake's query execution model differs from a typical SaaS API. Queries run on warehouses (compute resources) that the customer pays for. The customer's warehouse sizing determines query speed. The customer's role-based access control determines which tables and views the query can see. And the customer's overall Snowflake quota determines how much query time the integration can consume.

The architectural implication is that the integration vendor cannot treat Snowflake reads as "free" the way they might treat HubSpot reads. Every query consumes the customer's compute budget. Heavy queries (full table scans, complex joins) can saturate the customer's smaller warehouses or push them onto larger ones, costing the customer more.

The integration vendor's job is to be a respectful tenant on the customer's warehouse. This means using narrow column selection (don't SELECT *), filtering aggressively at the source (push predicates down), respecting the customer's clustering and partitioning, and avoiding unnecessary repeated queries. The integration platform should expose query templates the customer's data engineering team can review and adjust, and the platform should track query cost per customer so the integration vendor can optimize over time.

Some sophisticated patterns help further. Materialized views or scheduled exports to a small "integration-friendly" schema can reduce repeated heavy queries to one cheap one. Query result caching, especially for queries that return the same data within a short window, can drastically reduce compute consumption. Snowflake's own query result cache helps here, but only if the integration vendor's query patterns are stable enough to hit it.

The schema discovery problem: customer-specific warehouse layouts

Every customer's Snowflake account is laid out differently. Some customers have a clean, dimensional model (a dim_customer, dim_account, fact_revenue set of tables in a core schema). Others have a raw-and-conformed pattern with dbt-managed transformations. Others have a heterogeneous warehouse with imported tables from many source systems and minimal harmonization.

The integration vendor's read query has to be authored against the customer's specific schema. There is no standard "Customer table" with standard columns the way there is in a CRM. The customer's data engineering team is the one who knows where the relevant data lives, what it is called, and how it joins.

The architectural answer is per-customer query configuration. The integration vendor's CS team works with the customer's data engineering team to author the relevant queries, validates them against the customer's actual data, and stores them per-customer in declarative configuration. This is the same pattern as customer-specific ERP query support (different system, same architecture). The integration platform exposes the SQL interface, scoped to the customer's warehouse, with managed auth and rate-aware execution.

The customer's data engineering team can review and edit the queries through a code review process, the same way they would review any other SQL change. This is what makes the integration acceptable to customers with mature data governance: the integration vendor is a query author with reviewable, version-controlled queries, not a black-box exfiltration tool.

Real-time vs scheduled: the latency budget for AI enrichment

A subtle but important question for any AI enrichment platform integrating with Snowflake is whether the queries run synchronously (on the agent's critical path, in the moment) or asynchronously (on a schedule, with results pre-staged for the agent).

Synchronous queries are the right pattern when the agent needs the freshest possible data and the query is fast enough to fit in the agent's latency budget. For a Snowflake warehouse properly sized and a well-indexed query, sub-second response is achievable. But the latency depends on the customer's warehouse size, the query complexity, and the warehouse's warm-up state. Cold warehouses can take seconds to spin up.

Asynchronous queries are the right pattern when the data freshness budget is hours or days, and the query is expensive. The integration runs the query on a schedule (every hour, every night, every week), stages the results in a low-latency store the agent can read from, and serves the agent's queries from the staged data. This trades freshness for latency and cost.

Most AI enrichment platforms use a hybrid: a small set of high-value, freshness-sensitive queries run synchronously on the agent's path, while a larger set of bulk enrichment queries run asynchronously and stage results. The integration platform should support both patterns through the same configuration model, with the choice driven by the use case.

The growth-engineering use case: outbound enrichment at scale

A specific pattern worth flagging for AI enrichment platforms: the growth engineering use case where a customer wants to run prospecting workflows that join external enrichment data with their internal Snowflake data. Examples include identifying lookalike accounts of high-value customers, scoring open opportunities against historical conversion patterns, or routing inbound leads based on warehouse-resident segmentation.

The architectural shape is distinctive. The agent or workflow takes an external signal (a new prospect, a new inbound lead, a target account list) and joins it against the customer's Snowflake-resident data to produce a context-aware action. The query is per-event, latency-sensitive (often sub-second on the agent's path), and depends on the customer's specific table layout.

The naive implementation is to ship the external signal to Snowflake, run a query, and ship the results back. This works for low volume and high latency tolerance. It fails for the high-volume growth engineering use case because every external event triggers a Snowflake query, the customer's compute budget gets consumed faster than expected, and the latency variance pushes the workflow into asynchronous territory.

The architecture that scales is to materialize the relevant Snowflake-resident data into an integration-friendly read schema, refresh it on a cadence appropriate to the freshness budget (every 15 minutes, every hour, every night, depending on use case), and serve the agent's queries from the materialized data. The customer's data engineering team owns the materialization SQL, with the integration platform handling the refresh schedule and the read-path optimization. We have written about why CRM platforms need agent-ready integration infrastructure, and the warehouse-as-source-of-truth pattern is the natural extension.

Industry context: data warehouses as systems of record

The shift from "the CRM is the source of truth" to "the warehouse is the source of truth" has been one of the most consequential architectural shifts in B2B SaaS over the past five years. The 2026 Snowflake Summit data showed that 80%+ of Snowflake's mid-market and enterprise customers now have their CRM data flowing into Snowflake on at least an hourly cadence, and 60% have material customer-decision data (renewals, expansion, churn risk) computed in the warehouse rather than the CRM.

The implication for AI enrichment and GTM platforms is that the warehouse is no longer a downstream analytics destination. It is an upstream system of record that the agent has to read from to give grounded responses. Vendors who treat Snowflake as "a nice-to-have integration" are losing deals to vendors who treat it as a tier-one read source alongside the CRM.

The same shift is happening with BigQuery and Redshift, with Databricks rapidly catching up in the mid-market and lower-enterprise segment. The 2026 Forrester Wave for cloud data warehouses showed that all three have crossed the threshold of being procurement criteria for AI enrichment platforms. The vendors with deep, secure read integrations across all three are the ones expanding upmarket. The vendors with one or two are getting boxed out.

Comparison: data exports, generic connectors, and Ampersand for Snowflake integration

Dimension	Customer-managed scheduled exports	Generic ELT/iPaaS connector	Ampersand
Data residency (data leaves customer's warehouse?)	Often yes	Often yes	No, federated reads
Auth model	Custom per customer	Limited options	OAuth, key-pair, external OAuth
Per-customer query authoring	Manual	Recipe-only	Declarative SQL, version-controlled
Schema discovery	Manual	Limited	First-class per customer
Synchronous vs asynchronous	Async only	Variable	Both, configurable
Customer's data engineering team review	Manual	Limited	First-class through SQL config
Per-customer query cost telemetry	None	Limited	Built-in
Multi-warehouse support (Snowflake, BigQuery, Redshift)	Custom per system	Variable	Native

The first column deserves emphasis. Many integration vendors have shipped Snowflake "integrations" that consist of asking the customer to export their relevant tables on a schedule. This satisfies a checkbox but fails on data residency, freshness, and customer governance. The architecturally correct answer is federated reads against the customer's warehouse, with the data never leaving their account.

How Ampersand handles Snowflake for AI enrichment platforms

Ampersand is a deep integration platform built for product developers shipping integrations as part of their product. For AI enrichment, GTM, and prospecting platforms reading from customer-owned data warehouses, the load-bearing capabilities are these.

Multi-auth Snowflake connector. OAuth 2.0, key-pair authentication, and external OAuth (Okta, Azure AD, Ping) are all first-class. The customer picks the pattern their data security team approves.

Federated reads, no data exfiltration. The integration runs queries against the customer's warehouse and streams results to your platform. The customer's data does not land in our storage.

Per-customer SQL configuration. Queries live in declarative configuration, scoped per customer, version-controlled, reviewable by the customer's data engineering team, and editable without code deploys.

Synchronous and asynchronous query patterns. The same connector supports both, with configuration determining which queries run on the agent's critical path and which run on a schedule.

Per-customer query cost telemetry. The dashboard surfaces query count, compute time, and approximate cost per customer, so your CS team can flag heavy queries before they become customer escalations.

Multi-warehouse support. Snowflake, BigQuery, Redshift, and Databricks SQL are all supported with the same architectural model. The customer picks the warehouse, the integration adapts.

Schema discovery. The integration enumerates the customer's available databases, schemas, and tables on first connect, so your CS engineer (or the customer's data engineer) can author queries against the actual schema.

The Ampersand sell

If you build an AI enrichment, GTM, or prospecting platform and your enterprise customers run on Snowflake (or BigQuery, Redshift, or Databricks SQL), the integration to the customer's warehouse is the deal-making capability. Asking the customer to export tables on a schedule loses deals. Federated reads that respect the customer's data residency win them.

Ampersand handles the full architecture. Multi-auth Snowflake connector with OAuth, key-pair, and external OAuth. Federated reads with no data exfiltration. Per-customer SQL configuration that the customer's data engineering team can review and edit. Synchronous and asynchronous query patterns. Per-customer query cost telemetry. Multi-warehouse support across the major cloud data platforms.

The Ampersand documentation walks through the Snowflake connector, the auth options, and the SQL configuration model. The how-it-works page shows the architecture end to end. If you want to talk through your specific Snowflake integration scope (auth posture, query patterns, cost considerations) with someone who has shipped this exact pattern, the team is reachable through the main site.

If you would like access to our Snowflake connector, please reach out here.

FAQ

Does Ampersand store the customer's Snowflake data?

No. The integration runs federated reads against the customer's warehouse. Query results stream through the platform to your downstream system. The customer's data does not land in our storage. This is the data residency commitment most enterprise data security teams require.

What auth options does the Snowflake connector support?

OAuth 2.0 through a Snowflake-issued OAuth integration, key-pair authentication with RSA keys, and external OAuth through the customer's existing identity provider (Okta, Azure AD, Ping). The customer chooses the pattern.

How does query authoring work?

Per-customer SQL queries live in declarative configuration. Your CS engineer (or the customer's data engineering team) authors the queries, reviews them through standard PR workflows, and ships them through CI/CD. The customer's data engineering team can audit every query the integration runs.

What's the typical query latency?

Sub-second to a few seconds, depending on the customer's warehouse size, query complexity, and warm-up state. For latency-sensitive agent paths, the typical pattern is to use a small, properly-sized warehouse with frequent activity to keep it warm.

What about BigQuery, Redshift, and Databricks?

All supported with the same architectural model. The customer picks the warehouse, the integration adapts. Auth and query authoring patterns are analogous.

How do you handle the customer's quota and compute cost?

The dashboard surfaces query count, compute time, and approximate cost per customer. Your CS team can flag heavy queries before they become customer escalations. Query result caching and materialized view patterns are supported for cost-sensitive use cases.

Can the integration read across multiple Snowflake databases or schemas?

Yes. The customer's data engineering team scopes the service user's role to the relevant databases and schemas, and the per-customer SQL configuration can join across them as needed.

Conclusion

The customer's data warehouse, Snowflake for the median enterprise, BigQuery and Redshift and Databricks SQL for the rest, is the system of record AI enrichment and GTM platforms have to integrate against. The integration is not iPaaS-shaped or CRM-shaped. It is warehouse-shaped, with its own auth model, query model, schema discovery problem, and data residency expectations. The platforms that win enterprise deals do federated reads with no exfiltration. The platforms that lose ask the customer to export tables on a schedule.

Ampersand is built for the warehouse integration shape. If you are scoping how your AI enrichment platform reads from customer-owned Snowflake (or BigQuery, Redshift, Databricks), the right path is federated, secure, per-customer query authoring on managed integration infrastructure. Learn more at withampersand.com.

Snowflake Integration Patterns for AI Agents: Customer Warehouses, Secure Sharing, and Real-Time Read Paths

How AI platforms securely query customer Snowflake data in real time without exports, latency issues, or data residency risks

Snowflake Integration Patterns for AI Agents: Customer Warehouses, Secure Sharing, and Real-Time Read Paths

Why Snowflake is the integration that wins enterprise AI enrichment deals

The auth model: connected apps, key pairs, and OAuth flows

The query model: roles, warehouses, and quota interactions

The schema discovery problem: customer-specific warehouse layouts

Real-time vs scheduled: the latency budget for AI enrichment

The growth-engineering use case: outbound enrichment at scale

Industry context: data warehouses as systems of record

Comparison: data exports, generic connectors, and Ampersand for Snowflake integration

How Ampersand handles Snowflake for AI enrichment platforms

The Ampersand sell

FAQ

Conclusion

Recommended reads