Snowflake Goes GA — Separating Storage from Compute Rewrites the DWH

Name: Snowflake Goes GA — Separating Storage from Compute Rewrites the DWH
Start: 2014-10-01

In October 2014, the San Mateo startup Snowflake Computing reached general availability for its cloud data warehouse service Snowflake, following a closed beta in June.

Two years after Amazon Redshift (announced 2012, GA 2013) had opened the door to the cloud DWH, Snowflake walked through it carrying its own design principle: fully separate storage from compute. Six years later it would mount a US$70 billion IPO—the largest software IPO in history at the time.

Founders — Veterans of Oracle and Vectorwise

Snowflake's three founders sat at the heart of the recent history of commercial DBMS.

Benoit Dageville, French-born, was an Oracle architect and one of the principal designers of Oracle Database's parallel execution engine and partitioning features. Thierry Cruanes had been Oracle's chief optimiser architect. Marcin Żukowski authored the "vectorised query execution" line of research that produced MonetDB/X100 and its commercial form Vectorwise.

In 2012 the three founded Snowflake Computing in San Mateo. The concept was a clean redesign of the data warehouse for the cloud era. No legacy RDB architecture to carry forward; design from scratch on AWS S3 and EC2. They went into stealth, surfaced two years later, and shipped Snowflake in 2014.

The Core of the Design — Complete Separation of Storage and Compute

Snowflake's central innovation is the multi-cluster, shared-data architecture, a three-tier structure.

Storage layer. All data is held in Amazon S3 (later Azure Blob and GCS as well), in compressed columnar format. Compute is detached; storage is effectively unlimited, cheap, and 11-nines durable.

Compute layer. Queries run in "virtual warehouses" that can be started and stopped independently of storage. Sizes range from X-Small to 6X-Large, and multiple warehouses can run in parallel against the same data—a BI warehouse, an ETL warehouse, and a data-science warehouse can each have their own compute resources without interfering with each other's performance.

Cloud services layer. A shared layer for authentication, metadata, query optimisation, and transaction management.

The three-tier split was decisive. Early Redshift fixed the ratio of storage to compute per node: add storage and you add compute, and vice versa. Snowflake removed that constraint, delivering an elasticity in which storage scales cheaply and indefinitely on S3 while compute spins up only when needed, billed by the second.

And storage is referenced as shared data across virtual warehouses. Different users, teams, or regions can hit the same physical data with separate compute, which is why the design is called "multi-cluster, shared-data".

2014-2020 — Quiet Rapid Growth

At GA in October 2014, Snowflake was still a little-known startup. Marketing was restrained, and the company had the colour of an engineering shop solving customer pain one item at a time.

The inflection came in 2016 as enterprise adoptions—Capital One, Adobe, Sony Music—began to surface. Companies frustrated with Redshift's performance collapsing under concurrent workloads, or with BigQuery's Google-only ecosystem, started switching to Snowflake's multi-workload parallelism.

In 2018 Snowflake added Microsoft Azure (previously AWS only) and the following year Google Cloud. Multi-cloud as a design choice became its definitive differentiator: customers had a DWH that was not locked into any single cloud provider.

2020 IPO — The Largest Software IPO in History

On 16 September 2020, Snowflake listed on the NYSE. Priced at US$120, it closed day one at US$253.93, with a market capitalisation of roughly US$70 billion. That made it the largest software IPO of its time, and the participation of Berkshire Hathaway (Warren Buffett) and Salesforce Ventures in the pre-IPO round broke the convention that Buffett does not buy tech stocks.

CEO Frank Slootman, formerly CEO of ServiceNow, ran the business while founder Dageville stayed at CTO—a division of labour between designer and operator that worked.

2024 — Over US$3 Billion in Revenue, the DWH for the AI Era

In FY2024 (year ending January 2024) Snowflake reported approximately US$3 billion in revenue, growing 36 per cent. Customer count was around 10,000, with more than 700 customers spending over US$1 million annually. More than 60 per cent of the Fortune 500 are said to use Snowflake.

From 2023, Snowflake has been adding AI features fast: Snowpark (Python/Scala data processing), Snowpark Container Services (run arbitrary containers), Cortex (call LLM inference from SQL), Iceberg Tables (open table-format support). The competition with Databricks—data warehouse versus data lakehouse—has become the biggest front line in the industry in 2024-2025.

What October 2014 Means

Snowflake's GA was the moment the cloud-native DWH design pattern reached its mature form. Where Redshift transposed the on-premises DWH onto the cloud, Snowflake answered the question "what does a DWH designed for the cloud from scratch look like?" That answer—full separation of storage and compute—has propagated to Databricks, ClickHouse Cloud, OSS data platforms built on Apache Iceberg, and even OLTP-side systems like Aurora, Neon, and PlanetScale.

If Redshift began "renting a DWH by the hour", Snowflake completed "renting it by the second". October 2014 is the node in a twelve-year story in which data infrastructure went from a hardware purchase order to a function call.

Snowflake Goes GA — Separating Storage from Compute Rewrites the DWH

Metadata

Snowflake Goes GA — Separating Storage from Compute Rewrites the DWH

Founders — Veterans of Oracle and Vectorwise

The Core of the Design — Complete Separation of Storage and Compute

2014-2020 — Quiet Rapid Growth

2020 IPO — The Largest Software IPO in History

2024 — Over US$3 Billion in Revenue, the DWH for the AI Era

What October 2014 Means

Sources