Enterprises Now Deploy GPU Servers Thanks to Open-Source AI
Roundup #1
Hello Data folks 👋
Nvidia reported $46.7B Q2 revenue (56% YoY growth) as GPU demand shifts from hyperscalers to enterprise customers deploying on-premises AI infrastructure. CEO Jensen Huang credited open-source AI models for driving enterprise adoption, with companies like Disney, Hyundai, and SAP deploying new RTX Pro servers.
The company launched GPU-powered servers designed for standard IT environments last month. Server market saw record 134% quarterly growth to $95.2B in Q1, driven by GPU demand. However, analysts expect AI infrastructure growth to slow from 250% (2022-2024) to 67% in 2025 as enterprises shift toward smaller, specialized models requiring less compute.
Data leaders should evaluate whether current infrastructure plans account for this shift toward distributed enterprise AI deployments versus centralized cloud-only strategies. The cloud giants are no more the only game in town.
Source: CIO Dive (5 minute article)
Why Your Data Infrastructure Migration Project Will Fail (and How To Succeed)
Seattle Data Guy | 18th April 2025 | 11 minute video
Most data infrastructure migrations fail or remain incomplete, creating costly Frankenstein architectures that double spending and frustrate CFOs.
Key failure patterns: lack of detailed migration plans, unclear ownership across teams, insufficient testing of migrated data, and poor change management. Successful migrations require technical project leadership, comprehensive task tracking, automated testing scripts to verify data consistency, and early stakeholder buy-in from affected teams.
Data leaders should establish central ownership with technical expertise BEFORE announcing any platform migration. Create detailed migration checklists, build testing automation (not negotiable!), and involve downstream teams in tool selection.
Why Is Everyone Buying Change Data Capture?
Estuary | 18th April 2025 | 20 minute read
Major tech companies are buying Change Data Capture instead of building it, despite having thousands of developers. Since 2015, over 15 CDC acquisitions totaling $10B+ including IBM's $2.3B StreamSets deal and Qlik's $2.4B Talend acquisition.
The pattern reveals CDC's deceptive complexity (it’s a trap!). What starts as “just tailing a log file” becomes managing database heterogeneity, schema evolution, exactly-once delivery, and operational nightmares across multiple database types. One FAANG team spent 3 years and millions before abandoning their internal solution.
TCO analysis shows building costs $3.6M-$6M initially plus $600K-$900K annually for maintenance, versus commercial solutions at $50K-$500K/year. Engineering teams are choosing to focus on core differentiators rather than rebuilding infrastructure.
Streaming Data Into the Lakehouse With Iceberg and Trino at Going
Data Engineering Podcast | 18th April 2025 | 40 minute podcast
Going is a deal platform for flight tickets. They process 50 petabytes annually of flight data through Confluent Kafka streams into Trino and Iceberg, serving real-time travel deals to consumers. The travel platform migrated from Snowflake to an open lakehouse architecture to handle gigabyte-per-second data ingestion from global distribution systems.
Their 25-person engineering team combines streaming data processing with consumer-facing applications, using Postgres and Elasticsearch for sub-second query responses while maintaining analytical capabilities on massive datasets. The architecture separates storage and compute costs while enabling multiple processing engines.
This demonstrates how streaming lakehouses can power consumer applications beyond traditional analytics use cases. Worth evaluating for teams requiring both real-time processing and cost-effective storage at petabyte scale.
One More Thing(s)
X: LLMs are basically reddit wrappers.
That’s the brief.


