PuppyGraph brings real-time graph analytics to Databricks with Iceberg table integration

PuppyGraph announces real-time graph query integration with Databricks’ Iceberg Tables, eliminating ETL and enabling in-place analytics on massive datasets.

TAGS

PuppyGraph, a real-time graph query engine, has formally announced its native integration with Managed Iceberg Tables on the Databricks Data Intelligence Platform. The announcement was made on June 13 during the lead-up to the Data + AI Summit 2025, where the integration enters public preview. This development enables customers to perform real-time graph analytics directly on Iceberg-based datasets governed by Unity Catalog, removing the need for data extraction, transformation, or loading (ETL).

The integration marks a major milestone in unifying graph analytics with open data lakehouse architecture, allowing clients such as Coinbase and CipherOwl to analyze relationship-driven data natively on the Databricks platform. PuppyGraph’s latest move follows a series of integrations in the past year, including its support for Apache Arrow and query federation via Trino and Presto.

How does the Databricks and PuppyGraph integration work?

Databricks’ new Managed Iceberg Tables are designed to support the Apache Iceberg REST Catalog API, enabling interoperability with external engines such as Apache Spark, Apache Flink, and Apache Kafka. Unity Catalog governs these tables with fine-grained access controls, auditability, and data lineage tracking.

PuppyGraph’s integration allows users to run graph queries directly on top of Iceberg Tables. There is no need to move data into specialized graph databases, which typically require ETL and impose operational overhead. Instead, users can run real-time, in-place graph traversals across massive Iceberg datasets—scaling to petabytes—without compromising performance or governance.

According to Weimo Liu, CEO of PuppyGraph, the integration “changes how graph analytics fits into the modern data stack” by allowing “complex relationship-driven questions without ever leaving the lakehouse.”

See also  Dot Ai begins Nasdaq trading after completing SPAC merger with ShoulderUp and raising $12m in growth capital

What is the significance of real-time graph analytics on Iceberg?

The ability to perform real-time graph analytics directly on managed lakehouse storage represents a critical inflection point for enterprise data platforms. Graph workloads are central to use cases such as fraud detection, cybersecurity, root cause analysis, service dependency mapping, and recommendation engines. However, traditional approaches often involved siloed graph databases or specialized platforms that could not operate at lakehouse scale.

With Databricks’ public preview of Managed Iceberg Tables, data teams can now leverage Apache Iceberg’s inherent capabilities—like schema evolution, snapshot isolation, and partition pruning—while running graph queries without duplicating data.

For example, customers can now detect fraud across financial transactions in real-time, analyze security telemetry to trace lateral movement within networks, or model service dependencies across distributed cloud environments using PuppyGraph. These insights are derived directly from source tables without requiring a separate graph pipeline or database.

What are customers saying about this integration?

Coinbase and CipherOwl, joint customers of both Databricks and PuppyGraph, are among the early adopters of the integrated solution. Both firms are scheduled to share implementation details at the Data + AI Summit 2025.

Coinbase, which relies on high-frequency data to secure and analyze digital asset transactions, uses graph-based risk modeling to track abnormal wallet behavior and transaction patterns. CipherOwl, which builds observability platforms for enterprise infrastructure, leverages real-time service dependency graphs to monitor root cause scenarios at scale.

Their deployments reflect broader institutional enthusiasm around unifying graph workloads with governed, scalable storage layers. By removing the need for bespoke graph systems and synchronizing governance via Unity Catalog, data platform teams can now maintain security, compliance, and auditability without bottlenecks.

See also  Nokia Bell Labs to pioneer LunA-10 program for Lunar Economy with DARPA

What are the implications for the enterprise data ecosystem?

This move by PuppyGraph and Databricks mirrors a broader shift toward open, federated architectures in the data engineering space. As enterprises invest in lakehouse platforms for scalable analytics, demand is rising for engines that support domain-specific processing—including graph, vector, and streaming—without ETL friction.

The real-time execution model offered by PuppyGraph makes graph analytics accessible in domains where latency or data freshness previously made off-platform solutions unviable. Further, with the ability to support massive, constantly updated Iceberg datasets, it aligns with the expectations of large-scale operations across fintech, e-commerce, cybersecurity, and IoT.

While the solution is currently launching in public preview, industry observers suggest it could eventually push native graph capabilities into the mainstream data engineering toolkit. Early indicators from summit attendees and preliminary partner feedback suggest strong momentum, particularly among enterprises already standardized on Databricks and Iceberg.

How are analysts and institutions reacting?

Though no official analyst ratings accompanied the launch, early sentiment among institutional data teams has been largely positive. Data architects see the integration as a critical enabler for unifying domain-specific processing without vendor lock-in. By leveraging open standards like Apache Iceberg and the REST Catalog API, PuppyGraph avoids introducing additional proprietary layers.

Furthermore, the operational cost savings—stemming from eliminated ETL processes and reduced platform sprawl—align with cost-optimization goals in an era of cautious IT spending. Enterprise adoption of Iceberg is rising steadily, and by attaching graph capabilities directly to these datasets, PuppyGraph and Databricks are well-positioned to capture this segment.

See also  What TCS’ extension with SPARSH means for defence pensioners in India

What’s next for PuppyGraph and Databricks?

Looking ahead, analysts expect broader GA (General Availability) rollout for PuppyGraph’s Iceberg integration in the second half of 2025, potentially coinciding with additional observability, security, and lineage enhancements. Enhanced support for query federation, materialized relationship views, and performance tuning for specific graph traversal patterns may also follow.

On the Databricks side, the public preview of Managed Iceberg Tables marks the company’s most expansive push yet toward open table formats. As the open lakehouse model gains momentum, future integrations may extend to additional graph engines, real-time AI inference layers, and industry-specific accelerators.

For now, PuppyGraph remains the first real-time graph engine with certified interoperability for Managed Iceberg Tables. This exclusivity may offer a temporary advantage in attracting fintech, cybersecurity, and observability-focused enterprises seeking zero-ETL architecture.

PuppyGraph’s announcement reinforces a growing convergence: that the future of analytics lies in eliminating boundaries between data domains and engines—without sacrificing governance or scale.


Discover more from Business-News-Today.com

Subscribe to get the latest posts sent to your email.

CATEGORIES
TAGS
Share This