Google Pub/Sub to BigQuery the Easy Approach | by Jim Barlow | Sep, 2023


A hands-on information to implementing BigQuery Subscriptions in Pub/Sub for easy message and streaming ingestion

Google’s newest planet-scale knowledge warehouse subscription-based streaming ingestion water-borne navy functionality: BigSub. On this case, the Pub by no means made it to Normal Availability, so you’ll have to get your pints elsewhere. Photograph by Thomas Haas on Unsplash

I’ve encountered many conditions up to now the place I needed to get Pub/Sub messages right into a BigQuery desk, however I by no means managed to discover a notably easy approach of doing this.

You possibly can arrange a dataflow pipeline, however this requires further infrastructure to know, configure, handle and debug. Plus Dataflow (which is a managed Apache Beam service) is designed for high-throughput streaming, so at all times appeared like overkill for a easy message logging or monitoring system.

And it’s Java. However Python 😀! And Java… 😫!

public static string args void important... public static string args void important... public static string args void important... public static string args void important... public static string args void important... arrrrrrrrrrrrgh

Sorry, I nonetheless get flashbacks from my first makes an attempt to study to code (final century) in Java. Please don’t try to make use of that code snippet … step away from the code snippet.

I then stumbled upon this, which — though promising simplicity — appears to be much more sophisticated than the earlier technique (Debezium wtf?)!

It’s additionally doable to deploy a light-weight Cloud Operate to set off on receipt of a Pub/Sub message and stream or load this into BigQuery, however this nonetheless appeared a bit too complicated for one thing which felt prefer it ought to and will have been native performance.

And now it’s!

The type people at Google Cloud announced a direct connection from Pub/Sub to BigQuery some time in the past, superior! Nevertheless, having tried (and failed) to rapidly arrange a check a few occasions, I lastly had a real-life use-case which required me to get it working for a shopper.

It seems that there are a few nuances, so this text goals that will help you get this up and working as rapidly as doable.

Pub/Sub is an extremely helpful, highly effective and scaleable service within the Google Cloud ecosystem, with two core use-cases: streaming…

Leave a Reply

Your email address will not be published. Required fields are marked *