Google Pub/Sub to BigQuery the Easy Approach | by Jim Barlow | Sep, 2023
A hands-on information to implementing BigQuery Subscriptions in Pub/Sub for easy message and streaming ingestion
I’ve encountered many conditions up to now the place I needed to get Pub/Sub messages right into a BigQuery desk, however I by no means managed to discover a notably easy approach of doing this.
You possibly can arrange a dataflow pipeline, however this requires further infrastructure to know, configure, handle and debug. Plus Dataflow (which is a managed Apache Beam service) is designed for high-throughput streaming, so at all times appeared like overkill for a easy message logging or monitoring system.
And it’s Java. However Python 😀! And Java… 😫!
public static string args void important... public static string args void important... public static string args void important... public static string args void important... public static string args void important... arrrrrrrrrrrrgh
Sorry, I nonetheless get flashbacks from my first makes an attempt to study to code (final century) in Java. Please don’t try to make use of that code snippet … step away from the code snippet.
I then stumbled upon this, which — though promising simplicity — appears to be much more sophisticated than the earlier technique (Debezium wtf?)!
It’s additionally doable to deploy a light-weight Cloud Operate to set off on receipt of a Pub/Sub message and stream or load this into BigQuery, however this nonetheless appeared a bit too complicated for one thing which felt prefer it ought to and will have been native performance.
And now it’s!
The type people at Google Cloud announced a direct connection from Pub/Sub to BigQuery some time in the past, superior! Nevertheless, having tried (and failed) to rapidly arrange a check a few occasions, I lastly had a real-life use-case which required me to get it working for a shopper.
It seems that there are a few nuances, so this text goals that will help you get this up and working as rapidly as doable.
Pub/Sub is an extremely helpful, highly effective and scaleable service within the Google Cloud ecosystem, with two core use-cases: streaming…