Google Prediction Framework addresses data pipeline drudgery

Prediction Framework weaves together Google Cloud Platform services to simplify the implementation of prediction projects.

Google Prediction Framework addresses data pipeline drudgery

Google’s Prediction Framework stitches together Google Cloud Platform services, from Cloud Functions to Pub/Sub to Vertex AutoML to BigQuery, to help users implement data science prediction projects and save time doing so.

Detailed in a December 29 blog post, Prediction Framework was designed to provide the basic scaffolding for prediction solutions and allow for customization. Built for hosting on the Google Cloud Platform, the framework is an attempt to generalize all steps involved in a prediction project, including data extraction, data preparation, filtering, prediction, and post-processing. The idea behind the framework is that with just a few particularizations/modifications, the framework would fit any similar use case, with a high level of reliability.

Code for the framework can be found on GitHub. Prediction Framework uses Google Cloud Functions for data processing, Vertex AutoML for hosting the model, and BigQuery for the final storage of predictions. Google Cloud Firestore, Pub/Sub, and Schedulers are also used in the pipeline. Users must provide a configuration file with environment variables about the cloud project, data sources, the ML model, and the scheduler for the throttling system.

In explaining the framework’s usefulness, Google noted that many marketing scenarios require analysis of first-party data, performing predictions on data, and leveraging results in marketing platforms such as Google Ads. Feeding these platforms regularly requires a report-oriented and cost-reduced ETL and prediction pipeline. Prediction Framework helps with implementing data prediction projects by providing the backbone elements of the predictive process.

Copyright © 2022 IDG Communications, Inc.