GCP Vertex AI

Vertex AI is GCP's unified platform for building, training, deploying, and managing machine learning (ML) models. It brings together all the tools needed in the ML lifecycle — data preparation, model training, evaluation, deployment, and monitoring — into one managed service. Data scientists and ML engineers use Vertex AI to go from raw data to a production-serving model without managing infrastructure.

Building an ML model without Vertex AI is like constructing a house by sourcing every material separately — concrete, steel, wiring, plumbing — from different suppliers with no contractor to coordinate. Vertex AI is the general contractor that organizes every phase, provides the tools, and manages the resources.

Machine Learning Lifecycle on Vertex AI

Step 1: Data Preparation
        │  Cloud Storage / BigQuery / Vertex AI Datasets
        ▼
Step 2: Model Training
        │  Custom training jobs / AutoML / Pre-built models
        ▼
Step 3: Model Evaluation
        │  Metrics: accuracy, precision, recall, AUC
        ▼
Step 4: Model Deployment
        │  Endpoint (real-time) / Batch prediction
        ▼
Step 5: Model Monitoring
        │  Detect data drift, prediction drift
        ▼
Step 6: MLOps (Pipeline automation)
        │  Vertex AI Pipelines

Vertex AI Core Components

Component	Purpose
Vertex AI Workbench	Managed Jupyter notebooks for data exploration and model development
Vertex AI Datasets	Managed datasets for images, text, tabular data, and video
AutoML	Train high-quality models without writing ML code
Custom Training	Train models using custom code (TensorFlow, PyTorch, scikit-learn)
Model Registry	Central repository to store and manage trained model versions
Endpoints	Deploy models for real-time online predictions
Batch Predictions	Run predictions on large datasets asynchronously
Vertex AI Pipelines	Automate and orchestrate the full ML workflow
Model Monitoring	Detect when prediction quality degrades over time
Vertex AI Model Garden	Access pre-trained foundation models (Gemini, Llama, etc.)

AutoML – Train Models Without ML Expertise

AutoML trains high-quality models automatically from labeled data. No knowledge of neural network architectures or hyperparameter tuning is required. The user provides data and labels; AutoML handles the rest.

AutoML Supported Data Types

Data Type	Task	Example
Tabular (CSV)	Classification, Regression, Forecasting	Predict if a customer will churn (Yes/No)
Image	Classification, Object Detection	Classify product images into categories
Text	Classification, Sentiment, Extraction	Classify customer reviews as positive/negative
Video	Classification, Object Tracking	Detect anomalies in factory surveillance video

AutoML Tabular Training Workflow

Upload a CSV dataset to Vertex AI Datasets (or link a BigQuery table)
Select the target column to predict (example: churned)
Configure the training budget (node hours)
Click Train — AutoML evaluates dozens of models and selects the best one
Review evaluation metrics (accuracy, AUC-ROC, confusion matrix)
Deploy the model to an Endpoint for predictions

Custom Training with Python

For full control, custom training jobs run any Python code on managed infrastructure (CPU, GPU, or TPU).

# trainer/task.py — scikit-learn example
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import joblib
import os

# Load data from Cloud Storage (mounted at /gcs/)
df = pd.read_csv("gs://my-bucket/data/customers.csv")

# Features and target
X = df.drop("churned", axis=1)
y = df["churned"]

# Split and train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)

# Evaluate
accuracy = accuracy_score(y_test, model.predict(X_test))
print(f"Test Accuracy: {accuracy:.4f}")

# Save model to Cloud Storage output directory
output_dir = os.environ.get("AIP_MODEL_DIR", "gs://my-bucket/models/")
joblib.dump(model, os.path.join(output_dir, "model.joblib"))
print(f"Model saved to {output_dir}")

# Submit a custom training job
gcloud ai custom-jobs create \
  --region=us-central1 \
  --display-name=churn-model-training \
  --python-package-uris=gs://my-bucket/trainer-0.1.tar.gz \
  --python-module=trainer.task \
  --worker-pool-spec=machine-type=n1-standard-4,replica-count=1,executor-image-uri=us-docker.pkg.dev/vertex-ai/training/scikit-learn-cpu.1-0:latest

Deploying a Model for Online Predictions

Model Registry → Upload Model → Create Endpoint → Deploy Model → Get Predictions

# Step 1: Upload the trained model to Model Registry
gcloud ai models upload \
  --region=us-central1 \
  --display-name=churn-model \
  --artifact-uri=gs://my-bucket/models/ \
  --container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-3:latest

# Step 2: Create an endpoint
gcloud ai endpoints create \
  --region=us-central1 \
  --display-name=churn-endpoint

# Step 3: Deploy the model to the endpoint
gcloud ai endpoints deploy-model ENDPOINT_ID \
  --region=us-central1 \
  --model=MODEL_ID \
  --display-name=churn-model-v1 \
  --machine-type=n1-standard-2 \
  --min-replica-count=1 \
  --max-replica-count=3

Making an Online Prediction

# Python — call the deployed endpoint
from google.cloud import aiplatform

aiplatform.init(project="my-project", location="us-central1")

endpoint = aiplatform.Endpoint("projects/my-project/locations/us-central1/endpoints/ENDPOINT_ID")

# Send a prediction request
instances = [
    {"tenure": 24, "monthly_charges": 75.5, "contract_type": 1, "tech_support": 0}
]

prediction = endpoint.predict(instances=instances)
print(prediction.predictions)
# Output: [[0.82]]  → 82% probability of churn

Vertex AI Model Garden

Model Garden provides access to hundreds of pre-trained and foundation models — including Google's Gemini models, open-source models like Llama and Mistral, and specialized models for vision, speech, and translation. These models can be used directly via API or fine-tuned on custom data.

# Call Gemini 1.5 Pro via Vertex AI
from vertexai.generative_models import GenerativeModel

model = GenerativeModel("gemini-1.5-pro")
response = model.generate_content("Explain Cloud Spanner in simple terms.")
print(response.text)

Vertex AI Pipelines

A pipeline automates the full ML workflow as a series of connected steps. If the data changes or the model needs retraining, one pipeline run updates everything automatically.

Vertex AI Pipeline: churn-model-pipeline
        │
        ▼
Step 1: Data Validation (check for missing values, schema)
        │
        ▼
Step 2: Feature Engineering (normalize, encode categories)
        │
        ▼
Step 3: Model Training (RandomForest with hyperparameter tuning)
        │
        ▼
Step 4: Model Evaluation (if accuracy > 0.85, proceed)
        │
        ▼
Step 5: Model Registration (save to Model Registry)
        │
        ▼
Step 6: Model Deployment (update production endpoint)

Key Takeaways

Vertex AI unifies the entire ML lifecycle — data, training, deployment, and monitoring — on one platform.
AutoML trains models from labeled data without requiring ML coding expertise.
Custom Training supports any Python ML framework (TensorFlow, PyTorch, scikit-learn) on managed compute.
Deployed models are served via Endpoints for real-time online predictions.
Model Garden provides access to Gemini and hundreds of other pre-trained models.
Vertex AI Pipelines automate the full retraining and redeployment workflow.

Previous lessons

Back to courses

Next lessons