GCP Vertex AI
Vertex AI is GCP's unified platform for building, training, deploying, and managing machine learning (ML) models. It brings together all the tools needed in the ML lifecycle — data preparation, model training, evaluation, deployment, and monitoring — into one managed service. Data scientists and ML engineers use Vertex AI to go from raw data to a production-serving model without managing infrastructure.
Building an ML model without Vertex AI is like constructing a house by sourcing every material separately — concrete, steel, wiring, plumbing — from different suppliers with no contractor to coordinate. Vertex AI is the general contractor that organizes every phase, provides the tools, and manages the resources.
Machine Learning Lifecycle on Vertex AI
Step 1: Data Preparation
│ Cloud Storage / BigQuery / Vertex AI Datasets
▼
Step 2: Model Training
│ Custom training jobs / AutoML / Pre-built models
▼
Step 3: Model Evaluation
│ Metrics: accuracy, precision, recall, AUC
▼
Step 4: Model Deployment
│ Endpoint (real-time) / Batch prediction
▼
Step 5: Model Monitoring
│ Detect data drift, prediction drift
▼
Step 6: MLOps (Pipeline automation)
│ Vertex AI Pipelines
Vertex AI Core Components
| Component | Purpose |
|---|---|
| Vertex AI Workbench | Managed Jupyter notebooks for data exploration and model development |
| Vertex AI Datasets | Managed datasets for images, text, tabular data, and video |
| AutoML | Train high-quality models without writing ML code |
| Custom Training | Train models using custom code (TensorFlow, PyTorch, scikit-learn) |
| Model Registry | Central repository to store and manage trained model versions |
| Endpoints | Deploy models for real-time online predictions |
| Batch Predictions | Run predictions on large datasets asynchronously |
| Vertex AI Pipelines | Automate and orchestrate the full ML workflow |
| Model Monitoring | Detect when prediction quality degrades over time |
| Vertex AI Model Garden | Access pre-trained foundation models (Gemini, Llama, etc.) |
AutoML – Train Models Without ML Expertise
AutoML trains high-quality models automatically from labeled data. No knowledge of neural network architectures or hyperparameter tuning is required. The user provides data and labels; AutoML handles the rest.
AutoML Supported Data Types
| Data Type | Task | Example |
|---|---|---|
| Tabular (CSV) | Classification, Regression, Forecasting | Predict if a customer will churn (Yes/No) |
| Image | Classification, Object Detection | Classify product images into categories |
| Text | Classification, Sentiment, Extraction | Classify customer reviews as positive/negative |
| Video | Classification, Object Tracking | Detect anomalies in factory surveillance video |
AutoML Tabular Training Workflow
- Upload a CSV dataset to Vertex AI Datasets (or link a BigQuery table)
- Select the target column to predict (example:
churned) - Configure the training budget (node hours)
- Click Train — AutoML evaluates dozens of models and selects the best one
- Review evaluation metrics (accuracy, AUC-ROC, confusion matrix)
- Deploy the model to an Endpoint for predictions
Custom Training with Python
For full control, custom training jobs run any Python code on managed infrastructure (CPU, GPU, or TPU).
# trainer/task.py — scikit-learn example
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import joblib
import os
# Load data from Cloud Storage (mounted at /gcs/)
df = pd.read_csv("gs://my-bucket/data/customers.csv")
# Features and target
X = df.drop("churned", axis=1)
y = df["churned"]
# Split and train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
# Evaluate
accuracy = accuracy_score(y_test, model.predict(X_test))
print(f"Test Accuracy: {accuracy:.4f}")
# Save model to Cloud Storage output directory
output_dir = os.environ.get("AIP_MODEL_DIR", "gs://my-bucket/models/")
joblib.dump(model, os.path.join(output_dir, "model.joblib"))
print(f"Model saved to {output_dir}")
# Submit a custom training job gcloud ai custom-jobs create \ --region=us-central1 \ --display-name=churn-model-training \ --python-package-uris=gs://my-bucket/trainer-0.1.tar.gz \ --python-module=trainer.task \ --worker-pool-spec=machine-type=n1-standard-4,replica-count=1,executor-image-uri=us-docker.pkg.dev/vertex-ai/training/scikit-learn-cpu.1-0:latest
Deploying a Model for Online Predictions
Model Registry → Upload Model → Create Endpoint → Deploy Model → Get Predictions # Step 1: Upload the trained model to Model Registry gcloud ai models upload \ --region=us-central1 \ --display-name=churn-model \ --artifact-uri=gs://my-bucket/models/ \ --container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-3:latest # Step 2: Create an endpoint gcloud ai endpoints create \ --region=us-central1 \ --display-name=churn-endpoint # Step 3: Deploy the model to the endpoint gcloud ai endpoints deploy-model ENDPOINT_ID \ --region=us-central1 \ --model=MODEL_ID \ --display-name=churn-model-v1 \ --machine-type=n1-standard-2 \ --min-replica-count=1 \ --max-replica-count=3
Making an Online Prediction
# Python — call the deployed endpoint
from google.cloud import aiplatform
aiplatform.init(project="my-project", location="us-central1")
endpoint = aiplatform.Endpoint("projects/my-project/locations/us-central1/endpoints/ENDPOINT_ID")
# Send a prediction request
instances = [
{"tenure": 24, "monthly_charges": 75.5, "contract_type": 1, "tech_support": 0}
]
prediction = endpoint.predict(instances=instances)
print(prediction.predictions)
# Output: [[0.82]] → 82% probability of churn
Vertex AI Model Garden
Model Garden provides access to hundreds of pre-trained and foundation models — including Google's Gemini models, open-source models like Llama and Mistral, and specialized models for vision, speech, and translation. These models can be used directly via API or fine-tuned on custom data.
# Call Gemini 1.5 Pro via Vertex AI
from vertexai.generative_models import GenerativeModel
model = GenerativeModel("gemini-1.5-pro")
response = model.generate_content("Explain Cloud Spanner in simple terms.")
print(response.text)
Vertex AI Pipelines
A pipeline automates the full ML workflow as a series of connected steps. If the data changes or the model needs retraining, one pipeline run updates everything automatically.
Vertex AI Pipeline: churn-model-pipeline
│
▼
Step 1: Data Validation (check for missing values, schema)
│
▼
Step 2: Feature Engineering (normalize, encode categories)
│
▼
Step 3: Model Training (RandomForest with hyperparameter tuning)
│
▼
Step 4: Model Evaluation (if accuracy > 0.85, proceed)
│
▼
Step 5: Model Registration (save to Model Registry)
│
▼
Step 6: Model Deployment (update production endpoint)
Key Takeaways
- Vertex AI unifies the entire ML lifecycle — data, training, deployment, and monitoring — on one platform.
- AutoML trains models from labeled data without requiring ML coding expertise.
- Custom Training supports any Python ML framework (TensorFlow, PyTorch, scikit-learn) on managed compute.
- Deployed models are served via Endpoints for real-time online predictions.
- Model Garden provides access to Gemini and hundreds of other pre-trained models.
- Vertex AI Pipelines automate the full retraining and redeployment workflow.
