Databricks Workspace
The Databricks Workspace is the place where all your work lives. It is the first thing you see after logging in, and understanding it well saves you time every day. This topic walks through every major section of the workspace, explains what each part does, and shows you how to set up your personal environment so you can start working immediately.
Logging Into Databricks
Your organization's Databricks workspace has a unique URL, typically in the format https://[your-org].azuredatabricks.net (for Azure), https://[your-org].cloud.databricks.com (for AWS), or https://[your-org].gcp.databricks.com (for GCP). You log in using your email and password, or through your company's single sign-on (SSO) system such as Microsoft Azure AD, Okta, or Google Identity.
Once logged in, you land on the Databricks Home page. The left sidebar is your primary navigation tool.
The Left Sidebar – Your Navigation Menu
DATABRICKS LEFT SIDEBAR ───────────────────────────── 🏠 Home → Your personal folder and recent items 🔍 Search → Find notebooks, tables, clusters quickly 📓 Workspace → All notebooks and folders 📂 Catalog → Browse databases, tables, and volumes ⚡ Compute → Manage clusters and SQL warehouses ⚙️ Workflows → Create and monitor scheduled jobs 💬 SQL Editor → Write and run SQL queries 📊 Dashboards → View and build visual reports 🤖 Machine Learning → Experiments, models, feature store 🔔 Alerts → Set up notifications for metrics
Each section of the sidebar opens a different part of the workspace. You can think of the sidebar as a remote control for the entire Databricks platform.
The Home Section
The Home section is your personal landing area. It shows recently visited notebooks, recently modified files, and quick links to your most-used resources. It also displays your personal folder — a private space where you store notebooks and files that only you can see by default.
The Home section also shows Recent Items, which lists the last notebooks you opened, clusters you used, and jobs you ran. This makes it easy to pick up where you left off after a break.
The Workspace Section – Organizing Your Notebooks
The Workspace section is a file system for all your notebooks, libraries, and folders. It works like Google Drive or a shared network folder, but instead of documents, it holds Databricks notebooks and code files.
WORKSPACE FILE STRUCTURE EXAMPLE
──────────────────────────────────
📁 Workspace
├── 📁 Users
│ ├── 📁 priya@company.com ← Personal folder
│ │ ├── 📓 sales_analysis
│ │ └── 📓 customer_segmentation
│ └── 📁 rahul@company.com
│ └── 📓 inventory_pipeline
└── 📁 Shared ← Team-shared folder
├── 📁 Data Engineering
│ └── 📓 etl_master_notebook
└── 📁 Analytics
└── 📓 monthly_report
Files inside a user's personal folder are private by default. Files inside the Shared folder are visible to all workspace members. Administrators can create additional shared folders with custom access permissions.
Creating a Notebook
To create a new notebook, right-click any folder in the Workspace browser and select Create > Notebook. A dialog box asks you for the notebook name, the default programming language (Python, SQL, Scala, or R), and the cluster to attach it to. You can always change these settings later.
Importing and Exporting Notebooks
Databricks notebooks can be exported as .ipynb (Jupyter format), .py (Python scripts), .sql (SQL files), or .dbc (Databricks archive format). To import a notebook from your local computer, right-click a folder and choose Import. This is useful when you want to bring a Jupyter notebook from your laptop into Databricks.
The Catalog Section – Browsing Your Data
The Catalog section (also called the Data Explorer) is where you browse all your databases, tables, views, and volumes. Think of it as a library catalog — it shows you every dataset that exists in your workspace and how it is organized.
CATALOG BROWSER STRUCTURE
──────────────────────────────
📦 Catalog: main
└── 📁 Schema: retail_data
├── 🗂 Table: customers
│ ├── customer_id (INT)
│ ├── name (STRING)
│ ├── city (STRING)
│ └── signup_date (DATE)
├── 🗂 Table: transactions
│ ├── txn_id (INT)
│ ├── customer_id (INT)
│ ├── amount (DOUBLE)
│ └── txn_date (TIMESTAMP)
└── 👁 View: high_value_customers
Clicking on a table shows you its column names, data types, sample rows, and statistics like minimum value, maximum value, and the number of null entries per column. This preview feature is extremely useful for understanding a new dataset before writing any code.
The Compute Section – Managing Your Processing Power
The Compute section shows all clusters and SQL Warehouses in your workspace. This is where you create new clusters, check the status of running clusters, and adjust cluster configurations.
Types of Compute in Databricks
COMPUTE TYPES IN DATABRICKS
─────────────────────────────────────────────────────
All-Purpose Cluster → Used for interactive notebooks
Stays running until you stop it
Best for development and exploration
Job Cluster → Created automatically when a job runs
Shuts down as soon as the job finishes
Best for production pipelines (cheaper)
SQL Warehouse → Optimized for SQL queries only
Serverless option available
Best for analysts using the SQL Editor
For day-to-day exploration, use an All-Purpose Cluster. For scheduled production jobs, use Job Clusters — they spin up fresh for each run and shut down automatically, saving money.
Creating Your First Cluster
Go to Compute, click Create Cluster, and fill in these key fields:
- Cluster Name: A descriptive name like "dev-exploration-cluster"
- Cluster Mode: Single Node (for small jobs) or Standard (for distributed jobs)
- Databricks Runtime Version: Choose the latest LTS (Long Term Support) version
- Node Type: The type of virtual machine. Larger machines cost more but process data faster.
- Autoscaling: Check this box to let the cluster grow and shrink automatically based on workload
- Terminate after inactivity: Set this to 30–60 minutes to avoid paying for idle clusters
The SQL Editor – Writing Queries Like a Spreadsheet
The SQL Editor is a dedicated interface for analysts who prefer writing SQL queries rather than Python code. It connects to a SQL Warehouse (not a regular cluster) and provides a clean environment for writing queries, viewing results, and saving frequently used queries.
SQL EDITOR LAYOUT ────────────────────────────── ┌─────────────────────────────────────┐ │ SQL EDITOR │ │─────────────────────────────────────│ │ [Warehouse: Analytics-WH ▼] │ ← Select SQL Warehouse │─────────────────────────────────────│ │ SELECT city, │ │ COUNT(*) AS total_customers │ │ FROM main.retail_data.customers │ │ GROUP BY city │ │ ORDER BY total_customers DESC │ │ │ │ [▶ Run Query] [Save Query] │ │─────────────────────────────────────│ │ RESULTS │ │ city | total_customers │ │ Mumbai | 4523 │ │ Delhi | 3892 │ │ Pune | 2110 │ └─────────────────────────────────────┘
The SQL Editor also supports query history, so you can see every query you ran in the past and re-run them with one click.
Dashboards – Turning Data Into Charts
The Dashboards section lets you build visual reports from your SQL query results. After running a query in the SQL Editor, you can turn the result into a bar chart, line chart, pie chart, or data table, then pin it to a dashboard.
Dashboards in Databricks are shareable. You can schedule them to refresh automatically — for example, every morning at 7 AM — and send a link to your manager or client so they always see the latest data without needing to log into Databricks themselves.
Personal Settings and User Preferences
Click your username in the top-right corner to open User Settings. Here you can manage several important preferences.
Access Tokens
Access tokens let you connect to Databricks from external tools like Python scripts, BI tools (Tableau, Power BI), or command-line interfaces. To generate a token, go to User Settings > Access Tokens > Generate New Token. Keep the token secret — it acts like a password for programmatic access.
Git Integration
Databricks supports integration with GitHub, GitLab, and Azure DevOps. Once connected, you can commit notebook changes directly to a Git repository, create branches, and collaborate using pull requests — the same way software developers work. This keeps your data code versioned and auditable.
GIT INTEGRATION WORKFLOW
──────────────────────────────────────
Local Git Branch Databricks Notebook
│ │
│ git clone → pull into │
│ Databricks Repo │
▼ ▼
Work on code locally Work on notebook in browser
│ │
└──────── git push ──────────────┘
│
▼
Pull Request on GitHub
│
▼
Code Review + Merge
Notification Preferences
You can configure email notifications for job failures, job completions, and alert thresholds. Setting up failure notifications ensures you know immediately when a critical pipeline breaks, rather than discovering it hours later when someone notices that a report is empty.
Admin Settings – For Workspace Administrators
If you have administrator access, you see an additional Admin Console option in your user menu. The Admin Console lets you manage users and groups, set workspace-level permissions, configure Single Sign-On (SSO), view audit logs, and manage service principals (automated accounts used by applications).
Administrators also control which cloud storage buckets the workspace can access and set up security configurations like IP access lists (allowing only office IP addresses to log in).
The Databricks CLI – Working From the Command Line
The Databricks CLI (Command Line Interface) lets you interact with your workspace from a terminal. It is useful for automating tasks like deploying notebooks, managing clusters, or uploading files.
Install it using pip:
pip install databricks-cli
Configure it with your workspace URL and access token:
databricks configure --token Host: https://your-workspace.azuredatabricks.net Token: [paste your token here]
After configuration, you can run commands like:
databricks fs ls dbfs:/ → List files in Databricks File System databricks clusters list → Show all clusters databricks jobs list → Show all scheduled jobs databricks workspace ls /Users → List notebooks in workspace
Keyboard Shortcuts in Notebooks
Knowing keyboard shortcuts speeds up your notebook workflow significantly.
DATABRICKS NOTEBOOK SHORTCUTS ────────────────────────────────── Ctrl + Enter → Run current cell Shift + Enter → Run cell and move to next Ctrl + Shift + P → Command palette Esc + A → Insert cell above Esc + B → Insert cell below Esc + D + D → Delete current cell Ctrl + Z → Undo Ctrl + / → Comment / Uncomment code
Key Points
- The Databricks Workspace is the browser-based interface where all your data work happens.
- The left sidebar provides navigation to notebooks, data catalog, compute, SQL editor, dashboards, and machine learning tools.
- All-Purpose Clusters suit interactive work; Job Clusters suit automated production pipelines; SQL Warehouses suit SQL-only analytics.
- The Catalog browser lets you explore all tables, schemas, and data assets without writing any code.
- Git integration keeps your notebook code versioned and enables collaboration through pull requests.
- Access Tokens allow external tools and scripts to connect to your Databricks workspace programmatically.
- The Databricks CLI automates workspace management tasks from the command line.
