Skip to main content

Wanderbricks dataset

The wanderbricks schema in the samples catalog contains a simulated travel booking platform dataset. It models a vacation rental marketplace, with tables covering users and hosts, property listings and destinations, bookings and payments, reviews and support logs, and clickstream activity.

Use this dataset to explore data engineering pipelines, analytics dashboards, and machine learning workflows without loading your own data.

Access the dataset

The Wanderbricks dataset is preloaded in the samples catalog and is available in Unity Catalog-enabled workspaces.

To list all tables in the schema:

SQL
SHOW TABLES IN samples.wanderbricks;

To preview data in a specific table:

SQL
SELECT *
FROM samples.wanderbricks.<table-name>
LIMIT 10;

Replace <table-name> with the table you want to explore, for example bookings or reviews.

Tables

The wanderbricks schema includes the following tables. Run SHOW TABLES IN samples.wanderbricks for the complete list.

Table

Description

users

User profiles including name, email, country, and user type.

hosts

Host profiles linked to property listings, including account and contact details.

properties

Property listings with details such as title, type, price, and destination.

bookings

Booking records with check-in/check-out dates, guest counts, total amounts, and status.

payments

Payment records with method, amount, status, and booking references.

booking_updates

Booking state change records for change data capture (CDC) pipelines.

reviews

User reviews for properties, including ratings, comments, and an is_deleted flag for soft deletes.

clickstream

User activity events (views, clicks, searches, filters) with nested metadata for device and referrer.

page_views

Page view events tied to users and properties.

customer_support_logs

Support ticket logs with nested arrays of messages, including sender and sentiment.

destinations

Destination locations with names and descriptions, referenced by property listings.

Primary table relationships

The following diagram shows the connections between users, bookings, properties, and related tables.

Wanderbricks primary table relationships diagram showing connections between users, bookings, properties, and related tables.

  • Users and hosts
    • users represents travelers and business customers.
    • hosts represents property owners and operators.
  • Properties and destinations
    • Each row in properties is a listing owned by a host.
    • properties links to destinations to model where the listing is located.
  • Bookings and payments
    • bookings connects travelers (user_id) to properties (property_id).
    • payments and booking_updates reference booking_id to capture financial transactions and status changes.
  • Behavior and experience
    • clickstream and page_views track how users browse and interact with listings.
    • reviews captures post-stay feedback by user and property.
    • customer_support_logs records support interactions tied back to a user.

Example queries

The following examples show how to query the Wanderbricks dataset.

Join users, bookings, and properties

Return recent bookings with guest and property details:

SQL
SELECT
u.name AS guest_name,
p.title AS property_title,
b.check_in,
b.check_out,
b.total_amount,
b.status
FROM samples.wanderbricks.bookings AS b
JOIN samples.wanderbricks.users AS u
ON b.user_id = u.user_id
JOIN samples.wanderbricks.properties AS p
ON b.property_id = p.property_id
ORDER BY b.check_in DESC
LIMIT 10;

Analyze clickstream events by device

Summarize engagement by event type and device:

SQL
SELECT
metadata.device AS device_type,
event,
COUNT(*) AS event_count
FROM samples.wanderbricks.clickstream
GROUP BY metadata.device, event
ORDER BY event_count DESC;

Compute average ratings per property

Find the highest-rated properties with enough review volume:

SQL
SELECT
p.title AS property_title,
p.property_type,
ROUND(AVG(r.rating), 2) AS avg_rating,
COUNT(r.rating) AS review_count
FROM samples.wanderbricks.properties AS p
JOIN samples.wanderbricks.reviews AS r
ON p.property_id = r.property_id
WHERE r.is_deleted = false
GROUP BY p.title, p.property_type
HAVING COUNT(r.rating) >= 5
ORDER BY avg_rating DESC
LIMIT 10;