SQL or Custom Expressions?¶
SLayer draws a clean line between SQL and its DSL:
- Model definitions (dimensions, measures, filters) use raw SQL — they reference the underlying table columns directly.
- Queries use the DSL — they reference dimension/measure names, apply transforms like
cumsumandchange, and filter on named entities.
The model is the abstraction boundary between the two. This notebook demonstrates the distinction with working examples.
See also: SQL vs DSL | Formulas | Models | Queries
Prerequisites: pip install motley-slayer
import os
import sys
sys.path.insert(0, os.path.join(os.getcwd(), "..", "..", ".."))
sys.path.insert(0, os.path.join(os.getcwd(), "..", "jaffle_data"))
from setup_jaffle import ensure_jaffle_shop
engine, storage, models = ensure_jaffle_shop()
Model Definitions Use SQL¶
A model's dimensions and measures define the mapping from semantic names to SQL expressions. The sql field is a raw SQL column reference or expression — it's what ends up in the generated SELECT and GROUP BY.
For example, the orders model maps the dimension name order_total to the SQL column order_total, and the measure name order_total to an aggregation over order_total:
orders_model = next(m for m in models if m.name == "orders")
print(f"orders model has {len(orders_model.columns)} columns:")
print()
print("Columns (name -> SQL expression):")
for col in orders_model.columns[:8]:
sql_str = col.sql if col.sql else col.name
pk = " [PK]" if col.primary_key else ""
print(f" {col.name:<20} -> sql: {sql_str!r:<20} type: {col.type}{pk}")
orders model has 7 columns: Columns (name -> SQL expression): id -> sql: 'id' type: string [PK] customer_id -> sql: 'customer_id' type: string ordered_at -> sql: 'ordered_at' type: date store_id -> sql: 'store_id' type: string subtotal -> sql: 'subtotal' type: number tax_paid -> sql: 'tax_paid' type: number order_total -> sql: 'order_total' type: number
Notice that the dimension/measure name is a semantic label, while sql is the raw SQL expression. They often coincide (e.g., name: "order_total", sql: "order_total"), but they don't have to — a dimension could be named revenue_with_tax with sql: "subtotal + tax_paid".
Model-level filters are also raw SQL:
filters:
- "deleted_at IS NULL"
- "status <> 'test'"
Queries Use the DSL¶
When you write a query, you reference dimension and measure names — the semantic side. You never write SQL in a query.
Transform functions like cumsum, change, and time_shift are part of the DSL. They compile to complex SQL (CTEs, window functions, self-joins), but you just name them:
result = engine.execute_sync(
query={
"source_model": "orders",
"time_dimensions": [{"dimension": "ordered_at", "granularity": "month"}],
"measures": [
"order_total:sum",
{"formula": "change(order_total:sum)", "name": "mom_change"},
],
"order": [{"column": "ordered_at", "direction": "asc"}],
"limit": 6,
"offset": 12,
}
)
print(f"{'Month':<12} {'Revenue':>12} {'MoM Change':>12}")
print("-" * 38)
for row in result.data:
month = str(row["orders.ordered_at"])[:7]
rev = row["orders.order_total_sum"]
chg = row["orders.mom_change"]
chg_str = f"${chg:+,.0f}" if chg is not None else "-"
print(f"{month:<12} ${rev:>11,.2f} {chg_str:>12}")
print("\n SQL of the query", result.sql)
Month Revenue MoM Change
--------------------------------------
2024-05 $ 118,106.62 $+17,512
2024-06 $ 154,699.94 $+36,593
2024-07 $ 168,960.69 $+14,261
2024-08 $ 180,589.12 $+11,628
2024-09 $ 177,773.88 $-2,815
2024-10 $ 187,103.17 $+9,329
SQL of the query WITH base AS (
SELECT
DATE_TRUNC('MONTH', orders.ordered_at) AS "orders.ordered_at",
SUM(orders.order_total) AS "orders.order_total_sum"
FROM orders AS orders
GROUP BY
DATE_TRUNC('MONTH', orders.ordered_at)
),
shifted__ts_mom_change AS (
SELECT
DATE_TRUNC('MONTH', orders.ordered_at + INTERVAL '1' MONTH) AS "orders.ordered_at",
SUM(orders.order_total) AS "orders.order_total_sum"
FROM orders AS orders
GROUP BY
DATE_TRUNC('MONTH', orders.ordered_at + INTERVAL '1' MONTH)
),
sjoin__ts_mom_change AS (
SELECT base."orders.order_total_sum", base."orders.ordered_at", shifted__ts_mom_change."orders.order_total_sum" AS "orders._ts_mom_change"
FROM base
LEFT JOIN shifted__ts_mom_change
ON base."orders.ordered_at" = shifted__ts_mom_change."orders.ordered_at"
),
step2 AS (
SELECT
"orders._ts_mom_change",
"orders.order_total_sum",
"orders.ordered_at",
"orders.order_total_sum" - "orders._ts_mom_change" AS "orders.mom_change"
FROM sjoin__ts_mom_change
)
SELECT
"orders._ts_mom_change",
"orders.mom_change",
"orders.order_total_sum",
"orders.ordered_at"
FROM step2
ORDER BY "orders.ordered_at" ASC
LIMIT 6
OFFSET 12
The change() function compiled to a self-join CTE under the hood — but the query only referenced the measure name order_total:sum and the transform name change. No SQL needed to be written by the caller.
Filters in Queries Reference Model Entities¶
Query-level filters use dimension and measure names — including dotted names for joined dimensions. SLayer resolves them to the correct SQL automatically:
# Filter on a joined dimension name — not a SQL column reference
result = engine.execute_sync(
query={
"source_model": "orders",
"measures": ["*:count", "order_total:sum"],
"dimensions": ["stores.name"],
"filters": ["stores.name in ('Brooklyn', 'Philadelphia')"],
}
)
for row in result.data:
print(f" {row['orders.stores.name']}: {row['orders._count']:,} orders, ${row['orders.order_total_sum']:,.2f}")
print("\nThe filter 'stores.name' resolved to SQL:")
# Show just the WHERE clause from the generated SQL
where_line = [line for line in result.sql.split("\n") if "WHERE" in line or " IN" in line]
for line in where_line:
print(f" {line.strip()}")
Brooklyn: 256,532 orders, $2,793,658.90
Philadelphia: 192,450 orders, $2,099,292.44
The filter 'stores.name' resolved to SQL:
WHERE
stores.name IN ('Brooklyn', 'Philadelphia')
Adding SQL-Level Filters via ModelExtension¶
What if you need a filter that references raw SQL — an expression over underlying columns that can't be reduced to a combination of dimension/measure filters?
Use query-time model extension semantics to add the filter to the model definition inline, right inside the query. Model-level filters are SQL, so any valid SQL expression works:
# "subtotal > tax_paid * 5" is raw SQL — it references underlying table columns,
# not dimension names. So it goes on the model (via ModelExtension), not the query.
result = engine.execute_sync(
query={
"source_model": {
"source_name": "orders",
"filters": ["subtotal > tax_paid * 5"],
},
"measures": ["*:count", "order_total:sum"],
"dimensions": ["stores.name"],
"order": [{"column": "order_total_sum", "direction": "desc"}],
}
)
print("Orders where subtotal > 5x tax (raw SQL filter via ModelExtension):")
for row in result.data:
print(f" {row['orders.stores.name']}: {row['orders._count']:,} orders, ${row['orders.order_total_sum']:,.2f}")
Orders where subtotal > 5x tax (raw SQL filter via ModelExtension): Brooklyn: 256,532 orders, $2,793,658.90 Philadelphia: 192,450 orders, $2,099,292.44 Chicago: 106,290 orders, $1,183,765.61 San Francisco: 90,401 orders, $1,059,640.66 New Orleans: 15,473 orders, $180,782.69
The key insight: the raw SQL filter subtotal > tax_paid * 5 is on the model (via ModelExtension), while the query's own measures, dimensions, and any query-level filters stay in DSL territory.
Query Result as Model¶
What if you want to use DSL transforms like time_shift or change to define derived measures or dimensions?
Use create_model() with a query parameter to save a query's result as a permanent model. The derived columns become dimensions and measures on the new model, which can then be queried like any other - or just use a query inside another query, as covered in multistage queries
See Creating Models from Queries for details.
Summary¶
| Layer | Language | Examples |
|---|---|---|
| Model definitions (dimensions, measures, filters) | Raw SQL | sql: "amount", filters: ["deleted_at IS NULL"] |
| Queries (measures, dimensions, filters) | DSL names | {"formula": "change(revenue)"}, filters: ["status = 'active'"] |
| SQL in queries | ModelExtension | Add raw SQL filters/dimensions to the model inline |
| DSL in models | Query-as-model | Save a DSL query result as a permanent model |