Werner Daehn
2 min readMay 9, 2021

--

Let me see if I got it right. The Puppini bridge essentially contains all allowed joins. My first problem with that is we have such a thing already, it is called semantic layer. And it works on query level and does not require to create such a monster table.

When I change your query to group by Product (the Forecast is not on product level), it will still return correct data, wouldn't it? I would see the forecast sum on the product NULL and the actuals per product. And that is the most correct result for such "wrong" user request we can produce.

Issue #2 is that many fact tables have implicit dimensions, that are foreign keys which are dimensions by themselves. Say order-completed Yes/No. As the bridge works with foreign key relationships there must be a separate dimension table. Not nice but okay.

Data Loading should be fine. One new order record requires one new record in the bridge table. In a Star schema as opposed to a Snowflake schema there are not even derived foreign keys.

If I am not mistaken, I see the potential but for a real life use case with millions of orders, shippings, invoices, payments, ... and 100 dimensions this Pippini table simply gets too large and wide.

Much easier and faster to use a semantic layer on query level:

select category, sum(sales), sum(forecast) from

(

select category, sum(sales), null from orders

union all

select category, null, sum(forecast) from forecast

)

Two simple queries with a single join between a large and a small table, taking full advantage of all database optimization and quick secondary step to combine the two into one resultset.

--

--

Werner Daehn
Werner Daehn

Written by Werner Daehn

Data Integration expert for Big Data and SAP

Responses (1)