oracle aide

December 30, 2014

Slow changing dimensions types 0-4

Filed under: Uncategorized — oracleaide @ 1:02 am

The common enumeration of  SCDs using types 0-7 encodes a single attribute — where the history is stored.

I think there is an obvious pattern for types 0-4 (and emerging mnemonics):

Type 0: history is stored nowhere.

Type 1: history is stored in the dimension itself, in a single current row (history with length of 1 means – no history).

Type 2: history is stored in the dimension itself, in extra rows (history length is 2+)

Type 3: history is stored in the dimension itself, in extra columns (something about 3rd dimension?)

Type 4: history is stored in a separate table.

December 28, 2014

Analysis of time series in RDBMS

Filed under: Uncategorized — oracleaide @ 6:33 pm

Summary: pivoted time series allow trade-offs between speed, space, convenience, scalability.

Many scientists and analysts (a.k.a. humans) visualize time series horizontally, where the time axis goes from left to right and values — parallel to it. The series array is often sparse, e.g. we have no data point for January 2, but have to allocate an array element, so January 1 and January 3 are two days apart. In terms of RDBMS: the time information is stored in columns (even column names), values are stored in rows, which is worse since columns are static and defined via DDL.

time_series

This is approach is intuitive and friendly to humans, but not to databases.
Sparse data do waste space. In the world of databases wasted space = wasted time.

Dates as columns are rigid and require developers to hard-code dates in analytic SQL.

An alternative approach is to pivot the model: the time axis goes to a single column, values – to separate columns too.

Every row corresponds to a single point in time and contains values from all the column (e.g. forecast at P50, P80)

The data become dense. Since dates are not a part of the model (no fixed columns) – there is no need to keep empty rows for missing data. The absence of hard-coded dates make SQL simple, compact and makes it easier to run analysis in a moving window of data range.

Create a free website or blog at WordPress.com.