Skip to content

Commit b916816

Browse files
Added tsql_utils.surrogate_key() macro (#32)
* Added tsql_utils.surrogate_key() macro * contrib Co-authored-by: Kim Streich <[email protected]> Co-authored-by: Anders Swanson <[email protected]>
1 parent d1c741e commit b916816

File tree

2 files changed

+110
-0
lines changed

2 files changed

+110
-0
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
## New features
66

7+
- add support for `dbt_utils.surrogate_key()` [#32](https://github.com/dbt-msft/tsql-utils/pull/32) thanks [@infused-kim](https://github.com/infused-kim)
78
- shim dbt-date, currently passing all tests! [#36](https://github.com/dbt-msft/tsql-utils/pull/36)
89
- support for `dbt_utils.generate_series()` [#36](https://github.com/dbt-msft/tsql-utils/pull/36)
910
- support on synapse for `dbt_utils.dateadd()` and `dbt_utils.datediff()` [#36](https://github.com/dbt-msft/tsql-utils/pull/36)
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
{#
2+
Generates a surrogate key hash like dbt_utils.surrogate_key(), but
3+
provides additional, T-SQL specific parameters and config.
4+
5+
Example usage:
6+
```sql
7+
select
8+
{{ tsql_utils.surrogate_key(["id"]) }} as test_key
9+
from src_test
10+
```
11+
12+
Args:
13+
field_list (list): A list of columns or values that should be used to
14+
generate the surrogate key.
15+
16+
col_type (str): The column type field values will be casted to before
17+
hashing. Useful for when the underlying columns are
18+
nvarchar, for example.
19+
20+
use_binary_hash (bool): By default the hash is converted to a varchar
21+
string that uses 32 bytes of data. Setting
22+
this parameter to True will keep the key as
23+
varbinary that only uses 16 bytes of data.
24+
25+
This will reduce space in the database and can
26+
potentially increase join performance, but the
27+
column has to be converted into varchar before
28+
it can be used in Power BI for relationships.
29+
30+
Returns:
31+
str: SQL code that generates a hashed surrogate key.
32+
33+
DBT Project Variables:
34+
You can also adjust default settings through variables in your
35+
dbt_project.yml:
36+
37+
```yml
38+
vars:
39+
dbt_utils_dispatch_list: ['tsql_utils']
40+
tsql_utils_surrogate_key_col_type: 'nvarchar(1234)'
41+
tsql_utils_surrogate_key_use_binary_hash: True
42+
```
43+
#}
44+
45+
46+
{%- macro surrogate_key(field_list, col_type=None, use_binary_hash=None) -%}
47+
48+
{%- if col_type == None -%}
49+
{%- set col_type = var(
50+
"tsql_utils_surrogate_key_col_type",
51+
"varchar(8000)"
52+
) -%}
53+
{%- endif -%}
54+
55+
{%- if use_binary_hash == None -%}
56+
{%- set use_binary_hash = var(
57+
"tsql_utils_surrogate_key_use_binary_hash",
58+
False
59+
) -%}
60+
{%- endif -%}
61+
62+
{%- if field_list is string -%}
63+
{%- set field_list = [field_list] -%}
64+
{%- endif -%}
65+
66+
{%- set fields = [] -%}
67+
68+
{%- for field in field_list -%}
69+
70+
{%- set _ = fields.append(
71+
"coalesce(cast(" ~ field ~ " as " ~ col_type ~ "), '')"
72+
) -%}
73+
74+
{%- if not loop.last %}
75+
{%- set _ = fields.append("'-'") -%}
76+
{%- endif -%}
77+
78+
{%- endfor -%}
79+
80+
{%- if use_binary_hash == True -%}
81+
{%- set key = "hashbytes('md5', " ~ dbt_utils.concat(fields) ~ ")" -%}
82+
{%- else -%}
83+
{%- set key = dbt_utils.hash(dbt_utils.concat(fields)) -%}
84+
{%- endif -%}
85+
86+
{{ key }}
87+
88+
{%- endmacro -%}
89+
90+
{#
91+
Converts a value from a binary surrogate key hash into varchar.
92+
93+
This is useful if you are using `use_binary_hash=True` for your surrogate keys. Binary columns cannot be used for relationships in Power BI.
94+
95+
This macro allows you to convert them to varchar inside your report views
96+
before importing them into Power BI to allow relationships on your
97+
surrogate key columns.
98+
99+
Args:
100+
col (str): The column or value that should be converted from binary
101+
hash to varchar hash.
102+
103+
Returns:
104+
str: SQL code that converts a varbinary has to varchar.
105+
106+
#}
107+
{%- macro cast_hash_to_str(col) -%}
108+
convert(varchar(32), {{ col }}, 2)
109+
{%- endmacro -%}

0 commit comments

Comments
 (0)