|
| 1 | +{# |
| 2 | + Generates a surrogate key hash like dbt_utils.surrogate_key(), but |
| 3 | + provides additional, T-SQL specific parameters and config. |
| 4 | + |
| 5 | + Example usage: |
| 6 | + ```sql |
| 7 | + select |
| 8 | + {{ tsql_utils.surrogate_key(["id"]) }} as test_key |
| 9 | + from src_test |
| 10 | + ``` |
| 11 | + |
| 12 | + Args: |
| 13 | + field_list (list): A list of columns or values that should be used to |
| 14 | + generate the surrogate key. |
| 15 | + |
| 16 | + col_type (str): The column type field values will be casted to before |
| 17 | + hashing. Useful for when the underlying columns are |
| 18 | + nvarchar, for example. |
| 19 | + |
| 20 | + use_binary_hash (bool): By default the hash is converted to a varchar |
| 21 | + string that uses 32 bytes of data. Setting |
| 22 | + this parameter to True will keep the key as |
| 23 | + varbinary that only uses 16 bytes of data. |
| 24 | + |
| 25 | + This will reduce space in the database and can |
| 26 | + potentially increase join performance, but the |
| 27 | + column has to be converted into varchar before |
| 28 | + it can be used in Power BI for relationships. |
| 29 | + |
| 30 | + Returns: |
| 31 | + str: SQL code that generates a hashed surrogate key. |
| 32 | + |
| 33 | + DBT Project Variables: |
| 34 | + You can also adjust default settings through variables in your |
| 35 | + dbt_project.yml: |
| 36 | + |
| 37 | + ```yml |
| 38 | + vars: |
| 39 | + dbt_utils_dispatch_list: ['tsql_utils'] |
| 40 | + tsql_utils_surrogate_key_col_type: 'nvarchar(1234)' |
| 41 | + tsql_utils_surrogate_key_use_binary_hash: True |
| 42 | + ``` |
| 43 | +#} |
| 44 | + |
| 45 | + |
| 46 | +{%- macro surrogate_key(field_list, col_type=None, use_binary_hash=None) -%} |
| 47 | + |
| 48 | + {%- if col_type == None -%} |
| 49 | + {%- set col_type = var( |
| 50 | + "tsql_utils_surrogate_key_col_type", |
| 51 | + "varchar(8000)" |
| 52 | + ) -%} |
| 53 | + {%- endif -%} |
| 54 | + |
| 55 | + {%- if use_binary_hash == None -%} |
| 56 | + {%- set use_binary_hash = var( |
| 57 | + "tsql_utils_surrogate_key_use_binary_hash", |
| 58 | + False |
| 59 | + ) -%} |
| 60 | + {%- endif -%} |
| 61 | + |
| 62 | + {%- if field_list is string -%} |
| 63 | + {%- set field_list = [field_list] -%} |
| 64 | + {%- endif -%} |
| 65 | + |
| 66 | + {%- set fields = [] -%} |
| 67 | + |
| 68 | + {%- for field in field_list -%} |
| 69 | + |
| 70 | + {%- set _ = fields.append( |
| 71 | + "coalesce(cast(" ~ field ~ " as " ~ col_type ~ "), '')" |
| 72 | + ) -%} |
| 73 | + |
| 74 | + {%- if not loop.last %} |
| 75 | + {%- set _ = fields.append("'-'") -%} |
| 76 | + {%- endif -%} |
| 77 | + |
| 78 | + {%- endfor -%} |
| 79 | + |
| 80 | + {%- if use_binary_hash == True -%} |
| 81 | + {%- set key = "hashbytes('md5', " ~ dbt_utils.concat(fields) ~ ")" -%} |
| 82 | + {%- else -%} |
| 83 | + {%- set key = dbt_utils.hash(dbt_utils.concat(fields)) -%} |
| 84 | + {%- endif -%} |
| 85 | + |
| 86 | + {{ key }} |
| 87 | + |
| 88 | +{%- endmacro -%} |
| 89 | + |
| 90 | +{# |
| 91 | + Converts a value from a binary surrogate key hash into varchar. |
| 92 | + |
| 93 | + This is useful if you are using `use_binary_hash=True` for your surrogate keys. Binary columns cannot be used for relationships in Power BI. |
| 94 | + |
| 95 | + This macro allows you to convert them to varchar inside your report views |
| 96 | + before importing them into Power BI to allow relationships on your |
| 97 | + surrogate key columns. |
| 98 | + |
| 99 | + Args: |
| 100 | + col (str): The column or value that should be converted from binary |
| 101 | + hash to varchar hash. |
| 102 | + |
| 103 | + Returns: |
| 104 | + str: SQL code that converts a varbinary has to varchar. |
| 105 | + |
| 106 | +#} |
| 107 | +{%- macro cast_hash_to_str(col) -%} |
| 108 | + convert(varchar(32), {{ col }}, 2) |
| 109 | +{%- endmacro -%} |
0 commit comments