Skip to content

Generate data analysis SQL using code_gen from data transform expression in SQLFlow. #1667

Open
@brightcoder01

Description

@brightcoder01

The following transform functions contains analysis. The analysis work should be done at first to make the transform logic concrete.

Name Feature Column Template Analyzer
STANDARDIZE(x) numeric_column({var_name}, normalizer_fn=lambda x : x - {mean} / {std}) MEAN, STDDEV
NORMALIZE(x) numeric_column({var_name}, normalizer_fn=lambda x : x - {min} / {max} - {min}) MAX, MIN
BUCKETIZE(x, bucket_num=y) bucketized_column({var_name}, boundaries={percentiles}) PERCENTILE
APPLY_VOCAB(x) categorical_column_with_vocabulary({var_name}, vocabulary_list={vocabulary_list}) DISTINCT
HASH(x, hash_bucket_size) categorical_column_with_hash_bucket COUNT(DISTINCT)

The SQLFlow syntax for data transform and analysis is discussed in #1664

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions