Skip to content

collection data types (arrays, maps, structs) and the lambda functions (reduce/transform/filter) #962

Open
@ybyzek

Description

@ybyzek

Show advanced workflow with collection data types (arrays, maps, structs) and the lambda functions (reduce/transform/filter)

Examples from @blueedgenick (thank you!)

  1. A dating site : full 3-stage sequence to identify "activated matches": User A messages user B, User B responds to user A, User A responds to user B

This produces the bare minimum in the output (the id-pairs which meet the 'activated' criteria) but you can copy the reduce block from the having clause into the projection as well if you want to see what it's doing. This version should even be resilient to out-of-order data due to the sort=true flag passed to the as_map function.

select 
    array_sort(array [send_id, recv_id]) as sorted_send_recv_pair
from message_stream
group by array_sort(array [send_id, recv_id])
having reduce(
        entries(
            as_map(
                collect_list(cast(rowtime as string)),
                collect_list(send_id)
            ),
            true
        ),
        struct(state := 'none', sender := -1),
        (s, e) = > case
            when s->state = 'none' 
                then struct(state := 'initial', sender := e->v)
            when s->state = 'initial' and s->sender != e->v 
                then struct(state := 'replied', sender := e->v)
            when s->state = 'replied' and s->sender != e->v 
                then struct(state := 'activated', sender := e->v)
            else s
        end
    )->state = 'activated' emit changes;
  1. To get the pageid with the top number of views per 60s window.:
create table foo as
select 'dummy',
  reduce(
    histogram(cast(USER_ID as string)),
     struct(user := 'no result', views := cast(0 as bigint)),
      (s, k, v) => case
        when v > s->views 
           then struct(user := k, views := v)
           else s
         end) as top_user
from pageviews
window tumbling (size 60 seconds)
group by 'dummy'
emit changes;

Metadata

Metadata

Assignees

No one assigned

    Labels

    ksqlDBksqlDB related

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions