Skip to content

Verifying results #4

@IG16

Description

@IG16

Hey,
I was using your awesome clickstream algorithm engine when I noticed something interesting.

Here is what I did:
I am trying to verify results of the algorithm, so the check I do is the following:

  1. After running algorithm, open result.json file.
  2. For all leaf nodes in result.json, find list of exclusions for example:
    ["t", [["l", [48, 167, 201, 283, 434, 468, 672, 883, 916, 970, 1015, 1271],

{"exclusionsScore": [1285.0, 336.0208333333333, 0.0, 0.0], "exclusions": ["S2319", "S674", "S3690", "S3361"]}],

  1. To verify results I do a lookup for all users in this cluster (for instance userId 48) against their respective input file (input file contains actual log of actions performed by users which is used as input to algorithm) to verify that they actually have done at least one of ["S2319", "S674", "S3690", "S3361"] sequences.

Here are the results:
I found that I when do verify results - about 20% of users do not have any of the cluster sequences in the input file, meaning they did not perform any of the sequences of actions of the cluster they belong to.

Here is what I expected:
Does this result make sense? Shouldn’t users perform at least 1 sequence that appears in cluster they belong to?
Thank you very much

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions