@@ -346,6 +346,50 @@ Each row in the result shows:
346
346
SELECT approx_topk(clientip, 10 ) FROM " default"
347
347
```
348
348
It returns the ` 10 ` most frequently occurring client IP addresses from the ` default ` stream.
349
+
350
+ ** Function result (returns an object with an array):**
351
+
352
+ ``` json
353
+ {
354
+ "item" : [
355
+ {"clientip" : " 192.168.1.100" , "request_count" : 2650 },
356
+ {"clientip" : " 10.0.0.5" , "request_count" : 2230 },
357
+ {"clientip" : " 203.0.113.50" , "request_count" : 2210 },
358
+ {"clientip" : " 198.51.100.75" , "request_count" : 1970 },
359
+ {"clientip" : " 172.16.0.10" , "request_count" : 1930 },
360
+ {"clientip" : " 192.168.1.200" , "request_count" : 1830 },
361
+ {"clientip" : " 203.0.113.80" , "request_count" : 1630 },
362
+ {"clientip" : " 10.0.0.25" , "request_count" : 1590 },
363
+ {"clientip" : " 172.16.0.30" , "request_count" : 1550 },
364
+ {"clientip" : " 192.168.1.150" , "request_count" : 1410 }
365
+ ]
366
+ }
367
+ ```
368
+ ** Use the ` unnest() ` to extract usable results:**
369
+
370
+ ``` sql
371
+ SELECT item .clientip as clientip, item .request_count as request_count
372
+ FROM (
373
+ SELECT unnest(approx_topk(clientip, 10 ))
374
+ FROM " default"
375
+ )
376
+ ORDER BY request_count DESC
377
+ ```
378
+ ** Final output (individual rows):**
379
+
380
+ ``` json
381
+ {"clientip" :" 192.168.1.100" ,"request_count" :2650 }
382
+ {"clientip" :" 10.0.0.5" ,"request_count" :2230 }
383
+ {"clientip" :" 203.0.113.50" ,"request_count" :2210 }
384
+ {"clientip" :" 198.51.100.75" ,"request_count" :1970 }
385
+ {"clientip" :" 172.16.0.10" ,"request_count" :1930 }
386
+ {"clientip" :" 192.168.1.200" ,"request_count" :1830 }
387
+ {"clientip" :" 203.0.113.80" ,"request_count" :1630 }
388
+ {"clientip" :" 10.0.0.25" ,"request_count" :1590 }
389
+ {"clientip" :" 172.16.0.30" ,"request_count" :1550 }
390
+ {"clientip" :" 192.168.1.150" ,"request_count" :1410 }
391
+ ```
392
+
349
393
??? info "The Space-Saving Algorithm Explained:"
350
394
The Space-Saving algorithm enables efficient top-K queries on high-cardinality data by limiting memory usage during distributed query execution. This approach trades exact precision for system stability and performance. <br >
351
395
** Problem Statement** <br >
@@ -481,6 +525,49 @@ SELECT approx_topk_distinct(clientip, clientas, 3) FROM "default" ORDER BY _time
481
525
```
482
526
It returns the top 3 client IP addresses that have the most unique user agents.
483
527
528
+ ** Function result (returns an object with an array):**
529
+
530
+ ``` json
531
+ {
532
+ "item" : [
533
+ {"clientip" : " 192.168.1.100" , "distinct_count" : 1450 },
534
+ {"clientip" : " 203.0.113.50" , "distinct_count" : 1170 },
535
+ {"clientip" : " 10.0.0.5" , "distinct_count" : 1160 },
536
+ {"clientip" : " 198.51.100.75" , "distinct_count" : 1040 },
537
+ {"clientip" : " 172.16.0.10" , "distinct_count" : 1010 },
538
+ {"clientip" : " 192.168.1.200" , "distinct_count" : 950 },
539
+ {"clientip" : " 203.0.113.80" , "distinct_count" : 830 },
540
+ {"clientip" : " 10.0.0.25" , "distinct_count" : 810 },
541
+ {"clientip" : " 172.16.0.30" , "distinct_count" : 790 },
542
+ {"clientip" : " 192.168.1.150" , "distinct_count" : 690 }
543
+ ]
544
+ }
545
+ ```
546
+
547
+ ** Use the ` unnest() ` , to extract usable results:**
548
+
549
+ ``` sql
550
+ SELECT item .clientip as clientip, item .distinct_count as distinct_count
551
+ FROM (
552
+ SELECT unnest(approx_topk_distinct(clientip, clientas, 10 )) as item
553
+ FROM " default"
554
+ )
555
+ ORDER BY distinct_count DESC
556
+ ```
557
+ ** Final output (individual rows):**
558
+
559
+ ``` json
560
+ {"clientip" :" 192.168.1.100" ,"distinct_count" :1450 }
561
+ {"clientip" :" 203.0.113.50" ,"distinct_count" :1170 }
562
+ {"clientip" :" 10.0.0.5" ,"distinct_count" :1160 }
563
+ {"clientip" :" 198.51.100.75" ,"distinct_count" :1040 }
564
+ {"clientip" :" 172.16.0.10" ,"distinct_count" :1010 }
565
+ {"clientip" :" 192.168.1.200" ,"distinct_count" :950 }
566
+ {"clientip" :" 203.0.113.80" ,"distinct_count" :830 }
567
+ {"clientip" :" 10.0.0.25" ,"distinct_count" :810 }
568
+ {"clientip" :" 172.16.0.30" ,"distinct_count" :790 }
569
+ {"clientip" :" 192.168.1.150" ,"distinct_count" :690 }
570
+ ```
484
571
??? info "The HyperLogLog Algorithm Explained:"
485
572
** Problem Statement**
486
573
0 commit comments