Examples

example queries that sybil can run

page visits & page load times

Let's say we had a dataset that represents client page loads and has the fields: session_id, page, browser, country and load time. with sybil, you could run all the following queries:

make a time series of the avg (or median) page load time (grouped by browser or country)
show a table of visitor counts by browser and country
show the distribution (and percentiles) of page load times grouped by country
show how many unique visitors are visiting my site per hour, day, week, etc

browser history

We can also log every website that our browser visits into sybil, to get a more detailed and searchable browser history. By importing data out of chrome, we can learn things like:

how many sites do we visit per day
what is our frequency and usage of a particular site (by visits or time spent)
how long do we spend in our browser per day
what are our most visited websites?

user interactions

Aside from instrumenting our own web browsing, its possible to instrument all (or some) of the actions in a browser on a web property we own. For example, if we own foo.com, its trivial to add a tracking script that sends back the following:

arbitrary user actions, including all clicks and their related metainfo
page performance stats (load time, etc)
time spent including focused vs. idle time
track and view navigation between pages
etc

Most web properties do this (via GA or others) and perhaps more - like including your demographics, the other websites you visit and other advertiser relevant information. It is unfortunate and privacy invading, but its the world we live in. Use a chrome extension to block it, but it's unlikely you can block the website owner from tracking you (if they write their own custom scripts) without disabling javascript. The extension will hopefully block 3rd party sites, though.

per process stats

Another fun thing to log is the process stats for long term analysis. This is the equivalent of running something like top for historical analysis. Using pidstats (which scans the /proc/ dir), we can log the pid, args, commands, memory size, cpu %, etc. This will let us run queries like:

how much memory do all my node server process uses over time?
which process consumes the most CPU over time?
which processes fight for CPU?
is there any process that spins up overnight and does lots of work when I'm not looking?

Logging all this info + dynamic queries is giving us the ability to zoom into our CPU and mem graph on a per process basis. It's amazing what you can see when you start digging through these graphs.

other possibilities

Sybil is not limited to only web datasets - it's useful for modeling and understanding any scenario that involves multi-dimensional data , for example: advertising impressions and funnels, content popularity, ops monitoring, site performance, bug tracking, IOT, hardware sensors and more . As long as your data can be modeled as events in time, it's possible to store and analyze through sybil

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Examples

example queries that sybil can run

page visits & page load times

browser history

user interactions

per process stats

other possibilities

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally