diff --git a/docs/grammar/parameters/index.html b/docs/grammar/parameters/index.html index 1e878b7..3cb67fc 100644 --- a/docs/grammar/parameters/index.html +++ b/docs/grammar/parameters/index.html @@ -1850,7 +1850,7 @@

Selection parametersexample gallery with slight modifications (GenomeSpy provides no "bar" mark). The diff --git a/docs/search/search_index.json b/docs/search/search_index.json index 906f36d..6b828de 100644 --- a/docs/search/search_index.json +++ b/docs/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Introduction","text":"

GenomeSpy is a toolkit for interactive visualization of genomic and other data. It enables tailored visualizations by providing a declarative grammar, which allows for mapping data to visual channels (position, color, etc.) and composing complex visualization from primitive graphical marks (points, rectangles, etc.). The grammar is heavily inspired by Vega-Lite, providing partial compatibility and extending it with features essential in genome visualization.

The visualizations are rendered using a carefully crafted WebGL-based engine, enabling fluid interaction and smooth animation for datasets comprising several million data points. The high interactive performance is achieved using GPU shader programs for all scale transformations and rendering of marks. However, shaders are an implementation detail hidden from the end users.

The toolkit comprises two JavaScript packages:

  1. The core library implements the visualization grammar and rendering engine and can be embedded in web pages or applications.
  2. The app extends the core library with support for interactive analysis of large sample collections. It broadens the grammar by introducing a facet operator that repeats a visualization for multiple samples. The app also provides interactions for filtering, sorting, and grouping these samples.

Check the Getting Started page to get started with GenomeSpy and make your own tailored visualizations.

"},{"location":"#an-interactive-example","title":"An interactive example","text":"

The example below is interactive. You can zoom in using the mouse wheel.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 200000, \"as\": \"x\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"random() * 0.682\", \"as\": \"u\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"((datum.u % 1e-8 > 5e-9 ? 1 : -1) * (sqrt(-log(max(1e-9, datum.u))) - 0.618)) * 1.618 + sin(datum.x / 10000)\",\n      \"as\": \"y\"\n    }\n  ],\n  \"mark\": {\n    \"type\": \"point\",\n    \"geometricZoomBound\": 10.5\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\", \"scale\": { \"zoom\": true } },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"size\": { \"value\": 200 },\n    \"opacity\": { \"value\": 0.6 }\n  }\n}\n
"},{"location":"#about","title":"About","text":"

GenomeSpy is developed by Kari Lavikka in The Systems Biology of Drug Resistance in Cancer group at the University of Helsinki.

This project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under Grant agreement No. 667403 (HERCULES) and No. 965193 (DECIDER)

"},{"location":"api/","title":"JavaScript API","text":"

The public JavaScript API is currently quite minimal.

"},{"location":"api/#embedding","title":"Embedding","text":"

See the getting started page.

"},{"location":"api/#the-api","title":"The API","text":"

The embed function returns a promise that resolves into an object that provides the current public API. The API is documented in the interface definition.

For practical examples on using the API, check the embed-examples package.

"},{"location":"api/#embed-options","title":"Embed options","text":"

The embed function accepts an optional options object.

"},{"location":"api/#named-data-provider","title":"Named data provider","text":"

See the API definition.

"},{"location":"api/#custom-tooltip-handlers","title":"Custom tooltip handlers","text":"

GenomeSpy provides two built-in tooltip handlers.

The default handler displays the underlying datum's properties in a table. Property names starting with an underscore are omitted. The values are formatted nicely.

The refseqgene handler fetches a summary description for a gene symbol using the Entrez API. For an example, check the RefSeq gene track in this notebook.

Handlers are functions that receive the hovered mark's underlying datum and return a promise that resolves into a string, HTMLElement, or lit-html TemplateResult.

The function signature:

export type TooltipHandler = (\n  datum: Record<string, any>,\n  mark: Mark,\n  /** Optional parameters from the view specification */\n  params?: Record<string, any>\n) => Promise<string | TemplateResult | HTMLElement>;\n

Use the tooltipHandlers option to register custom handlers or override the default. See the example below.

"},{"location":"api/#examples","title":"Examples","text":"

Overriding the default handler:

import { html } from \"lit-html\";\n\nconst options = {\n  tooltipHandlers: {\n    default: async (datum, mark, props) =>\n      html`\n        The datum has\n        <strong>${Object.keys(datum).length}</strong> attributes!\n      `,\n  },\n};\n\nembed(container, spec, options);\n

To use a specific (custom) handler in a view specification:

{\n  \"mark\": {\n    \"type\": \"point\",\n    \"tooltip\": {\n      \"handler\": \"myhandler\",\n      \"params\": {\n        \"custom\": \"param\"\n      }\n    }\n  },\n  ...\n}\n
"},{"location":"getting-started/","title":"Getting Started","text":"

GenomeSpy is a visualization toolkit for genomic data. More specifically, it is a JavaScript library that can be used to create interactive visualizations of genomic data in web browsers. To visualize data with GenomeSpy, you need to:

  1. Have some data to be visualized
  2. Write or find a visualization specification that describes how the data should be visualized
  3. Embed GenomeSpy into a web page and initialize it with the specification and the data
  4. Open the web page with your web browser

However, there are three ways to get quickly started with GenomeSpy visualizations: the Playground app, Observable notebooks, and embedding GenomeSpy on HTML pages. More advanced users can use GenomeSpy as a visualization library in web applications.

"},{"location":"getting-started/#playground","title":"Playground","text":"

The easiest way to try out GenomeSpy is the Playground app, which allows you to experiment with different visualization specifications directly in your web browser. You can load data from publicly accessible web servers or from your computer. The app is still rudimentary and does not support saving or sharing visualizations.

"},{"location":"getting-started/#observable-notebooks","title":"Observable notebooks","text":"

You can embed GenomeSpy into an Observable notebook. Please check the GenomeSpy collection for usage examples.

"},{"location":"getting-started/#local-or-remote-web-server","title":"Local or remote web server","text":"

For more serious work, you should use the GenomeSpy JavaScript library to create a web page for the visualization:

  1. Create an HTML document (web page) by using the example below
  2. Place the visualization spec and your data files into the same directory as the HTML document
  3. Copy them onto a remote web server or start a local web server in the directory
"},{"location":"getting-started/#local-web-server","title":"Local web server","text":"

Python comes with an HTTP server module that can be started from command line:

python3 -m http.server --bind 127.0.0.1\n

By default, it serves files from the current working directory. See Python's documentation for details.

"},{"location":"getting-started/#html-template","title":"HTML template","text":"

The templates below load the GenomeSpy JavaScript library from a content delivery network. Because the specification schema and the JavaScript API are not yet 100% stable, it is recommended to use a specific version.

The embed function initializes a visualization into the HTML element given as the first parameter using the specification given as the second parameter. The function returns a promise that resolves into an object that provides the current public API. For deails, see the API Documentation.

Check the latest version!

The versions in the examples below may be slightly out of date. The current version is:

"},{"location":"getting-started/#load-the-spec-from-a-file","title":"Load the spec from a file","text":"

This template loads the spec from a separate spec.json file.

<!DOCTYPE html>\n<html>\n  <head>\n    <title>GenomeSpy</title>\n  </head>\n  <body>\n    <script\n      type=\"text/javascript\"\n      src=\"https://cdn.jsdelivr.net/npm/@genome-spy/core@0.37.x\"\n    ></script>\n\n    <script>\n      genomeSpyEmbed.embed(document.body, \"spec.json\", {});\n    </script>\n  </body>\n</html>\n
"},{"location":"getting-started/#embed-the-spec-in-the-html-document","title":"Embed the spec in the HTML document","text":"

You can alternatively provide the specification as a JavaScript object.

<!DOCTYPE html>\n<html>\n  <head>\n    <title>GenomeSpy</title>\n  </head>\n  <body>\n    <script\n      type=\"text/javascript\"\n      src=\"https://cdn.jsdelivr.net/npm/@genome-spy/core@0.37.x\"\n    ></script>\n\n    <script>\n      const spec = {\n        data: {\n          sequence: { start: 0, stop: 6.284, step: 0.39269908169, as: \"x\" },\n        },\n        transform: [{ type: \"formula\", expr: \"sin(datum.x)\", as: \"sin\" }],\n        mark: \"point\",\n        encoding: {\n          x: { field: \"x\", type: \"quantitative\" },\n          y: { field: \"sin\", type: \"quantitative\" },\n        },\n      };\n\n      genomeSpyEmbed.embed(document.body, spec, {});\n    </script>\n  </body>\n</html>\n
"},{"location":"getting-started/#genomespyapp-website-examples","title":"Genomespy.app website examples","text":"

The examples on the genomespy.app main page are stored in the website-examples GitHub repository. You can clone the repository and launch the examples locally for further experimentation.

"},{"location":"getting-started/#using-genomespy-as-a-visualization-library-in-web-applications","title":"Using GenomeSpy as a visualization library in web applications","text":"

The @genome-spy/core NPM package contains a bundled library that can be used on web pages as shown in the examples above. In addition, it contains the source code in ESM format, allowing use with bundlers such as Vite and Webpack. For examples of such use, see:

"},{"location":"license/","title":"License","text":"

MIT License

Copyright (c) 2018-2023 Kari Lavikka

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

"},{"location":"license/#contains-code-from","title":"Contains Code From","text":""},{"location":"license/#vega-and-vega-lite","title":"Vega and Vega-Lite","text":"

Copyright (c) 2015, University of Washington Interactive Data Lab. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

"},{"location":"genomic-data/","title":"Working with Genomic Data","text":"

GenomeSpy provides various features that are specifically designed for working with genomic data.

"},{"location":"genomic-data/#loading-genomic-data","title":"Loading Genomic Data","text":"

While GenomeSpy can load data from various sources, such as CSV and JSON files, genomic data is often stored in specialized file formats, such as Indexed FASTA, BigWig, and BigBed. GenomeSpy provides built-in support for these formats, allowing you to load and visualize genomic data without the need for additional tools or libraries.

"},{"location":"genomic-data/#handling-genomic-coordinates","title":"Handling Genomic Coordinates","text":"

Genomic data is typically associated with genomic coordinates comprising chromosome names and positions within the chromosomes. GenomeSpy provides various techniques for working with such coordinates, such as transforming between different coordinate systems and visualizing data in the context of a reference genome.

"},{"location":"genomic-data/#data-transformations","title":"Data Transformations","text":"

Specialized transformations, such as folding tabular data, calculating coverage, and computing a piled up layout allows GenomeSpy to be adapted for many genomic data visualization and analysis tasks.

"},{"location":"genomic-data/#gpu-accelerated-rendering","title":"GPU-accelerated Rendering","text":"

As genomic data can be large and complex, GenomeSpy's GPU-accelerated rendering allows you to visualize, navigate, and explore large datasets with high performance.

"},{"location":"genomic-data/examples/","title":"Practical Genomic Data Examples","text":""},{"location":"genomic-data/examples/#observable-notebooks","title":"Observable notebooks","text":"

The ASCAT Copy-Number Segmentation notebook provides a comprehensive and fully documented example of using GenomeSpy with genomic data.

The Annotation Tracks notebooks explains how to implement a chromosome ideogram and a fancy gene annotation track.

"},{"location":"genomic-data/examples/#website-examples","title":"Website examples","text":"

The genomespy.app main page showcases several examples, some of which focusing on genomic data.

"},{"location":"genomic-data/genomic-coordinates/","title":"Genomic Coordinates","text":"

To allow easy visualization of coordinate-based genomic data, GenomeSpy can concatenate the discrete chromosomes onto a single continuous linear axis. Concatenation needs the sizes and preferred order for the contigs or chromosomes. These are usually provided with a genome assembly.

To activate support for genomic coordinates, add the genome property with the name of the assembly to the top level view specification:

{\n  \"genome\": {\n    \"name\": \"hg38\"\n  },\n  ...\n}\n

Only a single genome assembly

Currently, a visualization may have only a single globally configured genome assembly. Different assemblies for different scales (for x and y axes, for example) will be supported in the future.

"},{"location":"genomic-data/genomic-coordinates/#supported-genomes","title":"Supported genomes","text":"

GenomeSpy bundles a few common built-in genome assemblies: \"hg38\", \"hg19\", \"hg18\", \"mm10\", \"mm9\", and \"dm6\".

"},{"location":"genomic-data/genomic-coordinates/#custom-genomes","title":"Custom genomes","text":"

Custom genome assemblies can be provided in two ways: as a chrom.sizes file or within the the specification.

"},{"location":"genomic-data/genomic-coordinates/#as-a-chromsizes-file","title":"As a chrom.sizes file","text":"

The chrom.sizes file is a two-column text file with the chromosome names and their sizes. You may want to use the UCSC Genome Browser's fetchChromSizes script to download the sizes for a genome assembly. GenomeSpy does not filter out any alternative contigs or haplotypes, so you may want to preprocess the file before using it.

Example:

{\n  \"genome\": {\n    \"name\": \"hg19\",\n    \"url\": \"https://genomespy.app/data/genomes/hg19/chrom.sizes\"\n  },\n  ...\n}\n
"},{"location":"genomic-data/genomic-coordinates/#within-the-specification","title":"Within the specification","text":"

You can provide the genome assembly directly in the specification using the contigs property. The contigs are an array of objects with the name and size properties.

Example:

{\n  \"genome\": {\n    \"name\": \"dm6\",\n    \"contigs\": [\n      {\"name\": \"chr3R\", \"size\": 32079331 },\n      {\"name\": \"chr3L\", \"size\": 28110227 },\n      {\"name\": \"chr2R\", \"size\": 25286936 },\n      {\"name\": \"chrX\",  \"size\": 23542271 },\n      {\"name\": \"chr2L\", \"size\": 23513712 },\n      {\"name\": \"chrY\",  \"size\": 3667352 },\n      {\"name\": \"chr4\",  \"size\": 1348131 },\n    ]\n  },\n  ...\n}\n
"},{"location":"genomic-data/genomic-coordinates/#encoding-genomic-coordinates","title":"Encoding genomic coordinates","text":"

When a genome assembly has been specified, you can encode the genomic coordinates conveniently by specifying the chromosome (chrom) and position (pos) fields as follows:

{\n  ...,\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"Chr\",\n      \"pos\": \"Pos\",\n      \"offset\": -1.0,\n      \"type\": \"locus\"\n    },\n    ...\n  }\n}\n

The example above specifies that the chromosome is read from the \"Chr\" field and the intra-chromosomal position from the \"Pos\" field. The \"locus\" data type pairs the channel with a \"locus\" scale, which provides a chromosome-aware axis. However, you can also use the field property with the locus data type if the coordinate has already been linearized. The offset property is explained below.

What happens under the hood

When the chrom and pos properties are used used in channel definitions, GenomeSpy inserts an implicit linearizeGenomicCoordinate transformation into the data flow. The transformation introduces a new field with the linearized coordinate for the (chromosome, position) pair. The channel definition is modified to use the new field.

In some cases you may want to insert an explicit transformation to the data flow to have better control on its behavior.

"},{"location":"genomic-data/genomic-coordinates/#coordinate-counting","title":"Coordinate counting","text":"

The offset property allows for aligning and adjusting for different coordinate notations: zero or one based, closed or half-open. The offset is added to the final coordinate.

GenomeSpy's \"locus\" scale expects half-open, zero-based coordinates.

Read more about coordinates at the UCSC Genome Browser Blog.

"},{"location":"genomic-data/genomic-coordinates/#examples","title":"Examples","text":""},{"location":"genomic-data/genomic-coordinates/#point-features","title":"Point features","text":"

Point features cover a single position on a chromosome. An example of a point feature is a single nucleotide variant (SNV), where a nucleotide has been replaced by another.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"data\": {\n    \"values\": [\n      { \"chrom\": \"chr3\", \"pos\": 134567890 },\n      { \"chrom\": \"chr4\", \"pos\": 123456789 },\n      { \"chrom\": \"chr9\", \"pos\": 34567890 }\n    ]\n  },\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"pos\",\n      \"type\": \"locus\"\n    }\n  }\n}\n
"},{"location":"genomic-data/genomic-coordinates/#segment-features","title":"Segment features","text":"

Segment features cover a range of positions on a chromosome. They are defined by their two end positions. An example of a segment feature is a copy number variant (CNV), where a region of the genome has been duplicated or deleted.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"data\": {\n    \"values\": [\n      { \"chrom\": \"chr3\", \"startpos\": 100000000, \"endpos\": 140000000 },\n      { \"chrom\": \"chr4\", \"startpos\": 70000000, \"endpos\": 170000000 },\n      { \"chrom\": \"chr9\", \"startpos\": 50000000, \"endpos\": 70000000 }\n    ]\n  },\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"startpos\",\n      \"type\": \"locus\"\n    },\n    \"x2\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"endpos\"\n    }\n  }\n}\n
"},{"location":"grammar/","title":"Visualization Grammar","text":"

Genome browser applications typically couple the visual representations to specific file formats and provide few customization options. GenomeSpy has a more abstract approach to visualization, providing combinatorial building blocks such as marks, transformations, and scales. As a result, users can author tailored visualizations that display the underlying data more effectively.

The concept was first introduced in The Grammar of Graphics and developed further in ggplot2 and Vega-Lite.

A dialect of Vega-Lite

The visualization grammar of GenomeSpy is a dialect of Vega-Lite, providing partial compatibility. However, the goals of GenomeSpy and Vega-Lite are different \u2013 GenomeSpy is more domain-specific and primarily intended for the visualization and analysis of large datasets containing genomic coordinates. Nevertheless, GenomeSpy tries to follow Vega-Lite's grammar where practical, and thus, this documentation has several references to its documentation.

"},{"location":"grammar/#a-single-view-specification","title":"A single view specification","text":"

Each view specification must have at least the data to be visualized, the mark that will represent the data items, and an encoding that specifies how the fields of data are mapped to the visual channels of the mark. In addition, an optional transform steps allow for modifying the data before they are encoded into mark instances.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"abs(datum.sin)\", \"as\": \"abs(sin)\" }\n  ],\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"abs(sin)\", \"type\": \"quantitative\" },\n    \"size\": { \"field\": \"x\", \"type\": \"quantitative\" }\n  }\n}\n
"},{"location":"grammar/#properties","title":"Properties","text":"aggregateSamples

Type: array

Specifies views that aggregate multiple samples within the GenomeSpy App.

baseUrl

Type: string

The base URL for relative URL data sources and URL imports. The base URLs are inherited in the view hierarchy unless overridden with this property. By default, the top-level view's base URL equals to the visualization specification's base URL.

configurableVisibility

Type: boolean

Is the visibility configurable interactively from the GenomeSpy App. Configurability requires that the view has an explicitly specified name that is unique in within the view hierarchy.

Default: false for children of layer, true for others.

data

Type: UrlData | InlineData | NamedData | DynamicCallbackData | LazyData | Generator

Specifies a data source. If omitted, the data source is inherited from the parent view.

description

Type: string | string[]

A description of the view. Can be used for documentation. The description of the top-level view is shown in the toolbar of the GenomeSpy App.

encoding

Type: Encoding

Specifies how data are encoded using the visual channels.

height

Type: SizeDef | number | Step | \"container\"

Height of the view. If a number, it is interpreted as pixels. Check child sizing for details.

Default value: \"container\"

mark Required

Type: \"rect\" | \"point\" | \"rule\" | \"text\" | \"link\" | RectProps | TextProps | RuleProps | LinkProps | PointProps

The graphical mark presenting the data objects.

name

Type: string

An internal name that can be used for referring the view. For referencing purposes, the name should be unique within the view hierarchy.

opacity

Type: number | DynamicOpacity | ExprRef

Opacity of the view and all its children. Allows implementing semantic zooming where the layers are faded in and out as the user zooms in and out.

TODO: Write proper documentation with examples.

Default: 1.0

padding

Type: Paddings | number

Padding applied to the view. Accepts either a number representing pixels or an object specifying separate paddings for each edge.

Examples: - padding: 10 - padding: { top: 10, right: 20, bottom: 10, left: 20 }

Default value: 0

params

Type: array

Dynamic variables that parameterize a visualization.

resolve

Type: object

Specifies how scales and axes are resolved in the view hierarchy.

templates

Type: object

Templates that can be reused within the view specification by importing them with the template key.

title

Type: string | Title

View title. N.B.: Currently, GenomeSpy doesn't do bound calculation, and you need to manually specify proper padding for the view to ensure that the title is visible.

transform

Type: array

An array of transformations applied to the data before visual encoding.

view

Type: ViewBackground

The background of the view, including fill, stroke, and stroke width.

viewportHeight

Type: SizeDef | number | \"container\"

Optional viewport height of the view. If the view size exceeds the viewport height, it will be shown with scrollbars. This property implicitly enables clipping.

Default: null (same as height)

viewportWidth

Type: SizeDef | number | \"container\"

Optional viewport width of the view. If the view size exceeds the viewport width, it will be shown with scrollbars. This property implicitly enables clipping.

Default: null (same as width)

visible

Type: boolean

The default visibility of the view. An invisible view is removed from the layout and not rendered. For context, see toggleable view visibility.

Default: true

width

Type: SizeDef | number | Step | \"container\"

Width of the view. If a number, it is interpreted as pixels. Check child sizing for details.

Default: \"container\"

"},{"location":"grammar/#view-composition-for-more-complex-visualizations","title":"View composition for more complex visualizations","text":"

View composition allows for building more complex visualizations from multiple single-view specifications. For example, the layer operator allows creation of custom glyphs and the concatenation operators enables stacked layouts resembling genome browsers with multiple tracks.

"},{"location":"grammar/expressions/","title":"Expressions","text":"

Expressions allow for defining predicates or computing new variables based on existing data. The expression language is based on JavaScript, but provides only a limited set of features, guaranteeing secure execution.

Expressions can be used with the \"filter\" and \"formula\" transforms, in encoding, and in expression references for dynamic properties in marks, transforms, and data sources.

"},{"location":"grammar/expressions/#usage","title":"Usage","text":"

All basic arithmetic operators are supported:

(1 + 2) * 3 / 4\n

When using expressions within the data transformation pipeline, the current data object is available in the datum variable. Its properties (fields) can be accessed by using the dot or bracket notation:

datum.foo + 2\n

If the name of the property contains special characters such as \".\", \"!\", or \" \" (a space) the bracket notation must be used:

datum['A very *special* name!'] > 100\n
"},{"location":"grammar/expressions/#conditional-operators","title":"Conditional operators","text":"

Ternary operator:

datum.foo < 5 ? 'small' : 'large'\n

And an equivalent if construct:

if(datum.foo < 5, 'small', 'large')\n
"},{"location":"grammar/expressions/#provided-constants-and-functions","title":"Provided constants and functions","text":"

Common mathematical functions are supported:

(datum.u % 1e-8 > 5e-9 ? 1 : -1) *\n  (sqrt(-log(max(1e-9, datum.u))) - 0.618) *\n  1.618\n
"},{"location":"grammar/expressions/#constants-and-functions-from-vega","title":"Constants and functions from Vega","text":"

The following constants and functions are provided by the vega-expression package.

"},{"location":"grammar/expressions/#constants","title":"Constants","text":"

NaN, E, LN2, LN10, LOG2E, LOG10E, PI, SQRT1_2, SQRT2, MIN_VALUE, MAX_VALUE

"},{"location":"grammar/expressions/#type-checking-functions","title":"Type Checking Functions","text":"

isArray, isBoolean, isNumber, isObject, isRegExp, isString

"},{"location":"grammar/expressions/#math-functions","title":"Math Functions","text":"

isNaN, isFinite, abs, acos, asin, atan, atan2, ceil, cos, exp, floor, hypot, log, max, min, pow, random, round, sin, sqrt, tan, clamp

"},{"location":"grammar/expressions/#sequence-array-or-string-functions","title":"Sequence (Array or String) Functions","text":"

length, join, indexof, lastindexof, reverse, slice

"},{"location":"grammar/expressions/#string-functions","title":"String Functions","text":"

parseFloat, parseInt, upper, lower, replace, split, substring, trim

"},{"location":"grammar/expressions/#regexp-functions","title":"RegExp Functions","text":"

regexp, test

"},{"location":"grammar/expressions/#other-functions","title":"Other functions","text":"

# lerp(array, fraction) Provides a linearly interpolated value from the first to the last element in the given array based on the specified interpolation fraction, usually ranging from 0 to 1. For instance, lerp([0, 50], 0.5) yields 25.

# linearstep(edge0, edge1, x) Calculates a linear interpolation between 0 and 1 for a value x within the range defined by edge0 and edge1. It applies a clamp to ensure the result stays within the 0.0 to 1.0 range.

# smoothstep(edge0, edge1, x) Performs smooth Hermite interpolation between 0 and 1 for values of x that lie between edge0 and edge1. This function is particularly useful for scenarios requiring a threshold function with a smooth transition, offering a gradual rather than an abrupt change between states.

"},{"location":"grammar/import/","title":"Importing Views","text":"

GenomeSpy facilitates reusing views by allowing them to be imported from the same specification by name or from external specification files by a URL. The files can be placed flexibly \u2013 it may be practical to split large specifications into multiple files and place them in the same directory. On the other hand, if you have created, for example, an annotation track that you would like the share with the research community, you can upload the specification file and the associated data to a publicly accessible web server. The imported views, both named and URLs, can be parameterized to allow for customization.

"},{"location":"grammar/import/#properties","title":"Properties","text":"import Required

Type: UrlImport | TemplateImport

The method to import a specification.

name

Type: string

The name given to the imported view. This property overrides the name specified in the imported specification.

params

Type: (VariableParameter | SelectionParameter)[] | object

Dynamic variables that parameterize a visualization. Parameters defined here override the parameters defined in the imported specification.

"},{"location":"grammar/import/#urlimport","title":"UrlImport","text":"url Required

Type: string

Imports a specification from the specified URL.

"},{"location":"grammar/import/#templateimport","title":"TemplateImport","text":"template Required

Type: string

Imports a specification from the current view hierarchy, searching first in the current view, then ascending through ancestors.

"},{"location":"grammar/import/#importing-from-a-url","title":"Importing from a URL","text":"

Views can be imported from relative and absolute URLs. Relative URLs are imported with respect to the current baseUrl.

The imported specification may contain a single, concatenated, or layered view. The baseUrl of the imported specification is updated to match the directory of the imported specification. Thus, you can publish a view (or a track as known in genome browsers) by placing its specification and data available in the same directory on a web server.

The URL import supports parameters, which are described below within the named templates.

Example
{\n  ...,\n  \"vconcat\": [\n    ...,\n    { \"import\": { \"url\": \"includes/annotations.json\" } },\n    { \"import\": { \"url\": \"https://example.site/tracks/annotations.json\" } }\n  ]\n}\n
"},{"location":"grammar/import/#repeating-with-named-templates","title":"Repeating with named templates","text":"

Instead of importing from external files, views can offer named templates for reuse by their descendants. In the example below, the provided specification features a template called \"myTrack,\" which is applied twice, each instance with a unique set of parameters. The imported view can access the parameters using expressions. This approach enables the modification of visual elements through parameter changes, streamlining the creation of varied visualizations from a single template without the need to duplicate the base specification fragment.

{\n  \"vconcat\": [\n    {\n      \"import\": {\n        \"template\": \"myTrack\"\n      },\n      \"params\": [{ \"name\": \"size\", \"value\": 5 }]\n    },\n    {\n      \"import\": {\n        \"template\": \"myTrack\"\n      },\n      \"params\": { \"offset\": 3.141, \"size\": 20 }\n    }\n  ],\n  \"templates\": {\n    \"myTrack\": {\n      \"params\": [\n        { \"name\": \"offset\", \"value\": 0 },\n        { \"name\": \"size\", \"value\": 10 }\n      ],\n      \"data\": {\n        \"sequence\": { \"start\": 0, \"stop\": 20, \"step\": 0.2, \"as\": \"x\" }\n      },\n      \"transform\": [\n        { \"type\": \"formula\", \"expr\": \"sin(datum.x + offset)\", \"as\": \"y\" }\n      ],\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"size\": { \"value\": { \"expr\": \"size\" } },\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"y\", \"type\": \"quantitative\" }\n      }\n    }\n  }\n}\n
"},{"location":"grammar/parameters/","title":"Parameters","text":"

Work in progress

This page is a work in progress and is incomplete.

Parameters enable various dynamic behaviors in GenomeSpy visualizations, such as interactive selections, conditional encoding, and data filtering with expressions. They also enable parameterization when importing specification fragments from external files or named templates. Parameters in GenomeSpy are heavily inspired by the parameters concept of Vega-Lite.

"},{"location":"grammar/parameters/#examples","title":"Examples","text":""},{"location":"grammar/parameters/#using-input-bindings","title":"Using Input Bindings","text":"

Parameters can be bound to input elements, such as sliders, dropdowns, and checkboxes. The GenomeSpy Core library shows the input elements below the visualization. In the GenomeSpy App, the input elements are shown in the View visibility menu, allowing the visualization author to provide sophisticated configuration options to the end user.

The following example shows how to bind parameters to input elements and use them to control the size, angle, and text of a text mark.

{\n  \"padding\": 0,\n  \"view\": { \"fill\": \"#cbeef3\" },\n  \"params\": [\n    {\n      \"name\": \"size\",\n      \"value\": 80,\n      \"bind\": { \"input\": \"range\", \"min\": 1, \"max\": 300 }\n    },\n    {\n      \"name\": \"angle\",\n      \"value\": 0,\n      \"bind\": { \"input\": \"range\", \"min\": 0, \"max\": 360 }\n    },\n    {\n      \"name\": \"text\",\n      \"value\": \"Params are cool!\",\n      \"bind\": {\n        \"input\": \"select\",\n        \"options\": [\"Params are cool!\", \"GenomeSpy\", \"Hello\", \"World\"]\n      }\n    }\n  ],\n\n  \"data\": { \"values\": [{}] },\n\n  \"mark\": {\n    \"type\": \"text\",\n    \"font\": \"Lobster\",\n    \"text\": { \"expr\": \"text\" },\n    \"size\": { \"expr\": \"size\" },\n    \"angle\": { \"expr\": \"angle\" }\n  }\n}\n
"},{"location":"grammar/parameters/#expressions","title":"Expressions","text":"

Parameters can be based on expressions, which can depend on other parameters. They are automatically re-evaluated when the dependent parameters change.

{\n  \"view\": { \"stroke\": \"lightgray\" },\n  \"params\": [\n    {\n      \"name\": \"A\",\n      \"value\": 2,\n      \"bind\": { \"input\": \"range\", \"min\": 0, \"max\": 10, \"step\": 1 }\n    },\n    {\n      \"name\": \"B\",\n      \"value\": 3,\n      \"bind\": { \"input\": \"range\", \"min\": 0, \"max\": 10, \"step\": 1 }\n    },\n    {\n      \"name\": \"C\",\n      \"expr\": \"A * B\"\n    }\n  ],\n\n  \"data\": { \"values\": [{}] },\n\n  \"mark\": {\n    \"type\": \"text\",\n    \"size\": 30,\n    \"text\": { \"expr\": \"'' + A + ' * ' + B + ' = ' + C\" }\n  }\n}\n
"},{"location":"grammar/parameters/#selection-parameters","title":"Selection parameters","text":"

Parameters allow for defining interactive selections, which can be used in conditional encodings. GenomeSpy compiles the conditional encoding rules into efficient GPU shader code, enabling fast interactions in very large data sets. However, currently only single-point selections are supported.

The following example has been adapted from Vega-Lite's example gallery with slight modifications (GenomeSpy provides no \"bar\" mark). The specification below is fully compatible with Vega-Lite. You can select multiple bars by holding down the Shift key.

{\n  \"description\": \"A bar chart with highlighting on hover and selecting on click. (Inspired by Tableau's interaction style.)\",\n\n  \"data\": {\n    \"values\": [\n      { \"a\": \"A\", \"b\": 28 },\n      { \"a\": \"B\", \"b\": 55 },\n      { \"a\": \"C\", \"b\": 43 },\n      { \"a\": \"D\", \"b\": 91 },\n      { \"a\": \"E\", \"b\": 81 },\n      { \"a\": \"F\", \"b\": 53 },\n      { \"a\": \"G\", \"b\": 19 },\n      { \"a\": \"H\", \"b\": 87 },\n      { \"a\": \"I\", \"b\": 52 }\n    ]\n  },\n  \"params\": [\n    {\n      \"name\": \"highlight\",\n      \"select\": { \"type\": \"point\", \"on\": \"pointerover\" }\n    },\n    { \"name\": \"select\", \"select\": \"point\" }\n  ],\n  \"mark\": {\n    \"type\": \"rect\",\n    \"fill\": \"#4C78A8\",\n    \"stroke\": \"black\"\n  },\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"a\",\n      \"type\": \"ordinal\",\n      \"scale\": { \"type\": \"band\", \"padding\": 0.2 }\n    },\n    \"y\": { \"field\": \"b\", \"type\": \"quantitative\" },\n    \"fillOpacity\": {\n      \"value\": 0.3,\n      \"condition\": { \"param\": \"select\", \"value\": 1 }\n    },\n    \"strokeWidth\": {\n      \"value\": 0,\n      \"condition\": [\n        { \"param\": \"select\", \"value\": 2, \"empty\": false },\n        { \"param\": \"highlight\", \"value\": 1, \"empty\": false }\n      ]\n    }\n  }\n}\n
"},{"location":"grammar/scale/","title":"Scale","text":"

Scales are functions that map abstract data values (e.g., a type of a point mutation) to visual values (e.g., colors that indicate the type).

By default, GenomeSpy configures scales automatically based on the data type (e.g., \"ordinal\"), the visual channel, and the data domain. As the defaults may not always be optimal, the scales can be configured explicitly.

Specifying a scale for a channel
{\n  \"encoding\": {\n    \"y\": {\n      \"field\": \"impact\",\n      \"type\": \"quantitative\",\n      \"scale\": {\n        \"type\": \"linear\",\n        \"domain\": [0, 1]\n      }\n    }\n  },\n  ...\n}\n
"},{"location":"grammar/scale/#vega-lite-scales","title":"Vega-Lite scales","text":"

GenomeSpy implements most of the scale types of Vega-Lite. The aim is to replicate their behavior identically (unless stated otherwise) in GenomeSpy. Although that has yet to fully materialize, Vega-Lite's scale documentation generally applies to GenomeSpy as well.

The supported scales are: \"linear\", \"pow\", \"sqrt\", \"symlog\", \"log\", \"ordinal\", \"band\", \"point\", \"quantize\", and \"threshold\". Disabled scale is supported on quantitative channels such as x and opacity.

Currently, the following scales are not supported: \"time\", \"utc\", \"quantile\", \"bin-linear\", \"bin-ordinal\".

Relation to Vega scales

In fact, GenomeSpy uses Vega scales, which are based on d3-scale. However, GenomeSpy has GPU-based implementations for the actual scale transformations, ensuring high rendering performance.

"},{"location":"grammar/scale/#genomespy-specific-scales","title":"GenomeSpy-specific scales","text":"

GenomeSpy provides two additional scales that are designed for molecular sequence data.

"},{"location":"grammar/scale/#index-scale","title":"Index scale","text":"

The \"index\" scale allows mapping index-based values such as nucleotide or amino-acid locations to positional visual channels. It has traits from both the continuous \"linear\" and the discrete \"band\" scale. It is linear and zoomable but maps indices to the range like the band scale does \u2013 each index has its own band. Properties such as padding work just as in the band scale.

The indices must be zero-based, i.e., the counting must start from zero. The numbering of the axis labels can be adjusted to give an impression of, for example, one-based indexing.

The index scale is used by default when the field type is \"index\".

"},{"location":"grammar/scale/#point-indices","title":"Point indices","text":"

When only the primary positional channel is defined, marks such as \"rect\" fill the whole band.

{\n  \"data\": {\n    \"values\": [0, 2, 4, 7, 8, 10, 12]\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"data\", \"type\": \"index\" }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"color\": { \"field\": \"data\", \"type\": \"nominal\" }\n      }\n    },\n    {\n      \"mark\": \"text\",\n      \"encoding\": {\n        \"text\": {\n          \"field\": \"data\"\n        }\n      }\n    }\n  ]\n}\n

Marks such as \"point\" that do not support the secondary positional channel are centered.

{\n  \"data\": {\n    \"values\": [0, 2, 4, 7, 8, 10, 12]\n  },\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": { \"field\": \"data\", \"type\": \"index\" },\n    \"color\": { \"field\": \"data\", \"type\": \"nominal\" },\n    \"size\": { \"value\": 300 }\n  }\n}\n
"},{"location":"grammar/scale/#range-indices","title":"Range indices","text":"

When the index scale is used with ranges, e.g., a \"rect\" mark that has both the x and x2 channels defined, the ranges must be half open. For example, if a segment should cover the indices 2, 3, and 4, a half-open range would be defined as: x = 2 (inclusive), x2 = 5 (exclusive).

{\n  \"data\": {\n    \"values\": [\n      { \"from\": 0, \"to\": 2 },\n      { \"from\": 2, \"to\": 5 },\n      { \"from\": 8, \"to\": 9 },\n      { \"from\": 10, \"to\": 13 }\n    ]\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"from\", \"type\": \"index\" },\n    \"x2\": { \"field\": \"to\" }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"color\": { \"field\": \"from\", \"type\": \"nominal\" }\n      }\n    },\n    {\n      \"mark\": \"text\",\n      \"encoding\": {\n        \"text\": {\n          \"expr\": \"'[' + datum.from + ', ' + datum.to + ')'\"\n        }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/scale/#adjusting-the-indexing-of-axis-labels","title":"Adjusting the indexing of axis labels","text":"

The index scale expects zero-based indexing. However, it may be desirable to display the axis labels using one-based indexing. Use the numberingOffset property adjust the label indices.

{\n  \"data\": {\n    \"values\": [0, 2, 4, 7, 8, 10, 12]\n  },\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"data\",\n      \"type\": \"index\",\n      \"scale\": {\n        \"numberingOffset\": 1\n      }\n    }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"color\": { \"field\": \"data\", \"type\": \"nominal\" }\n      }\n    },\n    {\n      \"mark\": \"text\",\n      \"encoding\": {\n        \"text\": {\n          \"field\": \"data\"\n        }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/scale/#locus-scale","title":"Locus scale","text":"

The \"locus\" scale is similar to the \"index\" scale, but provides a genome-aware axis with concatenated chromosomes. To use the locus scale, a genome must be specified.

The locus scale is used by default when the field type is \"locus\".

Note

The locus scale does not map the discrete chromosomes onto the concatenated axis. It's done by the linearizeGenomicCoordinate transform.

"},{"location":"grammar/scale/#specifying-the-domain","title":"Specifying the domain","text":"

By default, the domain of the locus scale consists of the whole genome. However, You can specify a custom domain using either linearized or genomic coordinates. A genomic coordinate consists of a chromosome (chrom) and an optional position (pos). The left bound's position defaults to zero, whereas the right bound's position defaults to the size of the chromosome. Thus, the chromosomes are inclusive.

For example, chromosomes 3, 4, and 5:

[{ \"chrom\": \"chr3\" }, { \"chrom\": \"chr5\" }]\n

Only the chromosome 3:

[{ \"chrom\": \"chr3\" }]\n

A specific region inside the chromosome 3:

[\n  { \"chrom\": \"chr3\", \"pos\": 1000000 },\n  { \"chrom\": \"chr3\", \"pos\": 2000000 }\n]\n

Somewhere inside the chromosome 1:

[1000000, 2000000]\n
"},{"location":"grammar/scale/#example","title":"Example","text":"
{\n  \"genome\": { \"name\": \"hg38\" },\n  \"data\": {\n    \"values\": [\n      { \"chrom\": \"chr3\", \"pos\": 134567890 },\n      { \"chrom\": \"chr4\", \"pos\": 123456789 },\n      { \"chrom\": \"chr9\", \"pos\": 34567890 }\n    ]\n  },\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"pos\",\n      \"type\": \"locus\",\n      \"scale\": {\n        \"domain\": [{ \"chrom\": \"chr3\" }, { \"chrom\": \"chr9\" }]\n      }\n    },\n    \"size\": { \"value\": 200 }\n  }\n}\n
"},{"location":"grammar/scale/#zooming-and-panning","title":"Zooming and panning","text":"

To enable zooming and panning of continuous scales on positional channels, set the zoom scale property to true. Example:

{\n  \"x\": {\n    \"field\": \"foo\",\n    \"type\": \"quantitative\",\n    \"scale\": {\n      \"zoom\": true\n    }\n  }\n}\n

Both \"index\" and \"locus\" scales are zoomable by default.

"},{"location":"grammar/scale/#zoom-extent","title":"Zoom extent","text":"

The zoom extent allows you to control how far the scale can be zoomed out or panned (translated). Zoom extent equals the scale domain by default, except for the \"locus\" scale, where it includes the whole genome. Example:

{\n  ...,\n  \"scale\": {\n    \"domain\": [10, 20],\n    \"zoom\": {\n      \"extent\": [0, 30]\n    }\n  }\n}\n
"},{"location":"grammar/scale/#named-scales","title":"Named scales","text":"

By giving the scale a name, it can be accessed through the API.

{\n  ...,\n  \"scale\": {\n    \"name\": \"myScale\"\n  }\n}\n
"},{"location":"grammar/scale/#axes","title":"Axes","text":"

Positional channels are usually annotated with axes, which are automatically generated based on the scale type. However, you can customize the axis by specifying the axis property in the encoding block.

{\n  ...,\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"foo\",\n      \"type\": \"quantitative\",\n      \"axis\": {\n        \"title\": \"My axis title\"\n      }\n    }\n  }\n}\n

GenomeSpy implements most of Vega-Lite's axis properties. See the interface definition for supported properties. TODO: Write a proper documentation.

Grid lines

Grid lines are hidden by default in GenomeSpy and can be enabled for each view using the grid property. The default behavior will be configurable once GenomeSpy supports themes.

"},{"location":"grammar/scale/#genome-axis-for-loci","title":"Genome axis for loci","text":"

The genome axis is a special axis for the \"locus\" scale. It displays chromosome names and the intra-chromosomal coordinates. You can adjust the style of the chromosome axis and grid using various parameters.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"data\": { \"values\": [{}] },\n  \"mark\": \"point\",\n\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"a\",\n      \"pos\": \"b\",\n      \"type\": \"locus\",\n\n      \"axis\": {\n        \"chromTickColor\": \"#5F87F5\",\n        \"chromLabelColor\": \"#E16B67\",\n\n        \"grid\": true,\n        \"gridColor\": \"gray\",\n        \"gridOpacity\": 0.5,\n        \"gridDash\": [1, 11],\n\n        \"chromGrid\": true,\n        \"chromGridDash\": [3, 3],\n        \"chromGridColor\": \"#5F87F5\",\n        \"chromGridOpacity\": 0.7,\n        \"chromGridFillEven\": \"#BEFACC\",\n        \"chromGridFillOdd\": \"#FDFCE8\"\n      }\n    }\n  }\n}\n
"},{"location":"grammar/scale/#fully-customized-axes","title":"Fully customized axes","text":"

You can also disable the genome axis and grid and specify a custom axis instead. The \"axisGenome\" data source provides the chromosomes and their sizes, which can be used to create a custom axes or grids for a view.

"},{"location":"grammar/types/","title":"Types Used in the Grammar","text":"

Note

The list is still incomplete.

"},{"location":"grammar/types/#compareparams","title":"CompareParams","text":"field Required

Type: string (field name)[] | string (field name)

The field(s) to sort by

order

Type: (\"ascending\" | \"descending\")[] | \"ascending\" | \"descending\"

The order(s) to use: \"ascending\" (default), \"descending\".

"},{"location":"grammar/types/#dynamicopacity","title":"DynamicOpacity","text":"channel

Type: \"x\" | \"y\"

TODO

unitsPerPixel Required

Type: array

Stops expressed as units (base pairs, for example) per pixel.

values Required

Type: array

Opacity values that match the given stops.

"},{"location":"grammar/types/#exprref","title":"ExprRef","text":"expr Required

Type: string

The expression string.

"},{"location":"grammar/types/#paddings","title":"Paddings","text":"bottom

Type: number

TODO

left

Type: number

TODO

right

Type: number

TODO

top

Type: number

TODO

"},{"location":"grammar/types/#step","title":"Step","text":"step Required

Type: number

TODO

"},{"location":"grammar/types/#sizedef","title":"SizeDef","text":"grow

Type: number

Share of the remaining space. See child sizing for details.

px

Type: number

Size in pixels

"},{"location":"grammar/types/#title","title":"Title","text":"align

Type: \"left\" | \"center\" | \"right\"

Horizontal text alignment for title text. One of \"left\", \"center\", or \"right\".

anchor

Type: None | start | middle | end

The anchor position for placing the title and subtitle text. One of \"start\", \"middle\", or \"end\". For example, with an orientation of top these anchor positions map to a left-, center-, or right-aligned title.

angle

Type: number | ExprRef

Angle in degrees of title and subtitle text.

baseline

Type: \"top\" | \"middle\" | \"bottom\" | \"alphabetic\"

Vertical text baseline for title and subtitle text. One of \"alphabetic\" (default), \"top\", \"middle\", or \"bottom\".

color

Type: string | ExprRef

Text color for title text.

dx

Type: number

Delta offset for title and subtitle text x-coordinate.

dy

Type: number

Delta offset for title and subtitle text y-coordinate.

font

Type: string

Font name for title text.

fontSize

Type: number | ExprRef

Font size in pixels for title text.

fontStyle

Type: \"normal\" | \"italic\"

Font style for title text.

fontWeight

Type: number | \"thin\" | \"light\" | \"regular\" | \"normal\" | \"medium\" | \"bold\" | \"black\"

Font weight for title text. This can be either a string (e.g \"bold\", \"normal\") or a number (100, 200, 300, ..., 900 where \"normal\" = 400 and \"bold\" = 700).

frame

Type: \"bounds\" | \"group\"

The reference frame for the anchor position, one of \"bounds\" (to anchor relative to the full bounding box) or \"group\" (to anchor relative to the group width or height).

offset

Type: number

The orthogonal offset in pixels by which to displace the title group from its position along the edge of the chart.

orient

Type: \"none\" | \"left\" | \"right\" | \"top\" | \"bottom\"

Default title orientation (\"none\", \"top\", \"bottom\", \"left\", or \"right\")

style

Type: string

A mark style property to apply to the title text mark. If not specified, a default style of \"group-title\" is applied.

text Required

Type: string | ExprRef

The title text.

"},{"location":"grammar/types/#viewbackground","title":"ViewBackground","text":"fill

Type: string | ExprRef

The fill color.

fillOpacity

Type: number | ExprRef

The fill opacity. Value between 0 and 1.

stroke

Type: string | ExprRef

The stroke color

strokeOpacity

Type: number | ExprRef

The stroke opacity. Value between 0 and 1.

strokeWidth

Type: number

The stroke width in pixels.

"},{"location":"grammar/types/#viewopacitydef","title":"ViewOpacityDef","text":"

Type: number | DynamicOpacity | ExprRef

"},{"location":"grammar/composition/","title":"View Composition","text":"

GenomeSpy replicates the hierarchical composition model of Vega-Lite, and currently provides the concatenation and layer composition operators in the core library. In addition, the GenomeSpy app provides a facet operator for visualizing sample collections using a track-based layout.

The hierarchical model allows for nesting composition operators. For instance, you could have a visualization with two views side by side, and those views could contain multiple layered views. The views in the hierarchy inherit (transformed) data and encoding from their parents, and in some cases, the views may also share scales and axes with their siblings and parents. The data and encoding inherited from ancestors can always be overridden by the descendants.

"},{"location":"grammar/composition/#scale-and-axis-resolution","title":"Scale and axis resolution","text":"

Each visual channel of a view has a scale, which is either \"independent\" or \"shared\" with other views. For example, sharing the scale on the positional x channel links the zooming interactions of the participanting views through the shared scale domain. The axes of positional channels can be configured similarly.

The resolve property configures the scale and axis resolutions for the view's children.

An example of a resolution configuration
{\n  \"resolve\": {\n    \"scale\": {\n      \"x\": \"shared\",\n      \"y\": \"independent\",\n      \"color\": \"independent\"\n    },\n    \"axis\": {\n      \"x\": \"shared\",\n      \"y\": \"independent\"\n    }\n  },\n  ...\n}\n
"},{"location":"grammar/composition/#shared","title":"Shared","text":"

The example below shows an excerpt of segmented copy number data layered on raw SNP logR values. The scale of the y channel is shared by default and the domain is unioned. As the x channel's scale is also shared, the zooming interaction affects both views.

{\n  \"layer\": [\n    {\n      \"data\": { \"url\": \"../data/cnv_chr19_raw.tsv\" },\n      \"title\": \"Single probe\",\n\n      \"mark\": {\n        \"type\": \"point\",\n        \"geometricZoomBound\": 9.5\n      },\n\n      \"encoding\": {\n        \"x\": { \"field\": \"Position\", \"type\": \"index\" },\n        \"y\": { \"field\": \"logR\", \"type\": \"quantitative\" },\n        \"size\": { \"value\": 225 },\n        \"opacity\": { \"value\": 0.15 }\n      }\n    },\n    {\n      \"data\": {\n        \"url\": \"../data/cnv_chr19_segs.tsv\"\n      },\n      \"title\": \"Segment mean\",\n      \"mark\": {\n        \"type\": \"rule\",\n        \"size\": 3.0,\n        \"minLength\": 3.0,\n        \"color\": \"black\"\n      },\n      \"encoding\": {\n        \"x\": { \"field\": \"startpos\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"endpos\" },\n        \"y\": { \"field\": \"segMean\", \"type\": \"quantitative\" }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/composition/#independent","title":"Independent","text":"

By specifying that the scales of the y channel should remain \"independent\", both layers get their own scales and axes. Obviously, such a configuration makes no sense with these data.

{\n  \"resolve\": {\n    \"scale\": { \"y\": \"independent\" },\n    \"axis\": { \"y\": \"independent\" }\n  },\n  \"layer\": [\n    {\n      \"data\": { \"url\": \"../data/cnv_chr19_raw.tsv\" },\n      \"title\": \"Single probe\",\n\n      \"mark\": {\n        \"type\": \"point\",\n        \"geometricZoomBound\": 9.5\n      },\n\n      \"encoding\": {\n        \"x\": { \"field\": \"Position\", \"type\": \"index\" },\n        \"y\": { \"field\": \"logR\", \"type\": \"quantitative\" },\n        \"size\": { \"value\": 225 },\n        \"opacity\": { \"value\": 0.15 }\n      }\n    },\n    {\n      \"data\": {\n        \"url\": \"../data/cnv_chr19_segs.tsv\"\n      },\n      \"title\": \"Segment mean\",\n      \"mark\": {\n        \"type\": \"rule\",\n        \"size\": 3.0,\n        \"minLength\": 3.0,\n        \"color\": \"black\"\n      },\n      \"encoding\": {\n        \"x\": { \"field\": \"startpos\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"endpos\" },\n        \"y\": { \"field\": \"segMean\", \"type\": \"quantitative\" }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/","title":"View Concatenation","text":"

The vconcat and hconcat composition operators place views side-by-side either vertically or horizontally. The vconcat is practical for building genomic visualizations with multiple tracks. The concat operator with the columns property produces a wrapping grid layout.

The spacing (in pixels) between concatenated views can be adjusted using the spacing property (Default: 10).

"},{"location":"grammar/composition/concat/#example","title":"Example","text":""},{"location":"grammar/composition/concat/#vertical","title":"Vertical","text":"

Using vconcat for a vertical layout.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n\n  \"spacing\": 20,\n\n  \"vconcat\": [\n    {\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"sin\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"cos\", \"type\": \"quantitative\" }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#horizontal","title":"Horizontal","text":"

Using hconcat for a horizontal layout.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n\n  \"hconcat\": [\n    {\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"sin\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"cos\", \"type\": \"quantitative\" }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#grid","title":"Grid","text":"

Using concat and columns for a grid layout. For simplicity, the same visualization is used for all panels in the grid.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"sin\", \"type\": \"quantitative\" }\n  },\n\n  \"columns\": 3,\n  \"concat\": [\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#child-sizing","title":"Child sizing","text":"

The concatenation operators mimic the behavior of the CSS flexbox. The child views have an absolute minimum size (px) in pixels and an unitless grow value that specifies in what proportion the possible remaining space should be distributed. The remaining space depends on the parent view's size.

In the following example, the left view has a width of 20 px, the center view has a grow of 1, and the right view has a grow of 2. If you resize the web browser, you can observe that the width of the left view stays constant while the remaining space is distributed in proportions of 1:2.

{\n  \"data\": { \"values\": [{}] },\n\n  \"spacing\": 10,\n\n  \"hconcat\": [\n    {\n      \"width\": { \"px\": 20 },\n      \"mark\": \"rect\"\n    },\n    {\n      \"width\": { \"grow\": 1 },\n      \"mark\": \"rect\"\n    },\n    {\n      \"width\": { \"grow\": 2 },\n      \"mark\": \"rect\"\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#sizedef","title":"SizeDef","text":"grow

Type: number

Share of the remaining space. See child sizing for details.

px

Type: number

Size in pixels

The size may have both absolute (px) and proportional (grow) components. When views are nested, both the absolute and proportional sizes are added up. Thus, the width of the above example is { \"px\": 40, \"grow\": 3 }. The spacing between the child views is added to the total absolute width.

Views' size properties (width and height) accept both SizeDef objects and shorthands. The SizeDef objects contain either or both of px and grow properties. Numbers are interpreted as as absolute sizes, and \"container\" is the same as { grow: 1 }. Undefined sizes generally default to \"container\".

Concatenation operators can nested flexibly to build complex layouts as in the following example.

{\n  \"data\": { \"values\": [{}] },\n\n  \"hconcat\": [\n    { \"mark\": \"rect\" },\n    {\n      \"vconcat\": [{ \"mark\": \"rect\" }, { \"mark\": \"rect\" }]\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#scrollable-viewports","title":"Scrollable viewports","text":"

Sometimes the concents of a view are so large that they do not fit into the available space. In such cases, the view can be made scrollable by setting an explicit size for the view using the viewportWidth and viewportHeight properties. They accept the same values as width and height properties except for the step size. Scrollable viewports are particularly useful for categorical data types (\"ordinal\" and \"nominal\") and respective scales and axes that do not support zooming and panning.

{\n  \"height\": { \"step\": 20 },\n  \"viewportHeight\": \"container\",\n\n  \"view\": { \"stroke\": \"lightgray\" },\n\n  \"data\": { \"sequence\": { \"start\": 0, \"stop\": 31, \"step\": 1 } },\n\n  \"encoding\": {\n    \"x\": { \"field\": \"data\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"data\", \"type\": \"ordinal\" }\n  },\n\n  \"mark\": { \"type\": \"point\" }\n}\n
"},{"location":"grammar/composition/concat/#resolve","title":"Resolve","text":"

By default, all channels have \"independent\" scales and axes. However, because track-based layouts that resemble genome browsers are such a common use case, vconcat defaults to \"shared\" resolution for x channel and hconcat defaults to \"shared\" resolution for y channel.

"},{"location":"grammar/composition/concat/#shared-axes","title":"Shared axes","text":"

Concatenation operators support shared axes on channels that also have shared scales. Axis domain line, ticks, and labels are drawn only once for each row or column. Grid lines are drawn for all participating views.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n\n  \"resolve\": {\n    \"scale\": { \"x\": \"shared\", \"y\": \"shared\" },\n    \"axis\": { \"x\": \"shared\", \"y\": \"shared\" }\n  },\n\n  \"spacing\": 20,\n\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\", \"axis\": { \"grid\": true } },\n    \"y\": { \"field\": \"sin\", \"type\": \"quantitative\", \"axis\": { \"grid\": true } }\n  },\n\n  \"columns\": 2,\n\n  \"concat\": [\n    { \"mark\": \"point\", \"view\": { \"stroke\": \"lightgray\" } },\n    { \"mark\": \"point\", \"view\": { \"stroke\": \"lightgray\" } },\n    { \"mark\": \"point\", \"view\": { \"stroke\": \"lightgray\" } },\n    { \"mark\": \"point\", \"view\": { \"stroke\": \"lightgray\" } }\n  ]\n}\n
"},{"location":"grammar/composition/layer/","title":"Layering Views","text":"

The layer operator superimposes multiple views over each other.

"},{"location":"grammar/composition/layer/#example","title":"Example","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"a\": \"A\", \"b\": 28 },\n      { \"a\": \"B\", \"b\": 55 },\n      { \"a\": \"C\", \"b\": 43 },\n      { \"a\": \"D\", \"b\": 91 },\n      { \"a\": \"E\", \"b\": 81 },\n      { \"a\": \"F\", \"b\": 53 },\n      { \"a\": \"G\", \"b\": 19 },\n      { \"a\": \"H\", \"b\": 87 },\n      { \"a\": \"I\", \"b\": 52 }\n    ]\n  },\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"a\",\n      \"type\": \"nominal\",\n      \"scale\": { \"padding\": 0.1 },\n      \"axis\": { \"labelAngle\": 0 }\n    },\n    \"y\": { \"field\": \"b\", \"type\": \"quantitative\" }\n  },\n  \"layer\": [\n    {\n      \"name\": \"Bar\",\n      \"mark\": \"rect\"\n    },\n    {\n      \"name\": \"Label\",\n      \"mark\": { \"type\": \"text\", \"dy\": -9 },\n      \"encoding\": {\n        \"text\": { \"field\": \"b\" }\n      }\n    }\n  ]\n}\n

To specify multiple layers, use the layer property:

{\n  \"layer\": [\n    ...  // Single or layered view specifications\n  ]\n}\n

The provided array may contain both single view specifications and layer specifications. The encodings and data that are specified in a layer view propagate to its descendants. For example, in the above example, the \"Bar\" and \"Label\" views inherit the data and encodings for the x and y channels from their parent, the layer view.

"},{"location":"grammar/composition/layer/#resolve","title":"Resolve","text":"

By default, layers share their scales and axes, unioning the data domains.

"},{"location":"grammar/composition/layer/#more-examples","title":"More examples","text":""},{"location":"grammar/composition/layer/#lollipop-plot","title":"Lollipop plot","text":"

This example layers two marks to create a composite mark, a lollipop. Yet another layer is used for the baseline.

{\n  \"name\": \"The Root\",\n  \"description\": \"Lollipop plot example\",\n\n  \"layer\": [\n    {\n      \"name\": \"Baseline\",\n      \"data\": { \"values\": [0] },\n      \"mark\": \"rule\",\n      \"encoding\": {\n        \"y\": { \"field\": \"data\", \"type\": \"quantitative\", \"title\": null },\n        \"color\": { \"value\": \"lightgray\" }\n      }\n    },\n    {\n      \"name\": \"Arrows\",\n\n      \"data\": {\n        \"sequence\": {\n          \"start\": 0,\n          \"stop\": 6.284,\n          \"step\": 0.39269908169,\n          \"as\": \"x\"\n        }\n      },\n\n      \"transform\": [\n        { \"type\": \"formula\", \"expr\": \"sin(datum.x)\", \"as\": \"sin(x)\" }\n      ],\n\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": {\n          \"field\": \"sin(x)\",\n          \"type\": \"quantitative\",\n          \"scale\": { \"padding\": 0.1 }\n        },\n        \"color\": { \"field\": \"sin(x)\", \"type\": \"quantitative\" }\n      },\n\n      \"layer\": [\n        {\n          \"name\": \"Arrow shafts\",\n\n          \"mark\": {\n            \"type\": \"rule\",\n            \"size\": 3\n          }\n        },\n        {\n          \"name\": \"Arrowheads\",\n\n          \"mark\": {\n            \"type\": \"point\",\n            \"size\": 500,\n            \"filled\": true\n          },\n\n          \"encoding\": {\n            \"shape\": {\n              \"field\": \"sin(x)\",\n              \"type\": \"nominal\",\n              \"scale\": {\n                \"type\": \"threshold\",\n                \"domain\": [-0.01, 0.01],\n                \"range\": [\"triangle-down\", \"diamond\", \"triangle-up\"]\n              }\n            }\n          }\n        }\n      ]\n    }\n  ]\n}\n
"},{"location":"grammar/data/","title":"Data Input","text":"

Like Vega-Lite's data model, GenomeSpy utilizes a tabular data structure as its fundamental data model, resembling a spreadsheet or database table. Each data set in GenomeSpy is considered to consist of a set of records, each containing various named data fields.

In GenomeSpy, the data property within a view specification describes the data source. In a hierarchically composed view specification, the views inherit the data, which may be further transformed, from their parent views. However, each view can also override the inherited data.

Non-indexed eager data, which is fully loaded during the visualization initialization stage, can be provided as inline data (values) or by specifying a URL from which the data can be loaded (url). Additionally, you can use a sequence generator for generating sequences of numbers.

GenomeSpy provides several lazy data sources that load data on-demand in response to user interactions to support large genomic data sets comprising millions of records. These data sources enable easy handling of standard bioinformatic data formats such as indexed FASTA and BigWig.

Furthermore, GenomeSpy enables the creation of an empty data source with a given name. This data source can be dynamically updated using the API, making it particularly useful when embedding GenomeSpy in web applications.

"},{"location":"grammar/data/eager/","title":"Eager Data Sources","text":"

Eager data sources load and process all available data during the initialization stage. They are suitable for small data sets as they do not support partial loading or loading in response to user interactions. However, eager data sources are often more flexible and straightforward than lazy ones.

GenomeSpy inputs eager data as tabular \"csv\", \"tsv\", and \"json\" files or as non-indexed \"fasta\" files. Data can be loaded from URLs or provided inline. You can also use generators to generate data on the fly and further modify them using transforms.

The data property of the view specification describes a data source. The following example loads a tab-delimited file. By default, GenomeSpy infers the format from the file extension. However, in bioinformatics, CSV files are often actually tab-delimited, and you must specify the \"tsv\" explicitly:

Example: Eagerly loading data from a URL
{\n  \"data\": {\n    \"url\": \"fileWithTabs.csv\",\n    \"format\": { \"type\": \"tsv\" }\n  },\n  ...\n}\n

With the exception of the unsupported geographical formats, the data property of GenomeSpy is identical to Vega-Lite's data property.

Type inference

GenomeSpy uses vega-loader to parse tabular data and infer its data types. Vega-loader is sometimes overly eager to interpret strings as a dates. In such cases, the field types need to be specified explicitly. On the other hand, explicit type specification also gives a significant performance boost to parsing performance.

Handling empty (NA) values

Empty or missing values must be presented as empty strings instead of NA that R writes by default. Otherwise type inference fails for numeric fields.

"},{"location":"grammar/data/eager/#named-data","title":"Named Data","text":"

When embedding GenomeSpy in a web application or page, data can be added or updated at runtime using the API. Data sources are referenced by a name, which is passed to the updateNamedData method:

{\n    \"data\": {\n        \"name\": \"myResults\"\n    }\n    ...\n}\n
const api = await embed(\"#container\", spec);\napi.updateNamedData(\"myResults\", [\n  { x: 1, y: 2 },\n  { x: 2, y: 3 },\n]);\n

Although named data can be updated dynamically, it does not automatically respond to user interactions. For practical examples of dynamically updated named data, check the embed-examples package.

"},{"location":"grammar/data/eager/#bioinformatic-formats","title":"Bioinformatic Formats","text":"

Most bioinformatic data formats are supported through lazy data. The following formats are supported as eager data with the url source.

"},{"location":"grammar/data/eager/#fasta","title":"FASTA","text":"

The type of FASTA format is \"fasta\" as shown in the example below:

{\n  \"data\": {\n    \"url\": \"16SRNA_Deino_87seq_copy.aln\",\n    \"format\": {\n      \"type\": \"fasta\"\n    }\n  },\n  ...\n}\n

The FASTA loader produces data objects with two fields: identifier and sequence. With the \"flattenSequence\" transform you can split the sequences into individual bases (one object per base) for easier visualization.

"},{"location":"grammar/data/lazy/","title":"Lazy Data Sources","text":"

Lazy data sources load data on-demand in response to user interactions. Unlike eager sources, most lazy data sources support indexing, which offers the capability to retrieve and load data partially and incrementally, as users navigate the genome. This is especially useful for very large datasets that are infeasible to load in their entirety.

How it works

Lazy data sources observe the scale domains of the view where the data source is specified. When the domain changes as a result of an user interaction, the data source invokes a request to fetch a new subset of the data. Lazy sources need the visual channel to be specified, which is used to determine the scale to observe. For genomic data sources, the channel defaults to \"x\".

Lazy data sources are specified using the lazy property of the data object. Unlike in eager data, the type of the data source must be specified explicitly:

Example: Specifiying a lazy data source
{\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"bigwig\",\n      \"url\": \"https://data.genomespy.app/genomes/hg38/hg38.gc5Base.bw\"\n    }\n  },\n  ...\n}\n
"},{"location":"grammar/data/lazy/#indexed-fasta","title":"Indexed FASTA","text":"

The \"indexedFasta\" source enable fast random access to a reference sequence. It loads the sequence as three consecutive chuncks that cover and flank the currently visible region (domain), allowing the user to rapidly pan the view. The chunks are provided as data objects with the following fields: chrom (string), start (integer), and sequence (a string of bases).

"},{"location":"grammar/data/lazy/#parameters","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

indexUrl

Type: string

URL of the index file.

Default value: url + \".fai\".

url Required

Type: string

URL of the fasta file.

windowSize

Type: number

Size of each chunk when fetching the fasta file. Data is only fetched when the length of the visible domain smaller than the window size.

Default value: 7000

"},{"location":"grammar/data/lazy/#example","title":"Example","text":"

The example below shows how to specify a sequence track using an indexed FASTA file. The sequence chunks are split into separate data objects using the \"flattenSequence\" transform, and the final position of each nucleotide is computed using the \"formula\" transform. Please note that new data are fetched only when the user zooms into a region smaller than the window size (default: 7000 bp).

{\n  \"genome\": { \"name\": \"hg38\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"indexedFasta\",\n      \"url\": \"https://data.genomespy.app/genomes/hg38/hg38.fa\"\n    }\n  },\n\n  \"transform\": [\n    {\n      \"type\": \"flattenSequence\",\n      \"field\": \"sequence\",\n      \"as\": [\"rawPos\", \"base\"]\n    },\n    { \"type\": \"formula\", \"expr\": \"datum.rawPos + datum.start\", \"as\": \"pos\" }\n  ],\n\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"pos\",\n      \"type\": \"locus\",\n      \"scale\": {\n        \"domain\": [\n          { \"chrom\": \"chr7\", \"pos\": 20003500 },\n          { \"chrom\": \"chr7\", \"pos\": 20003540 }\n        ]\n      }\n    },\n    \"color\": {\n      \"field\": \"base\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"domain\": [\"A\", \"C\", \"T\", \"G\", \"a\", \"c\", \"t\", \"g\", \"N\"],\n        \"range\": [\n          \"#7BD56C\",\n          \"#FF9B9B\",\n          \"#86BBF1\",\n          \"#FFC56C\",\n          \"#7BD56C\",\n          \"#FF9B9B\",\n          \"#86BBF1\",\n          \"#FFC56C\",\n          \"#E0E0E0\"\n        ]\n      }\n    }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\"\n    },\n    {\n      \"mark\": {\n        \"type\": \"text\",\n        \"size\": 13,\n        \"fitToBand\": true,\n        \"paddingX\": 1.5,\n        \"paddingY\": 1,\n        \"opacity\": 0.7,\n        \"flushX\": false,\n        \"tooltip\": null\n      },\n      \"encoding\": {\n        \"color\": { \"value\": \"black\" },\n        \"text\": { \"field\": \"base\" }\n      }\n    }\n  ]\n}\n

The data source is based on GMOD's indexedfasta-js library.

"},{"location":"grammar/data/lazy/#bigwig","title":"BigWig","text":"

The \"bigwig\" source enables the retrieval of dense, continuous data, such as coverage or other signal data stored in BigWig files. It behaves similarly to the indexed FASTA source, loading the data in chunks that cover and flank the currently visible region. However, the window size automatically adapts to the zoom level, and data are fetched in higher resolution when zooming in. The data source provides data objects with the following fields: chrom (string), start (integer), end (integer), and score (number).

"},{"location":"grammar/data/lazy/#parameters_1","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

pixelsPerBin

Type: number | ExprRef

The approximate minimum width of each data bin, in pixels.

Default value: 2

url Required

Type: string | ExprRef

URL of the BigWig file.

"},{"location":"grammar/data/lazy/#example_1","title":"Example","text":"

The example below shows the GC content of the human genome in 5-base windows. When you zoom in, the resolution of the data automatically increases.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"view\": { \"stroke\": \"lightgray\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"bigwig\",\n      \"url\": \"https://data.genomespy.app/genomes/hg38/hg38.gc5Base.bw\"\n    }\n  },\n\n  \"encoding\": {\n    \"y\": {\n      \"field\": \"score\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 100] },\n      \"axis\": { \"title\": \"GC (%)\", \"grid\": true, \"gridDash\": [2, 2] }\n    },\n    \"x\": { \"chrom\": \"chrom\", \"pos\": \"start\", \"type\": \"locus\" },\n    \"x2\": { \"chrom\": \"chrom\", \"pos\": \"end\" }\n  },\n\n  \"mark\": \"rect\"\n}\n

The data source is based on GMOD's bbi-js library.

"},{"location":"grammar/data/lazy/#bigbed","title":"BigBed","text":"

The \"bigbed\" source enables the retrieval of segmented data, such as annotated genomic regions stored in BigBed files.

"},{"location":"grammar/data/lazy/#parameters_2","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

url Required

Type: string | ExprRef

URL of the BigBed file.

windowSize

Type: number | ExprRef

Size of each chunk when fetching the BigBed file. Data is only fetched when the length of the visible domain smaller than the window size.

Default value: 1000000

"},{"location":"grammar/data/lazy/#example_2","title":"Example","text":"

The example below displays \"ENCODE Candidate Cis-Regulatory Elements (cCREs) combined from all cell types\" dataset for the hg38 genome.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"view\": { \"stroke\": \"lightgray\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"bigbed\",\n      \"url\": \"https://data.genomespy.app/sample-data/encodeCcreCombined.hg38.bb\"\n    }\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"chromStart\",\n      \"type\": \"locus\",\n      \"scale\": {\n        \"domain\": [\n          { \"chrom\": \"chr7\", \"pos\": 66600000 },\n          { \"chrom\": \"chr7\", \"pos\": 66800000 }\n        ]\n      }\n    },\n    \"x2\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"chromEnd\"\n    },\n    \"color\": {\n      \"field\": \"ucscLabel\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"domain\": [\"prom\", \"enhP\", \"enhD\", \"K4m3\", \"CTCF\"],\n        \"range\": [\"#FF0000\", \"#FFA700\", \"#FFCD00\", \"#FFAAAA\", \"#00B0F0\"]\n      }\n    }\n  },\n\n  \"mark\": \"rect\"\n}\n

The data source is based on GMOD's bbi-js library.

"},{"location":"grammar/data/lazy/#gff3","title":"GFF3","text":"

The tabix-based \"gff3\" source enables the retrieval of hierarchical data, such as genomic annotations stored in GFF3 files. The object format GenomeSpy uses is described in gff-js's documentation. The flatten and project transforms are useful when extracting the child features and attributes from the hierarchical data structure. See the example below.

"},{"location":"grammar/data/lazy/#parameters_3","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

indexUrl

Type: string

Url of the tabix index file.

Default value: url + \".tbi\".

url Required

Type: string

Url of the bgzip compressed file.

windowSize

Type: number

Size of each chunk when fetching the Tabix file. Data is only fetched when the length of the visible domain smaller than the window size.

Default value: 30000000

"},{"location":"grammar/data/lazy/#example_3","title":"Example","text":"

The example below displays the human (GRCh38.p13) GENCODE v43 annotation dataset. Please note that the example shows a maximum of ten overlapping features per locus as vertical scrolling is currently not supported properly.

{\n  \"$schema\": \"https://unpkg.com/@genome-spy/core/dist/schema.json\",\n\n  \"genome\": { \"name\": \"hg38\" },\n\n  \"height\": { \"step\": 28 },\n  \"viewportHeight\": \"container\",\n\n  \"view\": { \"stroke\": \"lightgray\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"gff3\",\n      \"url\": \"https://data.genomespy.app/sample-data/gencode.v43.annotation.sorted.gff3.gz\",\n      \"windowSize\": 2000000,\n      \"debounceDomainChange\": 300\n    }\n  },\n\n  \"transform\": [\n    {\n      \"type\": \"flatten\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.attributes.gene_name\",\n      \"as\": \"gene_name\"\n    },\n    {\n      \"type\": \"flatten\",\n      \"fields\": [\"child_features\"]\n    },\n    {\n      \"type\": \"flatten\",\n      \"fields\": [\"child_features\"],\n      \"as\": [\"child_feature\"]\n    },\n    {\n      \"type\": \"project\",\n      \"fields\": [\n        \"gene_name\",\n        \"child_feature.type\",\n        \"child_feature.strand\",\n        \"child_feature.seq_id\",\n        \"child_feature.start\",\n        \"child_feature.end\",\n        \"child_feature.attributes.gene_type\",\n        \"child_feature.attributes.transcript_type\",\n        \"child_feature.attributes.gene_id\",\n        \"child_feature.attributes.transcript_id\",\n        \"child_feature.attributes.transcript_name\",\n        \"child_feature.attributes.tag\",\n        \"source\",\n        \"child_feature.child_features\"\n      ],\n      \"as\": [\n        \"gene_name\",\n        \"type\",\n        \"strand\",\n        \"seq_id\",\n        \"start\",\n        \"end\",\n        \"gene_type\",\n        \"transcript_type\",\n        \"gene_id\",\n        \"transcript_id\",\n        \"transcript_name\",\n        \"tag\",\n        \"source\",\n        \"_child_features\"\n      ]\n    },\n    {\n      \"type\": \"collect\",\n      \"sort\": {\n        \"field\": [\"seq_id\", \"start\", \"transcript_id\"]\n      }\n    },\n    {\n      \"type\": \"pileup\",\n      \"start\": \"start\",\n      \"end\": \"end\",\n      \"as\": \"_lane\"\n    }\n  ],\n\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"seq_id\",\n      \"pos\": \"start\",\n      \"offset\": 1,\n      \"type\": \"locus\",\n      \"scale\": {\n        \"domain\": [\n          { \"chrom\": \"chr5\", \"pos\": 177482500 },\n          { \"chrom\": \"chr5\", \"pos\": 177518000 }\n        ]\n      }\n    },\n    \"x2\": {\n      \"chrom\": \"seq_id\",\n      \"pos\": \"end\"\n    },\n    \"y\": {\n      \"field\": \"_lane\",\n      \"type\": \"index\",\n      \"scale\": {\n        \"zoom\": false,\n        \"reverse\": true,\n        \"domain\": [0, 40],\n        \"padding\": 0.5\n      },\n      \"axis\": null\n    }\n  },\n\n  \"layer\": [\n    {\n      \"name\": \"gencode-transcript\",\n\n      \"layer\": [\n        {\n          \"name\": \"gencode-tooltip-trap\",\n          \"title\": \"GENCODE transcript\",\n          \"mark\": {\n            \"type\": \"rule\",\n            \"color\": \"#b0b0b0\",\n            \"opacity\": 0,\n            \"size\": 7\n          }\n        },\n        {\n          \"name\": \"gencode-transcript-body\",\n          \"mark\": {\n            \"type\": \"rule\",\n            \"color\": \"#b0b0b0\",\n            \"tooltip\": null\n          }\n        }\n      ]\n    },\n    {\n      \"name\": \"gencode-exons\",\n\n      \"transform\": [\n        {\n          \"type\": \"flatten\",\n          \"fields\": [\"_child_features\"]\n        },\n        {\n          \"type\": \"flatten\",\n          \"fields\": [\"_child_features\"],\n          \"as\": [\"child_feature\"]\n        },\n        {\n          \"type\": \"project\",\n          \"fields\": [\n            \"gene_name\",\n            \"_lane\",\n            \"child_feature.type\",\n            \"child_feature.seq_id\",\n            \"child_feature.start\",\n            \"child_feature.end\",\n            \"child_feature.attributes.exon_number\",\n            \"child_feature.attributes.exon_id\"\n          ],\n          \"as\": [\n            \"gene_name\",\n            \"_lane\",\n            \"type\",\n            \"seq_id\",\n            \"start\",\n            \"end\",\n            \"exon_number\",\n            \"exon_id\"\n          ]\n        }\n      ],\n\n      \"layer\": [\n        {\n          \"title\": \"GENCODE exon\",\n\n          \"transform\": [{ \"type\": \"filter\", \"expr\": \"datum.type == 'exon'\" }],\n\n          \"mark\": {\n            \"type\": \"rect\",\n            \"minWidth\": 0.5,\n            \"minOpacity\": 0.5,\n            \"stroke\": \"#505050\",\n            \"fill\": \"#fafafa\",\n            \"strokeWidth\": 1.0\n          }\n        },\n        {\n          \"title\": \"GENCODE exon\",\n\n          \"transform\": [\n            {\n              \"type\": \"filter\",\n              \"expr\": \"datum.type != 'exon' && datum.type != 'start_codon' && datum.type != 'stop_codon'\"\n            }\n          ],\n\n          \"mark\": {\n            \"type\": \"rect\",\n            \"minWidth\": 0.5,\n            \"minOpacity\": 0,\n            \"strokeWidth\": 1.0,\n            \"strokeOpacity\": 0.0,\n            \"stroke\": \"gray\"\n          },\n          \"encoding\": {\n            \"fill\": {\n              \"field\": \"type\",\n              \"type\": \"nominal\",\n              \"scale\": {\n                \"domain\": [\"five_prime_UTR\", \"CDS\", \"three_prime_UTR\"],\n                \"range\": [\"#83bcb6\", \"#ffbf79\", \"#d6a5c9\"]\n              }\n            }\n          }\n        },\n        {\n          \"transform\": [\n            {\n              \"type\": \"filter\",\n              \"expr\": \"datum.type == 'three_prime_UTR' || datum.type == 'five_prime_UTR'\"\n            },\n            {\n              \"type\": \"formula\",\n              \"expr\": \"datum.type == 'three_prime_UTR' ? \\\"3'\\\" : \\\"5'\\\"\",\n              \"as\": \"label\"\n            }\n          ],\n\n          \"mark\": {\n            \"type\": \"text\",\n            \"color\": \"black\",\n            \"size\": 11,\n            \"opacity\": 0.7,\n            \"paddingX\": 2,\n            \"paddingY\": 1.5,\n            \"tooltip\": null\n          },\n\n          \"encoding\": {\n            \"text\": {\n              \"field\": \"label\"\n            }\n          }\n        }\n      ]\n    },\n    {\n      \"name\": \"gencode-transcript-labels\",\n\n      \"transform\": [\n        {\n          \"type\": \"formula\",\n          \"expr\": \"(datum.strand == '-' ? '< ' : '') + datum.transcript_name + ' - ' + datum.transcript_id + (datum.strand == '+' ? ' >' : '')\",\n          \"as\": \"label\"\n        }\n      ],\n\n      \"mark\": {\n        \"type\": \"text\",\n        \"size\": 10,\n        \"yOffset\": 12,\n        \"tooltip\": null,\n        \"color\": \"#505050\"\n      },\n\n      \"encoding\": {\n        \"text\": {\n          \"field\": \"label\"\n        }\n      }\n    }\n  ]\n}\n

The data source is based on GMOD's tabix-js and gff-js libraries.

"},{"location":"grammar/data/lazy/#bam","title":"BAM","text":"

The \"bam\" source is very much work in progress but has a low priority. It currently exposes the reads but provides no handling for variants alleles, CIGARs, etc. Please send a message to GitHub Discussions if you are interested in this feature.

"},{"location":"grammar/data/lazy/#parameters_4","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

indexUrl

Type: string

URL of the index file.

Default value: url + \".bai\".

url Required

Type: string

URL of the BigBed file.

windowSize

Type: number

Size of each chunk when fetching the BigBed file. Data is only fetched when the length of the visible domain smaller than the window size.

Default value: 10000

"},{"location":"grammar/data/lazy/#example_4","title":"Example","text":"
{\n  \"genome\": { \"name\": \"hg18\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"bam\",\n      \"url\": \"https://data.genomespy.app/sample-data/bamExample.bam\",\n      \"windowSize\": 30000\n    }\n  },\n\n  \"resolve\": { \"scale\": { \"x\": \"shared\" } },\n\n  \"spacing\": 5,\n\n  \"vconcat\": [\n    {\n      \"view\": { \"stroke\": \"lightgray\" },\n      \"height\": 40,\n\n      \"transform\": [\n        {\n          \"type\": \"coverage\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"coverage\",\n          \"chrom\": \"chrom\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": {\n          \"chrom\": \"chrom\",\n          \"pos\": \"start\",\n          \"type\": \"locus\",\n          \"axis\": null\n        },\n        \"x2\": { \"chrom\": \"chrom\", \"pos\": \"end\" },\n        \"y\": { \"field\": \"coverage\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"view\": { \"stroke\": \"lightgray\" },\n\n      \"transform\": [\n        {\n          \"type\": \"pileup\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"_lane\"\n        }\n      ],\n\n      \"encoding\": {\n        \"x\": {\n          \"chrom\": \"chrom\",\n          \"pos\": \"start\",\n          \"type\": \"locus\",\n          \"axis\": {},\n          \"scale\": {\n            \"domain\": [\n              { \"chrom\": \"chr21\", \"pos\": 33037317 },\n              { \"chrom\": \"chr21\", \"pos\": 33039137 }\n            ]\n          }\n        },\n        \"x2\": {\n          \"chrom\": \"chrom\",\n          \"pos\": \"end\"\n        },\n        \"y\": {\n          \"field\": \"_lane\",\n          \"type\": \"index\",\n          \"scale\": {\n            \"domain\": [0, 60],\n            \"padding\": 0.3,\n            \"reverse\": true,\n            \"zoom\": false\n          }\n        },\n        \"color\": {\n          \"field\": \"strand\",\n          \"type\": \"nominal\",\n          \"scale\": {\n            \"domain\": [\"+\", \"-\"],\n            \"range\": [\"crimson\", \"orange\"]\n          }\n        }\n      },\n\n      \"mark\": \"rect\"\n    }\n  ]\n}\n

The data source is based on GMOD's bam-js library.

"},{"location":"grammar/data/lazy/#axis-ticks","title":"Axis ticks","text":"

The \"axisTicks\" data source generates a set of ticks for the specified channel. While GenomeSpy internally uses this data source for generating axis ticks, you also have the flexibility to employ it for creating fully customized axes according to your requirements. The data source generates data objects with value and label fields.

"},{"location":"grammar/data/lazy/#parameters_5","title":"Parameters","text":"axis

Type: Axis

Optional axis properties

channel Required

Type: \"x\" | \"y\"

Which channel's scale domain to listen to

"},{"location":"grammar/data/lazy/#example_5","title":"Example","text":"

The example below generates approximately three ticks for the x axis.

{\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"axisTicks\",\n      \"channel\": \"x\",\n      \"axis\": {\n        \"tickCount\": 3\n      }\n    }\n  },\n\n  \"mark\": {\n    \"type\": \"text\",\n    \"size\": 20,\n    \"clip\": false\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"value\",\n      \"type\": \"quantitative\",\n      \"scale\": {\n        \"domain\": [0, 10],\n        \"zoom\": true\n      }\n    },\n    \"text\": {\n      \"field\": \"label\"\n    }\n  }\n}\n
"},{"location":"grammar/data/lazy/#axis-genome","title":"Axis genome","text":"

The axisGenome data source, in fact, does not dynamically update data. However, it provides a convenient access to the genome (chromosomes) of the given channel, allowing creation of customized chromosome ticks or annotations. The data source generates data objects with the following fields: name, size (in bp), continuousStart (linearized coordinate), continuousEnd, odd (boolean), and number (1-based index).

"},{"location":"grammar/data/lazy/#parameters_6","title":"Parameters","text":"channel Required

Type: \"x\" | \"y\"

Which channel's scale domain to use

"},{"location":"grammar/data/lazy/#example_6","title":"Example","text":"
{\n  \"genome\": { \"name\": \"hg38\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"axisGenome\",\n      \"channel\": \"x\"\n    }\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"continuousStart\",\n      \"type\": \"locus\"\n    },\n    \"x2\": {\n      \"field\": \"continuousEnd\"\n    },\n    \"text\": {\n      \"field\": \"name\"\n    }\n  },\n\n  \"layer\": [\n    {\n      \"transform\": [\n        {\n          \"type\": \"filter\",\n          \"expr\": \"datum.odd\"\n        }\n      ],\n      \"mark\": {\n        \"type\": \"rect\",\n        \"fill\": \"#f0f0f0\"\n      }\n    },\n    {\n      \"mark\": {\n        \"type\": \"text\",\n        \"size\": 16,\n        \"angle\": -90,\n        \"align\": \"right\",\n        \"baseline\": \"top\",\n        \"paddingX\": 3,\n        \"paddingY\": 5,\n        \"y\": 1\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/mark/","title":"Marks","text":"

In GenomeSpy, visualizations are built from marks, which are geometric shapes, such as points, rectangles, and lines, that represent data objects (or rows in tabular data). These marks are mapped to the data using the encoding property, which specifies which visual channels, such as x, color, and size, should be used to encode the data fields. By adjusting the encodings, you can present the same data in a wide range of visual forms, such as scatterplots, bar charts, and heatmaps.

Example: Specifying the mark type
{\n  ...,\n  \"mark\": \"rect\"\n  ...,\n}\n
"},{"location":"grammar/mark/#properties","title":"Properties","text":"

Marks also support various properties for controlling their appearance or behavior. The properties can be specified with an object that contains at least the type property:

Example: Specifying the mark type and additional properties
{\n  ...,\n  \"mark\": {\n    \"type\": \"rect\",\n    \"cornerRadius\": 5\n  },\n  ...,\n}\n
"},{"location":"grammar/mark/#encoding","title":"Encoding","text":"

While mark properties are static, i.e., same for all mark instances, encoding allows for mapping data to visual channels and using data-driven visual encoding.

It's worth noting that while all visual encoding channels are also available as static properties, not all properties can be used for encoding. Only certain properties are suitable for encoding data in a meaningful way.

Example: Specifying visual channels with the encoding property
{\n  ...,\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"from\", \"type\": \"index\"\n    },\n    \"x2\": {\n      \"field\": \"to\"\n    },\n    \"color\": {\n      \"field\": \"category\", \"type\": \"nominal\"\n    }\n  },\n  ...\n}\n

The schematic example above uses the \"rect\" mark to represent the data objects. The \"from\" field is mapped to the positional \"x\" channel, and so on. You can adjust the mapping by specifying a scale for the channel.

"},{"location":"grammar/mark/#channels","title":"Channels","text":""},{"location":"grammar/mark/#position-channels","title":"Position channels","text":"

All marks support the two position channels, which define the mark instance's placement in the visualization. If a positional channel is left unspecified, the mark instance is placed at the center of the respective axis.

"},{"location":"grammar/mark/#primary-channels","title":"Primary channels","text":"x The position on the x axis y The position on the y axis"},{"location":"grammar/mark/#secondary-channels","title":"Secondary channels","text":"

Some marks, such as \"rect\" and \"rule\", also support secondary positional channels, which allow specifying an interval that the mark should cover in the visualization.

x2 The secondary position on the x axis y2 The secondary position on the y axis"},{"location":"grammar/mark/#other-channels","title":"Other channels","text":"color Color of the mark. Affects fill or stroke, depending on the filled property. fill Fill color stroke Stroke color opacity Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property. fillOpacity Fill opacity strokeOpacity Stroke opacity strokeWidth Stroke width in pixels size Depends on the mark. \"point\": the area of the rectangle that encloses the mark instance. \"rule\" and \"link\": stroke width. \"text\": font size. shape Shape of \"point\" marks. angle Rotational angle of \"point\" and \"text\" marks. text Text that the \"text\" mark should render for a mark instance."},{"location":"grammar/mark/#channels-for-sample-collections","title":"Channels for sample collections","text":"

The GenomeSpy app supports an additional channel.

sample Defines the track (or facet) for the sample"},{"location":"grammar/mark/#visual-encoding","title":"Visual Encoding","text":"

GenomeSpy provides several methods for controlling how data is mapped to visual channels. The most common method is to map a field of the data to a channel, but you can also use expressions, values, or data values belonging to the data domain.

Expect for the value method, all methods require specifying the data type using the type property, which must be one of: \"quantitative\", \"nominal\", or \"ordinal\", \"index\", or \"locus\". The first three types are equivalent to the Vega-Lite types of the same name.

"},{"location":"grammar/mark/#field","title":"Field","text":"

field maps a field (or column) of the data to a visual channel.

{\n  \"encoding\": {\n    \"color\": { \"field\": \"significance\", \"type\": \"ordinal\" }\n  },\n  ...\n}\n
"},{"location":"grammar/mark/#expression","title":"Expression","text":"

expr applies an expression before passing the value for a scale transformation.

{\n  \"encoding\": {\n    \"color\": { \"expr\": \"datum.score > 10\", \"type\": \"nominal\" }\n  },\n  ...\n}\n
"},{"location":"grammar/mark/#value","title":"Value","text":"

value defines a value on channel's range, skipping the scale transformation.

{\n  \"encoding\": {\n    \"color\": { \"value\": \"red\" }\n  },\n  ...\n}\n
"},{"location":"grammar/mark/#datum","title":"Datum","text":"

datum defines a value on the domain of the scale used on the channel. Thus, the scale transformation will be applied.

{\n  \"encoding\": {\n    \"color\": { \"datum\": \"important\", \"type\": \"ordinal\" }\n  },\n  ...\n}\n
"},{"location":"grammar/mark/#chrom-and-pos","title":"Chrom and Pos","text":"

See Working with Genomic Data.

"},{"location":"grammar/mark/link/","title":"Link","text":"

The \"link\" mark displays each data item as a curve that connects two points. The mark can be used to display structural variation and interactions, for example. The mark has several different linkShapes that control how the curve is drawn.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 30, \"as\": \"z\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"round(random() * 800)\", \"as\": \"x\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"round(datum.x + pow(2, random() * 10))\",\n      \"as\": \"x2\"\n    }\n  ],\n  \"mark\": \"link\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"index\" },\n    \"x2\": { \"field\": \"x2\" }\n  }\n}\n
"},{"location":"grammar/mark/link/#channels","title":"Channels","text":"

In addition to the primary and secondary position channels and the color and opacity channels, link mark supports the following channels: size.

"},{"location":"grammar/mark/link/#properties","title":"Properties","text":"arcFadingDistance

Type: [number, number] | boolean | ExprRef

The range of the \"arc\" shape's fading distance in pixels. This property allows for making the arc's opacity fade out as it extends away from the chord. The fading distance is interpolated from one to zero between the interval defined by this property. Both false and [0, 0] disable fading.

Default value: false

arcHeightFactor

Type: number | ExprRef

Scaling factor for the \"arc\" shape's height. The default value 1.0 produces roughly circular arcs.

Default value: 1.0

clampApex

Type: boolean | ExprRef

Whether the apex of the \"dome\" shape is clamped to the viewport edge. When over a half of the dome is located outside the viewport, clamping allows for more accurate reading of the value encoded by the apex' position.

Default value: false

clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

linkShape

Type: \"arc\" | \"diagonal\" | \"line\" | \"dome\" | ExprRef

The shape of the link path.

The \"arc\" shape draws a circular arc between the two points. The apex of the arc resides on the left side of the line that connects the two points. The \"dome\" shape draws a vertical or horizontal arc with a specific height. The primary positional channel determines the apex of the arc and the secondary determines the endpoint placement. The \"diagonal\" shape draws an \"S\"-shaped curve between the two points. The \"line\" shape draws a straight line between the two points. See an example of the different shapes below.

Default value: \"arc\"

maxChordLength

Type: number | ExprRef

The maximum length of \"arc\" shape's chord in pixels. The chord is the line segment between the two points that define the arc. Limiting the chord length serves two purposes when zooming in close enough: 1) it prevents the arc from becoming a straight line and 2) it mitigates the limited precision of floating point numbers in arc rendering.

Default value: 50000

minArcHeight

Type: number | ExprRef

The minimum height of an \"arc\" shape. Makes very short links more clearly visible.

Default value: 1.5

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

minPickingSize

Type: number | ExprRef

The minimum picking size invisibly increases the stroke width or point diameter of marks when pointing them with the mouse cursor, making it easier to select them. The valus is the minimum size in pixels.

Default value: 3.0 for \"link\" and 2.0 for \"point\"

noFadingOnPointSelection

Type: boolean | ExprRef

Disables fading of the link when an mark instance is subject to any point selection. As the fading distance is unavailable as a visual channel, this property allows for enhancing the visibility of the selected links.

Default value: true

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

orient

Type: \"vertical\" | \"horizontal\" | ExprRef

The orientation of the link path. Either \"vertical\" or \"horizontal\". Only applies to diagonal links.

Default value: \"vertical\"

segments

Type: number | ExprRef

The number of segments in the b\u00e9zier curve. Affects the rendering quality and performance. Use a higher value for a smoother curve.

Default value: 101

size

Type: number | ExprRef

Stroke width of \"link\" and \"rule\" marks in pixels, the area of the bounding square of \"point\" mark, or the font size of \"text\" mark.

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

x

Type: number | ExprRef

Position on the x axis.

x2

Type: number | ExprRef

The secondary position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

y2

Type: number | ExprRef

The secondary position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/link/#examples","title":"Examples","text":""},{"location":"grammar/mark/link/#different-link-shapes-and-orientations","title":"Different link shapes and orientations","text":"

This example shows the different link shapes and orientations. All links have the same coordinates: { x: 2, y: 2, x2: 8, y2: 8 }. The links are arranged in grid with

linkShape as columns: \"arc\", \"dome\", \"diagonal\", \"line\". orient as rows: \"vertical\", \"horizontal\".

{\n  \"data\": { \"values\": [{ \"x\": 2, \"x2\": 8 }] },\n  \"resolve\": {\n    \"scale\": { \"x\": \"shared\", \"y\": \"shared\" },\n    \"axis\": { \"x\": \"shared\", \"y\": \"shared\" }\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"x\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 10] },\n      \"axis\": { \"grid\": true }\n    },\n    \"x2\": { \"field\": \"x2\" },\n    \"y\": {\n      \"field\": \"x\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 10] },\n      \"axis\": { \"grid\": true }\n    },\n    \"y2\": { \"field\": \"x2\" },\n    \"size\": { \"value\": 2 }\n  },\n\n  \"columns\": 4,\n  \"spacing\": 20,\n\n  \"concat\": [\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"arc\", \"orient\": \"vertical\" } },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"dome\", \"orient\": \"vertical\" } },\n    {\n      \"mark\": { \"type\": \"link\", \"linkShape\": \"diagonal\", \"orient\": \"vertical\" }\n    },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"line\", \"orient\": \"vertical\" } },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"arc\", \"orient\": \"horizontal\" } },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"dome\", \"orient\": \"horizontal\" } },\n    {\n      \"mark\": {\n        \"type\": \"link\",\n        \"linkShape\": \"diagonal\",\n        \"orient\": \"horizontal\"\n      }\n    },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"line\", \"orient\": \"horizontal\" } }\n  ]\n}\n
"},{"location":"grammar/mark/link/#varying-the-dome-height","title":"Varying the dome height","text":"

This example uses the \"dome\" shape to draw links with varying heights. The height is determined by the y channel. The clampApex property is set to true to ensure that the apex of the dome is always visible. Try to zoom in and pan around to see it in action.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 20, \"as\": \"z\" }\n  },\n\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"round(random() * 1000)\", \"as\": \"x\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"round(datum.x + random() * 500)\",\n      \"as\": \"x2\"\n    },\n    { \"type\": \"formula\", \"expr\": \"random() * 1000 - 500\", \"as\": \"y\" }\n  ],\n\n  \"mark\": {\n    \"type\": \"link\",\n    \"linkShape\": \"dome\",\n    \"orient\": \"vertical\",\n    \"clampApex\": true,\n    \"color\": \"gray\"\n  },\n\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"index\" },\n    \"x2\": { \"field\": \"x2\" },\n    \"y\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\",\n      \"axis\": { \"grid\": true }\n    }\n  }\n}\n
"},{"location":"grammar/mark/point/","title":"Point","text":"

Point mark displays each data item as a symbol. Points are often used to create a scatter plot. In the genomic context, they could represent, for example, point mutations at genomic loci.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"sin\", \"type\": \"quantitative\" },\n    \"size\": { \"field\": \"x\", \"type\": \"quantitative\" }\n  }\n}\n
"},{"location":"grammar/mark/point/#channels","title":"Channels","text":"

In addition to standard position channels and color, opacity, and strokeWidth channels, point mark has the following channels: size, shape, dx, and dy.

"},{"location":"grammar/mark/point/#properties","title":"Properties","text":"angle

Type: number | ExprRef

The rotation angle in degrees.

Default value: 0

clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

fill

Type: string | ExprRef

The fill color.

fillGradientStrength

Type: number | ExprRef

Gradient strength controls the amount of the gradient eye-candy effect in the fill color. Valid values are between 0 and 1.

Default value: 0

fillOpacity

Type: number | ExprRef

The fill opacity. Value between 0 and 1.

filled

Type: boolean

Whether the color represents the fill color (true) or the stroke color (false).

geometricZoomBound

Type: number

Enables geometric zooming. The value is the base two logarithmic zoom level where the maximum point size is reached.

Default value: 0

inwardStroke

Type: boolean | ExprRef

Should the stroke only grow inwards, e.g, the diameter/outline is not affected by the stroke width. Thus, a point that has a zero size has no visible stroke. This allows strokes to be used with geometric zoom, etc.

Default value: false

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

minPickingSize

Type: number | ExprRef

The minimum picking size invisibly increases the stroke width or point diameter of marks when pointing them with the mouse cursor, making it easier to select them. The valus is the minimum size in pixels.

Default value: 3.0 for \"link\" and 2.0 for \"point\"

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

semanticZoomFraction

Type: number | ExprRef

TODO

Default value: 0.02

shape

Type: string | ExprRef

One of \"circle\", \"square\", \"cross\", \"diamond\", \"triangle-up\", \"triangle-down\", \"triangle-right\", \"triangle-left\", \"tick-up\", \"tick-down\", \"tick-right\", or \"tick-left\"

Default value: \"circle\"

size

Type: number | ExprRef

Stroke width of \"link\" and \"rule\" marks in pixels, the area of the bounding square of \"point\" mark, or the font size of \"text\" mark.

stroke

Type: string | ExprRef

The stroke color

strokeOpacity

Type: number | ExprRef

The stroke opacity. Value between 0 and 1.

strokeWidth

Type: number | ExprRef

The stroke width in pixels.

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

x

Type: number | ExprRef

Position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/point/#examples","title":"Examples","text":""},{"location":"grammar/mark/point/#plenty-of-points","title":"Plenty of points","text":"

The example below demonstrates how points can be varied by using shape, fill, size, strokeWidth, and angle channels.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 160, \"as\": \"z\" }\n  },\n\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"datum.z % 20\", \"as\": \"x\" },\n    { \"type\": \"formula\", \"expr\": \"floor(datum.z / 20)\", \"as\": \"y\" }\n  ],\n\n  \"mark\": {\n    \"type\": \"point\",\n    \"stroke\": \"black\"\n  },\n\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"ordinal\", \"axis\": null },\n    \"y\": { \"field\": \"y\", \"type\": \"ordinal\", \"axis\": null },\n    \"shape\": { \"field\": \"x\", \"type\": \"nominal\" },\n    \"fill\": { \"field\": \"x\", \"type\": \"nominal\" },\n    \"size\": {\n      \"field\": \"x\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"type\": \"pow\", \"exponent\": 2, \"range\": [0, 900] }\n    },\n    \"strokeWidth\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"range\": [0, 4] }\n    },\n    \"angle\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"range\": [0, 45] }\n    }\n  }\n}\n
"},{"location":"grammar/mark/point/#zoom-behavior","title":"Zoom behavior","text":"

Although points are infinitely small on the real number line, they have a specific diameter on the screen. Thus, closely located points tend to overlap each other. Decreasing the point size reduces the probability of overlap, but in a zoomed-in view, the plot may become overly sparse.

To control overplotting, the point mark provides two zooming behaviors that adjust the point size and visibility based on the zoom level.

"},{"location":"grammar/mark/point/#geometric-zoom","title":"Geometric zoom","text":"

Geometric zoom scales the point size down if the current zoom level is lower than the specified level (bound). geometricZoomBound mark property enables geometric zooming. The value is the negative base two logarithm of the relative width of the visible domain. Example: 0: (the default) full-size points are always shown, 1: when a half of the domain is visible, 2: when a quarter is visible, and so on.

The example below displays 200 000 semi-randomly generated points. The points reach their full size when 1 / 2^10.5 of the domain is visible, which equals about 1500X zoom.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 200000, \"as\": \"x\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"random() * 0.682\", \"as\": \"u\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"((datum.u % 1e-8 > 5e-9 ? 1 : -1) * (sqrt(-log(max(1e-9, datum.u))) - 0.618)) * 1.618 + sin(datum.x / 10000)\",\n      \"as\": \"y\"\n    }\n  ],\n  \"mark\": {\n    \"type\": \"point\",\n    \"geometricZoomBound\": 10.5\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\", \"scale\": { \"zoom\": true } },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"size\": { \"value\": 200 },\n    \"opacity\": { \"value\": 0.6 }\n  }\n}\n

Tip

You can use geometric zoom to improve rendering performance. Smaller points are faster to render than large points.

"},{"location":"grammar/mark/point/#semantic-zoom","title":"Semantic zoom","text":"

The score-based semantic zoom adjusts the point visibility by coupling a score threshold to current zoom level. The semanticScore channel enables the semantic zoom and specifies the score field. The semanticZoomFraction property controls the fraction of data items to show in the fully zoomed-out view, i.e., it specifies the threshold score. The fraction is scaled as the viewport is zoomed. Thus, if the data is distributed roughly uniformly along the zoomed axis, roughly constant number of points are visible at all zoom levels. The score can be arbitrarily distributed, as the threshold is computed using p-quantiles.

The example below has 200 000 semi-randomly generated points with an exponentially distributed score. As the view is zoomed in, new points appear. Their number in the viewport stays approximately constant until the lowest possible score has been reached.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 200000, \"as\": \"x\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"random() * 0.682\", \"as\": \"u\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"((datum.u % 1e-8 > 5e-9 ? 1 : -1) * (sqrt(-log(max(1e-9, datum.u))) - 0.618)) * 1.618\",\n      \"as\": \"y\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"-log(random())\",\n      \"as\": \"score\"\n    }\n  ],\n  \"mark\": {\n    \"type\": \"point\",\n    \"semanticZoomFraction\": 0.002\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\", \"scale\": { \"zoom\": true } },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"opacity\": {\n      \"field\": \"score\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"range\": [0.1, 1] }\n    },\n    \"semanticScore\": { \"field\": \"score\", \"type\": \"quantitative\" },\n    \"size\": { \"value\": 100 }\n  }\n}\n

Tip

The score-based semantic zoom is great for filtering point mutations and indels that are scored using CADD, for example.

"},{"location":"grammar/mark/rect/","title":"Rect","text":"

Rect mark displays each data item as a rectangle.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 20, \"as\": \"z\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"as\": \"x\", \"expr\": \"random()\" },\n    { \"type\": \"formula\", \"as\": \"x2\", \"expr\": \"datum.x + random() * 0.3\" },\n    { \"type\": \"formula\", \"as\": \"y\", \"expr\": \"random()\" },\n    { \"type\": \"formula\", \"as\": \"y2\", \"expr\": \"datum.y + random() * 0.4\" }\n  ],\n  \"mark\": {\n    \"type\": \"rect\",\n    \"strokeWidth\": 2,\n    \"stroke\": \"#404040\",\n    \"cornerRadius\": 5\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"x2\": { \"field\": \"x2\" },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"y2\": { \"field\": \"y2\" },\n    \"color\": { \"field\": \"z\", \"type\": \"quantitative\" }\n  }\n}\n
"},{"location":"grammar/mark/rect/#channels","title":"Channels","text":"

Rect mark supports the primary and secondary position channels and the color, stroke, fill, opacity, strokeOpacity, fillOpacity, and strokeWidth channels.

"},{"location":"grammar/mark/rect/#properties","title":"Properties","text":"clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

cornerRadius

Type: number | ExprRef

Radius of the rounded corners.

Default value: 0

cornerRadiusBottomLeft

Type: number | ExprRef

Radius of the bottom left rounded corner. Has higher precedence than cornerRadius.

Default value: (None)

cornerRadiusBottomRight

Type: number | ExprRef

Radius of the bottom right rounded corner. Has higher precedence than cornerRadius.

Default value: (None)

cornerRadiusTopLeft

Type: number | ExprRef

Radius of the top left rounded corner. Has higher precedence than cornerRadius.

Default value: (None)

cornerRadiusTopRight

Type: number | ExprRef

Radius of the top right rounded corner. Has higher precedence than cornerRadius.

Default value: (None)

fill

Type: string | ExprRef

The fill color.

fillOpacity

Type: number | ExprRef

The fill opacity. Value between 0 and 1.

filled

Type: boolean

Whether the color represents the fill color (true) or the stroke color (false).

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

minHeight

Type: number | ExprRef

The minimum height of a rectangle in pixels. The property clamps rectangles' heights.

Default value: 0

minOpacity

Type: number | ExprRef

Clamps the minimum size-dependent opacity. The property does not affect the opacity channel. Valid values are between 0 and 1.

When a rectangle would be smaller than what is specified in minHeight and minWidth, it is faded out proportionally. Example: a rectangle would be rendered as one pixel wide, but minWidth clamps it to five pixels. The rectangle is actually rendered as five pixels wide, but its opacity is multiplied by 0.2. With this setting, you can limit the factor to, for example, 0.5 to keep the rectangles more clearly visible.

Default value: 0

minWidth

Type: number | ExprRef

The minimum width of a rectangle in pixels. The property clamps rectangles' widths when the viewport is zoomed out.

This property also reduces flickering of very narrow rectangles when zooming. The value should generally be at least one.

Default value: 1

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

stroke

Type: string | ExprRef

The stroke color

strokeOpacity

Type: number | ExprRef

The stroke opacity. Value between 0 and 1.

strokeWidth

Type: number | ExprRef

The stroke width in pixels.

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

x

Type: number | ExprRef

Position on the x axis.

x2

Type: number | ExprRef

The secondary position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

y2

Type: number | ExprRef

The secondary position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/rect/#examples","title":"Examples","text":""},{"location":"grammar/mark/rect/#heatmap","title":"Heatmap","text":"

When used with \"band\" or \"index\" scales, the rectangles fill the whole bands when only the primary positional channel is defined.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 800, \"as\": \"z\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"as\": \"y\", \"expr\": \"floor(datum.z / 40)\" },\n    { \"type\": \"formula\", \"as\": \"x\", \"expr\": \"datum.z % 40\" },\n    {\n      \"type\": \"formula\",\n      \"as\": \"z\",\n      \"expr\": \"sin(datum.x / 8) + cos(datum.y / 10 - 0.5 + sin(datum.x / 20) * 2)\"\n    }\n  ],\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"index\" },\n    \"y\": { \"field\": \"y\", \"type\": \"index\" },\n    \"color\": {\n      \"field\": \"z\",\n      \"type\": \"quantitative\",\n      \"scale\": {\n        \"scheme\": \"magma\"\n      }\n    }\n  }\n}\n
"},{"location":"grammar/mark/rect/#bars","title":"Bars","text":"
{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 60, \"as\": \"x\" }\n  },\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"sin((datum.x - 30) / 4) + (datum.x - 30) / 30\",\n      \"as\": \"y\"\n    }\n  ],\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"index\", \"scale\": { \"padding\": 0.1 } },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"y2\": { \"datum\": 0 },\n    \"color\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\",\n      \"scale\": {\n        \"type\": \"threshold\",\n        \"domain\": [0],\n        \"range\": [\"#ed553b\", \"#20639b\"]\n      }\n    }\n  }\n}\n
"},{"location":"grammar/mark/rule/","title":"Rule","text":"

Rule mark displays each data item as a line segment. Rules can span the whole width or height of the view. Alternatively, they may have specific endpoints.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 15, \"as\": \"y\" }\n  },\n  \"mark\": {\n    \"type\": \"rule\",\n    \"strokeDash\": [6, 3]\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"color\": { \"field\": \"y\", \"type\": \"nominal\" }\n  }\n}\n
"},{"location":"grammar/mark/rule/#channels","title":"Channels","text":"

Rule mark supports the primary and secondary position channels and the color, opacity, and size channels.

"},{"location":"grammar/mark/rule/#properties","title":"Properties","text":"clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

minLength

Type: number | ExprRef

The minimum length of the rule in pixels. Use this property to ensure that very short ranged rules remain visible even when the user zooms out.

Default value: 0

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

size

Type: number | ExprRef

Stroke width of \"link\" and \"rule\" marks in pixels, the area of the bounding square of \"point\" mark, or the font size of \"text\" mark.

strokeCap

Type: \"butt\" | \"square\" | \"round\" | ExprRef

The style of stroke ends. Available choices: \"butt\", \"round\", and \"square\".

Default value: \"butt\"

strokeDash

Type: array

An array of of alternating stroke and gap lengths or null for solid strokes.

Default value: null

strokeDashOffset

Type: number

An offset for the stroke dash pattern.

Default value: 0

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

x

Type: number | ExprRef

Position on the x axis.

x2

Type: number | ExprRef

The secondary position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

y2

Type: number | ExprRef

The secondary position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/rule/#examples","title":"Examples","text":""},{"location":"grammar/mark/rule/#ranged-rules","title":"Ranged rules","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"y\": \"A\", \"x\": 2, \"x2\": 7 },\n      { \"y\": \"B\", \"x\": 0, \"x2\": 3 },\n      { \"y\": \"B\", \"x\": 5, \"x2\": 6 },\n      { \"y\": \"C\", \"x\": 4, \"x2\": 8 },\n      { \"y\": \"D\", \"x\": 1, \"x2\": 5 }\n    ]\n  },\n  \"mark\": {\n    \"type\": \"rule\",\n    \"size\": 10,\n    \"strokeCap\": \"round\"\n  },\n  \"encoding\": {\n    \"y\": { \"field\": \"y\", \"type\": \"nominal\" },\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"x2\": { \"field\": \"x2\" }\n  }\n}\n
"},{"location":"grammar/mark/rule/#plenty-of-diagonal-rules","title":"Plenty of diagonal rules","text":"
{\n  \"width\": 300,\n  \"height\": 300,\n\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 50 }\n  },\n\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"random()\",\n      \"as\": \"x\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.x + random() * 0.5 - 0.25\",\n      \"as\": \"x2\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"random()\",\n      \"as\": \"y\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.y + random() * 0.5 - 0.25\",\n      \"as\": \"y2\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"random()\",\n      \"as\": \"size\"\n    }\n  ],\n\n  \"mark\": {\n    \"type\": \"rule\",\n    \"strokeCap\": \"round\"\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"x\",\n      \"type\": \"quantitative\"\n    },\n    \"x2\": { \"field\": \"x2\" },\n    \"y\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\"\n    },\n    \"y2\": { \"field\": \"y2\" },\n    \"size\": {\n      \"field\": \"size\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"type\": \"pow\", \"range\": [0, 10] }\n    },\n    \"color\": {\n      \"field\": \"x\",\n      \"type\": \"nominal\",\n      \"scale\": { \"scheme\": \"category20\" }\n    }\n  }\n}\n
"},{"location":"grammar/mark/text/","title":"Text","text":"

Text mark displays each data item as text.

{\n  \"data\": {\n    \"values\": [\n      { \"x\": 1, \"text\": \"Hello\" },\n      { \"x\": 2, \"text\": \"world!\" }\n    ]\n  },\n  \"mark\": \"text\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"ordinal\" },\n    \"color\": { \"field\": \"x\", \"type\": \"nominal\" },\n    \"text\": { \"field\": \"text\" },\n    \"size\": { \"value\": 100 }\n  }\n}\n
"},{"location":"grammar/mark/text/#channels","title":"Channels","text":"

In addition to primary and secondary position channels and color and opacity channels, point mark has the following channels: text, size, and angle.

"},{"location":"grammar/mark/text/#properties","title":"Properties","text":"align

Type: \"left\" | \"center\" | \"right\"

The horizontal alignment of the text. One of \"left\", \"center\", or \"right\".

Default value: \"left\"

angle

Type: number | ExprRef

The rotation angle in degrees.

Default value: 0

baseline

Type: \"top\" | \"middle\" | \"bottom\" | \"alphabetic\"

The vertical alignment of the text. One of \"top\", \"middle\", \"bottom\".

Default value: \"bottom\"

clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

dx

Type: number

The horizontal offset between the text and its anchor point, in pixels. Applied after the rotation by angle.

dy

Type: number

The vertical offset between the text and its anchor point, in pixels. Applied after the rotation by angle.

fitToBand

Type: boolean | ExprRef

If true, sets the secondary positional channel that allows the text to be squeezed (see the squeeze property). Can be used when: 1) \"band\", \"index\", or \"locus\" scale is being used and 2) only the primary positional channel is specified.

Default value: false

flushX

Type: boolean | ExprRef

If true, the text is kept inside the viewport when the range of x and x2 intersect the viewport.

flushY

Type: boolean | ExprRef

If true, the text is kept inside the viewport when the range of y and y2 intersect the viewport.

font

Type: string

The font typeface. GenomeSpy uses SDF versions of Google Fonts. Check their availability at the A-Frame Fonts repository. System fonts are not supported.

Default value: \"Lato\"

fontStyle

Type: \"normal\" | \"italic\"

The font style. Valid values: \"normal\" and \"italic\".

Default value: \"normal\"

fontWeight

Type: number | \"thin\" | \"light\" | \"regular\" | \"normal\" | \"medium\" | \"bold\" | \"black\"

The font weight. The following strings and numbers are valid values: \"thin\" (100), \"light\" (300), \"regular\" (400), \"normal\" (400), \"medium\" (500), \"bold\" (700), \"black\" (900)

Default value: \"regular\"

logoLetters

Type: boolean | ExprRef

Stretch letters so that they can be used with sequence logos, etc...

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

paddingX

Type: number | ExprRef

The horizontal padding, in pixels, when the x2 channel is used for ranged text.

Default value: 0

paddingY

Type: number | ExprRef

The vertical padding, in pixels, when the y2 channel is used for ranged text.

Default value: 0

size

Type: number | ExprRef

The font size in pixels.

Default value: 11

squeeze

Type: boolean | ExprRef

If the squeeze property is true and secondary positional channels (x2 and/or y2) are used, the text is scaled to fit mark's width and/or height.

Default value: true

text

Type: Scalar | ExprRef

The text to display. The format of numeric data can be customized by setting a format specifier to channel definition's format property.

Default value: \"\"

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

viewportEdgeFadeDistanceBottom

Type: number

TODO

viewportEdgeFadeDistanceLeft

Type: number

TODO

viewportEdgeFadeDistanceRight

Type: number

TODO

viewportEdgeFadeDistanceTop

Type: number

TODO

viewportEdgeFadeWidthBottom

Type: number

TODO

viewportEdgeFadeWidthLeft

Type: number

TODO

viewportEdgeFadeWidthRight

Type: number

TODO

viewportEdgeFadeWidthTop

Type: number

TODO

x

Type: number | ExprRef

Position on the x axis.

x2

Type: number | ExprRef

The secondary position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

y2

Type: number | ExprRef

The secondary position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/text/#examples","title":"Examples","text":"

GenomeSpy's text mark provides several tricks useful with segmented data and zoomable visualizations.

"},{"location":"grammar/mark/text/#ranged-text","title":"Ranged text","text":"

The x2 and y2 channels allow for positioning the text inside a segment. The text is either squeezed (default) or hidden when it does not fit in the segment. The squeeze property controls the behavior.

The example below has two layers: gray rectangles at the bottom and ranged text on the top. Try to zoom and pan to see how they behave!

{\n  \"data\": {\n    \"values\": [\"A\", \"B\", \"C\", \"D\", \"E\", \"F\", \"G\"]\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"round(random() * 100)\", \"as\": \"a\" },\n    { \"type\": \"formula\", \"expr\": \"datum.a + round(random() * 60)\", \"as\": \"b\" }\n  ],\n  \"encoding\": {\n    \"x\": { \"field\": \"a\", \"type\": \"quantitative\", \"scale\": { \"zoom\": true } },\n    \"x2\": { \"field\": \"b\" },\n    \"y\": {\n      \"field\": \"data\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"padding\": 0.3\n      }\n    }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\",\n      \"encoding\": { \"color\": { \"value\": \"#eaeaea\" } }\n    },\n    {\n      \"mark\": {\n        \"type\": \"text\",\n        \"align\": \"center\",\n        \"baseline\": \"middle\",\n        \"paddingX\": 5\n      },\n      \"encoding\": {\n        \"text\": {\n          \"expr\": \"'Hello ' + floor(datum.a)\"\n        },\n        \"size\": { \"value\": 12 }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/mark/text/#sequence-logo","title":"Sequence logo","text":"

The example below demonstrates the use of the logoLetters, squeeze, and fitToBand properties to ensure that the letters fully cover the rectangles defined by the primary and secondary positional channels. Not all fonts look good in sequence logos, but Source Sans Pro seems decent.

{\n  \"data\": {\n    \"values\": [\n      { \"pos\": 1, \"base\": \"A\", \"count\": 2 },\n      { \"pos\": 1, \"base\": \"C\", \"count\": 3 },\n      { \"pos\": 1, \"base\": \"T\", \"count\": 5 },\n      { \"pos\": 2, \"base\": \"A\", \"count\": 7 },\n      { \"pos\": 2, \"base\": \"C\", \"count\": 3 },\n      { \"pos\": 3, \"base\": \"A\", \"count\": 10 },\n      { \"pos\": 4, \"base\": \"T\", \"count\": 9 },\n      { \"pos\": 4, \"base\": \"G\", \"count\": 1 },\n      { \"pos\": 5, \"base\": \"G\", \"count\": 8 },\n      { \"pos\": 6, \"base\": \"G\", \"count\": 7 }\n    ]\n  },\n  \"transform\": [\n    {\n      \"type\": \"stack\",\n      \"field\": \"count\",\n      \"groupby\": [\"pos\"],\n      \"offset\": \"information\",\n      \"as\": [\"_y0\", \"_y1\"],\n      \"baseField\": \"base\",\n      \"sort\": { \"field\": \"count\", \"order\": \"ascending\" }\n    }\n  ],\n  \"encoding\": {\n    \"x\": { \"field\": \"pos\", \"type\": \"index\" },\n    \"y\": {\n      \"field\": \"_y0\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 2] },\n      \"title\": \"Information\"\n    },\n    \"y2\": { \"field\": \"_y1\" },\n    \"text\": { \"field\": \"base\", \"type\": \"nominal\" },\n    \"color\": {\n      \"field\": \"base\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"type\": \"ordinal\",\n        \"domain\": [\"A\", \"C\", \"T\", \"G\"],\n        \"range\": [\"#7BD56C\", \"#FF9B9B\", \"#86BBF1\", \"#FFC56C\"]\n      }\n    }\n  },\n  \"mark\": {\n    \"type\": \"text\",\n    \"font\": \"Source Sans Pro\",\n    \"fontWeight\": 700,\n    \"size\": 100,\n    \"squeeze\": true,\n    \"fitToBand\": true,\n\n    \"paddingX\": 0,\n    \"paddingY\": 0,\n\n    \"logoLetters\": true\n  }\n}\n
"},{"location":"grammar/transform/","title":"Data transformation","text":"

With transforms, you can build a pipeline that modifies the data before the data objects are mapped to mark instances. In an abstract sense, a transformation inputs a list of data items and outputs a list of new items that may be filtered, modified, or generated from the original items.

The data flow is a forest of data sources and subsequent transformations, which may form trees. For instance, a layer view might have a data source, which is then filtered and mutated in a different way for each child layer.

Departure from Vega-Lite

The notation of transforms is different from Vega-Lite to enable more straghtforward addition of new operations. Each transform has to be specified using an explicit type property like in the lower-level Vega visualization grammar. Thus, the transform type is not inferred from the presence of transform-specific properties.

"},{"location":"grammar/transform/#example","title":"Example","text":"

The following example uses the \"filter\" transform to retain only the rows that match the predicate expression.

{\n  ...,\n  \"data\": { ... },\n  \"transform\": [\n    {\n      \"type\": \"filter\",\n      \"expr\": \"datum.end - datum.start < 5000\"\n    }\n  ],\n  ...\n}\n
"},{"location":"grammar/transform/#debugging-the-data-flow","title":"Debugging the Data Flow","text":"

Complex visualizations may involve multiple data sources and transformations, which can make it difficult to understand the data flow, particularly when no data objects appear to pass through the flow. The Dataflow Inspector shows the structure of the data flow and allows you to inspect the parameters of each node, the number of propagated data objects, and a recorded copy of the first data object that passes through the node. The Inspector is currently available in the toolbar () of the GenomeSpy App.

"},{"location":"grammar/transform/aggregate/","title":"Aggregate","text":"

The \"aggregate\" transform summarizes data fields using aggregate functions, such as \"sum\" or \"max\". The data can be grouped by one or more fields, which results in a list of objects with the grouped fields and the aggregate values.

"},{"location":"grammar/transform/aggregate/#parameters","title":"Parameters","text":"as

Type: array

The names for the output fields corresponding to each aggregated field. If not provided, names will be automatically created using the operation and field names (e.g., sum_field, average_field).

fields

Type: array

The data fields to apply aggregate functions to. This array should correspond with the ops and as arrays. If no fields or operations are specified, a count aggregation will be applied by default.

groupby

Type: array

The fields by which to group the data. If these are not defined, all data objects will be grouped into a single category.

ops

Type: array

The aggregation operations to be performed on the fields, such as \"sum\", \"average\", or \"count\".

"},{"location":"grammar/transform/aggregate/#available-aggregate-functions","title":"Available aggregate functions","text":"

Aggregate functions are applied to the data fields in each group.

"},{"location":"grammar/transform/aggregate/#example","title":"Example","text":"

Given the following data:

x y first 123 first 456 second 789

... and configuration:

{\n  \"type\": \"aggregate\",\n  \"groupby\": [\"x\"]\n}\n

A new list of data objects is created:

x count first 2 second 1"},{"location":"grammar/transform/aggregate/#calculating-min-and-max","title":"Calculating min and max","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"Category\": \"A\", \"Value\": 5 },\n      { \"Category\": \"A\", \"Value\": 9 },\n      { \"Category\": \"A\", \"Value\": 9.5 },\n      { \"Category\": \"B\", \"Value\": 3 },\n      { \"Category\": \"B\", \"Value\": 5 },\n      { \"Category\": \"B\", \"Value\": 7.5 },\n      { \"Category\": \"B\", \"Value\": 8 }\n    ]\n  },\n\n  \"encoding\": {\n    \"y\": { \"field\": \"Category\", \"type\": \"nominal\" }\n  },\n\n  \"layer\": [\n    {\n      \"encoding\": {\n        \"x\": { \"field\": \"Value\", \"type\": \"quantitative\" }\n      },\n      \"mark\": \"point\"\n    },\n    {\n      \"transform\": [\n        {\n          \"type\": \"aggregate\",\n          \"groupby\": [\"Category\"],\n          \"fields\": [\"Value\", \"Value\"],\n          \"ops\": [\"min\", \"max\"],\n          \"as\": [\"minValue\", \"maxValue\"]\n        }\n      ],\n      \"encoding\": {\n        \"x\": { \"field\": \"minValue\", \"type\": \"quantitative\" },\n        \"x2\": { \"field\": \"maxValue\" }\n      },\n      \"mark\": \"rule\"\n    }\n  ]\n}\n
"},{"location":"grammar/transform/collect/","title":"Collect","text":"

The \"collect\" transform collects (buffers) the data items from the data flow into an internal array and optionally sorts them.

"},{"location":"grammar/transform/collect/#parameters","title":"Parameters","text":"groupby

Type: array

Arranges the data into consecutive batches based on the groups. This is mainly intended for internal use so that faceted data can be handled as batches.

sort

Type: CompareParams

The sort order.

"},{"location":"grammar/transform/collect/#example","title":"Example","text":"
{\n  \"type\": \"collect\",\n  \"sort\": {\n    \"field\": [\"score\"],\n    \"order\": [\"descending\"]\n  }\n}\n
"},{"location":"grammar/transform/coverage/","title":"Coverage","text":"

The \"coverage\" transform computes coverage for overlapping segments. The result is a new list of non-overlapping segments with the coverage values. The segments must be sorted by their start coordinates before passing them to the coverage transform.

"},{"location":"grammar/transform/coverage/#parameters","title":"Parameters","text":"as

Type: string

The output field for the computed coverage.

asChrom

Type: string

The output field for the chromosome.

Default: Same as chrom

asEnd

Type: string

The output field for the end coordinate.

Default: Same as end

asStart

Type: string

The output field for the start coordinate.

Default: Same as start

chrom

Type: string (field name)

An optional chromosome field that is passed through. TODO: groupby

end Required

Type: string (field name)

The field representing the end coordinate of the segment (exclusive).

start Required

Type: string (field name)

The field representing the start coordinate of the segment (inclusive).

weight

Type: string (field name)

A field representing an optional weight for the segment. Can be used with copy ratios, for example.

"},{"location":"grammar/transform/coverage/#example","title":"Example","text":"

Given the following data:

start end 0 4 1 3

... and configuration:

{\n  \"type\": \"coverage\",\n  \"start\": \"startpos\",\n  \"end\": \"endpos\"\n}\n

A new list of segments is produced:

start end coverage 0 1 1 1 3 2 3 4 1"},{"location":"grammar/transform/coverage/#interactive-example","title":"Interactive example","text":"

The following example demonstrates both \"coverage\" and \"pileup\" transforms.

{\n  \"data\": {\n    \"sequence\": {\n      \"start\": 1,\n      \"stop\": 100,\n      \"as\": \"start\"\n    }\n  },\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.start + ceil(random() * 20)\",\n      \"as\": \"end\"\n    }\n  ],\n  \"resolve\": { \"scale\": { \"x\": \"shared\" } },\n  \"vconcat\": [\n    {\n      \"transform\": [\n        {\n          \"type\": \"coverage\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"coverage\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": { \"field\": \"start\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"end\" },\n        \"y\": { \"field\": \"coverage\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"transform\": [\n        {\n          \"type\": \"pileup\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"lane\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": { \"field\": \"start\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"end\" },\n        \"y\": {\n          \"field\": \"lane\",\n          \"type\": \"index\",\n          \"scale\": {\n            \"padding\": 0.2,\n            \"reverse\": true,\n            \"zoom\": false\n          }\n        }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/transform/filter-scored-labels/","title":"Filter Scored Lables","text":"

The \"filterScoredLables\" transform fits prioritized labels into the available space, and dynamically reflows the data when the scale domain is adjusted (i.e., zoomed).

For an usage example, check the Annotation Tracks notebook.

"},{"location":"grammar/transform/filter-scored-labels/#parameters","title":"Parameters","text":"channel

Type: string

Default: \"x\"

lane

Type: string (field name)

An optional field representing element's lane, e.g., if transcripts are shown using a piled up layout.

padding

Type: number

Padding (in pixels) around the element.

Default: 0

pos Required

Type: string (field name)

The field representing element's position on the domain.

score Required

Type: string (field name)

The field representing the score used for prioritization.

width Required

Type: string (field name)

The field representing element's width in pixels

"},{"location":"grammar/transform/filter/","title":"Filter","text":"

The \"filter\" transform removes data objects based on a predicate expression.

"},{"location":"grammar/transform/filter/#parameters","title":"Parameters","text":"expr Required

Type: string

An expression string. The data object is removed if the expression evaluates to false.

"},{"location":"grammar/transform/filter/#example","title":"Example","text":"
{\n  \"type\": \"filter\",\n  \"expr\": \"datum.p <= 0.05\"\n}\n

The example above retains all rows for which the field p is less than or equal to 0.05.

"},{"location":"grammar/transform/flatten-compressed-exons/","title":"Flatten Compressed Exons","text":"

The \"flattenCompressedExons\" transform flattens \"delta encoded\" exons. The transform inputs the start coordinate of the gene body and a comma-delimited string of alternating intron and exon lengths. A new data object is created for each exon.

This transform is mainly intended to be used with an optimized gene annotation track. Read more at Annotation Tracks notebook.

"},{"location":"grammar/transform/flatten-compressed-exons/#parameters","title":"Parameters","text":"as

Type: array

Field names for the flattened exons.

Default: [\"exonStart\", \"exonEnd\"]

exons

Type: string (field name)

The field containing the exons.

Default: \"exons\"

start

Type: string (field name)

Start coordinate of the gene body.

Default: \"start\"

"},{"location":"grammar/transform/flatten-delimited/","title":"Flatten Delimited","text":"

The \"flattenDelimited\" transform flattens (or normalizes) a field or a set of fields that contain delimited values. In other words, each delimited value is written into a new data object that contains a single value from the delimited field. All other fields are copied as such.

"},{"location":"grammar/transform/flatten-delimited/#parameters","title":"Parameters","text":"as

Type: string[] | string

The output field name(s) for the flattened field.

Default: the input fields.

field Required

Type: string (field name)[] | string (field name)

The field(s) to split and flatten

separator Required

Type: string[] | string

Separator(s) used on the field(s) TODO: Rename to delimiter

"},{"location":"grammar/transform/flatten-delimited/#example","title":"Example","text":"

Given the following data:

patient tissue value A Ova,Asc 4,2 B Adn,Asc,Ute 6,3,4

... and configuration:

{\n  \"type\": \"flattenDelimited\",\n  \"field\": [\"tissue\", \"value\"],\n  \"separator\": [\",\", \",\"]\n}\n

TODO: Rename separator to delimiter

Flattened data is produced:

patient tissue value A Ova 4 A Asc 2 B Adn 6 B Asc 3 B Ute 4"},{"location":"grammar/transform/flatten-sequence/","title":"Flatten Sequence","text":"

The \"flattenSequence\" transform flattens strings such as FASTA sequences into data objecsts with position and character fields.

"},{"location":"grammar/transform/flatten-sequence/#parameters","title":"Parameters","text":"as

Type: array

Name of the fields where the zero-based index number and flattened sequence letter are written to.

Default: [\"pos\", \"sequence\"]

field

Type: string (field name)

The field to flatten.

Default: \"sequence\"

"},{"location":"grammar/transform/flatten-sequence/#example","title":"Example","text":"

Given the following data:

identifier sequence X AC Y ACTG

... and parameters:

{\n  \"type\": \"flattenSequence\",\n  \"field\": \"sequence\",\n  \"as\": [\"base\", \"pos\"]\n}\n

The sequences are flattened into:

identifier sequence base pos X AC A 0 X AC C 1 Y ACTG A 0 Y ACTG C 1 Y ACTG T 2 Y ACTG G 3"},{"location":"grammar/transform/flatten/","title":"Flatten","text":"

The \"flatten\" transform converts fields that hold arrays into distinct, individual data objects. This creates a new sequence of data, where each element encompasses both an extracted array component and all the original fields from the corresponding input object.

"},{"location":"grammar/transform/flatten/#parameters","title":"Parameters","text":"as

Type: string[] | string

The output field name(s) for the flattened field.

Default: the input fields.

fields

Type: string (field name)[] | string (field name)

The field(s) to flatten. If no field is defined, the data object itself is treated as an array to be flattened.

index

Type: string

The output field name for the zero-based index of the array values. If unspecified, an index field is not added.

"},{"location":"grammar/transform/flatten/#example","title":"Example","text":""},{"location":"grammar/transform/flatten/#single-field-flattening","title":"Single-Field Flattening","text":"

This example flattens the array-valued field named foo. Note that all fields except foo are repeated in every output datum.

{ \"type\": \"flatten\", \"fields\": [\"foo\"] }\n

Input data:

[\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": [1, 2] },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": [3, 4, 5] }\n]\n

Result:

[\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": 1 },\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": 2 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 3 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 4 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 5 }\n]\n
"},{"location":"grammar/transform/flatten/#adding-an-index-field","title":"Adding an Index Field","text":"
{ \"type\": \"flatten\", \"fields\": [\"foo\"], \"index\": \"idx\" }\n

This example adds an field containing the array index that each item originated from.

[\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": [1, 2] },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": [3, 4, 5] }\n]\n

Result:

[\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": 1, \"idx\": 0 },\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": 2, \"idx\": 1 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 3, \"idx\": 0 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 4, \"idx\": 1 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 5, \"idx\": 2 }\n]\n
"},{"location":"grammar/transform/flatten/#multi-field-flattening","title":"Multi-Field Flattening","text":"
{ \"type\": \"flatten\", \"fields\": [\"foo\", \"bar\"] }\n

This example simultaneously flattens the array-valued fields foo and bar. Given the input data

[\n  { \"key\": \"alpha\", \"foo\": [1, 2], \"bar\": [\"A\", \"B\"] },\n  { \"key\": \"beta\", \"foo\": [3, 4, 5], \"bar\": [\"C\", \"D\"] }\n]\n

this example produces the output:

[\n  { \"key\": \"alpha\", \"foo\": 1, \"bar\": \"A\" },\n  { \"key\": \"alpha\", \"foo\": 2, \"bar\": \"B\" },\n  { \"key\": \"beta\", \"foo\": 3, \"bar\": \"C\" },\n  { \"key\": \"beta\", \"foo\": 4, \"bar\": \"D\" },\n  { \"key\": \"beta\", \"foo\": 5, \"bar\": null }\n]\n
"},{"location":"grammar/transform/flatten/#flattening-array-objects","title":"Flattening Array Objects","text":"
{ \"type\": \"flatten\" }\n

This example treats the data objects as arrays that should be flattened. Given the input data

[[{ \"foo\": 1 }], [{ \"foo\": 2 }, { \"foo\": 3 }]]\n

this example produces the output:

[{ \"foo\": 1 }, { \"foo\": 2 }, { \"foo\": 3 }]\n
"},{"location":"grammar/transform/formula/","title":"Formula","text":"

The \"formula\" transform uses an expression to calculate and add a new field to the data objects.

"},{"location":"grammar/transform/formula/#parameters","title":"Parameters","text":"as Required

Type: string

The (new) field where the computed value is written to

expr Required

Type: string

An expression string

"},{"location":"grammar/transform/formula/#example","title":"Example","text":"

Given the following data:

x y 1 2 3 4

... and configuration:

{\n  \"type\": \"formula\",\n  \"expr\": \"datum.x + datum.y\",\n  \"as\": \"z\"\n}\n

A new field is added:

x y z 1 2 3 3 4 7"},{"location":"grammar/transform/formula/#using-with-parameters","title":"Using with Parameters","text":"

As expressions have access to parameters, they can be used to create dynamic visualizations. The following example uses a formula to calculate the sum of two sine waves with different wave lengths. The wave lengths are controlled by the a and b parameters.

Under the hood, when any of the parameters change, the formula transform finds the closest collector or data source in the data pipeline and triggers a re-propagation of the data, resulting in a re-evaluation of the formula expression.

{\n  \"params\": [\n    {\n      \"name\": \"a\",\n      \"value\": 200,\n      \"bind\": { \"input\": \"range\", \"min\": 10, \"max\": 2000, \"step\": 1 }\n    },\n    {\n      \"name\": \"b\",\n      \"value\": 270,\n      \"bind\": { \"input\": \"range\", \"min\": 10, \"max\": 2000, \"step\": 1 }\n    }\n  ],\n\n  \"data\": { \"sequence\": { \"start\": 0, \"stop\": 1000, \"as\": \"x\" } },\n\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"sin(datum.x * 2 * PI / a) + sin(datum.x * 2 * PI / b)\",\n      \"as\": \"y\"\n    }\n  ],\n\n  \"mark\": \"point\",\n\n  \"encoding\": {\n    \"size\": { \"value\": 4 },\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" }\n  }\n}\n
"},{"location":"grammar/transform/linearize-genomic-coordinate/","title":"Linearize Genomic Coordinate","text":"

The linearizeGenomicCoordinate transform maps the (chromosome, position) pairs into a linear coordinate space using the chromosome sizes of the current genome assembly.

"},{"location":"grammar/transform/linearize-genomic-coordinate/#parameters","title":"Parameters","text":"as Required

Type: string | string[]

The output field or fields for linearized coordinates.

channel

Type: string

Get the genome assembly from the scale of the channel.

Default: \"x\"

chrom Required

Type: string (field name)

The chromosome/contig field

offset

Type: number | number[]

An offset or offsets that allow for adjusting the numbering base. The offset is subtracted from the positions.

GenomeSpy uses internally zero-based indexing with half-open intervals. UCSC-based formats (BED, etc.) generally use this scheme. However, for example, VCF files use one-based indexing and must be adjusted by setting the offset to 1.

Default: 0

pos Required

Type: string (field name) | string (field name)[]

The field or fields that contain intra-chromosomal positions

"},{"location":"grammar/transform/linearize-genomic-coordinate/#example","title":"Example","text":"
{\n  \"type\": \"linearizeGenomicCoordinate\",\n  \"chrom\": \"chrom\",\n  \"pos\": \"start\",\n  \"as\": \"_start\"\n}\n
"},{"location":"grammar/transform/measure-text/","title":"Measure Text","text":"

The \"measureText\" transforms measures the length of a string in pixels. The measurement can be used in downstream layout computations.

For an usage example, check the Annotation Tracks notebook.

"},{"location":"grammar/transform/measure-text/#parameters","title":"Parameters","text":"as Required

Type: string

TODO

field Required

Type: string (field name)

TODO

fontSize Required

Type: number

TODO

"},{"location":"grammar/transform/measure-text/#example","title":"Example","text":"
{\n  \"type\": \"measureText\",\n  \"fontSize\": 11,\n  \"field\": \"symbol\",\n  \"as\": \"_textWidth\"\n}\n
"},{"location":"grammar/transform/pileup/","title":"Pileup","text":"

The \"pileup\" transform computes a piled up layout for overlapping segments. The computed lane can be used to position the segments in a visualization. The segments must be sorted by their start coordinates before passing them to the pileup transform.

"},{"location":"grammar/transform/pileup/#parameters","title":"Parameters","text":"as

Type: string

The output field name for the computed lane.

Default: \"lane\".

end Required

Type: string (field name)

The field representing the end coordinate of the segment (exclusive).

preference

Type: string (field name)

An optional field indicating the preferred lane. Use together with the preferredOrder property.

preferredOrder

Type: string[] | number[] | boolean[]

The order of the lane preferences. The first element contains the value that should place the segment on the first lane and so forth. If the preferred lane is occupied, the first available lane is taken.

spacing

Type: number

The spacing between adjacent segments on the same lane in coordinate units.

Default: 1.

start Required

Type: string (field name)

The field representing the start coordinate of the segment (inclusive).

"},{"location":"grammar/transform/pileup/#example","title":"Example","text":"

Given the following data:

start end 0 4 1 3 2 6 4 8

... and configuration:

{\n  \"type\": \"pileup\",\n  \"start\": \"start\",\n  \"end\": \"end\",\n  \"as\": \"lane\"\n}\n

A new field is added:

start end lane 0 4 0 1 3 1 2 6 2 4 8 1"},{"location":"grammar/transform/pileup/#interactive-example","title":"Interactive example","text":"

The following example demonstrates both \"coverage\" and \"pileup\" transforms.

{\n  \"data\": {\n    \"sequence\": {\n      \"start\": 1,\n      \"stop\": 100,\n      \"as\": \"start\"\n    }\n  },\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.start + ceil(random() * 20)\",\n      \"as\": \"end\"\n    }\n  ],\n  \"resolve\": { \"scale\": { \"x\": \"shared\" } },\n  \"vconcat\": [\n    {\n      \"transform\": [\n        {\n          \"type\": \"coverage\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"coverage\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": { \"field\": \"start\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"end\" },\n        \"y\": { \"field\": \"coverage\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"transform\": [\n        {\n          \"type\": \"pileup\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"lane\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": { \"field\": \"start\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"end\" },\n        \"y\": {\n          \"field\": \"lane\",\n          \"type\": \"index\",\n          \"scale\": {\n            \"padding\": 0.2,\n            \"reverse\": true,\n            \"zoom\": false\n          }\n        }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/transform/project/","title":"Project","text":"

The \"project\" transform retains the specified fields of the data objects, optionally renaming them. All other fields are removed.

"},{"location":"grammar/transform/project/#parameters","title":"Parameters","text":"as

Type: array

New names for the projected fields. If omitted, the names of the source fields are used.

fields Required

Type: array

The fields to be projected.

"},{"location":"grammar/transform/project/#example","title":"Example","text":"
{\n  \"type\": \"project\",\n  \"fields\": [\"lane\", \"start\", \"exons\"]\n}\n
"},{"location":"grammar/transform/regex-extract/","title":"Regex Extract","text":"

The \"regexExtract\" transform extracts groups from a string field and adds them to the data objects as new fields.

"},{"location":"grammar/transform/regex-extract/#parameters","title":"Parameters","text":"as Required

Type: string | string[]

The new field or an array of fields where the extracted values are written.

field Required

Type: string (field name)

The source field

regex Required

Type: string

A valid JavaScript regular expression with at least one group. For example: \"^Sample(\\\\d+)$\".

Read more at: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

skipInvalidInput

Type: boolean

Do not complain about invalid input. Just skip it and leave the new fields undefined on the affected datum.

Default: false

"},{"location":"grammar/transform/regex-extract/#example","title":"Example","text":"

Given the following data:

Gene Genome Location AKT1 14:104770341-104792643

... and configuration:

{\n  \"type\": \"regexExtract\",\n  \"field\": \"Genome Location\",\n  \"regex\": \"^(X|Y|\\\\d+):(\\\\d+)-(\\\\d+)$\",\n  \"as\": [\"Chrom\", \"Start\", \"End\"]\n}\n

Three new fields are added to the data:

Gene Genome Location Chrom Start End AKT1 14:104770341-104792643 14 104770341 104792643"},{"location":"grammar/transform/regex-fold/","title":"Regex Fold","text":"

The \"regexFold\" transform gathers columns into key-value pairs using a regular expression.

"},{"location":"grammar/transform/regex-fold/#parameters","title":"Parameters","text":"asKey

Type: string

Default: \"sample\"

asValue Required

Type: string[] | string

A new column name for the extracted values.

columnRegex Required

Type: string[] | string

A regular expression that matches to column names. The regex must have one capturing group that is used for extracting the key (e.g., a sample id) from the column name.

skipRegex

Type: string

An optional regex that matches to fields that should not be included in the new folded data objects.

"},{"location":"grammar/transform/regex-fold/#example","title":"Example","text":"

Given the following data:

SNP foo.AF bar.AF baz.AF rs99924582 0.3 0.24 0.94 rs22238423 0.92 0.21 0.42

... and configuration:

{\n  \"type\": \"regexFold\",\n  \"columnRegex\": [\"^(.*)\\\\.AF$\"],\n  \"asValue\": [\"VAF\"],\n  \"asKey\": \"sample\"\n}\n

The matched columns are folded into new data objects. All others are left intact:

SNP sample VAF rs99924582 foo 0.3 rs99924582 bar 0.24 rs99924582 baz 0.94 rs22238423 foo 0.92 rs22238423 bar 0.21 rs22238423 baz 0.42"},{"location":"grammar/transform/sample/","title":"Sample","text":"

The \"sample\" transform takes a random sample of the data objects.

"},{"location":"grammar/transform/sample/#parameters","title":"Parameters","text":"size

Type: number

The maximum sample size.

Default: 500

"},{"location":"grammar/transform/sample/#example","title":"Example","text":"
{\n  \"type\": \"sample\",\n  \"size\": 100\n}\n
"},{"location":"grammar/transform/stack/","title":"Stack","text":"

The \"stack\" transform computes a stacked layout. Stacked bar plots and sequence logos are some of its applications.

"},{"location":"grammar/transform/stack/#parameters","title":"Parameters","text":"as Required

Type: array

Fields to write the stacked values.

Default: [\"y0\", \"y1\"]

baseField

Type: string (field name)

The field that contains the base or amino acid. Used for information content calculation when the offset is \"information\". The data objects that have null in the baseField are considered gaps and they are taken into account when scaling the the locus' information content.

cardinality

Type: number

Cardinality, e.g., the number if distinct bases or amino acids. Used for information content calculation when the offset is \"information\".

Default: 4

field

Type: string (field name)

The field to stack. If no field is defined, a constant value of one is assumed.

groupby Required

Type: array

The fields to be used for forming groups for different stacks.

offset

Type: string

How to offset the values in a stack. \"zero\" (default) starts stacking at 0. \"center\" centers the values around zero. \"normalize\" computes intra-stack percentages and normalizes the values to the range of [0, 1]. \"information\" computes a layout for a sequence logo. The total height of the stack reflects the group's information content.

sort

Type: CompareParams

The sort order of data in each stack.

"},{"location":"grammar/transform/stack/#examples","title":"Examples","text":""},{"location":"grammar/transform/stack/#stacked-bar-plot","title":"Stacked bar plot","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"x\": 1, \"q\": \"A\", \"z\": 7 },\n      { \"x\": 1, \"q\": \"B\", \"z\": 3 },\n      { \"x\": 1, \"q\": \"C\", \"z\": 10 },\n      { \"x\": 2, \"q\": \"A\", \"z\": 8 },\n      { \"x\": 2, \"q\": \"B\", \"z\": 5 },\n      { \"x\": 3, \"q\": \"B\", \"z\": 10 }\n    ]\n  },\n  \"transform\": [\n    {\n      \"type\": \"stack\",\n      \"field\": \"z\",\n      \"groupby\": [\"x\"]\n    }\n  ],\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"nominal\", \"band\": 0.8 },\n    \"y\": { \"field\": \"y0\", \"type\": \"quantitative\" },\n    \"y2\": { \"field\": \"y1\" },\n    \"color\": { \"field\": \"q\", \"type\": \"nominal\" }\n  }\n}\n
"},{"location":"grammar/transform/stack/#sequence-logo","title":"Sequence logo","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"pos\": 1, \"base\": \"A\", \"count\": 2 },\n      { \"pos\": 1, \"base\": \"C\", \"count\": 3 },\n      { \"pos\": 1, \"base\": \"T\", \"count\": 5 },\n      { \"pos\": 2, \"base\": \"A\", \"count\": 7 },\n      { \"pos\": 2, \"base\": \"C\", \"count\": 3 },\n      { \"pos\": 3, \"base\": \"A\", \"count\": 10 },\n      { \"pos\": 4, \"base\": \"T\", \"count\": 9 },\n      { \"pos\": 4, \"base\": \"G\", \"count\": 1 },\n      { \"pos\": 5, \"base\": \"G\", \"count\": 8 },\n      { \"pos\": 6, \"base\": \"G\", \"count\": 7 }\n    ]\n  },\n  \"transform\": [\n    {\n      \"type\": \"stack\",\n      \"field\": \"count\",\n      \"groupby\": [\"pos\"],\n      \"offset\": \"information\",\n      \"as\": [\"_y0\", \"_y1\"],\n      \"baseField\": \"base\",\n      \"sort\": { \"field\": \"count\", \"order\": \"ascending\" }\n    }\n  ],\n  \"encoding\": {\n    \"x\": { \"field\": \"pos\", \"type\": \"index\" },\n    \"y\": {\n      \"field\": \"_y0\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 2] },\n      \"title\": \"Information\"\n    },\n    \"y2\": { \"field\": \"_y1\" },\n    \"text\": { \"field\": \"base\", \"type\": \"nominal\" },\n    \"color\": {\n      \"field\": \"base\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"type\": \"ordinal\",\n        \"domain\": [\"A\", \"C\", \"T\", \"G\"],\n        \"range\": [\"#7BD56C\", \"#FF9B9B\", \"#86BBF1\", \"#FFC56C\"]\n      }\n    }\n  },\n  \"mark\": {\n    \"type\": \"text\",\n    \"font\": \"Source Sans Pro\",\n    \"fontWeight\": 700,\n    \"size\": 100,\n    \"squeeze\": true,\n    \"fitToBand\": true,\n\n    \"paddingX\": 0,\n    \"paddingY\": 0,\n\n    \"logoLetters\": true\n  }\n}\n
"},{"location":"sample-collections/","title":"Working with Sample Collections","text":"

The app package of the GenomeSpy toolkit enables an interactive analysis of large sample collections. It builds upon the core package, which allows developers to build tailored visualizations using the visualization grammar and GPU-accelerated rendering engine. The app extends the grammar with a facet operator that makes it possible to repeat a single visualization for thousands of samples. The end users of the visualization have access to several interactions that facilitate the exploration of such sample collections.

The documentation of the app package is split into two parts serving different audiences:

  1. Visualizing Sample Collections (for method developers)
  2. Analyzing Sample Collections (for end users)
"},{"location":"sample-collections/analyzing/","title":"Analyzing Sample Collections","text":"

End-User Documentation

This page is mainly intended for end users who analyze sample collections interactively using the GenomeSpy app.

"},{"location":"sample-collections/analyzing/#elements-of-the-user-interface","title":"Elements of the user interface","text":"

Because GenomeSpy visualizations are highly customizable, the actual visualization and the available user-interface elements may differ significantly from what is shown below.

  1. Location / search field shows the genomic coordinates of the current viewport in a UCSC-style format. You can look up features such as gene symbols using the field. In addition, you can filter the sample collection by categorical metadata attibutes by typing a categorical value into this field.
  2. Undo history and provenance allows you to undo and redo actions performed on the sample collection. The provenance () button shows all perfomed actions, allowing you to better understand how the current visualization state was constructed.
  3. View visibility menu allows for toggling the visibility of elements such as metadata attributes or annotation tracks.
  4. Bookmark menu shows a list of pre-defined bookmarks and allows you to save the visualization state as a local bookmark into your web browser. The adjacent Share () button constructs a shareable URL, which captures the visualization state and optional notes related to the current visualization state.
  5. Fullscreen toggle opens the visualization in fullscreen mode.
  6. Group markers become visible when the sample collection has been stratified using some attribute.
  7. Sample names identify the samples.
  8. Metadata such as clinical attributes or computed variables shown as a heatmap.
  9. Genomic data is shown here.
"},{"location":"sample-collections/analyzing/#navigation-interactions","title":"Navigation interactions","text":""},{"location":"sample-collections/analyzing/#navigating-around-the-genome","title":"Navigating around the genome","text":"

To navigate around the genome in GenomeSpy, you can use either a mouse or a touchpad. If you're using a mouse, you can zoom the genome axis in and out using the scroll wheel. To pan the view, click with the left mouse button and start dragging.

If you're using a touchpad, you can zoom the genome axis by performing a vertical two-finger gesture. Similarly, you can pan the view by performing a horizontal gesture.

"},{"location":"sample-collections/analyzing/#peeking-samples","title":"Peeking samples","text":"

The GenomeSpy app is designed for the exploration of large datasets containing hundreds or thousands of samples. To provide a better overview of patterns across the entire sample collection, GenomeSpy displays the samples as a bird's eye view that fits them into the available vertical space. If you discover interesting patterns or outliers in the dataset, you can peek individual samples by activating a close-up view from the context menu or by pressing the E key on the keyboard.

Once the close-up view is activated, the zooming interaction will change to vertical scrolling. However, you can still zoom in and out by holding down the Ctrl key while operating the mouse wheel or touchpad.

"},{"location":"sample-collections/analyzing/#manipulating-the-sample-collection","title":"Manipulating the sample collection","text":"

Sorting, filtering, and stratifying a large sample collection can provide valuable insights into the data by helping to identify patterns and outliers. By sorting samples based on a particular attribute or filtering out irrelevant samples, you can more easily identify patterns or trends in the data that might be difficult to see otherwise. Stratifying the sample collection by grouping samples into distinct categories can also help to identify meaningful differences between groups and reveal new insights into the data.

The GenomeSpy app enables users to manipulate the sample collection using incremental actions that operate on abstract attributes, such as metadata variables or measured values at specific genomic loci. By applying a series of these stepwise actions, users can gradually shape the sample collection to their needs, enabling complex analyses. The applied actions are saved in an undo history, which also serves as provenance information for bookmarks and shared links.

An example scenario

Suppose a user has a sample collection that includes multiple tumor samples from each patient and wants to keep a single representative sample from each patient. The user defines a representative sample as having a tumor purity greater or equal to 15% and the highest copy number at the MYC locus. To form a sample collection with only the representative samples, the user performs the following actions:

  1. Retains samples with purity greater than or equal to 15%
  2. Sorts the samples in descending order by the copy number at the MYC locus
  3. Retains only the top sample from each patient, based on the sorting in Step 2

Following these steps, the user is left with the representative samples.

"},{"location":"sample-collections/analyzing/#accessing-the-actions","title":"Accessing the actions","text":"

You can access the actions via a context menu, which appears when you right-click on a metadata attribute in the heatmap or a location in the genomic data panel.

There are two types of attributes: quantitative and categorical. Each type has a different set of supported actions. For example, quantitative attributes can be filtered using a threshold, while categorical attributes support retention or removal of selected categories.

The context menu also provides shortcuts to some actions based on the value under the cursor. For example, a context menu opened on a categorical attribute will give you actions for retaining or removing samples with the selected categorical value.

"},{"location":"sample-collections/analyzing/#undo-history-and-provenance","title":"Undo history and provenance","text":"

GenomeSpy stores the applied actions in an undo history, allowing you to easily experiment with different analyses and revert back to previous states if needed. The provenance button () reveals a menu that shows the applied actions together with the used attributes and parameters. You can jump to different states in the undo history by clicking the menu items or the adjacent previous/next buttons.

"},{"location":"sample-collections/analyzing/#the-actions","title":"The actions","text":""},{"location":"sample-collections/analyzing/#sort","title":"Sort","text":"

The Sort by action arranges the samples in a descending order based on the chosen quantitative attribute.

"},{"location":"sample-collections/analyzing/#filter-by-a-categorical-attribute","title":"Filter by a categorical attribute","text":"

The context menu provides two shortcut actions for retaining and removing samples having the chosen value in the selected attribute. The Advanced filter... option allows you to choose multiple categories to be retained or removed.

"},{"location":"sample-collections/analyzing/#filter-by-a-quantitative-attribute","title":"Filter by a quantitative attribute","text":"

For quantitative attributes, the menu offers shortcut actions for retaining samples with a value greater or equal to or less or equal to the chosen value. For more precise thresholding, you can use the Advanced filter... option, which opens a dialog with a histogram and options for choosing open or closed thresholds.

"},{"location":"sample-collections/analyzing/#retain-the-first-of-each","title":"Retain the first of each","text":"

In many analyses, it is necessary to select a single, representative sample from each category. This action retains the first, topmost sample from each category. It is not necessary to sort the samples by the categorical attribute, but rather they should be sorted according to the attributes used to select the representative samples. For a usage example, refer to the example scenario provided in the box above.

"},{"location":"sample-collections/analyzing/#retain-first-n-categories","title":"Retain first n categories","text":"

Sometimes you might be interested in a small number of categories that contain samples with the most extreme values in another attribute. For example, if each patient (the category) has multiple samples, this action allows you to retain all samples from the top-5 patients based on the highest number of mutations (the another attribute) in any of their samples.

"},{"location":"sample-collections/analyzing/#create-custom-groups","title":"Create custom groups","text":"

Use this action to manually select and group multiple categories together according to your specific requirements. This feature allows you to create new groups that contain any combination of categories from your data, giving you the flexibility to organize and view your data in customized groupings.

"},{"location":"sample-collections/analyzing/#group-by-categorical-attribute","title":"Group by categorical attribute","text":"

This action stratifies the data based on the selected categorical attribute. The groups will be shown to the left of the sample names, as shown above.

"},{"location":"sample-collections/analyzing/#group-by-quartiles","title":"Group by quartiles","text":"

This action enables rapid stratification into four groups using a quantitative attribute. The implementation uses the R-7 method, the default in the R programming language and Excel.

"},{"location":"sample-collections/analyzing/#group-by-thresholds","title":"Group by thresholds","text":"

The group by thresholds action allows stratifying the samples using custom thresholds on a quantitative attribute. Upon selecting this action, you are shown a dialog with a histogram, where you can add any number of thresholds and specify which side of the threshold should be open or closed.

"},{"location":"sample-collections/analyzing/#retain-matched","title":"Retain matched","text":"

This action retains categories that are common to all of the current groups. For example, suppose you are working with a sample collection with multiple samples from each patient. You have grouped the samples into two groups based on the anatomical site of the sample. By applying this action to the categorical patient attribute, you can ensure that your sample collection comprises only those patients with samples from both anatomical sites. In other words, the patients with only a single anatomical site are removed.

"},{"location":"sample-collections/analyzing/#bookmarking-and-sharing","title":"Bookmarking and sharing","text":"

Saving a visualization state together with provenance as a bookmark is a practical way to revisit a particular visualization later or share it with others. Bookmarks store the entire state of the visualization, including the actions taken to arrive at that state. This allows for easy and reproducible sharing of findings from the data. Moreover, bookmarks support optional Markdown-formatted notes that allow communicating essential background information and possible implications related to the discovery.

"},{"location":"sample-collections/analyzing/#bookmarks","title":"Bookmarks","text":"

GenomeSpy supports two types of bookmarks: pre-defined bookmarks that the visualization author may have included with the visualization and local bookmarks that you can save in your web browser. You can access both types of bookmarks from the bookmark menu (). Additionally, you can remove or edit existing bookmarks through a submenu that appears when you click the ellipsis button ().

"},{"location":"sample-collections/analyzing/#sharing","title":"Sharing","text":"

The current visualization state is constantly updated to the web browser's address bar, allowing you to quickly share the state with others. However, for better context, GenomeSpy's sharing function provides the option to include a name and notes with the shared state. Additionally, recipients can conveniently import the shared link into their local GenomeSpy bookmarks. You can share the current state by clicking on the Share () button, or share an existing bookmark by selecting the Share option from the bookmark's submenu.

"},{"location":"sample-collections/visualizing/","title":"Visualizing Sample Collections","text":"

Developer Documentation

This page is intended for users who develop tailored visualizations using the GenomeSpy app.

"},{"location":"sample-collections/visualizing/#getting-started","title":"Getting started","text":"

You can use the following HTML template to create a web page for your visualization. The template loads the app from a content delivery network and the visualization specification from a separate spec.json file placed in the same directory. See the getting started page for more information.

<!DOCTYPE html>\n<html>\n  <head>\n    <title>GenomeSpy</title>\n    <link\n      rel=\"stylesheet\"\n      type=\"text/css\"\n      href=\"https://cdn.jsdelivr.net/npm/@genome-spy/app@0.51.x/dist/style.css\"\n    />\n  </head>\n  <body>\n    <script\n      type=\"text/javascript\"\n      src=\"https://cdn.jsdelivr.net/npm/@genome-spy/app@0.51.x\"\n    ></script>\n\n    <script>\n      genomeSpyApp.embed(document.body, \"spec.json\", {\n        // Show the dataflow inspector button in the toolbar (default: true)\n        showInspectorButton: true,\n      });\n    </script>\n  </body>\n</html>\n

For a complete example, check the website-examples repository on GitHub.

"},{"location":"sample-collections/visualizing/#specifying-a-sample-view","title":"Specifying a Sample View","text":"

The GenomeSpy app extends the core library with a new view composition operator that allows visualization of multiple samples. In this context, a sample means a set of data objects representing an organism, a piece of tissue, a cell line, a single cell, etc. Each sample gets its own track in the visualization, and the behavior resembles the facet operator of Vega-Lite. However, there are subtle differences in the behavior.

A sample view is defined by the samples and spec properties. To assign a track for a data object, define a sample-identifier field using the sample channel. More complex visualizations can be created using the layer operator. Each composed view may have a different data source, enabling concurrent visualization of multiple data types. For instance, the bottom layer could display segmented copy-number data, while the top layer might show single-nucleotide variants.

{\n  \"samples\": {\n    // Optional sample identifiers and metadata\n    ...\n  },\n  \"spec\": {\n    // A single or layer specification\n    ...,\n    \"encoding\": {\n      ...,\n      // The sample channel identifies the track\n      \"sample\": {\n        \"field\": \"sampleId\"\n      }\n    }\n  }\n}\n

Y axis ticks

The Y axis ticks are not available in sample views at the moment. Will be fixed at a later time. However, they would not be particularly practical with high number of samples.

But we have Band scale?

Superficially similar results can be achieved by using the \"band\" scale on the y channel. However, you can not adjust the intra-band y-position, as the y channel is already reserved for assigning a band for a datum. On the other hand, with the band scale, the graphical marks can span multiple bands. You could, for example, draw lines between the bands.

"},{"location":"sample-collections/visualizing/#implicit-sample-identifiers","title":"Implicit sample identifiers","text":"

By default, the identifiers of the samples are extracted from the data, and each sample gets its own track.

"},{"location":"sample-collections/visualizing/#explicit-sample-identifiers-and-metadata-attributes","title":"Explicit sample identifiers and metadata attributes","text":"

Genomic data is commonly supplemented with metadata that contains various clinical and computational annotations. To show such metadata alongside the genomic data as a color-coded heat map, you can provide a data source with sample identifiers and metadata columns.

Explicit sample identifiers
{\n  \"samples\": {\n    \"data\": { \"url\": \"samples.tsv\" }\n  },\n  \"spec\": {\n    ...\n  }\n}\n

The data source must have a sample field matching the sample identifiers used in the genomic data. In addition, an optional displayName field can be provided if the sample names should be shown, for example, in a shortened form. All other fields are shown as metadata attributes, and their data types are inferred automatically from the data: numeric attributes are interpreted as \"quantitative\" data, all others as \"nominal\".

An example of a metadata file (samples.tsv):

sample displayName treatment ploidy purity EOC52_pPer_DNA4 EOC52_pPer NACT 3.37 0.29 EOC702_pOme1_DNA1 EOC702_pOme1 PDS 3.74 0.155 EOC912_p2Bow2_DNA1 EOC912_p2Bow2 PDS 3.29 0.53"},{"location":"sample-collections/visualizing/#specifying-data-types-of-metadata-attributes","title":"Specifying data types of metadata attributes","text":"

To adjust the data types, scales, and default visibility of the attributes, they can be specified explicitly using the attributes object, as shown in the example below:

Specifying a purity attribute
{\n  \"samples\": {\n    \"data\": { \"url\": \"samples.tsv\" },\n    \"attributes\": {\n      \"purity\": {\n        \"type\": \"quantitative\",\n        \"scale\": {\n          \"domain\": [0, 1],\n          \"scheme\": \"yellowgreenblue\"\n        },\n        \"barScale\": { },\n        \"visible\": false\n      },\n      ...\n    }\n  },\n  ...\n}\n

The scale property specifies a scale for the color channel used to encode the values on the metadata heatmap. The optional barScale property enables positional encoding, changing the heatmap cells into a horizontal bar chart. The visible property configures the default visibility for the attribute.

"},{"location":"sample-collections/visualizing/#adjusting-font-sizes-etc","title":"Adjusting font sizes, etc.","text":"

The samples object can also be used to adjust the font sizes, etc. of the metadata attributes. For example, to increase the font sizes of the sample and attribute labels, use the following configuration:

Adjusting font sizes
{\n  \"samples\": {\n    ...,\n    \"labelFontSize\": 12,\n    \"attributeLabelFontSize\": 10\n  },\n  ...\n}\n

The following properties allow for fine-grained control of the font styles: labelFont, labelFontSize, labelFontWeight, labelFontStyle, labelAlign, attributeLabelFont, attributeLabelFontSize, attributeLabelFontWeight, attributeLabelFontStyle.

In addition, the following properties are supported:

labelTitleText

The title of the sample labels.

Default value: \"Sample name\"

labelLength

The space allocated for the sample labels in pixels.

Default value: 140

labelAlign

The horizontal alignment of the text. One of \"left\", \"center\", or \"right\".

Default value: \"left\"

attributeSize

Default size (width) of the metadata attribute columns. Can be configured per attribute using the attributes property.

Default value: 10

attributeLabelAngle

Angle to be added to the default label angle (-90).

Default value: 0

attributeSpacing

Spacing between attribute columns in pixels.

Default value: 1

"},{"location":"sample-collections/visualizing/#handling-variable-sample-heights","title":"Handling variable sample heights","text":"

The height of a single sample depend on the number of samples and the height of the sample view. Moreover, the end user can toggle between a bird's eye view and a closeup view making the height very dynamic.

To adapt the maximum size of \"point\" marks to the height of the samples, you need to specify a dynamic scale range for the size channel. The following example demonstrates how to use expressions and the height parameter to adjust the point size:

Dynamic point sizes
\"encoding\": {\n  \"size\": {\n    \"field\": \"VAF\",\n    \"type\": \"quantitative\",\n    \"scale\": {\n      \"domain\": [0, 1],\n      \"range\": [\n        { \"expr\": \"0\" },\n        { \"expr\": \"pow(clamp(height * 0.65, 2, 18), 2)\" }\n      ]\n    }\n  },\n  ...\n}\n

In this example, the height parameter, provided by the sample view, contains the height of a single sample. By multiplying it with 0.65, the points get some padding at the top and bottom. To prevent the points from becoming too small or excessively large, the clamp function is used to limit the point's diameter to a minimum of 2 and a maximum of 18 pixels. As the size channel encodes the area, not the diameter of the points, the pow function is used to square the value. The technique shown here is used in the PARPiCL example.

"},{"location":"sample-collections/visualizing/#aggregation","title":"Aggregation","text":"

TODO

"},{"location":"sample-collections/visualizing/#bookmarking","title":"Bookmarking","text":"

With the GenomeSpy app, users can save the current visualization state, including scale domains and view visibilities, as bookmarks. These bookmarks are stored in the IndexedDB of the user's web browser. Each database is unique to an origin, which typically refers to the hostname and domain of the web server hosting the visualization. Since the server may host multiple visualizations, each visualization must have a unique ID assigned to it. To enable bookmarking, simply add the specId property with an arbitrary but unique string value to the top-level view. Example:

{\n  \"specId\": \"My example visualization\",\n\n  \"vconcat\": { ... },\n  ...\n}\n
"},{"location":"sample-collections/visualizing/#pre-defined-bookmarks-and-bookmark-tour","title":"Pre-defined bookmarks and bookmark tour","text":"

You may want to provide users with a few pre-defined bookmarks that showcase interesting findings from the data. Since bookmarks support Markdown-formatted notes, you can also explain the implications of the findings and present essential background information.

The remote bookmarks feature allows for storing bookmarks in a JSON file on a web server and provides them to users through the bookmark menu. In addition, you can optionally enable the tour function, which automatically opens the first bookmark in the file and allows the user navigate the tour using previous/next buttons.

"},{"location":"sample-collections/visualizing/#enabling-remote-bookmarks","title":"Enabling remote bookmarks","text":"View specification
{\n  \"bookmarks\": {\n    \"remote\": {\n      \"url\": \"tour.json\",\n      \"tour\": true\n    }\n  },\n\n  \"vconcat\": { ... },\n  ...\n}\n

The remote object accepts the following properties:

url (string) A URL to the remote bookmark file. initialBookmark (string) Name of the bookmark that should be loaded as the initial state. The bookmark description dialog is shown only if the tour property is set to true. tour (boolean, optional)

Should the user be shown a tour of the remote bookmarks when the visualization is launched? If the initialBookmark property is not defined, the tour starts from the first bookmark.

Default: false

afterTourBookmark (string, optional) Name of the bookmark that should be loaded when the user ends the tour. If null, the dialog will be closed and the current state is retained. If undefined, the default state without any performed actions will be loaded."},{"location":"sample-collections/visualizing/#the-bookmark-file","title":"The bookmark file","text":"

The remote bookmark file consists of an array of bookmark objects. The easiest way to create such bookmark objects is to create a bookmark in the app and choose Share from the submenu () of the bookmark item. The sharing dialog provides the bookmark in a URL-encoded format and as a JSON object. Just copy-paste the JSON object into the bookmark file to make it available to all users. A simplified example:

Bookmark file (tour.json)
[\n  {\n    \"name\": \"First bookmark\",\n    \"actions\": [ ... ],\n    ...\n  },\n  {\n    \"name\": \"Second bookmark\",\n    \"actions\": [ ... ],\n    ...\n  }\n]\n

Providing the user with an initial state

If you want to provide the user with an initial state comprising specific actions performed on the samples, a particular visible genomic region, etc., you can create a bookmark with the desired settings and set the initialBookmark property to the bookmark's name. See the documentation above for details.

"},{"location":"sample-collections/visualizing/#toggleable-view-visibility","title":"Toggleable View Visibility","text":"

When working with a complex visualization that includes multiple tracks and extensive metadata, it may not always be necessary to display all views simultaneously. The GenomeSpy app offers users the ability to toggle the visibility of nodes within the view hierarchy. This visibility state is also included in shareable links and bookmarks, allowing users to easily access their preferred configurations.

Views have two properties for controlling the visibility:

visible (boolean)

If true, the view is visible. This property can be used to set the default visibility.

Default: true

configurableVisibility (boolean)

If true, the visibility is configurable from a menu in the app

Configurability requires that the view has an explicitly specified name that is unique within the view specification.

Default: false for children of layer, true for others

"},{"location":"sample-collections/visualizing/#search","title":"Search","text":"

The location/search field in the toolbar allows users to quickly navigate to features in the data. To make features searchable, use the search channel on marks that represent the searchable data objects. Example:

{\n  ...,\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"search\": {\n      \"field\": \"geneSymbol\"\n    },\n    ...,\n  },\n  ...\n}\n
"},{"location":"sample-collections/visualizing/#a-practical-example","title":"A practical example","text":"

Work in progress

This part of the documentation is still under construction. For a live example, check the PARPiCL visualization, which is also available for interactive exploration

"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Introduction","text":"

GenomeSpy is a toolkit for interactive visualization of genomic and other data. It enables tailored visualizations by providing a declarative grammar, which allows for mapping data to visual channels (position, color, etc.) and composing complex visualization from primitive graphical marks (points, rectangles, etc.). The grammar is heavily inspired by Vega-Lite, providing partial compatibility and extending it with features essential in genome visualization.

The visualizations are rendered using a carefully crafted WebGL-based engine, enabling fluid interaction and smooth animation for datasets comprising several million data points. The high interactive performance is achieved using GPU shader programs for all scale transformations and rendering of marks. However, shaders are an implementation detail hidden from the end users.

The toolkit comprises two JavaScript packages:

  1. The core library implements the visualization grammar and rendering engine and can be embedded in web pages or applications.
  2. The app extends the core library with support for interactive analysis of large sample collections. It broadens the grammar by introducing a facet operator that repeats a visualization for multiple samples. The app also provides interactions for filtering, sorting, and grouping these samples.

Check the Getting Started page to get started with GenomeSpy and make your own tailored visualizations.

"},{"location":"#an-interactive-example","title":"An interactive example","text":"

The example below is interactive. You can zoom in using the mouse wheel.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 200000, \"as\": \"x\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"random() * 0.682\", \"as\": \"u\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"((datum.u % 1e-8 > 5e-9 ? 1 : -1) * (sqrt(-log(max(1e-9, datum.u))) - 0.618)) * 1.618 + sin(datum.x / 10000)\",\n      \"as\": \"y\"\n    }\n  ],\n  \"mark\": {\n    \"type\": \"point\",\n    \"geometricZoomBound\": 10.5\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\", \"scale\": { \"zoom\": true } },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"size\": { \"value\": 200 },\n    \"opacity\": { \"value\": 0.6 }\n  }\n}\n
"},{"location":"#about","title":"About","text":"

GenomeSpy is developed by Kari Lavikka in The Systems Biology of Drug Resistance in Cancer group at the University of Helsinki.

This project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under Grant agreement No. 667403 (HERCULES) and No. 965193 (DECIDER)

"},{"location":"api/","title":"JavaScript API","text":"

The public JavaScript API is currently quite minimal.

"},{"location":"api/#embedding","title":"Embedding","text":"

See the getting started page.

"},{"location":"api/#the-api","title":"The API","text":"

The embed function returns a promise that resolves into an object that provides the current public API. The API is documented in the interface definition.

For practical examples on using the API, check the embed-examples package.

"},{"location":"api/#embed-options","title":"Embed options","text":"

The embed function accepts an optional options object.

"},{"location":"api/#named-data-provider","title":"Named data provider","text":"

See the API definition.

"},{"location":"api/#custom-tooltip-handlers","title":"Custom tooltip handlers","text":"

GenomeSpy provides two built-in tooltip handlers.

The default handler displays the underlying datum's properties in a table. Property names starting with an underscore are omitted. The values are formatted nicely.

The refseqgene handler fetches a summary description for a gene symbol using the Entrez API. For an example, check the RefSeq gene track in this notebook.

Handlers are functions that receive the hovered mark's underlying datum and return a promise that resolves into a string, HTMLElement, or lit-html TemplateResult.

The function signature:

export type TooltipHandler = (\n  datum: Record<string, any>,\n  mark: Mark,\n  /** Optional parameters from the view specification */\n  params?: Record<string, any>\n) => Promise<string | TemplateResult | HTMLElement>;\n

Use the tooltipHandlers option to register custom handlers or override the default. See the example below.

"},{"location":"api/#examples","title":"Examples","text":"

Overriding the default handler:

import { html } from \"lit-html\";\n\nconst options = {\n  tooltipHandlers: {\n    default: async (datum, mark, props) =>\n      html`\n        The datum has\n        <strong>${Object.keys(datum).length}</strong> attributes!\n      `,\n  },\n};\n\nembed(container, spec, options);\n

To use a specific (custom) handler in a view specification:

{\n  \"mark\": {\n    \"type\": \"point\",\n    \"tooltip\": {\n      \"handler\": \"myhandler\",\n      \"params\": {\n        \"custom\": \"param\"\n      }\n    }\n  },\n  ...\n}\n
"},{"location":"getting-started/","title":"Getting Started","text":"

GenomeSpy is a visualization toolkit for genomic data. More specifically, it is a JavaScript library that can be used to create interactive visualizations of genomic data in web browsers. To visualize data with GenomeSpy, you need to:

  1. Have some data to be visualized
  2. Write or find a visualization specification that describes how the data should be visualized
  3. Embed GenomeSpy into a web page and initialize it with the specification and the data
  4. Open the web page with your web browser

However, there are three ways to get quickly started with GenomeSpy visualizations: the Playground app, Observable notebooks, and embedding GenomeSpy on HTML pages. More advanced users can use GenomeSpy as a visualization library in web applications.

"},{"location":"getting-started/#playground","title":"Playground","text":"

The easiest way to try out GenomeSpy is the Playground app, which allows you to experiment with different visualization specifications directly in your web browser. You can load data from publicly accessible web servers or from your computer. The app is still rudimentary and does not support saving or sharing visualizations.

"},{"location":"getting-started/#observable-notebooks","title":"Observable notebooks","text":"

You can embed GenomeSpy into an Observable notebook. Please check the GenomeSpy collection for usage examples.

"},{"location":"getting-started/#local-or-remote-web-server","title":"Local or remote web server","text":"

For more serious work, you should use the GenomeSpy JavaScript library to create a web page for the visualization:

  1. Create an HTML document (web page) by using the example below
  2. Place the visualization spec and your data files into the same directory as the HTML document
  3. Copy them onto a remote web server or start a local web server in the directory
"},{"location":"getting-started/#local-web-server","title":"Local web server","text":"

Python comes with an HTTP server module that can be started from command line:

python3 -m http.server --bind 127.0.0.1\n

By default, it serves files from the current working directory. See Python's documentation for details.

"},{"location":"getting-started/#html-template","title":"HTML template","text":"

The templates below load the GenomeSpy JavaScript library from a content delivery network. Because the specification schema and the JavaScript API are not yet 100% stable, it is recommended to use a specific version.

The embed function initializes a visualization into the HTML element given as the first parameter using the specification given as the second parameter. The function returns a promise that resolves into an object that provides the current public API. For deails, see the API Documentation.

Check the latest version!

The versions in the examples below may be slightly out of date. The current version is:

"},{"location":"getting-started/#load-the-spec-from-a-file","title":"Load the spec from a file","text":"

This template loads the spec from a separate spec.json file.

<!DOCTYPE html>\n<html>\n  <head>\n    <title>GenomeSpy</title>\n  </head>\n  <body>\n    <script\n      type=\"text/javascript\"\n      src=\"https://cdn.jsdelivr.net/npm/@genome-spy/core@0.37.x\"\n    ></script>\n\n    <script>\n      genomeSpyEmbed.embed(document.body, \"spec.json\", {});\n    </script>\n  </body>\n</html>\n
"},{"location":"getting-started/#embed-the-spec-in-the-html-document","title":"Embed the spec in the HTML document","text":"

You can alternatively provide the specification as a JavaScript object.

<!DOCTYPE html>\n<html>\n  <head>\n    <title>GenomeSpy</title>\n  </head>\n  <body>\n    <script\n      type=\"text/javascript\"\n      src=\"https://cdn.jsdelivr.net/npm/@genome-spy/core@0.37.x\"\n    ></script>\n\n    <script>\n      const spec = {\n        data: {\n          sequence: { start: 0, stop: 6.284, step: 0.39269908169, as: \"x\" },\n        },\n        transform: [{ type: \"formula\", expr: \"sin(datum.x)\", as: \"sin\" }],\n        mark: \"point\",\n        encoding: {\n          x: { field: \"x\", type: \"quantitative\" },\n          y: { field: \"sin\", type: \"quantitative\" },\n        },\n      };\n\n      genomeSpyEmbed.embed(document.body, spec, {});\n    </script>\n  </body>\n</html>\n
"},{"location":"getting-started/#genomespyapp-website-examples","title":"Genomespy.app website examples","text":"

The examples on the genomespy.app main page are stored in the website-examples GitHub repository. You can clone the repository and launch the examples locally for further experimentation.

"},{"location":"getting-started/#using-genomespy-as-a-visualization-library-in-web-applications","title":"Using GenomeSpy as a visualization library in web applications","text":"

The @genome-spy/core NPM package contains a bundled library that can be used on web pages as shown in the examples above. In addition, it contains the source code in ESM format, allowing use with bundlers such as Vite and Webpack. For examples of such use, see:

"},{"location":"license/","title":"License","text":"

MIT License

Copyright (c) 2018-2023 Kari Lavikka

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

"},{"location":"license/#contains-code-from","title":"Contains Code From","text":""},{"location":"license/#vega-and-vega-lite","title":"Vega and Vega-Lite","text":"

Copyright (c) 2015, University of Washington Interactive Data Lab. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

"},{"location":"genomic-data/","title":"Working with Genomic Data","text":"

GenomeSpy provides various features that are specifically designed for working with genomic data.

"},{"location":"genomic-data/#loading-genomic-data","title":"Loading Genomic Data","text":"

While GenomeSpy can load data from various sources, such as CSV and JSON files, genomic data is often stored in specialized file formats, such as Indexed FASTA, BigWig, and BigBed. GenomeSpy provides built-in support for these formats, allowing you to load and visualize genomic data without the need for additional tools or libraries.

"},{"location":"genomic-data/#handling-genomic-coordinates","title":"Handling Genomic Coordinates","text":"

Genomic data is typically associated with genomic coordinates comprising chromosome names and positions within the chromosomes. GenomeSpy provides various techniques for working with such coordinates, such as transforming between different coordinate systems and visualizing data in the context of a reference genome.

"},{"location":"genomic-data/#data-transformations","title":"Data Transformations","text":"

Specialized transformations, such as folding tabular data, calculating coverage, and computing a piled up layout allows GenomeSpy to be adapted for many genomic data visualization and analysis tasks.

"},{"location":"genomic-data/#gpu-accelerated-rendering","title":"GPU-accelerated Rendering","text":"

As genomic data can be large and complex, GenomeSpy's GPU-accelerated rendering allows you to visualize, navigate, and explore large datasets with high performance.

"},{"location":"genomic-data/examples/","title":"Practical Genomic Data Examples","text":""},{"location":"genomic-data/examples/#observable-notebooks","title":"Observable notebooks","text":"

The ASCAT Copy-Number Segmentation notebook provides a comprehensive and fully documented example of using GenomeSpy with genomic data.

The Annotation Tracks notebooks explains how to implement a chromosome ideogram and a fancy gene annotation track.

"},{"location":"genomic-data/examples/#website-examples","title":"Website examples","text":"

The genomespy.app main page showcases several examples, some of which focusing on genomic data.

"},{"location":"genomic-data/genomic-coordinates/","title":"Genomic Coordinates","text":"

To allow easy visualization of coordinate-based genomic data, GenomeSpy can concatenate the discrete chromosomes onto a single continuous linear axis. Concatenation needs the sizes and preferred order for the contigs or chromosomes. These are usually provided with a genome assembly.

To activate support for genomic coordinates, add the genome property with the name of the assembly to the top level view specification:

{\n  \"genome\": {\n    \"name\": \"hg38\"\n  },\n  ...\n}\n

Only a single genome assembly

Currently, a visualization may have only a single globally configured genome assembly. Different assemblies for different scales (for x and y axes, for example) will be supported in the future.

"},{"location":"genomic-data/genomic-coordinates/#supported-genomes","title":"Supported genomes","text":"

GenomeSpy bundles a few common built-in genome assemblies: \"hg38\", \"hg19\", \"hg18\", \"mm10\", \"mm9\", and \"dm6\".

"},{"location":"genomic-data/genomic-coordinates/#custom-genomes","title":"Custom genomes","text":"

Custom genome assemblies can be provided in two ways: as a chrom.sizes file or within the the specification.

"},{"location":"genomic-data/genomic-coordinates/#as-a-chromsizes-file","title":"As a chrom.sizes file","text":"

The chrom.sizes file is a two-column text file with the chromosome names and their sizes. You may want to use the UCSC Genome Browser's fetchChromSizes script to download the sizes for a genome assembly. GenomeSpy does not filter out any alternative contigs or haplotypes, so you may want to preprocess the file before using it.

Example:

{\n  \"genome\": {\n    \"name\": \"hg19\",\n    \"url\": \"https://genomespy.app/data/genomes/hg19/chrom.sizes\"\n  },\n  ...\n}\n
"},{"location":"genomic-data/genomic-coordinates/#within-the-specification","title":"Within the specification","text":"

You can provide the genome assembly directly in the specification using the contigs property. The contigs are an array of objects with the name and size properties.

Example:

{\n  \"genome\": {\n    \"name\": \"dm6\",\n    \"contigs\": [\n      {\"name\": \"chr3R\", \"size\": 32079331 },\n      {\"name\": \"chr3L\", \"size\": 28110227 },\n      {\"name\": \"chr2R\", \"size\": 25286936 },\n      {\"name\": \"chrX\",  \"size\": 23542271 },\n      {\"name\": \"chr2L\", \"size\": 23513712 },\n      {\"name\": \"chrY\",  \"size\": 3667352 },\n      {\"name\": \"chr4\",  \"size\": 1348131 },\n    ]\n  },\n  ...\n}\n
"},{"location":"genomic-data/genomic-coordinates/#encoding-genomic-coordinates","title":"Encoding genomic coordinates","text":"

When a genome assembly has been specified, you can encode the genomic coordinates conveniently by specifying the chromosome (chrom) and position (pos) fields as follows:

{\n  ...,\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"Chr\",\n      \"pos\": \"Pos\",\n      \"offset\": -1.0,\n      \"type\": \"locus\"\n    },\n    ...\n  }\n}\n

The example above specifies that the chromosome is read from the \"Chr\" field and the intra-chromosomal position from the \"Pos\" field. The \"locus\" data type pairs the channel with a \"locus\" scale, which provides a chromosome-aware axis. However, you can also use the field property with the locus data type if the coordinate has already been linearized. The offset property is explained below.

What happens under the hood

When the chrom and pos properties are used used in channel definitions, GenomeSpy inserts an implicit linearizeGenomicCoordinate transformation into the data flow. The transformation introduces a new field with the linearized coordinate for the (chromosome, position) pair. The channel definition is modified to use the new field.

In some cases you may want to insert an explicit transformation to the data flow to have better control on its behavior.

"},{"location":"genomic-data/genomic-coordinates/#coordinate-counting","title":"Coordinate counting","text":"

The offset property allows for aligning and adjusting for different coordinate notations: zero or one based, closed or half-open. The offset is added to the final coordinate.

GenomeSpy's \"locus\" scale expects half-open, zero-based coordinates.

Read more about coordinates at the UCSC Genome Browser Blog.

"},{"location":"genomic-data/genomic-coordinates/#examples","title":"Examples","text":""},{"location":"genomic-data/genomic-coordinates/#point-features","title":"Point features","text":"

Point features cover a single position on a chromosome. An example of a point feature is a single nucleotide variant (SNV), where a nucleotide has been replaced by another.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"data\": {\n    \"values\": [\n      { \"chrom\": \"chr3\", \"pos\": 134567890 },\n      { \"chrom\": \"chr4\", \"pos\": 123456789 },\n      { \"chrom\": \"chr9\", \"pos\": 34567890 }\n    ]\n  },\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"pos\",\n      \"type\": \"locus\"\n    }\n  }\n}\n
"},{"location":"genomic-data/genomic-coordinates/#segment-features","title":"Segment features","text":"

Segment features cover a range of positions on a chromosome. They are defined by their two end positions. An example of a segment feature is a copy number variant (CNV), where a region of the genome has been duplicated or deleted.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"data\": {\n    \"values\": [\n      { \"chrom\": \"chr3\", \"startpos\": 100000000, \"endpos\": 140000000 },\n      { \"chrom\": \"chr4\", \"startpos\": 70000000, \"endpos\": 170000000 },\n      { \"chrom\": \"chr9\", \"startpos\": 50000000, \"endpos\": 70000000 }\n    ]\n  },\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"startpos\",\n      \"type\": \"locus\"\n    },\n    \"x2\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"endpos\"\n    }\n  }\n}\n
"},{"location":"grammar/","title":"Visualization Grammar","text":"

Genome browser applications typically couple the visual representations to specific file formats and provide few customization options. GenomeSpy has a more abstract approach to visualization, providing combinatorial building blocks such as marks, transformations, and scales. As a result, users can author tailored visualizations that display the underlying data more effectively.

The concept was first introduced in The Grammar of Graphics and developed further in ggplot2 and Vega-Lite.

A dialect of Vega-Lite

The visualization grammar of GenomeSpy is a dialect of Vega-Lite, providing partial compatibility. However, the goals of GenomeSpy and Vega-Lite are different \u2013 GenomeSpy is more domain-specific and primarily intended for the visualization and analysis of large datasets containing genomic coordinates. Nevertheless, GenomeSpy tries to follow Vega-Lite's grammar where practical, and thus, this documentation has several references to its documentation.

"},{"location":"grammar/#a-single-view-specification","title":"A single view specification","text":"

Each view specification must have at least the data to be visualized, the mark that will represent the data items, and an encoding that specifies how the fields of data are mapped to the visual channels of the mark. In addition, an optional transform steps allow for modifying the data before they are encoded into mark instances.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"abs(datum.sin)\", \"as\": \"abs(sin)\" }\n  ],\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"abs(sin)\", \"type\": \"quantitative\" },\n    \"size\": { \"field\": \"x\", \"type\": \"quantitative\" }\n  }\n}\n
"},{"location":"grammar/#properties","title":"Properties","text":"aggregateSamples

Type: array

Specifies views that aggregate multiple samples within the GenomeSpy App.

baseUrl

Type: string

The base URL for relative URL data sources and URL imports. The base URLs are inherited in the view hierarchy unless overridden with this property. By default, the top-level view's base URL equals to the visualization specification's base URL.

configurableVisibility

Type: boolean

Is the visibility configurable interactively from the GenomeSpy App. Configurability requires that the view has an explicitly specified name that is unique in within the view hierarchy.

Default: false for children of layer, true for others.

data

Type: UrlData | InlineData | NamedData | DynamicCallbackData | LazyData | Generator

Specifies a data source. If omitted, the data source is inherited from the parent view.

description

Type: string | string[]

A description of the view. Can be used for documentation. The description of the top-level view is shown in the toolbar of the GenomeSpy App.

encoding

Type: Encoding

Specifies how data are encoded using the visual channels.

height

Type: SizeDef | number | Step | \"container\"

Height of the view. If a number, it is interpreted as pixels. Check child sizing for details.

Default value: \"container\"

mark Required

Type: \"rect\" | \"point\" | \"rule\" | \"text\" | \"link\" | RectProps | TextProps | RuleProps | LinkProps | PointProps

The graphical mark presenting the data objects.

name

Type: string

An internal name that can be used for referring the view. For referencing purposes, the name should be unique within the view hierarchy.

opacity

Type: number | DynamicOpacity | ExprRef

Opacity of the view and all its children. Allows implementing semantic zooming where the layers are faded in and out as the user zooms in and out.

TODO: Write proper documentation with examples.

Default: 1.0

padding

Type: Paddings | number

Padding applied to the view. Accepts either a number representing pixels or an object specifying separate paddings for each edge.

Examples: - padding: 10 - padding: { top: 10, right: 20, bottom: 10, left: 20 }

Default value: 0

params

Type: array

Dynamic variables that parameterize a visualization.

resolve

Type: object

Specifies how scales and axes are resolved in the view hierarchy.

templates

Type: object

Templates that can be reused within the view specification by importing them with the template key.

title

Type: string | Title

View title. N.B.: Currently, GenomeSpy doesn't do bound calculation, and you need to manually specify proper padding for the view to ensure that the title is visible.

transform

Type: array

An array of transformations applied to the data before visual encoding.

view

Type: ViewBackground

The background of the view, including fill, stroke, and stroke width.

viewportHeight

Type: SizeDef | number | \"container\"

Optional viewport height of the view. If the view size exceeds the viewport height, it will be shown with scrollbars. This property implicitly enables clipping.

Default: null (same as height)

viewportWidth

Type: SizeDef | number | \"container\"

Optional viewport width of the view. If the view size exceeds the viewport width, it will be shown with scrollbars. This property implicitly enables clipping.

Default: null (same as width)

visible

Type: boolean

The default visibility of the view. An invisible view is removed from the layout and not rendered. For context, see toggleable view visibility.

Default: true

width

Type: SizeDef | number | Step | \"container\"

Width of the view. If a number, it is interpreted as pixels. Check child sizing for details.

Default: \"container\"

"},{"location":"grammar/#view-composition-for-more-complex-visualizations","title":"View composition for more complex visualizations","text":"

View composition allows for building more complex visualizations from multiple single-view specifications. For example, the layer operator allows creation of custom glyphs and the concatenation operators enables stacked layouts resembling genome browsers with multiple tracks.

"},{"location":"grammar/expressions/","title":"Expressions","text":"

Expressions allow for defining predicates or computing new variables based on existing data. The expression language is based on JavaScript, but provides only a limited set of features, guaranteeing secure execution.

Expressions can be used with the \"filter\" and \"formula\" transforms, in encoding, and in expression references for dynamic properties in marks, transforms, and data sources.

"},{"location":"grammar/expressions/#usage","title":"Usage","text":"

All basic arithmetic operators are supported:

(1 + 2) * 3 / 4\n

When using expressions within the data transformation pipeline, the current data object is available in the datum variable. Its properties (fields) can be accessed by using the dot or bracket notation:

datum.foo + 2\n

If the name of the property contains special characters such as \".\", \"!\", or \" \" (a space) the bracket notation must be used:

datum['A very *special* name!'] > 100\n
"},{"location":"grammar/expressions/#conditional-operators","title":"Conditional operators","text":"

Ternary operator:

datum.foo < 5 ? 'small' : 'large'\n

And an equivalent if construct:

if(datum.foo < 5, 'small', 'large')\n
"},{"location":"grammar/expressions/#provided-constants-and-functions","title":"Provided constants and functions","text":"

Common mathematical functions are supported:

(datum.u % 1e-8 > 5e-9 ? 1 : -1) *\n  (sqrt(-log(max(1e-9, datum.u))) - 0.618) *\n  1.618\n
"},{"location":"grammar/expressions/#constants-and-functions-from-vega","title":"Constants and functions from Vega","text":"

The following constants and functions are provided by the vega-expression package.

"},{"location":"grammar/expressions/#constants","title":"Constants","text":"

NaN, E, LN2, LN10, LOG2E, LOG10E, PI, SQRT1_2, SQRT2, MIN_VALUE, MAX_VALUE

"},{"location":"grammar/expressions/#type-checking-functions","title":"Type Checking Functions","text":"

isArray, isBoolean, isNumber, isObject, isRegExp, isString

"},{"location":"grammar/expressions/#math-functions","title":"Math Functions","text":"

isNaN, isFinite, abs, acos, asin, atan, atan2, ceil, cos, exp, floor, hypot, log, max, min, pow, random, round, sin, sqrt, tan, clamp

"},{"location":"grammar/expressions/#sequence-array-or-string-functions","title":"Sequence (Array or String) Functions","text":"

length, join, indexof, lastindexof, reverse, slice

"},{"location":"grammar/expressions/#string-functions","title":"String Functions","text":"

parseFloat, parseInt, upper, lower, replace, split, substring, trim

"},{"location":"grammar/expressions/#regexp-functions","title":"RegExp Functions","text":"

regexp, test

"},{"location":"grammar/expressions/#other-functions","title":"Other functions","text":"

# lerp(array, fraction) Provides a linearly interpolated value from the first to the last element in the given array based on the specified interpolation fraction, usually ranging from 0 to 1. For instance, lerp([0, 50], 0.5) yields 25.

# linearstep(edge0, edge1, x) Calculates a linear interpolation between 0 and 1 for a value x within the range defined by edge0 and edge1. It applies a clamp to ensure the result stays within the 0.0 to 1.0 range.

# smoothstep(edge0, edge1, x) Performs smooth Hermite interpolation between 0 and 1 for values of x that lie between edge0 and edge1. This function is particularly useful for scenarios requiring a threshold function with a smooth transition, offering a gradual rather than an abrupt change between states.

"},{"location":"grammar/import/","title":"Importing Views","text":"

GenomeSpy facilitates reusing views by allowing them to be imported from the same specification by name or from external specification files by a URL. The files can be placed flexibly \u2013 it may be practical to split large specifications into multiple files and place them in the same directory. On the other hand, if you have created, for example, an annotation track that you would like the share with the research community, you can upload the specification file and the associated data to a publicly accessible web server. The imported views, both named and URLs, can be parameterized to allow for customization.

"},{"location":"grammar/import/#properties","title":"Properties","text":"import Required

Type: UrlImport | TemplateImport

The method to import a specification.

name

Type: string

The name given to the imported view. This property overrides the name specified in the imported specification.

params

Type: (VariableParameter | SelectionParameter)[] | object

Dynamic variables that parameterize a visualization. Parameters defined here override the parameters defined in the imported specification.

"},{"location":"grammar/import/#urlimport","title":"UrlImport","text":"url Required

Type: string

Imports a specification from the specified URL.

"},{"location":"grammar/import/#templateimport","title":"TemplateImport","text":"template Required

Type: string

Imports a specification from the current view hierarchy, searching first in the current view, then ascending through ancestors.

"},{"location":"grammar/import/#importing-from-a-url","title":"Importing from a URL","text":"

Views can be imported from relative and absolute URLs. Relative URLs are imported with respect to the current baseUrl.

The imported specification may contain a single, concatenated, or layered view. The baseUrl of the imported specification is updated to match the directory of the imported specification. Thus, you can publish a view (or a track as known in genome browsers) by placing its specification and data available in the same directory on a web server.

The URL import supports parameters, which are described below within the named templates.

Example
{\n  ...,\n  \"vconcat\": [\n    ...,\n    { \"import\": { \"url\": \"includes/annotations.json\" } },\n    { \"import\": { \"url\": \"https://example.site/tracks/annotations.json\" } }\n  ]\n}\n
"},{"location":"grammar/import/#repeating-with-named-templates","title":"Repeating with named templates","text":"

Instead of importing from external files, views can offer named templates for reuse by their descendants. In the example below, the provided specification features a template called \"myTrack,\" which is applied twice, each instance with a unique set of parameters. The imported view can access the parameters using expressions. This approach enables the modification of visual elements through parameter changes, streamlining the creation of varied visualizations from a single template without the need to duplicate the base specification fragment.

{\n  \"vconcat\": [\n    {\n      \"import\": {\n        \"template\": \"myTrack\"\n      },\n      \"params\": [{ \"name\": \"size\", \"value\": 5 }]\n    },\n    {\n      \"import\": {\n        \"template\": \"myTrack\"\n      },\n      \"params\": { \"offset\": 3.141, \"size\": 20 }\n    }\n  ],\n  \"templates\": {\n    \"myTrack\": {\n      \"params\": [\n        { \"name\": \"offset\", \"value\": 0 },\n        { \"name\": \"size\", \"value\": 10 }\n      ],\n      \"data\": {\n        \"sequence\": { \"start\": 0, \"stop\": 20, \"step\": 0.2, \"as\": \"x\" }\n      },\n      \"transform\": [\n        { \"type\": \"formula\", \"expr\": \"sin(datum.x + offset)\", \"as\": \"y\" }\n      ],\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"size\": { \"value\": { \"expr\": \"size\" } },\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"y\", \"type\": \"quantitative\" }\n      }\n    }\n  }\n}\n
"},{"location":"grammar/parameters/","title":"Parameters","text":"

Work in progress

This page is a work in progress and is incomplete.

Parameters enable various dynamic behaviors in GenomeSpy visualizations, such as interactive selections, conditional encoding, and data filtering with expressions. They also enable parameterization when importing specification fragments from external files or named templates. Parameters in GenomeSpy are heavily inspired by the parameters concept of Vega-Lite.

"},{"location":"grammar/parameters/#examples","title":"Examples","text":""},{"location":"grammar/parameters/#using-input-bindings","title":"Using Input Bindings","text":"

Parameters can be bound to input elements, such as sliders, dropdowns, and checkboxes. The GenomeSpy Core library shows the input elements below the visualization. In the GenomeSpy App, the input elements are shown in the View visibility menu, allowing the visualization author to provide sophisticated configuration options to the end user.

The following example shows how to bind parameters to input elements and use them to control the size, angle, and text of a text mark.

{\n  \"padding\": 0,\n  \"view\": { \"fill\": \"#cbeef3\" },\n  \"params\": [\n    {\n      \"name\": \"size\",\n      \"value\": 80,\n      \"bind\": { \"input\": \"range\", \"min\": 1, \"max\": 300 }\n    },\n    {\n      \"name\": \"angle\",\n      \"value\": 0,\n      \"bind\": { \"input\": \"range\", \"min\": 0, \"max\": 360 }\n    },\n    {\n      \"name\": \"text\",\n      \"value\": \"Params are cool!\",\n      \"bind\": {\n        \"input\": \"select\",\n        \"options\": [\"Params are cool!\", \"GenomeSpy\", \"Hello\", \"World\"]\n      }\n    }\n  ],\n\n  \"data\": { \"values\": [{}] },\n\n  \"mark\": {\n    \"type\": \"text\",\n    \"font\": \"Lobster\",\n    \"text\": { \"expr\": \"text\" },\n    \"size\": { \"expr\": \"size\" },\n    \"angle\": { \"expr\": \"angle\" }\n  }\n}\n
"},{"location":"grammar/parameters/#expressions","title":"Expressions","text":"

Parameters can be based on expressions, which can depend on other parameters. They are automatically re-evaluated when the dependent parameters change.

{\n  \"view\": { \"stroke\": \"lightgray\" },\n  \"params\": [\n    {\n      \"name\": \"A\",\n      \"value\": 2,\n      \"bind\": { \"input\": \"range\", \"min\": 0, \"max\": 10, \"step\": 1 }\n    },\n    {\n      \"name\": \"B\",\n      \"value\": 3,\n      \"bind\": { \"input\": \"range\", \"min\": 0, \"max\": 10, \"step\": 1 }\n    },\n    {\n      \"name\": \"C\",\n      \"expr\": \"A * B\"\n    }\n  ],\n\n  \"data\": { \"values\": [{}] },\n\n  \"mark\": {\n    \"type\": \"text\",\n    \"size\": 30,\n    \"text\": { \"expr\": \"'' + A + ' * ' + B + ' = ' + C\" }\n  }\n}\n
"},{"location":"grammar/parameters/#selection-parameters","title":"Selection parameters","text":"

Parameters allow for defining interactive selections, which can be used in conditional encodings. GenomeSpy compiles the conditional encoding rules into efficient GPU shader code, enabling fast interactions in very large data sets. However, only point selections are currently supported.

The following example has been adapted from Vega-Lite's example gallery with slight modifications (GenomeSpy provides no \"bar\" mark). The specification below is fully compatible with Vega-Lite. You can select multiple bars by holding down the Shift key.

{\n  \"description\": \"A bar chart with highlighting on hover and selecting on click. (Inspired by Tableau's interaction style.)\",\n\n  \"data\": {\n    \"values\": [\n      { \"a\": \"A\", \"b\": 28 },\n      { \"a\": \"B\", \"b\": 55 },\n      { \"a\": \"C\", \"b\": 43 },\n      { \"a\": \"D\", \"b\": 91 },\n      { \"a\": \"E\", \"b\": 81 },\n      { \"a\": \"F\", \"b\": 53 },\n      { \"a\": \"G\", \"b\": 19 },\n      { \"a\": \"H\", \"b\": 87 },\n      { \"a\": \"I\", \"b\": 52 }\n    ]\n  },\n  \"params\": [\n    {\n      \"name\": \"highlight\",\n      \"select\": { \"type\": \"point\", \"on\": \"pointerover\" }\n    },\n    { \"name\": \"select\", \"select\": \"point\" }\n  ],\n  \"mark\": {\n    \"type\": \"rect\",\n    \"fill\": \"#4C78A8\",\n    \"stroke\": \"black\"\n  },\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"a\",\n      \"type\": \"ordinal\",\n      \"scale\": { \"type\": \"band\", \"padding\": 0.2 }\n    },\n    \"y\": { \"field\": \"b\", \"type\": \"quantitative\" },\n    \"fillOpacity\": {\n      \"value\": 0.3,\n      \"condition\": { \"param\": \"select\", \"value\": 1 }\n    },\n    \"strokeWidth\": {\n      \"value\": 0,\n      \"condition\": [\n        { \"param\": \"select\", \"value\": 2, \"empty\": false },\n        { \"param\": \"highlight\", \"value\": 1, \"empty\": false }\n      ]\n    }\n  }\n}\n
"},{"location":"grammar/scale/","title":"Scale","text":"

Scales are functions that map abstract data values (e.g., a type of a point mutation) to visual values (e.g., colors that indicate the type).

By default, GenomeSpy configures scales automatically based on the data type (e.g., \"ordinal\"), the visual channel, and the data domain. As the defaults may not always be optimal, the scales can be configured explicitly.

Specifying a scale for a channel
{\n  \"encoding\": {\n    \"y\": {\n      \"field\": \"impact\",\n      \"type\": \"quantitative\",\n      \"scale\": {\n        \"type\": \"linear\",\n        \"domain\": [0, 1]\n      }\n    }\n  },\n  ...\n}\n
"},{"location":"grammar/scale/#vega-lite-scales","title":"Vega-Lite scales","text":"

GenomeSpy implements most of the scale types of Vega-Lite. The aim is to replicate their behavior identically (unless stated otherwise) in GenomeSpy. Although that has yet to fully materialize, Vega-Lite's scale documentation generally applies to GenomeSpy as well.

The supported scales are: \"linear\", \"pow\", \"sqrt\", \"symlog\", \"log\", \"ordinal\", \"band\", \"point\", \"quantize\", and \"threshold\". Disabled scale is supported on quantitative channels such as x and opacity.

Currently, the following scales are not supported: \"time\", \"utc\", \"quantile\", \"bin-linear\", \"bin-ordinal\".

Relation to Vega scales

In fact, GenomeSpy uses Vega scales, which are based on d3-scale. However, GenomeSpy has GPU-based implementations for the actual scale transformations, ensuring high rendering performance.

"},{"location":"grammar/scale/#genomespy-specific-scales","title":"GenomeSpy-specific scales","text":"

GenomeSpy provides two additional scales that are designed for molecular sequence data.

"},{"location":"grammar/scale/#index-scale","title":"Index scale","text":"

The \"index\" scale allows mapping index-based values such as nucleotide or amino-acid locations to positional visual channels. It has traits from both the continuous \"linear\" and the discrete \"band\" scale. It is linear and zoomable but maps indices to the range like the band scale does \u2013 each index has its own band. Properties such as padding work just as in the band scale.

The indices must be zero-based, i.e., the counting must start from zero. The numbering of the axis labels can be adjusted to give an impression of, for example, one-based indexing.

The index scale is used by default when the field type is \"index\".

"},{"location":"grammar/scale/#point-indices","title":"Point indices","text":"

When only the primary positional channel is defined, marks such as \"rect\" fill the whole band.

{\n  \"data\": {\n    \"values\": [0, 2, 4, 7, 8, 10, 12]\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"data\", \"type\": \"index\" }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"color\": { \"field\": \"data\", \"type\": \"nominal\" }\n      }\n    },\n    {\n      \"mark\": \"text\",\n      \"encoding\": {\n        \"text\": {\n          \"field\": \"data\"\n        }\n      }\n    }\n  ]\n}\n

Marks such as \"point\" that do not support the secondary positional channel are centered.

{\n  \"data\": {\n    \"values\": [0, 2, 4, 7, 8, 10, 12]\n  },\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": { \"field\": \"data\", \"type\": \"index\" },\n    \"color\": { \"field\": \"data\", \"type\": \"nominal\" },\n    \"size\": { \"value\": 300 }\n  }\n}\n
"},{"location":"grammar/scale/#range-indices","title":"Range indices","text":"

When the index scale is used with ranges, e.g., a \"rect\" mark that has both the x and x2 channels defined, the ranges must be half open. For example, if a segment should cover the indices 2, 3, and 4, a half-open range would be defined as: x = 2 (inclusive), x2 = 5 (exclusive).

{\n  \"data\": {\n    \"values\": [\n      { \"from\": 0, \"to\": 2 },\n      { \"from\": 2, \"to\": 5 },\n      { \"from\": 8, \"to\": 9 },\n      { \"from\": 10, \"to\": 13 }\n    ]\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"from\", \"type\": \"index\" },\n    \"x2\": { \"field\": \"to\" }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"color\": { \"field\": \"from\", \"type\": \"nominal\" }\n      }\n    },\n    {\n      \"mark\": \"text\",\n      \"encoding\": {\n        \"text\": {\n          \"expr\": \"'[' + datum.from + ', ' + datum.to + ')'\"\n        }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/scale/#adjusting-the-indexing-of-axis-labels","title":"Adjusting the indexing of axis labels","text":"

The index scale expects zero-based indexing. However, it may be desirable to display the axis labels using one-based indexing. Use the numberingOffset property adjust the label indices.

{\n  \"data\": {\n    \"values\": [0, 2, 4, 7, 8, 10, 12]\n  },\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"data\",\n      \"type\": \"index\",\n      \"scale\": {\n        \"numberingOffset\": 1\n      }\n    }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"color\": { \"field\": \"data\", \"type\": \"nominal\" }\n      }\n    },\n    {\n      \"mark\": \"text\",\n      \"encoding\": {\n        \"text\": {\n          \"field\": \"data\"\n        }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/scale/#locus-scale","title":"Locus scale","text":"

The \"locus\" scale is similar to the \"index\" scale, but provides a genome-aware axis with concatenated chromosomes. To use the locus scale, a genome must be specified.

The locus scale is used by default when the field type is \"locus\".

Note

The locus scale does not map the discrete chromosomes onto the concatenated axis. It's done by the linearizeGenomicCoordinate transform.

"},{"location":"grammar/scale/#specifying-the-domain","title":"Specifying the domain","text":"

By default, the domain of the locus scale consists of the whole genome. However, You can specify a custom domain using either linearized or genomic coordinates. A genomic coordinate consists of a chromosome (chrom) and an optional position (pos). The left bound's position defaults to zero, whereas the right bound's position defaults to the size of the chromosome. Thus, the chromosomes are inclusive.

For example, chromosomes 3, 4, and 5:

[{ \"chrom\": \"chr3\" }, { \"chrom\": \"chr5\" }]\n

Only the chromosome 3:

[{ \"chrom\": \"chr3\" }]\n

A specific region inside the chromosome 3:

[\n  { \"chrom\": \"chr3\", \"pos\": 1000000 },\n  { \"chrom\": \"chr3\", \"pos\": 2000000 }\n]\n

Somewhere inside the chromosome 1:

[1000000, 2000000]\n
"},{"location":"grammar/scale/#example","title":"Example","text":"
{\n  \"genome\": { \"name\": \"hg38\" },\n  \"data\": {\n    \"values\": [\n      { \"chrom\": \"chr3\", \"pos\": 134567890 },\n      { \"chrom\": \"chr4\", \"pos\": 123456789 },\n      { \"chrom\": \"chr9\", \"pos\": 34567890 }\n    ]\n  },\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"pos\",\n      \"type\": \"locus\",\n      \"scale\": {\n        \"domain\": [{ \"chrom\": \"chr3\" }, { \"chrom\": \"chr9\" }]\n      }\n    },\n    \"size\": { \"value\": 200 }\n  }\n}\n
"},{"location":"grammar/scale/#zooming-and-panning","title":"Zooming and panning","text":"

To enable zooming and panning of continuous scales on positional channels, set the zoom scale property to true. Example:

{\n  \"x\": {\n    \"field\": \"foo\",\n    \"type\": \"quantitative\",\n    \"scale\": {\n      \"zoom\": true\n    }\n  }\n}\n

Both \"index\" and \"locus\" scales are zoomable by default.

"},{"location":"grammar/scale/#zoom-extent","title":"Zoom extent","text":"

The zoom extent allows you to control how far the scale can be zoomed out or panned (translated). Zoom extent equals the scale domain by default, except for the \"locus\" scale, where it includes the whole genome. Example:

{\n  ...,\n  \"scale\": {\n    \"domain\": [10, 20],\n    \"zoom\": {\n      \"extent\": [0, 30]\n    }\n  }\n}\n
"},{"location":"grammar/scale/#named-scales","title":"Named scales","text":"

By giving the scale a name, it can be accessed through the API.

{\n  ...,\n  \"scale\": {\n    \"name\": \"myScale\"\n  }\n}\n
"},{"location":"grammar/scale/#axes","title":"Axes","text":"

Positional channels are usually annotated with axes, which are automatically generated based on the scale type. However, you can customize the axis by specifying the axis property in the encoding block.

{\n  ...,\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"foo\",\n      \"type\": \"quantitative\",\n      \"axis\": {\n        \"title\": \"My axis title\"\n      }\n    }\n  }\n}\n

GenomeSpy implements most of Vega-Lite's axis properties. See the interface definition for supported properties. TODO: Write a proper documentation.

Grid lines

Grid lines are hidden by default in GenomeSpy and can be enabled for each view using the grid property. The default behavior will be configurable once GenomeSpy supports themes.

"},{"location":"grammar/scale/#genome-axis-for-loci","title":"Genome axis for loci","text":"

The genome axis is a special axis for the \"locus\" scale. It displays chromosome names and the intra-chromosomal coordinates. You can adjust the style of the chromosome axis and grid using various parameters.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"data\": { \"values\": [{}] },\n  \"mark\": \"point\",\n\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"a\",\n      \"pos\": \"b\",\n      \"type\": \"locus\",\n\n      \"axis\": {\n        \"chromTickColor\": \"#5F87F5\",\n        \"chromLabelColor\": \"#E16B67\",\n\n        \"grid\": true,\n        \"gridColor\": \"gray\",\n        \"gridOpacity\": 0.5,\n        \"gridDash\": [1, 11],\n\n        \"chromGrid\": true,\n        \"chromGridDash\": [3, 3],\n        \"chromGridColor\": \"#5F87F5\",\n        \"chromGridOpacity\": 0.7,\n        \"chromGridFillEven\": \"#BEFACC\",\n        \"chromGridFillOdd\": \"#FDFCE8\"\n      }\n    }\n  }\n}\n
"},{"location":"grammar/scale/#fully-customized-axes","title":"Fully customized axes","text":"

You can also disable the genome axis and grid and specify a custom axis instead. The \"axisGenome\" data source provides the chromosomes and their sizes, which can be used to create a custom axes or grids for a view.

"},{"location":"grammar/types/","title":"Types Used in the Grammar","text":"

Note

The list is still incomplete.

"},{"location":"grammar/types/#compareparams","title":"CompareParams","text":"field Required

Type: string (field name)[] | string (field name)

The field(s) to sort by

order

Type: (\"ascending\" | \"descending\")[] | \"ascending\" | \"descending\"

The order(s) to use: \"ascending\" (default), \"descending\".

"},{"location":"grammar/types/#dynamicopacity","title":"DynamicOpacity","text":"channel

Type: \"x\" | \"y\"

TODO

unitsPerPixel Required

Type: array

Stops expressed as units (base pairs, for example) per pixel.

values Required

Type: array

Opacity values that match the given stops.

"},{"location":"grammar/types/#exprref","title":"ExprRef","text":"expr Required

Type: string

The expression string.

"},{"location":"grammar/types/#paddings","title":"Paddings","text":"bottom

Type: number

TODO

left

Type: number

TODO

right

Type: number

TODO

top

Type: number

TODO

"},{"location":"grammar/types/#step","title":"Step","text":"step Required

Type: number

TODO

"},{"location":"grammar/types/#sizedef","title":"SizeDef","text":"grow

Type: number

Share of the remaining space. See child sizing for details.

px

Type: number

Size in pixels

"},{"location":"grammar/types/#title","title":"Title","text":"align

Type: \"left\" | \"center\" | \"right\"

Horizontal text alignment for title text. One of \"left\", \"center\", or \"right\".

anchor

Type: None | start | middle | end

The anchor position for placing the title and subtitle text. One of \"start\", \"middle\", or \"end\". For example, with an orientation of top these anchor positions map to a left-, center-, or right-aligned title.

angle

Type: number | ExprRef

Angle in degrees of title and subtitle text.

baseline

Type: \"top\" | \"middle\" | \"bottom\" | \"alphabetic\"

Vertical text baseline for title and subtitle text. One of \"alphabetic\" (default), \"top\", \"middle\", or \"bottom\".

color

Type: string | ExprRef

Text color for title text.

dx

Type: number

Delta offset for title and subtitle text x-coordinate.

dy

Type: number

Delta offset for title and subtitle text y-coordinate.

font

Type: string

Font name for title text.

fontSize

Type: number | ExprRef

Font size in pixels for title text.

fontStyle

Type: \"normal\" | \"italic\"

Font style for title text.

fontWeight

Type: number | \"thin\" | \"light\" | \"regular\" | \"normal\" | \"medium\" | \"bold\" | \"black\"

Font weight for title text. This can be either a string (e.g \"bold\", \"normal\") or a number (100, 200, 300, ..., 900 where \"normal\" = 400 and \"bold\" = 700).

frame

Type: \"bounds\" | \"group\"

The reference frame for the anchor position, one of \"bounds\" (to anchor relative to the full bounding box) or \"group\" (to anchor relative to the group width or height).

offset

Type: number

The orthogonal offset in pixels by which to displace the title group from its position along the edge of the chart.

orient

Type: \"none\" | \"left\" | \"right\" | \"top\" | \"bottom\"

Default title orientation (\"none\", \"top\", \"bottom\", \"left\", or \"right\")

style

Type: string

A mark style property to apply to the title text mark. If not specified, a default style of \"group-title\" is applied.

text Required

Type: string | ExprRef

The title text.

"},{"location":"grammar/types/#viewbackground","title":"ViewBackground","text":"fill

Type: string | ExprRef

The fill color.

fillOpacity

Type: number | ExprRef

The fill opacity. Value between 0 and 1.

stroke

Type: string | ExprRef

The stroke color

strokeOpacity

Type: number | ExprRef

The stroke opacity. Value between 0 and 1.

strokeWidth

Type: number

The stroke width in pixels.

"},{"location":"grammar/types/#viewopacitydef","title":"ViewOpacityDef","text":"

Type: number | DynamicOpacity | ExprRef

"},{"location":"grammar/composition/","title":"View Composition","text":"

GenomeSpy replicates the hierarchical composition model of Vega-Lite, and currently provides the concatenation and layer composition operators in the core library. In addition, the GenomeSpy app provides a facet operator for visualizing sample collections using a track-based layout.

The hierarchical model allows for nesting composition operators. For instance, you could have a visualization with two views side by side, and those views could contain multiple layered views. The views in the hierarchy inherit (transformed) data and encoding from their parents, and in some cases, the views may also share scales and axes with their siblings and parents. The data and encoding inherited from ancestors can always be overridden by the descendants.

"},{"location":"grammar/composition/#scale-and-axis-resolution","title":"Scale and axis resolution","text":"

Each visual channel of a view has a scale, which is either \"independent\" or \"shared\" with other views. For example, sharing the scale on the positional x channel links the zooming interactions of the participanting views through the shared scale domain. The axes of positional channels can be configured similarly.

The resolve property configures the scale and axis resolutions for the view's children.

An example of a resolution configuration
{\n  \"resolve\": {\n    \"scale\": {\n      \"x\": \"shared\",\n      \"y\": \"independent\",\n      \"color\": \"independent\"\n    },\n    \"axis\": {\n      \"x\": \"shared\",\n      \"y\": \"independent\"\n    }\n  },\n  ...\n}\n
"},{"location":"grammar/composition/#shared","title":"Shared","text":"

The example below shows an excerpt of segmented copy number data layered on raw SNP logR values. The scale of the y channel is shared by default and the domain is unioned. As the x channel's scale is also shared, the zooming interaction affects both views.

{\n  \"layer\": [\n    {\n      \"data\": { \"url\": \"../data/cnv_chr19_raw.tsv\" },\n      \"title\": \"Single probe\",\n\n      \"mark\": {\n        \"type\": \"point\",\n        \"geometricZoomBound\": 9.5\n      },\n\n      \"encoding\": {\n        \"x\": { \"field\": \"Position\", \"type\": \"index\" },\n        \"y\": { \"field\": \"logR\", \"type\": \"quantitative\" },\n        \"size\": { \"value\": 225 },\n        \"opacity\": { \"value\": 0.15 }\n      }\n    },\n    {\n      \"data\": {\n        \"url\": \"../data/cnv_chr19_segs.tsv\"\n      },\n      \"title\": \"Segment mean\",\n      \"mark\": {\n        \"type\": \"rule\",\n        \"size\": 3.0,\n        \"minLength\": 3.0,\n        \"color\": \"black\"\n      },\n      \"encoding\": {\n        \"x\": { \"field\": \"startpos\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"endpos\" },\n        \"y\": { \"field\": \"segMean\", \"type\": \"quantitative\" }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/composition/#independent","title":"Independent","text":"

By specifying that the scales of the y channel should remain \"independent\", both layers get their own scales and axes. Obviously, such a configuration makes no sense with these data.

{\n  \"resolve\": {\n    \"scale\": { \"y\": \"independent\" },\n    \"axis\": { \"y\": \"independent\" }\n  },\n  \"layer\": [\n    {\n      \"data\": { \"url\": \"../data/cnv_chr19_raw.tsv\" },\n      \"title\": \"Single probe\",\n\n      \"mark\": {\n        \"type\": \"point\",\n        \"geometricZoomBound\": 9.5\n      },\n\n      \"encoding\": {\n        \"x\": { \"field\": \"Position\", \"type\": \"index\" },\n        \"y\": { \"field\": \"logR\", \"type\": \"quantitative\" },\n        \"size\": { \"value\": 225 },\n        \"opacity\": { \"value\": 0.15 }\n      }\n    },\n    {\n      \"data\": {\n        \"url\": \"../data/cnv_chr19_segs.tsv\"\n      },\n      \"title\": \"Segment mean\",\n      \"mark\": {\n        \"type\": \"rule\",\n        \"size\": 3.0,\n        \"minLength\": 3.0,\n        \"color\": \"black\"\n      },\n      \"encoding\": {\n        \"x\": { \"field\": \"startpos\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"endpos\" },\n        \"y\": { \"field\": \"segMean\", \"type\": \"quantitative\" }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/","title":"View Concatenation","text":"

The vconcat and hconcat composition operators place views side-by-side either vertically or horizontally. The vconcat is practical for building genomic visualizations with multiple tracks. The concat operator with the columns property produces a wrapping grid layout.

The spacing (in pixels) between concatenated views can be adjusted using the spacing property (Default: 10).

"},{"location":"grammar/composition/concat/#example","title":"Example","text":""},{"location":"grammar/composition/concat/#vertical","title":"Vertical","text":"

Using vconcat for a vertical layout.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n\n  \"spacing\": 20,\n\n  \"vconcat\": [\n    {\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"sin\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"cos\", \"type\": \"quantitative\" }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#horizontal","title":"Horizontal","text":"

Using hconcat for a horizontal layout.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n\n  \"hconcat\": [\n    {\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"sin\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"mark\": \"point\",\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": { \"field\": \"cos\", \"type\": \"quantitative\" }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#grid","title":"Grid","text":"

Using concat and columns for a grid layout. For simplicity, the same visualization is used for all panels in the grid.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"sin\", \"type\": \"quantitative\" }\n  },\n\n  \"columns\": 3,\n  \"concat\": [\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" },\n    { \"mark\": \"point\" }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#child-sizing","title":"Child sizing","text":"

The concatenation operators mimic the behavior of the CSS flexbox. The child views have an absolute minimum size (px) in pixels and an unitless grow value that specifies in what proportion the possible remaining space should be distributed. The remaining space depends on the parent view's size.

In the following example, the left view has a width of 20 px, the center view has a grow of 1, and the right view has a grow of 2. If you resize the web browser, you can observe that the width of the left view stays constant while the remaining space is distributed in proportions of 1:2.

{\n  \"data\": { \"values\": [{}] },\n\n  \"spacing\": 10,\n\n  \"hconcat\": [\n    {\n      \"width\": { \"px\": 20 },\n      \"mark\": \"rect\"\n    },\n    {\n      \"width\": { \"grow\": 1 },\n      \"mark\": \"rect\"\n    },\n    {\n      \"width\": { \"grow\": 2 },\n      \"mark\": \"rect\"\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#sizedef","title":"SizeDef","text":"grow

Type: number

Share of the remaining space. See child sizing for details.

px

Type: number

Size in pixels

The size may have both absolute (px) and proportional (grow) components. When views are nested, both the absolute and proportional sizes are added up. Thus, the width of the above example is { \"px\": 40, \"grow\": 3 }. The spacing between the child views is added to the total absolute width.

Views' size properties (width and height) accept both SizeDef objects and shorthands. The SizeDef objects contain either or both of px and grow properties. Numbers are interpreted as as absolute sizes, and \"container\" is the same as { grow: 1 }. Undefined sizes generally default to \"container\".

Concatenation operators can nested flexibly to build complex layouts as in the following example.

{\n  \"data\": { \"values\": [{}] },\n\n  \"hconcat\": [\n    { \"mark\": \"rect\" },\n    {\n      \"vconcat\": [{ \"mark\": \"rect\" }, { \"mark\": \"rect\" }]\n    }\n  ]\n}\n
"},{"location":"grammar/composition/concat/#scrollable-viewports","title":"Scrollable viewports","text":"

Sometimes the concents of a view are so large that they do not fit into the available space. In such cases, the view can be made scrollable by setting an explicit size for the view using the viewportWidth and viewportHeight properties. They accept the same values as width and height properties except for the step size. Scrollable viewports are particularly useful for categorical data types (\"ordinal\" and \"nominal\") and respective scales and axes that do not support zooming and panning.

{\n  \"height\": { \"step\": 20 },\n  \"viewportHeight\": \"container\",\n\n  \"view\": { \"stroke\": \"lightgray\" },\n\n  \"data\": { \"sequence\": { \"start\": 0, \"stop\": 31, \"step\": 1 } },\n\n  \"encoding\": {\n    \"x\": { \"field\": \"data\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"data\", \"type\": \"ordinal\" }\n  },\n\n  \"mark\": { \"type\": \"point\" }\n}\n
"},{"location":"grammar/composition/concat/#resolve","title":"Resolve","text":"

By default, all channels have \"independent\" scales and axes. However, because track-based layouts that resemble genome browsers are such a common use case, vconcat defaults to \"shared\" resolution for x channel and hconcat defaults to \"shared\" resolution for y channel.

"},{"location":"grammar/composition/concat/#shared-axes","title":"Shared axes","text":"

Concatenation operators support shared axes on channels that also have shared scales. Axis domain line, ticks, and labels are drawn only once for each row or column. Grid lines are drawn for all participating views.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n\n  \"resolve\": {\n    \"scale\": { \"x\": \"shared\", \"y\": \"shared\" },\n    \"axis\": { \"x\": \"shared\", \"y\": \"shared\" }\n  },\n\n  \"spacing\": 20,\n\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\", \"axis\": { \"grid\": true } },\n    \"y\": { \"field\": \"sin\", \"type\": \"quantitative\", \"axis\": { \"grid\": true } }\n  },\n\n  \"columns\": 2,\n\n  \"concat\": [\n    { \"mark\": \"point\", \"view\": { \"stroke\": \"lightgray\" } },\n    { \"mark\": \"point\", \"view\": { \"stroke\": \"lightgray\" } },\n    { \"mark\": \"point\", \"view\": { \"stroke\": \"lightgray\" } },\n    { \"mark\": \"point\", \"view\": { \"stroke\": \"lightgray\" } }\n  ]\n}\n
"},{"location":"grammar/composition/layer/","title":"Layering Views","text":"

The layer operator superimposes multiple views over each other.

"},{"location":"grammar/composition/layer/#example","title":"Example","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"a\": \"A\", \"b\": 28 },\n      { \"a\": \"B\", \"b\": 55 },\n      { \"a\": \"C\", \"b\": 43 },\n      { \"a\": \"D\", \"b\": 91 },\n      { \"a\": \"E\", \"b\": 81 },\n      { \"a\": \"F\", \"b\": 53 },\n      { \"a\": \"G\", \"b\": 19 },\n      { \"a\": \"H\", \"b\": 87 },\n      { \"a\": \"I\", \"b\": 52 }\n    ]\n  },\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"a\",\n      \"type\": \"nominal\",\n      \"scale\": { \"padding\": 0.1 },\n      \"axis\": { \"labelAngle\": 0 }\n    },\n    \"y\": { \"field\": \"b\", \"type\": \"quantitative\" }\n  },\n  \"layer\": [\n    {\n      \"name\": \"Bar\",\n      \"mark\": \"rect\"\n    },\n    {\n      \"name\": \"Label\",\n      \"mark\": { \"type\": \"text\", \"dy\": -9 },\n      \"encoding\": {\n        \"text\": { \"field\": \"b\" }\n      }\n    }\n  ]\n}\n

To specify multiple layers, use the layer property:

{\n  \"layer\": [\n    ...  // Single or layered view specifications\n  ]\n}\n

The provided array may contain both single view specifications and layer specifications. The encodings and data that are specified in a layer view propagate to its descendants. For example, in the above example, the \"Bar\" and \"Label\" views inherit the data and encodings for the x and y channels from their parent, the layer view.

"},{"location":"grammar/composition/layer/#resolve","title":"Resolve","text":"

By default, layers share their scales and axes, unioning the data domains.

"},{"location":"grammar/composition/layer/#more-examples","title":"More examples","text":""},{"location":"grammar/composition/layer/#lollipop-plot","title":"Lollipop plot","text":"

This example layers two marks to create a composite mark, a lollipop. Yet another layer is used for the baseline.

{\n  \"name\": \"The Root\",\n  \"description\": \"Lollipop plot example\",\n\n  \"layer\": [\n    {\n      \"name\": \"Baseline\",\n      \"data\": { \"values\": [0] },\n      \"mark\": \"rule\",\n      \"encoding\": {\n        \"y\": { \"field\": \"data\", \"type\": \"quantitative\", \"title\": null },\n        \"color\": { \"value\": \"lightgray\" }\n      }\n    },\n    {\n      \"name\": \"Arrows\",\n\n      \"data\": {\n        \"sequence\": {\n          \"start\": 0,\n          \"stop\": 6.284,\n          \"step\": 0.39269908169,\n          \"as\": \"x\"\n        }\n      },\n\n      \"transform\": [\n        { \"type\": \"formula\", \"expr\": \"sin(datum.x)\", \"as\": \"sin(x)\" }\n      ],\n\n      \"encoding\": {\n        \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n        \"y\": {\n          \"field\": \"sin(x)\",\n          \"type\": \"quantitative\",\n          \"scale\": { \"padding\": 0.1 }\n        },\n        \"color\": { \"field\": \"sin(x)\", \"type\": \"quantitative\" }\n      },\n\n      \"layer\": [\n        {\n          \"name\": \"Arrow shafts\",\n\n          \"mark\": {\n            \"type\": \"rule\",\n            \"size\": 3\n          }\n        },\n        {\n          \"name\": \"Arrowheads\",\n\n          \"mark\": {\n            \"type\": \"point\",\n            \"size\": 500,\n            \"filled\": true\n          },\n\n          \"encoding\": {\n            \"shape\": {\n              \"field\": \"sin(x)\",\n              \"type\": \"nominal\",\n              \"scale\": {\n                \"type\": \"threshold\",\n                \"domain\": [-0.01, 0.01],\n                \"range\": [\"triangle-down\", \"diamond\", \"triangle-up\"]\n              }\n            }\n          }\n        }\n      ]\n    }\n  ]\n}\n
"},{"location":"grammar/data/","title":"Data Input","text":"

Like Vega-Lite's data model, GenomeSpy utilizes a tabular data structure as its fundamental data model, resembling a spreadsheet or database table. Each data set in GenomeSpy is considered to consist of a set of records, each containing various named data fields.

In GenomeSpy, the data property within a view specification describes the data source. In a hierarchically composed view specification, the views inherit the data, which may be further transformed, from their parent views. However, each view can also override the inherited data.

Non-indexed eager data, which is fully loaded during the visualization initialization stage, can be provided as inline data (values) or by specifying a URL from which the data can be loaded (url). Additionally, you can use a sequence generator for generating sequences of numbers.

GenomeSpy provides several lazy data sources that load data on-demand in response to user interactions to support large genomic data sets comprising millions of records. These data sources enable easy handling of standard bioinformatic data formats such as indexed FASTA and BigWig.

Furthermore, GenomeSpy enables the creation of an empty data source with a given name. This data source can be dynamically updated using the API, making it particularly useful when embedding GenomeSpy in web applications.

"},{"location":"grammar/data/eager/","title":"Eager Data Sources","text":"

Eager data sources load and process all available data during the initialization stage. They are suitable for small data sets as they do not support partial loading or loading in response to user interactions. However, eager data sources are often more flexible and straightforward than lazy ones.

GenomeSpy inputs eager data as tabular \"csv\", \"tsv\", and \"json\" files or as non-indexed \"fasta\" files. Data can be loaded from URLs or provided inline. You can also use generators to generate data on the fly and further modify them using transforms.

The data property of the view specification describes a data source. The following example loads a tab-delimited file. By default, GenomeSpy infers the format from the file extension. However, in bioinformatics, CSV files are often actually tab-delimited, and you must specify the \"tsv\" explicitly:

Example: Eagerly loading data from a URL
{\n  \"data\": {\n    \"url\": \"fileWithTabs.csv\",\n    \"format\": { \"type\": \"tsv\" }\n  },\n  ...\n}\n

With the exception of the unsupported geographical formats, the data property of GenomeSpy is identical to Vega-Lite's data property.

Type inference

GenomeSpy uses vega-loader to parse tabular data and infer its data types. Vega-loader is sometimes overly eager to interpret strings as a dates. In such cases, the field types need to be specified explicitly. On the other hand, explicit type specification also gives a significant performance boost to parsing performance.

Handling empty (NA) values

Empty or missing values must be presented as empty strings instead of NA that R writes by default. Otherwise type inference fails for numeric fields.

"},{"location":"grammar/data/eager/#named-data","title":"Named Data","text":"

When embedding GenomeSpy in a web application or page, data can be added or updated at runtime using the API. Data sources are referenced by a name, which is passed to the updateNamedData method:

{\n    \"data\": {\n        \"name\": \"myResults\"\n    }\n    ...\n}\n
const api = await embed(\"#container\", spec);\napi.updateNamedData(\"myResults\", [\n  { x: 1, y: 2 },\n  { x: 2, y: 3 },\n]);\n

Although named data can be updated dynamically, it does not automatically respond to user interactions. For practical examples of dynamically updated named data, check the embed-examples package.

"},{"location":"grammar/data/eager/#bioinformatic-formats","title":"Bioinformatic Formats","text":"

Most bioinformatic data formats are supported through lazy data. The following formats are supported as eager data with the url source.

"},{"location":"grammar/data/eager/#fasta","title":"FASTA","text":"

The type of FASTA format is \"fasta\" as shown in the example below:

{\n  \"data\": {\n    \"url\": \"16SRNA_Deino_87seq_copy.aln\",\n    \"format\": {\n      \"type\": \"fasta\"\n    }\n  },\n  ...\n}\n

The FASTA loader produces data objects with two fields: identifier and sequence. With the \"flattenSequence\" transform you can split the sequences into individual bases (one object per base) for easier visualization.

"},{"location":"grammar/data/lazy/","title":"Lazy Data Sources","text":"

Lazy data sources load data on-demand in response to user interactions. Unlike eager sources, most lazy data sources support indexing, which offers the capability to retrieve and load data partially and incrementally, as users navigate the genome. This is especially useful for very large datasets that are infeasible to load in their entirety.

How it works

Lazy data sources observe the scale domains of the view where the data source is specified. When the domain changes as a result of an user interaction, the data source invokes a request to fetch a new subset of the data. Lazy sources need the visual channel to be specified, which is used to determine the scale to observe. For genomic data sources, the channel defaults to \"x\".

Lazy data sources are specified using the lazy property of the data object. Unlike in eager data, the type of the data source must be specified explicitly:

Example: Specifiying a lazy data source
{\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"bigwig\",\n      \"url\": \"https://data.genomespy.app/genomes/hg38/hg38.gc5Base.bw\"\n    }\n  },\n  ...\n}\n
"},{"location":"grammar/data/lazy/#indexed-fasta","title":"Indexed FASTA","text":"

The \"indexedFasta\" source enable fast random access to a reference sequence. It loads the sequence as three consecutive chuncks that cover and flank the currently visible region (domain), allowing the user to rapidly pan the view. The chunks are provided as data objects with the following fields: chrom (string), start (integer), and sequence (a string of bases).

"},{"location":"grammar/data/lazy/#parameters","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

indexUrl

Type: string

URL of the index file.

Default value: url + \".fai\".

url Required

Type: string

URL of the fasta file.

windowSize

Type: number

Size of each chunk when fetching the fasta file. Data is only fetched when the length of the visible domain smaller than the window size.

Default value: 7000

"},{"location":"grammar/data/lazy/#example","title":"Example","text":"

The example below shows how to specify a sequence track using an indexed FASTA file. The sequence chunks are split into separate data objects using the \"flattenSequence\" transform, and the final position of each nucleotide is computed using the \"formula\" transform. Please note that new data are fetched only when the user zooms into a region smaller than the window size (default: 7000 bp).

{\n  \"genome\": { \"name\": \"hg38\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"indexedFasta\",\n      \"url\": \"https://data.genomespy.app/genomes/hg38/hg38.fa\"\n    }\n  },\n\n  \"transform\": [\n    {\n      \"type\": \"flattenSequence\",\n      \"field\": \"sequence\",\n      \"as\": [\"rawPos\", \"base\"]\n    },\n    { \"type\": \"formula\", \"expr\": \"datum.rawPos + datum.start\", \"as\": \"pos\" }\n  ],\n\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"pos\",\n      \"type\": \"locus\",\n      \"scale\": {\n        \"domain\": [\n          { \"chrom\": \"chr7\", \"pos\": 20003500 },\n          { \"chrom\": \"chr7\", \"pos\": 20003540 }\n        ]\n      }\n    },\n    \"color\": {\n      \"field\": \"base\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"domain\": [\"A\", \"C\", \"T\", \"G\", \"a\", \"c\", \"t\", \"g\", \"N\"],\n        \"range\": [\n          \"#7BD56C\",\n          \"#FF9B9B\",\n          \"#86BBF1\",\n          \"#FFC56C\",\n          \"#7BD56C\",\n          \"#FF9B9B\",\n          \"#86BBF1\",\n          \"#FFC56C\",\n          \"#E0E0E0\"\n        ]\n      }\n    }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\"\n    },\n    {\n      \"mark\": {\n        \"type\": \"text\",\n        \"size\": 13,\n        \"fitToBand\": true,\n        \"paddingX\": 1.5,\n        \"paddingY\": 1,\n        \"opacity\": 0.7,\n        \"flushX\": false,\n        \"tooltip\": null\n      },\n      \"encoding\": {\n        \"color\": { \"value\": \"black\" },\n        \"text\": { \"field\": \"base\" }\n      }\n    }\n  ]\n}\n

The data source is based on GMOD's indexedfasta-js library.

"},{"location":"grammar/data/lazy/#bigwig","title":"BigWig","text":"

The \"bigwig\" source enables the retrieval of dense, continuous data, such as coverage or other signal data stored in BigWig files. It behaves similarly to the indexed FASTA source, loading the data in chunks that cover and flank the currently visible region. However, the window size automatically adapts to the zoom level, and data are fetched in higher resolution when zooming in. The data source provides data objects with the following fields: chrom (string), start (integer), end (integer), and score (number).

"},{"location":"grammar/data/lazy/#parameters_1","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

pixelsPerBin

Type: number | ExprRef

The approximate minimum width of each data bin, in pixels.

Default value: 2

url Required

Type: string | ExprRef

URL of the BigWig file.

"},{"location":"grammar/data/lazy/#example_1","title":"Example","text":"

The example below shows the GC content of the human genome in 5-base windows. When you zoom in, the resolution of the data automatically increases.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"view\": { \"stroke\": \"lightgray\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"bigwig\",\n      \"url\": \"https://data.genomespy.app/genomes/hg38/hg38.gc5Base.bw\"\n    }\n  },\n\n  \"encoding\": {\n    \"y\": {\n      \"field\": \"score\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 100] },\n      \"axis\": { \"title\": \"GC (%)\", \"grid\": true, \"gridDash\": [2, 2] }\n    },\n    \"x\": { \"chrom\": \"chrom\", \"pos\": \"start\", \"type\": \"locus\" },\n    \"x2\": { \"chrom\": \"chrom\", \"pos\": \"end\" }\n  },\n\n  \"mark\": \"rect\"\n}\n

The data source is based on GMOD's bbi-js library.

"},{"location":"grammar/data/lazy/#bigbed","title":"BigBed","text":"

The \"bigbed\" source enables the retrieval of segmented data, such as annotated genomic regions stored in BigBed files.

"},{"location":"grammar/data/lazy/#parameters_2","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

url Required

Type: string | ExprRef

URL of the BigBed file.

windowSize

Type: number | ExprRef

Size of each chunk when fetching the BigBed file. Data is only fetched when the length of the visible domain smaller than the window size.

Default value: 1000000

"},{"location":"grammar/data/lazy/#example_2","title":"Example","text":"

The example below displays \"ENCODE Candidate Cis-Regulatory Elements (cCREs) combined from all cell types\" dataset for the hg38 genome.

{\n  \"genome\": { \"name\": \"hg38\" },\n  \"view\": { \"stroke\": \"lightgray\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"bigbed\",\n      \"url\": \"https://data.genomespy.app/sample-data/encodeCcreCombined.hg38.bb\"\n    }\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"chromStart\",\n      \"type\": \"locus\",\n      \"scale\": {\n        \"domain\": [\n          { \"chrom\": \"chr7\", \"pos\": 66600000 },\n          { \"chrom\": \"chr7\", \"pos\": 66800000 }\n        ]\n      }\n    },\n    \"x2\": {\n      \"chrom\": \"chrom\",\n      \"pos\": \"chromEnd\"\n    },\n    \"color\": {\n      \"field\": \"ucscLabel\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"domain\": [\"prom\", \"enhP\", \"enhD\", \"K4m3\", \"CTCF\"],\n        \"range\": [\"#FF0000\", \"#FFA700\", \"#FFCD00\", \"#FFAAAA\", \"#00B0F0\"]\n      }\n    }\n  },\n\n  \"mark\": \"rect\"\n}\n

The data source is based on GMOD's bbi-js library.

"},{"location":"grammar/data/lazy/#gff3","title":"GFF3","text":"

The tabix-based \"gff3\" source enables the retrieval of hierarchical data, such as genomic annotations stored in GFF3 files. The object format GenomeSpy uses is described in gff-js's documentation. The flatten and project transforms are useful when extracting the child features and attributes from the hierarchical data structure. See the example below.

"},{"location":"grammar/data/lazy/#parameters_3","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

indexUrl

Type: string

Url of the tabix index file.

Default value: url + \".tbi\".

url Required

Type: string

Url of the bgzip compressed file.

windowSize

Type: number

Size of each chunk when fetching the Tabix file. Data is only fetched when the length of the visible domain smaller than the window size.

Default value: 30000000

"},{"location":"grammar/data/lazy/#example_3","title":"Example","text":"

The example below displays the human (GRCh38.p13) GENCODE v43 annotation dataset. Please note that the example shows a maximum of ten overlapping features per locus as vertical scrolling is currently not supported properly.

{\n  \"$schema\": \"https://unpkg.com/@genome-spy/core/dist/schema.json\",\n\n  \"genome\": { \"name\": \"hg38\" },\n\n  \"height\": { \"step\": 28 },\n  \"viewportHeight\": \"container\",\n\n  \"view\": { \"stroke\": \"lightgray\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"gff3\",\n      \"url\": \"https://data.genomespy.app/sample-data/gencode.v43.annotation.sorted.gff3.gz\",\n      \"windowSize\": 2000000,\n      \"debounceDomainChange\": 300\n    }\n  },\n\n  \"transform\": [\n    {\n      \"type\": \"flatten\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.attributes.gene_name\",\n      \"as\": \"gene_name\"\n    },\n    {\n      \"type\": \"flatten\",\n      \"fields\": [\"child_features\"]\n    },\n    {\n      \"type\": \"flatten\",\n      \"fields\": [\"child_features\"],\n      \"as\": [\"child_feature\"]\n    },\n    {\n      \"type\": \"project\",\n      \"fields\": [\n        \"gene_name\",\n        \"child_feature.type\",\n        \"child_feature.strand\",\n        \"child_feature.seq_id\",\n        \"child_feature.start\",\n        \"child_feature.end\",\n        \"child_feature.attributes.gene_type\",\n        \"child_feature.attributes.transcript_type\",\n        \"child_feature.attributes.gene_id\",\n        \"child_feature.attributes.transcript_id\",\n        \"child_feature.attributes.transcript_name\",\n        \"child_feature.attributes.tag\",\n        \"source\",\n        \"child_feature.child_features\"\n      ],\n      \"as\": [\n        \"gene_name\",\n        \"type\",\n        \"strand\",\n        \"seq_id\",\n        \"start\",\n        \"end\",\n        \"gene_type\",\n        \"transcript_type\",\n        \"gene_id\",\n        \"transcript_id\",\n        \"transcript_name\",\n        \"tag\",\n        \"source\",\n        \"_child_features\"\n      ]\n    },\n    {\n      \"type\": \"collect\",\n      \"sort\": {\n        \"field\": [\"seq_id\", \"start\", \"transcript_id\"]\n      }\n    },\n    {\n      \"type\": \"pileup\",\n      \"start\": \"start\",\n      \"end\": \"end\",\n      \"as\": \"_lane\"\n    }\n  ],\n\n  \"encoding\": {\n    \"x\": {\n      \"chrom\": \"seq_id\",\n      \"pos\": \"start\",\n      \"offset\": 1,\n      \"type\": \"locus\",\n      \"scale\": {\n        \"domain\": [\n          { \"chrom\": \"chr5\", \"pos\": 177482500 },\n          { \"chrom\": \"chr5\", \"pos\": 177518000 }\n        ]\n      }\n    },\n    \"x2\": {\n      \"chrom\": \"seq_id\",\n      \"pos\": \"end\"\n    },\n    \"y\": {\n      \"field\": \"_lane\",\n      \"type\": \"index\",\n      \"scale\": {\n        \"zoom\": false,\n        \"reverse\": true,\n        \"domain\": [0, 40],\n        \"padding\": 0.5\n      },\n      \"axis\": null\n    }\n  },\n\n  \"layer\": [\n    {\n      \"name\": \"gencode-transcript\",\n\n      \"layer\": [\n        {\n          \"name\": \"gencode-tooltip-trap\",\n          \"title\": \"GENCODE transcript\",\n          \"mark\": {\n            \"type\": \"rule\",\n            \"color\": \"#b0b0b0\",\n            \"opacity\": 0,\n            \"size\": 7\n          }\n        },\n        {\n          \"name\": \"gencode-transcript-body\",\n          \"mark\": {\n            \"type\": \"rule\",\n            \"color\": \"#b0b0b0\",\n            \"tooltip\": null\n          }\n        }\n      ]\n    },\n    {\n      \"name\": \"gencode-exons\",\n\n      \"transform\": [\n        {\n          \"type\": \"flatten\",\n          \"fields\": [\"_child_features\"]\n        },\n        {\n          \"type\": \"flatten\",\n          \"fields\": [\"_child_features\"],\n          \"as\": [\"child_feature\"]\n        },\n        {\n          \"type\": \"project\",\n          \"fields\": [\n            \"gene_name\",\n            \"_lane\",\n            \"child_feature.type\",\n            \"child_feature.seq_id\",\n            \"child_feature.start\",\n            \"child_feature.end\",\n            \"child_feature.attributes.exon_number\",\n            \"child_feature.attributes.exon_id\"\n          ],\n          \"as\": [\n            \"gene_name\",\n            \"_lane\",\n            \"type\",\n            \"seq_id\",\n            \"start\",\n            \"end\",\n            \"exon_number\",\n            \"exon_id\"\n          ]\n        }\n      ],\n\n      \"layer\": [\n        {\n          \"title\": \"GENCODE exon\",\n\n          \"transform\": [{ \"type\": \"filter\", \"expr\": \"datum.type == 'exon'\" }],\n\n          \"mark\": {\n            \"type\": \"rect\",\n            \"minWidth\": 0.5,\n            \"minOpacity\": 0.5,\n            \"stroke\": \"#505050\",\n            \"fill\": \"#fafafa\",\n            \"strokeWidth\": 1.0\n          }\n        },\n        {\n          \"title\": \"GENCODE exon\",\n\n          \"transform\": [\n            {\n              \"type\": \"filter\",\n              \"expr\": \"datum.type != 'exon' && datum.type != 'start_codon' && datum.type != 'stop_codon'\"\n            }\n          ],\n\n          \"mark\": {\n            \"type\": \"rect\",\n            \"minWidth\": 0.5,\n            \"minOpacity\": 0,\n            \"strokeWidth\": 1.0,\n            \"strokeOpacity\": 0.0,\n            \"stroke\": \"gray\"\n          },\n          \"encoding\": {\n            \"fill\": {\n              \"field\": \"type\",\n              \"type\": \"nominal\",\n              \"scale\": {\n                \"domain\": [\"five_prime_UTR\", \"CDS\", \"three_prime_UTR\"],\n                \"range\": [\"#83bcb6\", \"#ffbf79\", \"#d6a5c9\"]\n              }\n            }\n          }\n        },\n        {\n          \"transform\": [\n            {\n              \"type\": \"filter\",\n              \"expr\": \"datum.type == 'three_prime_UTR' || datum.type == 'five_prime_UTR'\"\n            },\n            {\n              \"type\": \"formula\",\n              \"expr\": \"datum.type == 'three_prime_UTR' ? \\\"3'\\\" : \\\"5'\\\"\",\n              \"as\": \"label\"\n            }\n          ],\n\n          \"mark\": {\n            \"type\": \"text\",\n            \"color\": \"black\",\n            \"size\": 11,\n            \"opacity\": 0.7,\n            \"paddingX\": 2,\n            \"paddingY\": 1.5,\n            \"tooltip\": null\n          },\n\n          \"encoding\": {\n            \"text\": {\n              \"field\": \"label\"\n            }\n          }\n        }\n      ]\n    },\n    {\n      \"name\": \"gencode-transcript-labels\",\n\n      \"transform\": [\n        {\n          \"type\": \"formula\",\n          \"expr\": \"(datum.strand == '-' ? '< ' : '') + datum.transcript_name + ' - ' + datum.transcript_id + (datum.strand == '+' ? ' >' : '')\",\n          \"as\": \"label\"\n        }\n      ],\n\n      \"mark\": {\n        \"type\": \"text\",\n        \"size\": 10,\n        \"yOffset\": 12,\n        \"tooltip\": null,\n        \"color\": \"#505050\"\n      },\n\n      \"encoding\": {\n        \"text\": {\n          \"field\": \"label\"\n        }\n      }\n    }\n  ]\n}\n

The data source is based on GMOD's tabix-js and gff-js libraries.

"},{"location":"grammar/data/lazy/#bam","title":"BAM","text":"

The \"bam\" source is very much work in progress but has a low priority. It currently exposes the reads but provides no handling for variants alleles, CIGARs, etc. Please send a message to GitHub Discussions if you are interested in this feature.

"},{"location":"grammar/data/lazy/#parameters_4","title":"Parameters","text":"channel

Type: \"x\" | \"y\"

Which channel's scale domain to monitor.

Default value: \"x\"

debounce

Type: number | ExprRef

Debounce time for data updates, in milliseconds. Debouncing prevents excessive data updates when the user is zooming or panning around.

Default value: 200

debounceMode

Type: string

The debounce mode for data updates. If set to \"domain\", domain change events (panning and zooming) will be debounced. If set to \"window\", the data fetches initiated by the changes to the visible window (or tile) will be debounced. If your data is small, the \"window\" is better as it will start fetching data while the user is still panning around, resulting in a shorter perceived latency.

Default value: \"window\"

indexUrl

Type: string

URL of the index file.

Default value: url + \".bai\".

url Required

Type: string

URL of the BigBed file.

windowSize

Type: number

Size of each chunk when fetching the BigBed file. Data is only fetched when the length of the visible domain smaller than the window size.

Default value: 10000

"},{"location":"grammar/data/lazy/#example_4","title":"Example","text":"
{\n  \"genome\": { \"name\": \"hg18\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"bam\",\n      \"url\": \"https://data.genomespy.app/sample-data/bamExample.bam\",\n      \"windowSize\": 30000\n    }\n  },\n\n  \"resolve\": { \"scale\": { \"x\": \"shared\" } },\n\n  \"spacing\": 5,\n\n  \"vconcat\": [\n    {\n      \"view\": { \"stroke\": \"lightgray\" },\n      \"height\": 40,\n\n      \"transform\": [\n        {\n          \"type\": \"coverage\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"coverage\",\n          \"chrom\": \"chrom\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": {\n          \"chrom\": \"chrom\",\n          \"pos\": \"start\",\n          \"type\": \"locus\",\n          \"axis\": null\n        },\n        \"x2\": { \"chrom\": \"chrom\", \"pos\": \"end\" },\n        \"y\": { \"field\": \"coverage\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"view\": { \"stroke\": \"lightgray\" },\n\n      \"transform\": [\n        {\n          \"type\": \"pileup\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"_lane\"\n        }\n      ],\n\n      \"encoding\": {\n        \"x\": {\n          \"chrom\": \"chrom\",\n          \"pos\": \"start\",\n          \"type\": \"locus\",\n          \"axis\": {},\n          \"scale\": {\n            \"domain\": [\n              { \"chrom\": \"chr21\", \"pos\": 33037317 },\n              { \"chrom\": \"chr21\", \"pos\": 33039137 }\n            ]\n          }\n        },\n        \"x2\": {\n          \"chrom\": \"chrom\",\n          \"pos\": \"end\"\n        },\n        \"y\": {\n          \"field\": \"_lane\",\n          \"type\": \"index\",\n          \"scale\": {\n            \"domain\": [0, 60],\n            \"padding\": 0.3,\n            \"reverse\": true,\n            \"zoom\": false\n          }\n        },\n        \"color\": {\n          \"field\": \"strand\",\n          \"type\": \"nominal\",\n          \"scale\": {\n            \"domain\": [\"+\", \"-\"],\n            \"range\": [\"crimson\", \"orange\"]\n          }\n        }\n      },\n\n      \"mark\": \"rect\"\n    }\n  ]\n}\n

The data source is based on GMOD's bam-js library.

"},{"location":"grammar/data/lazy/#axis-ticks","title":"Axis ticks","text":"

The \"axisTicks\" data source generates a set of ticks for the specified channel. While GenomeSpy internally uses this data source for generating axis ticks, you also have the flexibility to employ it for creating fully customized axes according to your requirements. The data source generates data objects with value and label fields.

"},{"location":"grammar/data/lazy/#parameters_5","title":"Parameters","text":"axis

Type: Axis

Optional axis properties

channel Required

Type: \"x\" | \"y\"

Which channel's scale domain to listen to

"},{"location":"grammar/data/lazy/#example_5","title":"Example","text":"

The example below generates approximately three ticks for the x axis.

{\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"axisTicks\",\n      \"channel\": \"x\",\n      \"axis\": {\n        \"tickCount\": 3\n      }\n    }\n  },\n\n  \"mark\": {\n    \"type\": \"text\",\n    \"size\": 20,\n    \"clip\": false\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"value\",\n      \"type\": \"quantitative\",\n      \"scale\": {\n        \"domain\": [0, 10],\n        \"zoom\": true\n      }\n    },\n    \"text\": {\n      \"field\": \"label\"\n    }\n  }\n}\n
"},{"location":"grammar/data/lazy/#axis-genome","title":"Axis genome","text":"

The axisGenome data source, in fact, does not dynamically update data. However, it provides a convenient access to the genome (chromosomes) of the given channel, allowing creation of customized chromosome ticks or annotations. The data source generates data objects with the following fields: name, size (in bp), continuousStart (linearized coordinate), continuousEnd, odd (boolean), and number (1-based index).

"},{"location":"grammar/data/lazy/#parameters_6","title":"Parameters","text":"channel Required

Type: \"x\" | \"y\"

Which channel's scale domain to use

"},{"location":"grammar/data/lazy/#example_6","title":"Example","text":"
{\n  \"genome\": { \"name\": \"hg38\" },\n\n  \"data\": {\n    \"lazy\": {\n      \"type\": \"axisGenome\",\n      \"channel\": \"x\"\n    }\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"continuousStart\",\n      \"type\": \"locus\"\n    },\n    \"x2\": {\n      \"field\": \"continuousEnd\"\n    },\n    \"text\": {\n      \"field\": \"name\"\n    }\n  },\n\n  \"layer\": [\n    {\n      \"transform\": [\n        {\n          \"type\": \"filter\",\n          \"expr\": \"datum.odd\"\n        }\n      ],\n      \"mark\": {\n        \"type\": \"rect\",\n        \"fill\": \"#f0f0f0\"\n      }\n    },\n    {\n      \"mark\": {\n        \"type\": \"text\",\n        \"size\": 16,\n        \"angle\": -90,\n        \"align\": \"right\",\n        \"baseline\": \"top\",\n        \"paddingX\": 3,\n        \"paddingY\": 5,\n        \"y\": 1\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/mark/","title":"Marks","text":"

In GenomeSpy, visualizations are built from marks, which are geometric shapes, such as points, rectangles, and lines, that represent data objects (or rows in tabular data). These marks are mapped to the data using the encoding property, which specifies which visual channels, such as x, color, and size, should be used to encode the data fields. By adjusting the encodings, you can present the same data in a wide range of visual forms, such as scatterplots, bar charts, and heatmaps.

Example: Specifying the mark type
{\n  ...,\n  \"mark\": \"rect\"\n  ...,\n}\n
"},{"location":"grammar/mark/#properties","title":"Properties","text":"

Marks also support various properties for controlling their appearance or behavior. The properties can be specified with an object that contains at least the type property:

Example: Specifying the mark type and additional properties
{\n  ...,\n  \"mark\": {\n    \"type\": \"rect\",\n    \"cornerRadius\": 5\n  },\n  ...,\n}\n
"},{"location":"grammar/mark/#encoding","title":"Encoding","text":"

While mark properties are static, i.e., same for all mark instances, encoding allows for mapping data to visual channels and using data-driven visual encoding.

It's worth noting that while all visual encoding channels are also available as static properties, not all properties can be used for encoding. Only certain properties are suitable for encoding data in a meaningful way.

Example: Specifying visual channels with the encoding property
{\n  ...,\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"from\", \"type\": \"index\"\n    },\n    \"x2\": {\n      \"field\": \"to\"\n    },\n    \"color\": {\n      \"field\": \"category\", \"type\": \"nominal\"\n    }\n  },\n  ...\n}\n

The schematic example above uses the \"rect\" mark to represent the data objects. The \"from\" field is mapped to the positional \"x\" channel, and so on. You can adjust the mapping by specifying a scale for the channel.

"},{"location":"grammar/mark/#channels","title":"Channels","text":""},{"location":"grammar/mark/#position-channels","title":"Position channels","text":"

All marks support the two position channels, which define the mark instance's placement in the visualization. If a positional channel is left unspecified, the mark instance is placed at the center of the respective axis.

"},{"location":"grammar/mark/#primary-channels","title":"Primary channels","text":"x The position on the x axis y The position on the y axis"},{"location":"grammar/mark/#secondary-channels","title":"Secondary channels","text":"

Some marks, such as \"rect\" and \"rule\", also support secondary positional channels, which allow specifying an interval that the mark should cover in the visualization.

x2 The secondary position on the x axis y2 The secondary position on the y axis"},{"location":"grammar/mark/#other-channels","title":"Other channels","text":"color Color of the mark. Affects fill or stroke, depending on the filled property. fill Fill color stroke Stroke color opacity Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property. fillOpacity Fill opacity strokeOpacity Stroke opacity strokeWidth Stroke width in pixels size Depends on the mark. \"point\": the area of the rectangle that encloses the mark instance. \"rule\" and \"link\": stroke width. \"text\": font size. shape Shape of \"point\" marks. angle Rotational angle of \"point\" and \"text\" marks. text Text that the \"text\" mark should render for a mark instance."},{"location":"grammar/mark/#channels-for-sample-collections","title":"Channels for sample collections","text":"

The GenomeSpy app supports an additional channel.

sample Defines the track (or facet) for the sample"},{"location":"grammar/mark/#visual-encoding","title":"Visual Encoding","text":"

GenomeSpy provides several methods for controlling how data is mapped to visual channels. The most common method is to map a field of the data to a channel, but you can also use expressions, values, or data values belonging to the data domain.

Expect for the value method, all methods require specifying the data type using the type property, which must be one of: \"quantitative\", \"nominal\", or \"ordinal\", \"index\", or \"locus\". The first three types are equivalent to the Vega-Lite types of the same name.

"},{"location":"grammar/mark/#field","title":"Field","text":"

field maps a field (or column) of the data to a visual channel.

{\n  \"encoding\": {\n    \"color\": { \"field\": \"significance\", \"type\": \"ordinal\" }\n  },\n  ...\n}\n
"},{"location":"grammar/mark/#expression","title":"Expression","text":"

expr applies an expression before passing the value for a scale transformation.

{\n  \"encoding\": {\n    \"color\": { \"expr\": \"datum.score > 10\", \"type\": \"nominal\" }\n  },\n  ...\n}\n
"},{"location":"grammar/mark/#value","title":"Value","text":"

value defines a value on channel's range, skipping the scale transformation.

{\n  \"encoding\": {\n    \"color\": { \"value\": \"red\" }\n  },\n  ...\n}\n
"},{"location":"grammar/mark/#datum","title":"Datum","text":"

datum defines a value on the domain of the scale used on the channel. Thus, the scale transformation will be applied.

{\n  \"encoding\": {\n    \"color\": { \"datum\": \"important\", \"type\": \"ordinal\" }\n  },\n  ...\n}\n
"},{"location":"grammar/mark/#chrom-and-pos","title":"Chrom and Pos","text":"

See Working with Genomic Data.

"},{"location":"grammar/mark/link/","title":"Link","text":"

The \"link\" mark displays each data item as a curve that connects two points. The mark can be used to display structural variation and interactions, for example. The mark has several different linkShapes that control how the curve is drawn.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 30, \"as\": \"z\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"round(random() * 800)\", \"as\": \"x\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"round(datum.x + pow(2, random() * 10))\",\n      \"as\": \"x2\"\n    }\n  ],\n  \"mark\": \"link\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"index\" },\n    \"x2\": { \"field\": \"x2\" }\n  }\n}\n
"},{"location":"grammar/mark/link/#channels","title":"Channels","text":"

In addition to the primary and secondary position channels and the color and opacity channels, link mark supports the following channels: size.

"},{"location":"grammar/mark/link/#properties","title":"Properties","text":"arcFadingDistance

Type: [number, number] | boolean | ExprRef

The range of the \"arc\" shape's fading distance in pixels. This property allows for making the arc's opacity fade out as it extends away from the chord. The fading distance is interpolated from one to zero between the interval defined by this property. Both false and [0, 0] disable fading.

Default value: false

arcHeightFactor

Type: number | ExprRef

Scaling factor for the \"arc\" shape's height. The default value 1.0 produces roughly circular arcs.

Default value: 1.0

clampApex

Type: boolean | ExprRef

Whether the apex of the \"dome\" shape is clamped to the viewport edge. When over a half of the dome is located outside the viewport, clamping allows for more accurate reading of the value encoded by the apex' position.

Default value: false

clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

linkShape

Type: \"arc\" | \"diagonal\" | \"line\" | \"dome\" | ExprRef

The shape of the link path.

The \"arc\" shape draws a circular arc between the two points. The apex of the arc resides on the left side of the line that connects the two points. The \"dome\" shape draws a vertical or horizontal arc with a specific height. The primary positional channel determines the apex of the arc and the secondary determines the endpoint placement. The \"diagonal\" shape draws an \"S\"-shaped curve between the two points. The \"line\" shape draws a straight line between the two points. See an example of the different shapes below.

Default value: \"arc\"

maxChordLength

Type: number | ExprRef

The maximum length of \"arc\" shape's chord in pixels. The chord is the line segment between the two points that define the arc. Limiting the chord length serves two purposes when zooming in close enough: 1) it prevents the arc from becoming a straight line and 2) it mitigates the limited precision of floating point numbers in arc rendering.

Default value: 50000

minArcHeight

Type: number | ExprRef

The minimum height of an \"arc\" shape. Makes very short links more clearly visible.

Default value: 1.5

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

minPickingSize

Type: number | ExprRef

The minimum picking size invisibly increases the stroke width or point diameter of marks when pointing them with the mouse cursor, making it easier to select them. The valus is the minimum size in pixels.

Default value: 3.0 for \"link\" and 2.0 for \"point\"

noFadingOnPointSelection

Type: boolean | ExprRef

Disables fading of the link when an mark instance is subject to any point selection. As the fading distance is unavailable as a visual channel, this property allows for enhancing the visibility of the selected links.

Default value: true

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

orient

Type: \"vertical\" | \"horizontal\" | ExprRef

The orientation of the link path. Either \"vertical\" or \"horizontal\". Only applies to diagonal links.

Default value: \"vertical\"

segments

Type: number | ExprRef

The number of segments in the b\u00e9zier curve. Affects the rendering quality and performance. Use a higher value for a smoother curve.

Default value: 101

size

Type: number | ExprRef

Stroke width of \"link\" and \"rule\" marks in pixels, the area of the bounding square of \"point\" mark, or the font size of \"text\" mark.

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

x

Type: number | ExprRef

Position on the x axis.

x2

Type: number | ExprRef

The secondary position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

y2

Type: number | ExprRef

The secondary position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/link/#examples","title":"Examples","text":""},{"location":"grammar/mark/link/#different-link-shapes-and-orientations","title":"Different link shapes and orientations","text":"

This example shows the different link shapes and orientations. All links have the same coordinates: { x: 2, y: 2, x2: 8, y2: 8 }. The links are arranged in grid with

linkShape as columns: \"arc\", \"dome\", \"diagonal\", \"line\". orient as rows: \"vertical\", \"horizontal\".

{\n  \"data\": { \"values\": [{ \"x\": 2, \"x2\": 8 }] },\n  \"resolve\": {\n    \"scale\": { \"x\": \"shared\", \"y\": \"shared\" },\n    \"axis\": { \"x\": \"shared\", \"y\": \"shared\" }\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"x\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 10] },\n      \"axis\": { \"grid\": true }\n    },\n    \"x2\": { \"field\": \"x2\" },\n    \"y\": {\n      \"field\": \"x\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 10] },\n      \"axis\": { \"grid\": true }\n    },\n    \"y2\": { \"field\": \"x2\" },\n    \"size\": { \"value\": 2 }\n  },\n\n  \"columns\": 4,\n  \"spacing\": 20,\n\n  \"concat\": [\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"arc\", \"orient\": \"vertical\" } },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"dome\", \"orient\": \"vertical\" } },\n    {\n      \"mark\": { \"type\": \"link\", \"linkShape\": \"diagonal\", \"orient\": \"vertical\" }\n    },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"line\", \"orient\": \"vertical\" } },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"arc\", \"orient\": \"horizontal\" } },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"dome\", \"orient\": \"horizontal\" } },\n    {\n      \"mark\": {\n        \"type\": \"link\",\n        \"linkShape\": \"diagonal\",\n        \"orient\": \"horizontal\"\n      }\n    },\n    { \"mark\": { \"type\": \"link\", \"linkShape\": \"line\", \"orient\": \"horizontal\" } }\n  ]\n}\n
"},{"location":"grammar/mark/link/#varying-the-dome-height","title":"Varying the dome height","text":"

This example uses the \"dome\" shape to draw links with varying heights. The height is determined by the y channel. The clampApex property is set to true to ensure that the apex of the dome is always visible. Try to zoom in and pan around to see it in action.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 20, \"as\": \"z\" }\n  },\n\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"round(random() * 1000)\", \"as\": \"x\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"round(datum.x + random() * 500)\",\n      \"as\": \"x2\"\n    },\n    { \"type\": \"formula\", \"expr\": \"random() * 1000 - 500\", \"as\": \"y\" }\n  ],\n\n  \"mark\": {\n    \"type\": \"link\",\n    \"linkShape\": \"dome\",\n    \"orient\": \"vertical\",\n    \"clampApex\": true,\n    \"color\": \"gray\"\n  },\n\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"index\" },\n    \"x2\": { \"field\": \"x2\" },\n    \"y\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\",\n      \"axis\": { \"grid\": true }\n    }\n  }\n}\n
"},{"location":"grammar/mark/point/","title":"Point","text":"

Point mark displays each data item as a symbol. Points are often used to create a scatter plot. In the genomic context, they could represent, for example, point mutations at genomic loci.

{\n  \"data\": { \"url\": \"sincos.csv\" },\n  \"mark\": \"point\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"sin\", \"type\": \"quantitative\" },\n    \"size\": { \"field\": \"x\", \"type\": \"quantitative\" }\n  }\n}\n
"},{"location":"grammar/mark/point/#channels","title":"Channels","text":"

In addition to standard position channels and color, opacity, and strokeWidth channels, point mark has the following channels: size, shape, dx, and dy.

"},{"location":"grammar/mark/point/#properties","title":"Properties","text":"angle

Type: number | ExprRef

The rotation angle in degrees.

Default value: 0

clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

fill

Type: string | ExprRef

The fill color.

fillGradientStrength

Type: number | ExprRef

Gradient strength controls the amount of the gradient eye-candy effect in the fill color. Valid values are between 0 and 1.

Default value: 0

fillOpacity

Type: number | ExprRef

The fill opacity. Value between 0 and 1.

filled

Type: boolean

Whether the color represents the fill color (true) or the stroke color (false).

geometricZoomBound

Type: number

Enables geometric zooming. The value is the base two logarithmic zoom level where the maximum point size is reached.

Default value: 0

inwardStroke

Type: boolean | ExprRef

Should the stroke only grow inwards, e.g, the diameter/outline is not affected by the stroke width. Thus, a point that has a zero size has no visible stroke. This allows strokes to be used with geometric zoom, etc.

Default value: false

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

minPickingSize

Type: number | ExprRef

The minimum picking size invisibly increases the stroke width or point diameter of marks when pointing them with the mouse cursor, making it easier to select them. The valus is the minimum size in pixels.

Default value: 3.0 for \"link\" and 2.0 for \"point\"

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

semanticZoomFraction

Type: number | ExprRef

TODO

Default value: 0.02

shape

Type: string | ExprRef

One of \"circle\", \"square\", \"cross\", \"diamond\", \"triangle-up\", \"triangle-down\", \"triangle-right\", \"triangle-left\", \"tick-up\", \"tick-down\", \"tick-right\", or \"tick-left\"

Default value: \"circle\"

size

Type: number | ExprRef

Stroke width of \"link\" and \"rule\" marks in pixels, the area of the bounding square of \"point\" mark, or the font size of \"text\" mark.

stroke

Type: string | ExprRef

The stroke color

strokeOpacity

Type: number | ExprRef

The stroke opacity. Value between 0 and 1.

strokeWidth

Type: number | ExprRef

The stroke width in pixels.

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

x

Type: number | ExprRef

Position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/point/#examples","title":"Examples","text":""},{"location":"grammar/mark/point/#plenty-of-points","title":"Plenty of points","text":"

The example below demonstrates how points can be varied by using shape, fill, size, strokeWidth, and angle channels.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 160, \"as\": \"z\" }\n  },\n\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"datum.z % 20\", \"as\": \"x\" },\n    { \"type\": \"formula\", \"expr\": \"floor(datum.z / 20)\", \"as\": \"y\" }\n  ],\n\n  \"mark\": {\n    \"type\": \"point\",\n    \"stroke\": \"black\"\n  },\n\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"ordinal\", \"axis\": null },\n    \"y\": { \"field\": \"y\", \"type\": \"ordinal\", \"axis\": null },\n    \"shape\": { \"field\": \"x\", \"type\": \"nominal\" },\n    \"fill\": { \"field\": \"x\", \"type\": \"nominal\" },\n    \"size\": {\n      \"field\": \"x\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"type\": \"pow\", \"exponent\": 2, \"range\": [0, 900] }\n    },\n    \"strokeWidth\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"range\": [0, 4] }\n    },\n    \"angle\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"range\": [0, 45] }\n    }\n  }\n}\n
"},{"location":"grammar/mark/point/#zoom-behavior","title":"Zoom behavior","text":"

Although points are infinitely small on the real number line, they have a specific diameter on the screen. Thus, closely located points tend to overlap each other. Decreasing the point size reduces the probability of overlap, but in a zoomed-in view, the plot may become overly sparse.

To control overplotting, the point mark provides two zooming behaviors that adjust the point size and visibility based on the zoom level.

"},{"location":"grammar/mark/point/#geometric-zoom","title":"Geometric zoom","text":"

Geometric zoom scales the point size down if the current zoom level is lower than the specified level (bound). geometricZoomBound mark property enables geometric zooming. The value is the negative base two logarithm of the relative width of the visible domain. Example: 0: (the default) full-size points are always shown, 1: when a half of the domain is visible, 2: when a quarter is visible, and so on.

The example below displays 200 000 semi-randomly generated points. The points reach their full size when 1 / 2^10.5 of the domain is visible, which equals about 1500X zoom.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 200000, \"as\": \"x\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"random() * 0.682\", \"as\": \"u\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"((datum.u % 1e-8 > 5e-9 ? 1 : -1) * (sqrt(-log(max(1e-9, datum.u))) - 0.618)) * 1.618 + sin(datum.x / 10000)\",\n      \"as\": \"y\"\n    }\n  ],\n  \"mark\": {\n    \"type\": \"point\",\n    \"geometricZoomBound\": 10.5\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\", \"scale\": { \"zoom\": true } },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"size\": { \"value\": 200 },\n    \"opacity\": { \"value\": 0.6 }\n  }\n}\n

Tip

You can use geometric zoom to improve rendering performance. Smaller points are faster to render than large points.

"},{"location":"grammar/mark/point/#semantic-zoom","title":"Semantic zoom","text":"

The score-based semantic zoom adjusts the point visibility by coupling a score threshold to current zoom level. The semanticScore channel enables the semantic zoom and specifies the score field. The semanticZoomFraction property controls the fraction of data items to show in the fully zoomed-out view, i.e., it specifies the threshold score. The fraction is scaled as the viewport is zoomed. Thus, if the data is distributed roughly uniformly along the zoomed axis, roughly constant number of points are visible at all zoom levels. The score can be arbitrarily distributed, as the threshold is computed using p-quantiles.

The example below has 200 000 semi-randomly generated points with an exponentially distributed score. As the view is zoomed in, new points appear. Their number in the viewport stays approximately constant until the lowest possible score has been reached.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 200000, \"as\": \"x\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"random() * 0.682\", \"as\": \"u\" },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"((datum.u % 1e-8 > 5e-9 ? 1 : -1) * (sqrt(-log(max(1e-9, datum.u))) - 0.618)) * 1.618\",\n      \"as\": \"y\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"-log(random())\",\n      \"as\": \"score\"\n    }\n  ],\n  \"mark\": {\n    \"type\": \"point\",\n    \"semanticZoomFraction\": 0.002\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\", \"scale\": { \"zoom\": true } },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"opacity\": {\n      \"field\": \"score\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"range\": [0.1, 1] }\n    },\n    \"semanticScore\": { \"field\": \"score\", \"type\": \"quantitative\" },\n    \"size\": { \"value\": 100 }\n  }\n}\n

Tip

The score-based semantic zoom is great for filtering point mutations and indels that are scored using CADD, for example.

"},{"location":"grammar/mark/rect/","title":"Rect","text":"

Rect mark displays each data item as a rectangle.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 20, \"as\": \"z\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"as\": \"x\", \"expr\": \"random()\" },\n    { \"type\": \"formula\", \"as\": \"x2\", \"expr\": \"datum.x + random() * 0.3\" },\n    { \"type\": \"formula\", \"as\": \"y\", \"expr\": \"random()\" },\n    { \"type\": \"formula\", \"as\": \"y2\", \"expr\": \"datum.y + random() * 0.4\" }\n  ],\n  \"mark\": {\n    \"type\": \"rect\",\n    \"strokeWidth\": 2,\n    \"stroke\": \"#404040\",\n    \"cornerRadius\": 5\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"x2\": { \"field\": \"x2\" },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"y2\": { \"field\": \"y2\" },\n    \"color\": { \"field\": \"z\", \"type\": \"quantitative\" }\n  }\n}\n
"},{"location":"grammar/mark/rect/#channels","title":"Channels","text":"

Rect mark supports the primary and secondary position channels and the color, stroke, fill, opacity, strokeOpacity, fillOpacity, and strokeWidth channels.

"},{"location":"grammar/mark/rect/#properties","title":"Properties","text":"clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

cornerRadius

Type: number | ExprRef

Radius of the rounded corners.

Default value: 0

cornerRadiusBottomLeft

Type: number | ExprRef

Radius of the bottom left rounded corner. Has higher precedence than cornerRadius.

Default value: (None)

cornerRadiusBottomRight

Type: number | ExprRef

Radius of the bottom right rounded corner. Has higher precedence than cornerRadius.

Default value: (None)

cornerRadiusTopLeft

Type: number | ExprRef

Radius of the top left rounded corner. Has higher precedence than cornerRadius.

Default value: (None)

cornerRadiusTopRight

Type: number | ExprRef

Radius of the top right rounded corner. Has higher precedence than cornerRadius.

Default value: (None)

fill

Type: string | ExprRef

The fill color.

fillOpacity

Type: number | ExprRef

The fill opacity. Value between 0 and 1.

filled

Type: boolean

Whether the color represents the fill color (true) or the stroke color (false).

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

minHeight

Type: number | ExprRef

The minimum height of a rectangle in pixels. The property clamps rectangles' heights.

Default value: 0

minOpacity

Type: number | ExprRef

Clamps the minimum size-dependent opacity. The property does not affect the opacity channel. Valid values are between 0 and 1.

When a rectangle would be smaller than what is specified in minHeight and minWidth, it is faded out proportionally. Example: a rectangle would be rendered as one pixel wide, but minWidth clamps it to five pixels. The rectangle is actually rendered as five pixels wide, but its opacity is multiplied by 0.2. With this setting, you can limit the factor to, for example, 0.5 to keep the rectangles more clearly visible.

Default value: 0

minWidth

Type: number | ExprRef

The minimum width of a rectangle in pixels. The property clamps rectangles' widths when the viewport is zoomed out.

This property also reduces flickering of very narrow rectangles when zooming. The value should generally be at least one.

Default value: 1

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

stroke

Type: string | ExprRef

The stroke color

strokeOpacity

Type: number | ExprRef

The stroke opacity. Value between 0 and 1.

strokeWidth

Type: number | ExprRef

The stroke width in pixels.

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

x

Type: number | ExprRef

Position on the x axis.

x2

Type: number | ExprRef

The secondary position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

y2

Type: number | ExprRef

The secondary position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/rect/#examples","title":"Examples","text":""},{"location":"grammar/mark/rect/#heatmap","title":"Heatmap","text":"

When used with \"band\" or \"index\" scales, the rectangles fill the whole bands when only the primary positional channel is defined.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 800, \"as\": \"z\" }\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"as\": \"y\", \"expr\": \"floor(datum.z / 40)\" },\n    { \"type\": \"formula\", \"as\": \"x\", \"expr\": \"datum.z % 40\" },\n    {\n      \"type\": \"formula\",\n      \"as\": \"z\",\n      \"expr\": \"sin(datum.x / 8) + cos(datum.y / 10 - 0.5 + sin(datum.x / 20) * 2)\"\n    }\n  ],\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"index\" },\n    \"y\": { \"field\": \"y\", \"type\": \"index\" },\n    \"color\": {\n      \"field\": \"z\",\n      \"type\": \"quantitative\",\n      \"scale\": {\n        \"scheme\": \"magma\"\n      }\n    }\n  }\n}\n
"},{"location":"grammar/mark/rect/#bars","title":"Bars","text":"
{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 60, \"as\": \"x\" }\n  },\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"sin((datum.x - 30) / 4) + (datum.x - 30) / 30\",\n      \"as\": \"y\"\n    }\n  ],\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"index\", \"scale\": { \"padding\": 0.1 } },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"y2\": { \"datum\": 0 },\n    \"color\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\",\n      \"scale\": {\n        \"type\": \"threshold\",\n        \"domain\": [0],\n        \"range\": [\"#ed553b\", \"#20639b\"]\n      }\n    }\n  }\n}\n
"},{"location":"grammar/mark/rule/","title":"Rule","text":"

Rule mark displays each data item as a line segment. Rules can span the whole width or height of the view. Alternatively, they may have specific endpoints.

{\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 15, \"as\": \"y\" }\n  },\n  \"mark\": {\n    \"type\": \"rule\",\n    \"strokeDash\": [6, 3]\n  },\n  \"encoding\": {\n    \"x\": { \"field\": \"y\", \"type\": \"quantitative\" },\n    \"color\": { \"field\": \"y\", \"type\": \"nominal\" }\n  }\n}\n
"},{"location":"grammar/mark/rule/#channels","title":"Channels","text":"

Rule mark supports the primary and secondary position channels and the color, opacity, and size channels.

"},{"location":"grammar/mark/rule/#properties","title":"Properties","text":"clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

minLength

Type: number | ExprRef

The minimum length of the rule in pixels. Use this property to ensure that very short ranged rules remain visible even when the user zooms out.

Default value: 0

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

size

Type: number | ExprRef

Stroke width of \"link\" and \"rule\" marks in pixels, the area of the bounding square of \"point\" mark, or the font size of \"text\" mark.

strokeCap

Type: \"butt\" | \"square\" | \"round\" | ExprRef

The style of stroke ends. Available choices: \"butt\", \"round\", and \"square\".

Default value: \"butt\"

strokeDash

Type: array

An array of of alternating stroke and gap lengths or null for solid strokes.

Default value: null

strokeDashOffset

Type: number

An offset for the stroke dash pattern.

Default value: 0

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

x

Type: number | ExprRef

Position on the x axis.

x2

Type: number | ExprRef

The secondary position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

y2

Type: number | ExprRef

The secondary position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/rule/#examples","title":"Examples","text":""},{"location":"grammar/mark/rule/#ranged-rules","title":"Ranged rules","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"y\": \"A\", \"x\": 2, \"x2\": 7 },\n      { \"y\": \"B\", \"x\": 0, \"x2\": 3 },\n      { \"y\": \"B\", \"x\": 5, \"x2\": 6 },\n      { \"y\": \"C\", \"x\": 4, \"x2\": 8 },\n      { \"y\": \"D\", \"x\": 1, \"x2\": 5 }\n    ]\n  },\n  \"mark\": {\n    \"type\": \"rule\",\n    \"size\": 10,\n    \"strokeCap\": \"round\"\n  },\n  \"encoding\": {\n    \"y\": { \"field\": \"y\", \"type\": \"nominal\" },\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"x2\": { \"field\": \"x2\" }\n  }\n}\n
"},{"location":"grammar/mark/rule/#plenty-of-diagonal-rules","title":"Plenty of diagonal rules","text":"
{\n  \"width\": 300,\n  \"height\": 300,\n\n  \"data\": {\n    \"sequence\": { \"start\": 0, \"stop\": 50 }\n  },\n\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"random()\",\n      \"as\": \"x\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.x + random() * 0.5 - 0.25\",\n      \"as\": \"x2\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"random()\",\n      \"as\": \"y\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.y + random() * 0.5 - 0.25\",\n      \"as\": \"y2\"\n    },\n    {\n      \"type\": \"formula\",\n      \"expr\": \"random()\",\n      \"as\": \"size\"\n    }\n  ],\n\n  \"mark\": {\n    \"type\": \"rule\",\n    \"strokeCap\": \"round\"\n  },\n\n  \"encoding\": {\n    \"x\": {\n      \"field\": \"x\",\n      \"type\": \"quantitative\"\n    },\n    \"x2\": { \"field\": \"x2\" },\n    \"y\": {\n      \"field\": \"y\",\n      \"type\": \"quantitative\"\n    },\n    \"y2\": { \"field\": \"y2\" },\n    \"size\": {\n      \"field\": \"size\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"type\": \"pow\", \"range\": [0, 10] }\n    },\n    \"color\": {\n      \"field\": \"x\",\n      \"type\": \"nominal\",\n      \"scale\": { \"scheme\": \"category20\" }\n    }\n  }\n}\n
"},{"location":"grammar/mark/text/","title":"Text","text":"

Text mark displays each data item as text.

{\n  \"data\": {\n    \"values\": [\n      { \"x\": 1, \"text\": \"Hello\" },\n      { \"x\": 2, \"text\": \"world!\" }\n    ]\n  },\n  \"mark\": \"text\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"ordinal\" },\n    \"color\": { \"field\": \"x\", \"type\": \"nominal\" },\n    \"text\": { \"field\": \"text\" },\n    \"size\": { \"value\": 100 }\n  }\n}\n
"},{"location":"grammar/mark/text/#channels","title":"Channels","text":"

In addition to primary and secondary position channels and color and opacity channels, point mark has the following channels: text, size, and angle.

"},{"location":"grammar/mark/text/#properties","title":"Properties","text":"align

Type: \"left\" | \"center\" | \"right\"

The horizontal alignment of the text. One of \"left\", \"center\", or \"right\".

Default value: \"left\"

angle

Type: number | ExprRef

The rotation angle in degrees.

Default value: 0

baseline

Type: \"top\" | \"middle\" | \"bottom\" | \"alphabetic\"

The vertical alignment of the text. One of \"top\", \"middle\", \"bottom\".

Default value: \"bottom\"

clip

Type: boolean | \"never\"

If true, the mark is clipped to the UnitView's rectangle. By default, clipping is enabled for marks that have zoomable positional scales.

color

Type: string | ExprRef

Color of the mark. Affects either fill or stroke, depending on the filled property.

dx

Type: number

The horizontal offset between the text and its anchor point, in pixels. Applied after the rotation by angle.

dy

Type: number

The vertical offset between the text and its anchor point, in pixels. Applied after the rotation by angle.

fitToBand

Type: boolean | ExprRef

If true, sets the secondary positional channel that allows the text to be squeezed (see the squeeze property). Can be used when: 1) \"band\", \"index\", or \"locus\" scale is being used and 2) only the primary positional channel is specified.

Default value: false

flushX

Type: boolean | ExprRef

If true, the text is kept inside the viewport when the range of x and x2 intersect the viewport.

flushY

Type: boolean | ExprRef

If true, the text is kept inside the viewport when the range of y and y2 intersect the viewport.

font

Type: string

The font typeface. GenomeSpy uses SDF versions of Google Fonts. Check their availability at the A-Frame Fonts repository. System fonts are not supported.

Default value: \"Lato\"

fontStyle

Type: \"normal\" | \"italic\"

The font style. Valid values: \"normal\" and \"italic\".

Default value: \"normal\"

fontWeight

Type: number | \"thin\" | \"light\" | \"regular\" | \"normal\" | \"medium\" | \"bold\" | \"black\"

The font weight. The following strings and numbers are valid values: \"thin\" (100), \"light\" (300), \"regular\" (400), \"normal\" (400), \"medium\" (500), \"bold\" (700), \"black\" (900)

Default value: \"regular\"

logoLetters

Type: boolean | ExprRef

Stretch letters so that they can be used with sequence logos, etc...

minBufferSize

Type: number

Minimum size for WebGL buffers (number of data items). Allows for using bufferSubData() to update graphics.

This property is intended for internal use.

opacity

Type: number | ExprRef

Opacity of the mark. Affects fillOpacity or strokeOpacity, depending on the filled property.

paddingX

Type: number | ExprRef

The horizontal padding, in pixels, when the x2 channel is used for ranged text.

Default value: 0

paddingY

Type: number | ExprRef

The vertical padding, in pixels, when the y2 channel is used for ranged text.

Default value: 0

size

Type: number | ExprRef

The font size in pixels.

Default value: 11

squeeze

Type: boolean | ExprRef

If the squeeze property is true and secondary positional channels (x2 and/or y2) are used, the text is scaled to fit mark's width and/or height.

Default value: true

text

Type: Scalar | ExprRef

The text to display. The format of numeric data can be customized by setting a format specifier to channel definition's format property.

Default value: \"\"

tooltip

Type: HandledTooltip | null

Tooltip handler. If null, no tooltip is shown. If string, specifies the tooltip handler to use.

viewportEdgeFadeDistanceBottom

Type: number

TODO

viewportEdgeFadeDistanceLeft

Type: number

TODO

viewportEdgeFadeDistanceRight

Type: number

TODO

viewportEdgeFadeDistanceTop

Type: number

TODO

viewportEdgeFadeWidthBottom

Type: number

TODO

viewportEdgeFadeWidthLeft

Type: number

TODO

viewportEdgeFadeWidthRight

Type: number

TODO

viewportEdgeFadeWidthTop

Type: number

TODO

x

Type: number | ExprRef

Position on the x axis.

x2

Type: number | ExprRef

The secondary position on the x axis.

xOffset

Type: number

Offsets of the x and x2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

y

Type: number | ExprRef

Position on the y axis.

y2

Type: number | ExprRef

The secondary position on the y axis.

yOffset

Type: number

Offsets of the y and y2 coordinates in pixels. The offset is applied after the viewport scaling and translation.

Default value: 0

"},{"location":"grammar/mark/text/#examples","title":"Examples","text":"

GenomeSpy's text mark provides several tricks useful with segmented data and zoomable visualizations.

"},{"location":"grammar/mark/text/#ranged-text","title":"Ranged text","text":"

The x2 and y2 channels allow for positioning the text inside a segment. The text is either squeezed (default) or hidden when it does not fit in the segment. The squeeze property controls the behavior.

The example below has two layers: gray rectangles at the bottom and ranged text on the top. Try to zoom and pan to see how they behave!

{\n  \"data\": {\n    \"values\": [\"A\", \"B\", \"C\", \"D\", \"E\", \"F\", \"G\"]\n  },\n  \"transform\": [\n    { \"type\": \"formula\", \"expr\": \"round(random() * 100)\", \"as\": \"a\" },\n    { \"type\": \"formula\", \"expr\": \"datum.a + round(random() * 60)\", \"as\": \"b\" }\n  ],\n  \"encoding\": {\n    \"x\": { \"field\": \"a\", \"type\": \"quantitative\", \"scale\": { \"zoom\": true } },\n    \"x2\": { \"field\": \"b\" },\n    \"y\": {\n      \"field\": \"data\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"padding\": 0.3\n      }\n    }\n  },\n  \"layer\": [\n    {\n      \"mark\": \"rect\",\n      \"encoding\": { \"color\": { \"value\": \"#eaeaea\" } }\n    },\n    {\n      \"mark\": {\n        \"type\": \"text\",\n        \"align\": \"center\",\n        \"baseline\": \"middle\",\n        \"paddingX\": 5\n      },\n      \"encoding\": {\n        \"text\": {\n          \"expr\": \"'Hello ' + floor(datum.a)\"\n        },\n        \"size\": { \"value\": 12 }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/mark/text/#sequence-logo","title":"Sequence logo","text":"

The example below demonstrates the use of the logoLetters, squeeze, and fitToBand properties to ensure that the letters fully cover the rectangles defined by the primary and secondary positional channels. Not all fonts look good in sequence logos, but Source Sans Pro seems decent.

{\n  \"data\": {\n    \"values\": [\n      { \"pos\": 1, \"base\": \"A\", \"count\": 2 },\n      { \"pos\": 1, \"base\": \"C\", \"count\": 3 },\n      { \"pos\": 1, \"base\": \"T\", \"count\": 5 },\n      { \"pos\": 2, \"base\": \"A\", \"count\": 7 },\n      { \"pos\": 2, \"base\": \"C\", \"count\": 3 },\n      { \"pos\": 3, \"base\": \"A\", \"count\": 10 },\n      { \"pos\": 4, \"base\": \"T\", \"count\": 9 },\n      { \"pos\": 4, \"base\": \"G\", \"count\": 1 },\n      { \"pos\": 5, \"base\": \"G\", \"count\": 8 },\n      { \"pos\": 6, \"base\": \"G\", \"count\": 7 }\n    ]\n  },\n  \"transform\": [\n    {\n      \"type\": \"stack\",\n      \"field\": \"count\",\n      \"groupby\": [\"pos\"],\n      \"offset\": \"information\",\n      \"as\": [\"_y0\", \"_y1\"],\n      \"baseField\": \"base\",\n      \"sort\": { \"field\": \"count\", \"order\": \"ascending\" }\n    }\n  ],\n  \"encoding\": {\n    \"x\": { \"field\": \"pos\", \"type\": \"index\" },\n    \"y\": {\n      \"field\": \"_y0\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 2] },\n      \"title\": \"Information\"\n    },\n    \"y2\": { \"field\": \"_y1\" },\n    \"text\": { \"field\": \"base\", \"type\": \"nominal\" },\n    \"color\": {\n      \"field\": \"base\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"type\": \"ordinal\",\n        \"domain\": [\"A\", \"C\", \"T\", \"G\"],\n        \"range\": [\"#7BD56C\", \"#FF9B9B\", \"#86BBF1\", \"#FFC56C\"]\n      }\n    }\n  },\n  \"mark\": {\n    \"type\": \"text\",\n    \"font\": \"Source Sans Pro\",\n    \"fontWeight\": 700,\n    \"size\": 100,\n    \"squeeze\": true,\n    \"fitToBand\": true,\n\n    \"paddingX\": 0,\n    \"paddingY\": 0,\n\n    \"logoLetters\": true\n  }\n}\n
"},{"location":"grammar/transform/","title":"Data transformation","text":"

With transforms, you can build a pipeline that modifies the data before the data objects are mapped to mark instances. In an abstract sense, a transformation inputs a list of data items and outputs a list of new items that may be filtered, modified, or generated from the original items.

The data flow is a forest of data sources and subsequent transformations, which may form trees. For instance, a layer view might have a data source, which is then filtered and mutated in a different way for each child layer.

Departure from Vega-Lite

The notation of transforms is different from Vega-Lite to enable more straghtforward addition of new operations. Each transform has to be specified using an explicit type property like in the lower-level Vega visualization grammar. Thus, the transform type is not inferred from the presence of transform-specific properties.

"},{"location":"grammar/transform/#example","title":"Example","text":"

The following example uses the \"filter\" transform to retain only the rows that match the predicate expression.

{\n  ...,\n  \"data\": { ... },\n  \"transform\": [\n    {\n      \"type\": \"filter\",\n      \"expr\": \"datum.end - datum.start < 5000\"\n    }\n  ],\n  ...\n}\n
"},{"location":"grammar/transform/#debugging-the-data-flow","title":"Debugging the Data Flow","text":"

Complex visualizations may involve multiple data sources and transformations, which can make it difficult to understand the data flow, particularly when no data objects appear to pass through the flow. The Dataflow Inspector shows the structure of the data flow and allows you to inspect the parameters of each node, the number of propagated data objects, and a recorded copy of the first data object that passes through the node. The Inspector is currently available in the toolbar () of the GenomeSpy App.

"},{"location":"grammar/transform/aggregate/","title":"Aggregate","text":"

The \"aggregate\" transform summarizes data fields using aggregate functions, such as \"sum\" or \"max\". The data can be grouped by one or more fields, which results in a list of objects with the grouped fields and the aggregate values.

"},{"location":"grammar/transform/aggregate/#parameters","title":"Parameters","text":"as

Type: array

The names for the output fields corresponding to each aggregated field. If not provided, names will be automatically created using the operation and field names (e.g., sum_field, average_field).

fields

Type: array

The data fields to apply aggregate functions to. This array should correspond with the ops and as arrays. If no fields or operations are specified, a count aggregation will be applied by default.

groupby

Type: array

The fields by which to group the data. If these are not defined, all data objects will be grouped into a single category.

ops

Type: array

The aggregation operations to be performed on the fields, such as \"sum\", \"average\", or \"count\".

"},{"location":"grammar/transform/aggregate/#available-aggregate-functions","title":"Available aggregate functions","text":"

Aggregate functions are applied to the data fields in each group.

"},{"location":"grammar/transform/aggregate/#example","title":"Example","text":"

Given the following data:

x y first 123 first 456 second 789

... and configuration:

{\n  \"type\": \"aggregate\",\n  \"groupby\": [\"x\"]\n}\n

A new list of data objects is created:

x count first 2 second 1"},{"location":"grammar/transform/aggregate/#calculating-min-and-max","title":"Calculating min and max","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"Category\": \"A\", \"Value\": 5 },\n      { \"Category\": \"A\", \"Value\": 9 },\n      { \"Category\": \"A\", \"Value\": 9.5 },\n      { \"Category\": \"B\", \"Value\": 3 },\n      { \"Category\": \"B\", \"Value\": 5 },\n      { \"Category\": \"B\", \"Value\": 7.5 },\n      { \"Category\": \"B\", \"Value\": 8 }\n    ]\n  },\n\n  \"encoding\": {\n    \"y\": { \"field\": \"Category\", \"type\": \"nominal\" }\n  },\n\n  \"layer\": [\n    {\n      \"encoding\": {\n        \"x\": { \"field\": \"Value\", \"type\": \"quantitative\" }\n      },\n      \"mark\": \"point\"\n    },\n    {\n      \"transform\": [\n        {\n          \"type\": \"aggregate\",\n          \"groupby\": [\"Category\"],\n          \"fields\": [\"Value\", \"Value\"],\n          \"ops\": [\"min\", \"max\"],\n          \"as\": [\"minValue\", \"maxValue\"]\n        }\n      ],\n      \"encoding\": {\n        \"x\": { \"field\": \"minValue\", \"type\": \"quantitative\" },\n        \"x2\": { \"field\": \"maxValue\" }\n      },\n      \"mark\": \"rule\"\n    }\n  ]\n}\n
"},{"location":"grammar/transform/collect/","title":"Collect","text":"

The \"collect\" transform collects (buffers) the data items from the data flow into an internal array and optionally sorts them.

"},{"location":"grammar/transform/collect/#parameters","title":"Parameters","text":"groupby

Type: array

Arranges the data into consecutive batches based on the groups. This is mainly intended for internal use so that faceted data can be handled as batches.

sort

Type: CompareParams

The sort order.

"},{"location":"grammar/transform/collect/#example","title":"Example","text":"
{\n  \"type\": \"collect\",\n  \"sort\": {\n    \"field\": [\"score\"],\n    \"order\": [\"descending\"]\n  }\n}\n
"},{"location":"grammar/transform/coverage/","title":"Coverage","text":"

The \"coverage\" transform computes coverage for overlapping segments. The result is a new list of non-overlapping segments with the coverage values. The segments must be sorted by their start coordinates before passing them to the coverage transform.

"},{"location":"grammar/transform/coverage/#parameters","title":"Parameters","text":"as

Type: string

The output field for the computed coverage.

asChrom

Type: string

The output field for the chromosome.

Default: Same as chrom

asEnd

Type: string

The output field for the end coordinate.

Default: Same as end

asStart

Type: string

The output field for the start coordinate.

Default: Same as start

chrom

Type: string (field name)

An optional chromosome field that is passed through. TODO: groupby

end Required

Type: string (field name)

The field representing the end coordinate of the segment (exclusive).

start Required

Type: string (field name)

The field representing the start coordinate of the segment (inclusive).

weight

Type: string (field name)

A field representing an optional weight for the segment. Can be used with copy ratios, for example.

"},{"location":"grammar/transform/coverage/#example","title":"Example","text":"

Given the following data:

start end 0 4 1 3

... and configuration:

{\n  \"type\": \"coverage\",\n  \"start\": \"startpos\",\n  \"end\": \"endpos\"\n}\n

A new list of segments is produced:

start end coverage 0 1 1 1 3 2 3 4 1"},{"location":"grammar/transform/coverage/#interactive-example","title":"Interactive example","text":"

The following example demonstrates both \"coverage\" and \"pileup\" transforms.

{\n  \"data\": {\n    \"sequence\": {\n      \"start\": 1,\n      \"stop\": 100,\n      \"as\": \"start\"\n    }\n  },\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.start + ceil(random() * 20)\",\n      \"as\": \"end\"\n    }\n  ],\n  \"resolve\": { \"scale\": { \"x\": \"shared\" } },\n  \"vconcat\": [\n    {\n      \"transform\": [\n        {\n          \"type\": \"coverage\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"coverage\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": { \"field\": \"start\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"end\" },\n        \"y\": { \"field\": \"coverage\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"transform\": [\n        {\n          \"type\": \"pileup\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"lane\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": { \"field\": \"start\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"end\" },\n        \"y\": {\n          \"field\": \"lane\",\n          \"type\": \"index\",\n          \"scale\": {\n            \"padding\": 0.2,\n            \"reverse\": true,\n            \"zoom\": false\n          }\n        }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/transform/filter-scored-labels/","title":"Filter Scored Lables","text":"

The \"filterScoredLables\" transform fits prioritized labels into the available space, and dynamically reflows the data when the scale domain is adjusted (i.e., zoomed).

For an usage example, check the Annotation Tracks notebook.

"},{"location":"grammar/transform/filter-scored-labels/#parameters","title":"Parameters","text":"channel

Type: string

Default: \"x\"

lane

Type: string (field name)

An optional field representing element's lane, e.g., if transcripts are shown using a piled up layout.

padding

Type: number

Padding (in pixels) around the element.

Default: 0

pos Required

Type: string (field name)

The field representing element's position on the domain.

score Required

Type: string (field name)

The field representing the score used for prioritization.

width Required

Type: string (field name)

The field representing element's width in pixels

"},{"location":"grammar/transform/filter/","title":"Filter","text":"

The \"filter\" transform removes data objects based on a predicate expression.

"},{"location":"grammar/transform/filter/#parameters","title":"Parameters","text":"expr Required

Type: string

An expression string. The data object is removed if the expression evaluates to false.

"},{"location":"grammar/transform/filter/#example","title":"Example","text":"
{\n  \"type\": \"filter\",\n  \"expr\": \"datum.p <= 0.05\"\n}\n

The example above retains all rows for which the field p is less than or equal to 0.05.

"},{"location":"grammar/transform/flatten-compressed-exons/","title":"Flatten Compressed Exons","text":"

The \"flattenCompressedExons\" transform flattens \"delta encoded\" exons. The transform inputs the start coordinate of the gene body and a comma-delimited string of alternating intron and exon lengths. A new data object is created for each exon.

This transform is mainly intended to be used with an optimized gene annotation track. Read more at Annotation Tracks notebook.

"},{"location":"grammar/transform/flatten-compressed-exons/#parameters","title":"Parameters","text":"as

Type: array

Field names for the flattened exons.

Default: [\"exonStart\", \"exonEnd\"]

exons

Type: string (field name)

The field containing the exons.

Default: \"exons\"

start

Type: string (field name)

Start coordinate of the gene body.

Default: \"start\"

"},{"location":"grammar/transform/flatten-delimited/","title":"Flatten Delimited","text":"

The \"flattenDelimited\" transform flattens (or normalizes) a field or a set of fields that contain delimited values. In other words, each delimited value is written into a new data object that contains a single value from the delimited field. All other fields are copied as such.

"},{"location":"grammar/transform/flatten-delimited/#parameters","title":"Parameters","text":"as

Type: string[] | string

The output field name(s) for the flattened field.

Default: the input fields.

field Required

Type: string (field name)[] | string (field name)

The field(s) to split and flatten

separator Required

Type: string[] | string

Separator(s) used on the field(s) TODO: Rename to delimiter

"},{"location":"grammar/transform/flatten-delimited/#example","title":"Example","text":"

Given the following data:

patient tissue value A Ova,Asc 4,2 B Adn,Asc,Ute 6,3,4

... and configuration:

{\n  \"type\": \"flattenDelimited\",\n  \"field\": [\"tissue\", \"value\"],\n  \"separator\": [\",\", \",\"]\n}\n

TODO: Rename separator to delimiter

Flattened data is produced:

patient tissue value A Ova 4 A Asc 2 B Adn 6 B Asc 3 B Ute 4"},{"location":"grammar/transform/flatten-sequence/","title":"Flatten Sequence","text":"

The \"flattenSequence\" transform flattens strings such as FASTA sequences into data objecsts with position and character fields.

"},{"location":"grammar/transform/flatten-sequence/#parameters","title":"Parameters","text":"as

Type: array

Name of the fields where the zero-based index number and flattened sequence letter are written to.

Default: [\"pos\", \"sequence\"]

field

Type: string (field name)

The field to flatten.

Default: \"sequence\"

"},{"location":"grammar/transform/flatten-sequence/#example","title":"Example","text":"

Given the following data:

identifier sequence X AC Y ACTG

... and parameters:

{\n  \"type\": \"flattenSequence\",\n  \"field\": \"sequence\",\n  \"as\": [\"base\", \"pos\"]\n}\n

The sequences are flattened into:

identifier sequence base pos X AC A 0 X AC C 1 Y ACTG A 0 Y ACTG C 1 Y ACTG T 2 Y ACTG G 3"},{"location":"grammar/transform/flatten/","title":"Flatten","text":"

The \"flatten\" transform converts fields that hold arrays into distinct, individual data objects. This creates a new sequence of data, where each element encompasses both an extracted array component and all the original fields from the corresponding input object.

"},{"location":"grammar/transform/flatten/#parameters","title":"Parameters","text":"as

Type: string[] | string

The output field name(s) for the flattened field.

Default: the input fields.

fields

Type: string (field name)[] | string (field name)

The field(s) to flatten. If no field is defined, the data object itself is treated as an array to be flattened.

index

Type: string

The output field name for the zero-based index of the array values. If unspecified, an index field is not added.

"},{"location":"grammar/transform/flatten/#example","title":"Example","text":""},{"location":"grammar/transform/flatten/#single-field-flattening","title":"Single-Field Flattening","text":"

This example flattens the array-valued field named foo. Note that all fields except foo are repeated in every output datum.

{ \"type\": \"flatten\", \"fields\": [\"foo\"] }\n

Input data:

[\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": [1, 2] },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": [3, 4, 5] }\n]\n

Result:

[\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": 1 },\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": 2 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 3 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 4 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 5 }\n]\n
"},{"location":"grammar/transform/flatten/#adding-an-index-field","title":"Adding an Index Field","text":"
{ \"type\": \"flatten\", \"fields\": [\"foo\"], \"index\": \"idx\" }\n

This example adds an field containing the array index that each item originated from.

[\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": [1, 2] },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": [3, 4, 5] }\n]\n

Result:

[\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": 1, \"idx\": 0 },\n  { \"name\": \"alpha\", \"data\": 123, \"foo\": 2, \"idx\": 1 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 3, \"idx\": 0 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 4, \"idx\": 1 },\n  { \"name\": \"beta\", \"data\": 456, \"foo\": 5, \"idx\": 2 }\n]\n
"},{"location":"grammar/transform/flatten/#multi-field-flattening","title":"Multi-Field Flattening","text":"
{ \"type\": \"flatten\", \"fields\": [\"foo\", \"bar\"] }\n

This example simultaneously flattens the array-valued fields foo and bar. Given the input data

[\n  { \"key\": \"alpha\", \"foo\": [1, 2], \"bar\": [\"A\", \"B\"] },\n  { \"key\": \"beta\", \"foo\": [3, 4, 5], \"bar\": [\"C\", \"D\"] }\n]\n

this example produces the output:

[\n  { \"key\": \"alpha\", \"foo\": 1, \"bar\": \"A\" },\n  { \"key\": \"alpha\", \"foo\": 2, \"bar\": \"B\" },\n  { \"key\": \"beta\", \"foo\": 3, \"bar\": \"C\" },\n  { \"key\": \"beta\", \"foo\": 4, \"bar\": \"D\" },\n  { \"key\": \"beta\", \"foo\": 5, \"bar\": null }\n]\n
"},{"location":"grammar/transform/flatten/#flattening-array-objects","title":"Flattening Array Objects","text":"
{ \"type\": \"flatten\" }\n

This example treats the data objects as arrays that should be flattened. Given the input data

[[{ \"foo\": 1 }], [{ \"foo\": 2 }, { \"foo\": 3 }]]\n

this example produces the output:

[{ \"foo\": 1 }, { \"foo\": 2 }, { \"foo\": 3 }]\n
"},{"location":"grammar/transform/formula/","title":"Formula","text":"

The \"formula\" transform uses an expression to calculate and add a new field to the data objects.

"},{"location":"grammar/transform/formula/#parameters","title":"Parameters","text":"as Required

Type: string

The (new) field where the computed value is written to

expr Required

Type: string

An expression string

"},{"location":"grammar/transform/formula/#example","title":"Example","text":"

Given the following data:

x y 1 2 3 4

... and configuration:

{\n  \"type\": \"formula\",\n  \"expr\": \"datum.x + datum.y\",\n  \"as\": \"z\"\n}\n

A new field is added:

x y z 1 2 3 3 4 7"},{"location":"grammar/transform/formula/#using-with-parameters","title":"Using with Parameters","text":"

As expressions have access to parameters, they can be used to create dynamic visualizations. The following example uses a formula to calculate the sum of two sine waves with different wave lengths. The wave lengths are controlled by the a and b parameters.

Under the hood, when any of the parameters change, the formula transform finds the closest collector or data source in the data pipeline and triggers a re-propagation of the data, resulting in a re-evaluation of the formula expression.

{\n  \"params\": [\n    {\n      \"name\": \"a\",\n      \"value\": 200,\n      \"bind\": { \"input\": \"range\", \"min\": 10, \"max\": 2000, \"step\": 1 }\n    },\n    {\n      \"name\": \"b\",\n      \"value\": 270,\n      \"bind\": { \"input\": \"range\", \"min\": 10, \"max\": 2000, \"step\": 1 }\n    }\n  ],\n\n  \"data\": { \"sequence\": { \"start\": 0, \"stop\": 1000, \"as\": \"x\" } },\n\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"sin(datum.x * 2 * PI / a) + sin(datum.x * 2 * PI / b)\",\n      \"as\": \"y\"\n    }\n  ],\n\n  \"mark\": \"point\",\n\n  \"encoding\": {\n    \"size\": { \"value\": 4 },\n    \"x\": { \"field\": \"x\", \"type\": \"quantitative\" },\n    \"y\": { \"field\": \"y\", \"type\": \"quantitative\" }\n  }\n}\n
"},{"location":"grammar/transform/linearize-genomic-coordinate/","title":"Linearize Genomic Coordinate","text":"

The linearizeGenomicCoordinate transform maps the (chromosome, position) pairs into a linear coordinate space using the chromosome sizes of the current genome assembly.

"},{"location":"grammar/transform/linearize-genomic-coordinate/#parameters","title":"Parameters","text":"as Required

Type: string | string[]

The output field or fields for linearized coordinates.

channel

Type: string

Get the genome assembly from the scale of the channel.

Default: \"x\"

chrom Required

Type: string (field name)

The chromosome/contig field

offset

Type: number | number[]

An offset or offsets that allow for adjusting the numbering base. The offset is subtracted from the positions.

GenomeSpy uses internally zero-based indexing with half-open intervals. UCSC-based formats (BED, etc.) generally use this scheme. However, for example, VCF files use one-based indexing and must be adjusted by setting the offset to 1.

Default: 0

pos Required

Type: string (field name) | string (field name)[]

The field or fields that contain intra-chromosomal positions

"},{"location":"grammar/transform/linearize-genomic-coordinate/#example","title":"Example","text":"
{\n  \"type\": \"linearizeGenomicCoordinate\",\n  \"chrom\": \"chrom\",\n  \"pos\": \"start\",\n  \"as\": \"_start\"\n}\n
"},{"location":"grammar/transform/measure-text/","title":"Measure Text","text":"

The \"measureText\" transforms measures the length of a string in pixels. The measurement can be used in downstream layout computations.

For an usage example, check the Annotation Tracks notebook.

"},{"location":"grammar/transform/measure-text/#parameters","title":"Parameters","text":"as Required

Type: string

TODO

field Required

Type: string (field name)

TODO

fontSize Required

Type: number

TODO

"},{"location":"grammar/transform/measure-text/#example","title":"Example","text":"
{\n  \"type\": \"measureText\",\n  \"fontSize\": 11,\n  \"field\": \"symbol\",\n  \"as\": \"_textWidth\"\n}\n
"},{"location":"grammar/transform/pileup/","title":"Pileup","text":"

The \"pileup\" transform computes a piled up layout for overlapping segments. The computed lane can be used to position the segments in a visualization. The segments must be sorted by their start coordinates before passing them to the pileup transform.

"},{"location":"grammar/transform/pileup/#parameters","title":"Parameters","text":"as

Type: string

The output field name for the computed lane.

Default: \"lane\".

end Required

Type: string (field name)

The field representing the end coordinate of the segment (exclusive).

preference

Type: string (field name)

An optional field indicating the preferred lane. Use together with the preferredOrder property.

preferredOrder

Type: string[] | number[] | boolean[]

The order of the lane preferences. The first element contains the value that should place the segment on the first lane and so forth. If the preferred lane is occupied, the first available lane is taken.

spacing

Type: number

The spacing between adjacent segments on the same lane in coordinate units.

Default: 1.

start Required

Type: string (field name)

The field representing the start coordinate of the segment (inclusive).

"},{"location":"grammar/transform/pileup/#example","title":"Example","text":"

Given the following data:

start end 0 4 1 3 2 6 4 8

... and configuration:

{\n  \"type\": \"pileup\",\n  \"start\": \"start\",\n  \"end\": \"end\",\n  \"as\": \"lane\"\n}\n

A new field is added:

start end lane 0 4 0 1 3 1 2 6 2 4 8 1"},{"location":"grammar/transform/pileup/#interactive-example","title":"Interactive example","text":"

The following example demonstrates both \"coverage\" and \"pileup\" transforms.

{\n  \"data\": {\n    \"sequence\": {\n      \"start\": 1,\n      \"stop\": 100,\n      \"as\": \"start\"\n    }\n  },\n  \"transform\": [\n    {\n      \"type\": \"formula\",\n      \"expr\": \"datum.start + ceil(random() * 20)\",\n      \"as\": \"end\"\n    }\n  ],\n  \"resolve\": { \"scale\": { \"x\": \"shared\" } },\n  \"vconcat\": [\n    {\n      \"transform\": [\n        {\n          \"type\": \"coverage\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"coverage\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": { \"field\": \"start\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"end\" },\n        \"y\": { \"field\": \"coverage\", \"type\": \"quantitative\" }\n      }\n    },\n    {\n      \"transform\": [\n        {\n          \"type\": \"pileup\",\n          \"start\": \"start\",\n          \"end\": \"end\",\n          \"as\": \"lane\"\n        }\n      ],\n      \"mark\": \"rect\",\n      \"encoding\": {\n        \"x\": { \"field\": \"start\", \"type\": \"index\" },\n        \"x2\": { \"field\": \"end\" },\n        \"y\": {\n          \"field\": \"lane\",\n          \"type\": \"index\",\n          \"scale\": {\n            \"padding\": 0.2,\n            \"reverse\": true,\n            \"zoom\": false\n          }\n        }\n      }\n    }\n  ]\n}\n
"},{"location":"grammar/transform/project/","title":"Project","text":"

The \"project\" transform retains the specified fields of the data objects, optionally renaming them. All other fields are removed.

"},{"location":"grammar/transform/project/#parameters","title":"Parameters","text":"as

Type: array

New names for the projected fields. If omitted, the names of the source fields are used.

fields Required

Type: array

The fields to be projected.

"},{"location":"grammar/transform/project/#example","title":"Example","text":"
{\n  \"type\": \"project\",\n  \"fields\": [\"lane\", \"start\", \"exons\"]\n}\n
"},{"location":"grammar/transform/regex-extract/","title":"Regex Extract","text":"

The \"regexExtract\" transform extracts groups from a string field and adds them to the data objects as new fields.

"},{"location":"grammar/transform/regex-extract/#parameters","title":"Parameters","text":"as Required

Type: string | string[]

The new field or an array of fields where the extracted values are written.

field Required

Type: string (field name)

The source field

regex Required

Type: string

A valid JavaScript regular expression with at least one group. For example: \"^Sample(\\\\d+)$\".

Read more at: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

skipInvalidInput

Type: boolean

Do not complain about invalid input. Just skip it and leave the new fields undefined on the affected datum.

Default: false

"},{"location":"grammar/transform/regex-extract/#example","title":"Example","text":"

Given the following data:

Gene Genome Location AKT1 14:104770341-104792643

... and configuration:

{\n  \"type\": \"regexExtract\",\n  \"field\": \"Genome Location\",\n  \"regex\": \"^(X|Y|\\\\d+):(\\\\d+)-(\\\\d+)$\",\n  \"as\": [\"Chrom\", \"Start\", \"End\"]\n}\n

Three new fields are added to the data:

Gene Genome Location Chrom Start End AKT1 14:104770341-104792643 14 104770341 104792643"},{"location":"grammar/transform/regex-fold/","title":"Regex Fold","text":"

The \"regexFold\" transform gathers columns into key-value pairs using a regular expression.

"},{"location":"grammar/transform/regex-fold/#parameters","title":"Parameters","text":"asKey

Type: string

Default: \"sample\"

asValue Required

Type: string[] | string

A new column name for the extracted values.

columnRegex Required

Type: string[] | string

A regular expression that matches to column names. The regex must have one capturing group that is used for extracting the key (e.g., a sample id) from the column name.

skipRegex

Type: string

An optional regex that matches to fields that should not be included in the new folded data objects.

"},{"location":"grammar/transform/regex-fold/#example","title":"Example","text":"

Given the following data:

SNP foo.AF bar.AF baz.AF rs99924582 0.3 0.24 0.94 rs22238423 0.92 0.21 0.42

... and configuration:

{\n  \"type\": \"regexFold\",\n  \"columnRegex\": [\"^(.*)\\\\.AF$\"],\n  \"asValue\": [\"VAF\"],\n  \"asKey\": \"sample\"\n}\n

The matched columns are folded into new data objects. All others are left intact:

SNP sample VAF rs99924582 foo 0.3 rs99924582 bar 0.24 rs99924582 baz 0.94 rs22238423 foo 0.92 rs22238423 bar 0.21 rs22238423 baz 0.42"},{"location":"grammar/transform/sample/","title":"Sample","text":"

The \"sample\" transform takes a random sample of the data objects.

"},{"location":"grammar/transform/sample/#parameters","title":"Parameters","text":"size

Type: number

The maximum sample size.

Default: 500

"},{"location":"grammar/transform/sample/#example","title":"Example","text":"
{\n  \"type\": \"sample\",\n  \"size\": 100\n}\n
"},{"location":"grammar/transform/stack/","title":"Stack","text":"

The \"stack\" transform computes a stacked layout. Stacked bar plots and sequence logos are some of its applications.

"},{"location":"grammar/transform/stack/#parameters","title":"Parameters","text":"as Required

Type: array

Fields to write the stacked values.

Default: [\"y0\", \"y1\"]

baseField

Type: string (field name)

The field that contains the base or amino acid. Used for information content calculation when the offset is \"information\". The data objects that have null in the baseField are considered gaps and they are taken into account when scaling the the locus' information content.

cardinality

Type: number

Cardinality, e.g., the number if distinct bases or amino acids. Used for information content calculation when the offset is \"information\".

Default: 4

field

Type: string (field name)

The field to stack. If no field is defined, a constant value of one is assumed.

groupby Required

Type: array

The fields to be used for forming groups for different stacks.

offset

Type: string

How to offset the values in a stack. \"zero\" (default) starts stacking at 0. \"center\" centers the values around zero. \"normalize\" computes intra-stack percentages and normalizes the values to the range of [0, 1]. \"information\" computes a layout for a sequence logo. The total height of the stack reflects the group's information content.

sort

Type: CompareParams

The sort order of data in each stack.

"},{"location":"grammar/transform/stack/#examples","title":"Examples","text":""},{"location":"grammar/transform/stack/#stacked-bar-plot","title":"Stacked bar plot","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"x\": 1, \"q\": \"A\", \"z\": 7 },\n      { \"x\": 1, \"q\": \"B\", \"z\": 3 },\n      { \"x\": 1, \"q\": \"C\", \"z\": 10 },\n      { \"x\": 2, \"q\": \"A\", \"z\": 8 },\n      { \"x\": 2, \"q\": \"B\", \"z\": 5 },\n      { \"x\": 3, \"q\": \"B\", \"z\": 10 }\n    ]\n  },\n  \"transform\": [\n    {\n      \"type\": \"stack\",\n      \"field\": \"z\",\n      \"groupby\": [\"x\"]\n    }\n  ],\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"x\": { \"field\": \"x\", \"type\": \"nominal\", \"band\": 0.8 },\n    \"y\": { \"field\": \"y0\", \"type\": \"quantitative\" },\n    \"y2\": { \"field\": \"y1\" },\n    \"color\": { \"field\": \"q\", \"type\": \"nominal\" }\n  }\n}\n
"},{"location":"grammar/transform/stack/#sequence-logo","title":"Sequence logo","text":"
{\n  \"data\": {\n    \"values\": [\n      { \"pos\": 1, \"base\": \"A\", \"count\": 2 },\n      { \"pos\": 1, \"base\": \"C\", \"count\": 3 },\n      { \"pos\": 1, \"base\": \"T\", \"count\": 5 },\n      { \"pos\": 2, \"base\": \"A\", \"count\": 7 },\n      { \"pos\": 2, \"base\": \"C\", \"count\": 3 },\n      { \"pos\": 3, \"base\": \"A\", \"count\": 10 },\n      { \"pos\": 4, \"base\": \"T\", \"count\": 9 },\n      { \"pos\": 4, \"base\": \"G\", \"count\": 1 },\n      { \"pos\": 5, \"base\": \"G\", \"count\": 8 },\n      { \"pos\": 6, \"base\": \"G\", \"count\": 7 }\n    ]\n  },\n  \"transform\": [\n    {\n      \"type\": \"stack\",\n      \"field\": \"count\",\n      \"groupby\": [\"pos\"],\n      \"offset\": \"information\",\n      \"as\": [\"_y0\", \"_y1\"],\n      \"baseField\": \"base\",\n      \"sort\": { \"field\": \"count\", \"order\": \"ascending\" }\n    }\n  ],\n  \"encoding\": {\n    \"x\": { \"field\": \"pos\", \"type\": \"index\" },\n    \"y\": {\n      \"field\": \"_y0\",\n      \"type\": \"quantitative\",\n      \"scale\": { \"domain\": [0, 2] },\n      \"title\": \"Information\"\n    },\n    \"y2\": { \"field\": \"_y1\" },\n    \"text\": { \"field\": \"base\", \"type\": \"nominal\" },\n    \"color\": {\n      \"field\": \"base\",\n      \"type\": \"nominal\",\n      \"scale\": {\n        \"type\": \"ordinal\",\n        \"domain\": [\"A\", \"C\", \"T\", \"G\"],\n        \"range\": [\"#7BD56C\", \"#FF9B9B\", \"#86BBF1\", \"#FFC56C\"]\n      }\n    }\n  },\n  \"mark\": {\n    \"type\": \"text\",\n    \"font\": \"Source Sans Pro\",\n    \"fontWeight\": 700,\n    \"size\": 100,\n    \"squeeze\": true,\n    \"fitToBand\": true,\n\n    \"paddingX\": 0,\n    \"paddingY\": 0,\n\n    \"logoLetters\": true\n  }\n}\n
"},{"location":"sample-collections/","title":"Working with Sample Collections","text":"

The app package of the GenomeSpy toolkit enables an interactive analysis of large sample collections. It builds upon the core package, which allows developers to build tailored visualizations using the visualization grammar and GPU-accelerated rendering engine. The app extends the grammar with a facet operator that makes it possible to repeat a single visualization for thousands of samples. The end users of the visualization have access to several interactions that facilitate the exploration of such sample collections.

The documentation of the app package is split into two parts serving different audiences:

  1. Visualizing Sample Collections (for method developers)
  2. Analyzing Sample Collections (for end users)
"},{"location":"sample-collections/analyzing/","title":"Analyzing Sample Collections","text":"

End-User Documentation

This page is mainly intended for end users who analyze sample collections interactively using the GenomeSpy app.

"},{"location":"sample-collections/analyzing/#elements-of-the-user-interface","title":"Elements of the user interface","text":"

Because GenomeSpy visualizations are highly customizable, the actual visualization and the available user-interface elements may differ significantly from what is shown below.

  1. Location / search field shows the genomic coordinates of the current viewport in a UCSC-style format. You can look up features such as gene symbols using the field. In addition, you can filter the sample collection by categorical metadata attibutes by typing a categorical value into this field.
  2. Undo history and provenance allows you to undo and redo actions performed on the sample collection. The provenance () button shows all perfomed actions, allowing you to better understand how the current visualization state was constructed.
  3. View visibility menu allows for toggling the visibility of elements such as metadata attributes or annotation tracks.
  4. Bookmark menu shows a list of pre-defined bookmarks and allows you to save the visualization state as a local bookmark into your web browser. The adjacent Share () button constructs a shareable URL, which captures the visualization state and optional notes related to the current visualization state.
  5. Fullscreen toggle opens the visualization in fullscreen mode.
  6. Group markers become visible when the sample collection has been stratified using some attribute.
  7. Sample names identify the samples.
  8. Metadata such as clinical attributes or computed variables shown as a heatmap.
  9. Genomic data is shown here.
"},{"location":"sample-collections/analyzing/#navigation-interactions","title":"Navigation interactions","text":""},{"location":"sample-collections/analyzing/#navigating-around-the-genome","title":"Navigating around the genome","text":"

To navigate around the genome in GenomeSpy, you can use either a mouse or a touchpad. If you're using a mouse, you can zoom the genome axis in and out using the scroll wheel. To pan the view, click with the left mouse button and start dragging.

If you're using a touchpad, you can zoom the genome axis by performing a vertical two-finger gesture. Similarly, you can pan the view by performing a horizontal gesture.

"},{"location":"sample-collections/analyzing/#peeking-samples","title":"Peeking samples","text":"

The GenomeSpy app is designed for the exploration of large datasets containing hundreds or thousands of samples. To provide a better overview of patterns across the entire sample collection, GenomeSpy displays the samples as a bird's eye view that fits them into the available vertical space. If you discover interesting patterns or outliers in the dataset, you can peek individual samples by activating a close-up view from the context menu or by pressing the E key on the keyboard.

Once the close-up view is activated, the zooming interaction will change to vertical scrolling. However, you can still zoom in and out by holding down the Ctrl key while operating the mouse wheel or touchpad.

"},{"location":"sample-collections/analyzing/#manipulating-the-sample-collection","title":"Manipulating the sample collection","text":"

Sorting, filtering, and stratifying a large sample collection can provide valuable insights into the data by helping to identify patterns and outliers. By sorting samples based on a particular attribute or filtering out irrelevant samples, you can more easily identify patterns or trends in the data that might be difficult to see otherwise. Stratifying the sample collection by grouping samples into distinct categories can also help to identify meaningful differences between groups and reveal new insights into the data.

The GenomeSpy app enables users to manipulate the sample collection using incremental actions that operate on abstract attributes, such as metadata variables or measured values at specific genomic loci. By applying a series of these stepwise actions, users can gradually shape the sample collection to their needs, enabling complex analyses. The applied actions are saved in an undo history, which also serves as provenance information for bookmarks and shared links.

An example scenario

Suppose a user has a sample collection that includes multiple tumor samples from each patient and wants to keep a single representative sample from each patient. The user defines a representative sample as having a tumor purity greater or equal to 15% and the highest copy number at the MYC locus. To form a sample collection with only the representative samples, the user performs the following actions:

  1. Retains samples with purity greater than or equal to 15%
  2. Sorts the samples in descending order by the copy number at the MYC locus
  3. Retains only the top sample from each patient, based on the sorting in Step 2

Following these steps, the user is left with the representative samples.

"},{"location":"sample-collections/analyzing/#accessing-the-actions","title":"Accessing the actions","text":"

You can access the actions via a context menu, which appears when you right-click on a metadata attribute in the heatmap or a location in the genomic data panel.

There are two types of attributes: quantitative and categorical. Each type has a different set of supported actions. For example, quantitative attributes can be filtered using a threshold, while categorical attributes support retention or removal of selected categories.

The context menu also provides shortcuts to some actions based on the value under the cursor. For example, a context menu opened on a categorical attribute will give you actions for retaining or removing samples with the selected categorical value.

"},{"location":"sample-collections/analyzing/#undo-history-and-provenance","title":"Undo history and provenance","text":"

GenomeSpy stores the applied actions in an undo history, allowing you to easily experiment with different analyses and revert back to previous states if needed. The provenance button () reveals a menu that shows the applied actions together with the used attributes and parameters. You can jump to different states in the undo history by clicking the menu items or the adjacent previous/next buttons.

"},{"location":"sample-collections/analyzing/#the-actions","title":"The actions","text":""},{"location":"sample-collections/analyzing/#sort","title":"Sort","text":"

The Sort by action arranges the samples in a descending order based on the chosen quantitative attribute.

"},{"location":"sample-collections/analyzing/#filter-by-a-categorical-attribute","title":"Filter by a categorical attribute","text":"

The context menu provides two shortcut actions for retaining and removing samples having the chosen value in the selected attribute. The Advanced filter... option allows you to choose multiple categories to be retained or removed.

"},{"location":"sample-collections/analyzing/#filter-by-a-quantitative-attribute","title":"Filter by a quantitative attribute","text":"

For quantitative attributes, the menu offers shortcut actions for retaining samples with a value greater or equal to or less or equal to the chosen value. For more precise thresholding, you can use the Advanced filter... option, which opens a dialog with a histogram and options for choosing open or closed thresholds.

"},{"location":"sample-collections/analyzing/#retain-the-first-of-each","title":"Retain the first of each","text":"

In many analyses, it is necessary to select a single, representative sample from each category. This action retains the first, topmost sample from each category. It is not necessary to sort the samples by the categorical attribute, but rather they should be sorted according to the attributes used to select the representative samples. For a usage example, refer to the example scenario provided in the box above.

"},{"location":"sample-collections/analyzing/#retain-first-n-categories","title":"Retain first n categories","text":"

Sometimes you might be interested in a small number of categories that contain samples with the most extreme values in another attribute. For example, if each patient (the category) has multiple samples, this action allows you to retain all samples from the top-5 patients based on the highest number of mutations (the another attribute) in any of their samples.

"},{"location":"sample-collections/analyzing/#create-custom-groups","title":"Create custom groups","text":"

Use this action to manually select and group multiple categories together according to your specific requirements. This feature allows you to create new groups that contain any combination of categories from your data, giving you the flexibility to organize and view your data in customized groupings.

"},{"location":"sample-collections/analyzing/#group-by-categorical-attribute","title":"Group by categorical attribute","text":"

This action stratifies the data based on the selected categorical attribute. The groups will be shown to the left of the sample names, as shown above.

"},{"location":"sample-collections/analyzing/#group-by-quartiles","title":"Group by quartiles","text":"

This action enables rapid stratification into four groups using a quantitative attribute. The implementation uses the R-7 method, the default in the R programming language and Excel.

"},{"location":"sample-collections/analyzing/#group-by-thresholds","title":"Group by thresholds","text":"

The group by thresholds action allows stratifying the samples using custom thresholds on a quantitative attribute. Upon selecting this action, you are shown a dialog with a histogram, where you can add any number of thresholds and specify which side of the threshold should be open or closed.

"},{"location":"sample-collections/analyzing/#retain-matched","title":"Retain matched","text":"

This action retains categories that are common to all of the current groups. For example, suppose you are working with a sample collection with multiple samples from each patient. You have grouped the samples into two groups based on the anatomical site of the sample. By applying this action to the categorical patient attribute, you can ensure that your sample collection comprises only those patients with samples from both anatomical sites. In other words, the patients with only a single anatomical site are removed.

"},{"location":"sample-collections/analyzing/#bookmarking-and-sharing","title":"Bookmarking and sharing","text":"

Saving a visualization state together with provenance as a bookmark is a practical way to revisit a particular visualization later or share it with others. Bookmarks store the entire state of the visualization, including the actions taken to arrive at that state. This allows for easy and reproducible sharing of findings from the data. Moreover, bookmarks support optional Markdown-formatted notes that allow communicating essential background information and possible implications related to the discovery.

"},{"location":"sample-collections/analyzing/#bookmarks","title":"Bookmarks","text":"

GenomeSpy supports two types of bookmarks: pre-defined bookmarks that the visualization author may have included with the visualization and local bookmarks that you can save in your web browser. You can access both types of bookmarks from the bookmark menu (). Additionally, you can remove or edit existing bookmarks through a submenu that appears when you click the ellipsis button ().

"},{"location":"sample-collections/analyzing/#sharing","title":"Sharing","text":"

The current visualization state is constantly updated to the web browser's address bar, allowing you to quickly share the state with others. However, for better context, GenomeSpy's sharing function provides the option to include a name and notes with the shared state. Additionally, recipients can conveniently import the shared link into their local GenomeSpy bookmarks. You can share the current state by clicking on the Share () button, or share an existing bookmark by selecting the Share option from the bookmark's submenu.

"},{"location":"sample-collections/visualizing/","title":"Visualizing Sample Collections","text":"

Developer Documentation

This page is intended for users who develop tailored visualizations using the GenomeSpy app.

"},{"location":"sample-collections/visualizing/#getting-started","title":"Getting started","text":"

You can use the following HTML template to create a web page for your visualization. The template loads the app from a content delivery network and the visualization specification from a separate spec.json file placed in the same directory. See the getting started page for more information.

<!DOCTYPE html>\n<html>\n  <head>\n    <title>GenomeSpy</title>\n    <link\n      rel=\"stylesheet\"\n      type=\"text/css\"\n      href=\"https://cdn.jsdelivr.net/npm/@genome-spy/app@0.51.x/dist/style.css\"\n    />\n  </head>\n  <body>\n    <script\n      type=\"text/javascript\"\n      src=\"https://cdn.jsdelivr.net/npm/@genome-spy/app@0.51.x\"\n    ></script>\n\n    <script>\n      genomeSpyApp.embed(document.body, \"spec.json\", {\n        // Show the dataflow inspector button in the toolbar (default: true)\n        showInspectorButton: true,\n      });\n    </script>\n  </body>\n</html>\n

For a complete example, check the website-examples repository on GitHub.

"},{"location":"sample-collections/visualizing/#specifying-a-sample-view","title":"Specifying a Sample View","text":"

The GenomeSpy app extends the core library with a new view composition operator that allows visualization of multiple samples. In this context, a sample means a set of data objects representing an organism, a piece of tissue, a cell line, a single cell, etc. Each sample gets its own track in the visualization, and the behavior resembles the facet operator of Vega-Lite. However, there are subtle differences in the behavior.

A sample view is defined by the samples and spec properties. To assign a track for a data object, define a sample-identifier field using the sample channel. More complex visualizations can be created using the layer operator. Each composed view may have a different data source, enabling concurrent visualization of multiple data types. For instance, the bottom layer could display segmented copy-number data, while the top layer might show single-nucleotide variants.

{\n  \"samples\": {\n    // Optional sample identifiers and metadata\n    ...\n  },\n  \"spec\": {\n    // A single or layer specification\n    ...,\n    \"encoding\": {\n      ...,\n      // The sample channel identifies the track\n      \"sample\": {\n        \"field\": \"sampleId\"\n      }\n    }\n  }\n}\n

Y axis ticks

The Y axis ticks are not available in sample views at the moment. Will be fixed at a later time. However, they would not be particularly practical with high number of samples.

But we have Band scale?

Superficially similar results can be achieved by using the \"band\" scale on the y channel. However, you can not adjust the intra-band y-position, as the y channel is already reserved for assigning a band for a datum. On the other hand, with the band scale, the graphical marks can span multiple bands. You could, for example, draw lines between the bands.

"},{"location":"sample-collections/visualizing/#implicit-sample-identifiers","title":"Implicit sample identifiers","text":"

By default, the identifiers of the samples are extracted from the data, and each sample gets its own track.

"},{"location":"sample-collections/visualizing/#explicit-sample-identifiers-and-metadata-attributes","title":"Explicit sample identifiers and metadata attributes","text":"

Genomic data is commonly supplemented with metadata that contains various clinical and computational annotations. To show such metadata alongside the genomic data as a color-coded heat map, you can provide a data source with sample identifiers and metadata columns.

Explicit sample identifiers
{\n  \"samples\": {\n    \"data\": { \"url\": \"samples.tsv\" }\n  },\n  \"spec\": {\n    ...\n  }\n}\n

The data source must have a sample field matching the sample identifiers used in the genomic data. In addition, an optional displayName field can be provided if the sample names should be shown, for example, in a shortened form. All other fields are shown as metadata attributes, and their data types are inferred automatically from the data: numeric attributes are interpreted as \"quantitative\" data, all others as \"nominal\".

An example of a metadata file (samples.tsv):

sample displayName treatment ploidy purity EOC52_pPer_DNA4 EOC52_pPer NACT 3.37 0.29 EOC702_pOme1_DNA1 EOC702_pOme1 PDS 3.74 0.155 EOC912_p2Bow2_DNA1 EOC912_p2Bow2 PDS 3.29 0.53"},{"location":"sample-collections/visualizing/#specifying-data-types-of-metadata-attributes","title":"Specifying data types of metadata attributes","text":"

To adjust the data types, scales, and default visibility of the attributes, they can be specified explicitly using the attributes object, as shown in the example below:

Specifying a purity attribute
{\n  \"samples\": {\n    \"data\": { \"url\": \"samples.tsv\" },\n    \"attributes\": {\n      \"purity\": {\n        \"type\": \"quantitative\",\n        \"scale\": {\n          \"domain\": [0, 1],\n          \"scheme\": \"yellowgreenblue\"\n        },\n        \"barScale\": { },\n        \"visible\": false\n      },\n      ...\n    }\n  },\n  ...\n}\n

The scale property specifies a scale for the color channel used to encode the values on the metadata heatmap. The optional barScale property enables positional encoding, changing the heatmap cells into a horizontal bar chart. The visible property configures the default visibility for the attribute.

"},{"location":"sample-collections/visualizing/#adjusting-font-sizes-etc","title":"Adjusting font sizes, etc.","text":"

The samples object can also be used to adjust the font sizes, etc. of the metadata attributes. For example, to increase the font sizes of the sample and attribute labels, use the following configuration:

Adjusting font sizes
{\n  \"samples\": {\n    ...,\n    \"labelFontSize\": 12,\n    \"attributeLabelFontSize\": 10\n  },\n  ...\n}\n

The following properties allow for fine-grained control of the font styles: labelFont, labelFontSize, labelFontWeight, labelFontStyle, labelAlign, attributeLabelFont, attributeLabelFontSize, attributeLabelFontWeight, attributeLabelFontStyle.

In addition, the following properties are supported:

labelTitleText

The title of the sample labels.

Default value: \"Sample name\"

labelLength

The space allocated for the sample labels in pixels.

Default value: 140

labelAlign

The horizontal alignment of the text. One of \"left\", \"center\", or \"right\".

Default value: \"left\"

attributeSize

Default size (width) of the metadata attribute columns. Can be configured per attribute using the attributes property.

Default value: 10

attributeLabelAngle

Angle to be added to the default label angle (-90).

Default value: 0

attributeSpacing

Spacing between attribute columns in pixels.

Default value: 1

"},{"location":"sample-collections/visualizing/#handling-variable-sample-heights","title":"Handling variable sample heights","text":"

The height of a single sample depend on the number of samples and the height of the sample view. Moreover, the end user can toggle between a bird's eye view and a closeup view making the height very dynamic.

To adapt the maximum size of \"point\" marks to the height of the samples, you need to specify a dynamic scale range for the size channel. The following example demonstrates how to use expressions and the height parameter to adjust the point size:

Dynamic point sizes
\"encoding\": {\n  \"size\": {\n    \"field\": \"VAF\",\n    \"type\": \"quantitative\",\n    \"scale\": {\n      \"domain\": [0, 1],\n      \"range\": [\n        { \"expr\": \"0\" },\n        { \"expr\": \"pow(clamp(height * 0.65, 2, 18), 2)\" }\n      ]\n    }\n  },\n  ...\n}\n

In this example, the height parameter, provided by the sample view, contains the height of a single sample. By multiplying it with 0.65, the points get some padding at the top and bottom. To prevent the points from becoming too small or excessively large, the clamp function is used to limit the point's diameter to a minimum of 2 and a maximum of 18 pixels. As the size channel encodes the area, not the diameter of the points, the pow function is used to square the value. The technique shown here is used in the PARPiCL example.

"},{"location":"sample-collections/visualizing/#aggregation","title":"Aggregation","text":"

TODO

"},{"location":"sample-collections/visualizing/#bookmarking","title":"Bookmarking","text":"

With the GenomeSpy app, users can save the current visualization state, including scale domains and view visibilities, as bookmarks. These bookmarks are stored in the IndexedDB of the user's web browser. Each database is unique to an origin, which typically refers to the hostname and domain of the web server hosting the visualization. Since the server may host multiple visualizations, each visualization must have a unique ID assigned to it. To enable bookmarking, simply add the specId property with an arbitrary but unique string value to the top-level view. Example:

{\n  \"specId\": \"My example visualization\",\n\n  \"vconcat\": { ... },\n  ...\n}\n
"},{"location":"sample-collections/visualizing/#pre-defined-bookmarks-and-bookmark-tour","title":"Pre-defined bookmarks and bookmark tour","text":"

You may want to provide users with a few pre-defined bookmarks that showcase interesting findings from the data. Since bookmarks support Markdown-formatted notes, you can also explain the implications of the findings and present essential background information.

The remote bookmarks feature allows for storing bookmarks in a JSON file on a web server and provides them to users through the bookmark menu. In addition, you can optionally enable the tour function, which automatically opens the first bookmark in the file and allows the user navigate the tour using previous/next buttons.

"},{"location":"sample-collections/visualizing/#enabling-remote-bookmarks","title":"Enabling remote bookmarks","text":"View specification
{\n  \"bookmarks\": {\n    \"remote\": {\n      \"url\": \"tour.json\",\n      \"tour\": true\n    }\n  },\n\n  \"vconcat\": { ... },\n  ...\n}\n

The remote object accepts the following properties:

url (string) A URL to the remote bookmark file. initialBookmark (string) Name of the bookmark that should be loaded as the initial state. The bookmark description dialog is shown only if the tour property is set to true. tour (boolean, optional)

Should the user be shown a tour of the remote bookmarks when the visualization is launched? If the initialBookmark property is not defined, the tour starts from the first bookmark.

Default: false

afterTourBookmark (string, optional) Name of the bookmark that should be loaded when the user ends the tour. If null, the dialog will be closed and the current state is retained. If undefined, the default state without any performed actions will be loaded."},{"location":"sample-collections/visualizing/#the-bookmark-file","title":"The bookmark file","text":"

The remote bookmark file consists of an array of bookmark objects. The easiest way to create such bookmark objects is to create a bookmark in the app and choose Share from the submenu () of the bookmark item. The sharing dialog provides the bookmark in a URL-encoded format and as a JSON object. Just copy-paste the JSON object into the bookmark file to make it available to all users. A simplified example:

Bookmark file (tour.json)
[\n  {\n    \"name\": \"First bookmark\",\n    \"actions\": [ ... ],\n    ...\n  },\n  {\n    \"name\": \"Second bookmark\",\n    \"actions\": [ ... ],\n    ...\n  }\n]\n

Providing the user with an initial state

If you want to provide the user with an initial state comprising specific actions performed on the samples, a particular visible genomic region, etc., you can create a bookmark with the desired settings and set the initialBookmark property to the bookmark's name. See the documentation above for details.

"},{"location":"sample-collections/visualizing/#toggleable-view-visibility","title":"Toggleable View Visibility","text":"

When working with a complex visualization that includes multiple tracks and extensive metadata, it may not always be necessary to display all views simultaneously. The GenomeSpy app offers users the ability to toggle the visibility of nodes within the view hierarchy. This visibility state is also included in shareable links and bookmarks, allowing users to easily access their preferred configurations.

Views have two properties for controlling the visibility:

visible (boolean)

If true, the view is visible. This property can be used to set the default visibility.

Default: true

configurableVisibility (boolean)

If true, the visibility is configurable from a menu in the app

Configurability requires that the view has an explicitly specified name that is unique within the view specification.

Default: false for children of layer, true for others

"},{"location":"sample-collections/visualizing/#search","title":"Search","text":"

The location/search field in the toolbar allows users to quickly navigate to features in the data. To make features searchable, use the search channel on marks that represent the searchable data objects. Example:

{\n  ...,\n  \"mark\": \"rect\",\n  \"encoding\": {\n    \"search\": {\n      \"field\": \"geneSymbol\"\n    },\n    ...,\n  },\n  ...\n}\n
"},{"location":"sample-collections/visualizing/#a-practical-example","title":"A practical example","text":"

Work in progress

This part of the documentation is still under construction. For a live example, check the PARPiCL visualization, which is also available for interactive exploration

"}]} \ No newline at end of file