Allele search web component by That-Thing · Pull Request #595 · legumeinfo/web-components

That-Thing · 2026-02-26T03:41:25Z

Web component version of the SoyBase allele search tool requested by @StevenCannon-USDA.

StevenCannon-USDA · 2026-02-26T16:27:01Z

Testing on 2/26 locally (nom run serve), I see only <lis-allele-search-element>

matthewwiese · 2026-02-26T16:51:31Z

@StevenCannon-USDA FWIW from a fresh clone on the branch I see all existing components plus the allele search one.

@That-Thing Regarding the TODO in the element code (glyma.Wm82.gnm4.Gm01 etc placeholders) could this be made customizable, such as part of the "collections"? In case we want to use this for something other than soybean in future.

StevenCannon-USDA

Testing now, after (re?)running npm run build, the content comes up fine for me.

This all looks good to me - though I think I would like two changes:

Change the text "Ref / Alt only" under 1.5 to "All strains". The behavior has changed vs. the implementation at https://www.soybase.org/tools/ . I think the new behavior is good, but that first radio button does give all strains, so that should be so-labeled.
I think we should limit the query region size, in both the "Identifier" and "Region" sections. I suggest 1000000. That probably then calls for a corresponding label: "Flanking Region (1 million max)"

… 1 million, ch ange wording of ref / alt.

That-Thing · 2026-03-04T15:28:12Z

@matthewwiese
Getting a weird error from fasta-api:

curl 'http://dev.lis.ncgr.org:50043/vcf/alleles/glyma.Wm82.gnm4.Gm16:3727736-3749207/https%253A%252F%252Fdata.legumeinfo.org%252FGlycine%252Fmax%252Fdiversity%252FWm82.gnm4.div.Song_Hyten_2015%252Fglyma.Wm82.gnm4.div.Song_Hyten_2015.vcf.gz?encoding=hap'

response:

{"error": "Unable to open file: [Errno 0] Closing failed: Success: 'https://data.legumeinfo.org/Glycine/max/diversity/Wm82.gnm4.div.Song_Hyten_2015/glyma.Wm82.gnm4.div.Song_Hyten_2015.vcf.gz'", "status": 400}

matthewwiese · 2026-03-04T17:01:05Z

@That-Thing That's interesting, I think the .tbi index is wrong? On my side the errors are:

[E::hts_open_format] Failed to open file "https://data.legumeinfo.org/Glycine/max/diversity/Wm82.gnm4.div.Song_Hyten_2015/glyma.Wm82.gnm4.div.Song_Hyten_2015.vcf.gz" : Destination address required
[E::hts_open_format] Failed to open file "https://data.legumeinfo.org/Glycine/max/diversity/Wm82.gnm4.div.Song_Hyten_2015/glyma.Wm82.gnm4.div.Song_Hyten_2015.vcf.gz" : Destination address required
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::bgzf_read_block] Invalid BGZF header at offset 116069636
[E::bgzf_read_block] Invalid BGZF header at offset 116069636
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::bgzf_read_block] Invalid BGZF header at offset 116069636
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::hts_open_format] Failed to open file "https://data.legumeinfo.org/Glycine/max/diversity/Wm82.gnm4.div.Song_Hyten_2015/glyma.Wm82.gnm4.div.Song_Hyten_2015.vcf.gz" : Destination address required
[E::bgzf_read_block] Invalid BGZF header at offset 116078328
[E::bgzf_read_block] Invalid BGZF header at offset 116078328

Pysam's fetch() automatically looks for a tbi at the same URL as the VCF, but the file at that location is only ~70 bytes https://data.legumeinfo.org/Glycine/max/diversity/Wm82.gnm4.div.Song_Hyten_2015/glyma.Wm82.gnm4.div.Song_Hyten_2015.vcf.gz.tbi

Whereas e.g. https://data.soybase.org/Glycine/max/diversity/Wm82.gnm2.div.Valliyodan_Brown_2021/USB481-25Kshared50Kpos.vcf.gz.tbi is ~170 KB.

My guess is the data is corrupt or something? @adf-ncgr @StevenCannon-USDA any ideas?

adf-ncgr · 2026-03-04T17:37:19Z

sounds plausible. When I run bcftools view I don't get an error but indexed retrieval seems to yield nothing but the headers. I'll see if I can fix the version in the datastore but @StevenCannon-USDA will then have to sync it to where the web-hosting takes place.

adf-ncgr · 2026-03-04T17:48:10Z

OK, there were several files that seemed to have a similar problem (different genome versions of the same Song_Hyten_2015 genotype set). I've fixed them on ceres but I think the problem @That-Thing reported will remain until @StevenCannon-USDA can sync the new files to the web-hosting server.

StevenCannon-USDA · 2026-03-04T18:08:19Z

I have schlepped these over to c2s2. This hasn't seemed to fix the problem for me -- though the explanation makes sense.

Adding @weihuang12 , who has been working on these files.

Is the problem that some of the indexes are out of date with the bgzipped VCFs?

adf-ncgr · 2026-03-04T18:22:59Z

I'm not really sure what happened. As @matthewwiese said the tbi files that were there before I re-ran the indexing seemed to be only ~70bytes in size, although it looked like they had been updated recently. Now if I run a bcftools view locally on ceres the indexing seems to work OK, but I agree that the web-hosted one is still mis-behaving; not sure if some sort of caching could be at play. Can you verify that the tbi files on c2s2 are appropriately sized? (they are hidden through the h5ai interface so I can't tell)

adf-ncgr · 2026-03-04T18:25:44Z

actually nevermind about validating the file size, I just curled it and it seems OK. Puzzled as to what may be happening.

adf-ncgr · 2026-03-04T18:29:09Z

Actually, I think it was a caching issue, at least in the way I was testing it. doing something like:
bcftools view https://data.legumeinfo.org/Glycine/max/diversity/Wm82.gnm4.div.Song_Hyten_2015/glyma.Wm82.gnm4.div.Song_Hyten_2015.vcf.gz glyma.Wm82.gnm4.Gm01:1-100000
first time around downloads the tbi file to where you ran the command. After deleting the bad one from a previous attempt it started working properly.

That-Thing added 4 commits January 16, 2026 11:28

Modify existing SoyBase allele search tool into a web component.

fdb18e7

center chr select

a634fd9

remove outdated comments

a142078

Fixes and additions to allele search tool

9f03d53

StevenCannon-USDA requested review from StevenCannon-USDA, ctcncgr and matthewwiese February 26, 2026 15:31

StevenCannon-USDA reviewed Feb 26, 2026

View reviewed changes

StevenCannon-USDA mentioned this pull request Feb 27, 2026

Adds filtering by strains to allele search tool soybase/jekyll-soybase#269

Closed

Allow setting collections data inside the component, limit queries to…

32bafbc

… 1 million, ch ange wording of ref / alt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allele search web component#595

Allele search web component#595
That-Thing wants to merge 5 commits intomainfrom
allele-search-tool

That-Thing commented Feb 26, 2026

Uh oh!

StevenCannon-USDA commented Feb 26, 2026 •

edited

Loading

Uh oh!

matthewwiese commented Feb 26, 2026

Uh oh!

StevenCannon-USDA left a comment •

edited

Loading

Uh oh!

That-Thing commented Mar 4, 2026

Uh oh!

matthewwiese commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

StevenCannon-USDA commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

That-Thing commented Feb 26, 2026

Uh oh!

StevenCannon-USDA commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matthewwiese commented Feb 26, 2026

Uh oh!

StevenCannon-USDA left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

That-Thing commented Mar 4, 2026

Uh oh!

matthewwiese commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

StevenCannon-USDA commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

adf-ncgr commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

StevenCannon-USDA commented Feb 26, 2026 •

edited

Loading

StevenCannon-USDA left a comment •

edited

Loading