Skip to content

VCF reader unit tests #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 39 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
e6cfbb1
Create ci.yml
pdebski21 May 9, 2025
a27fd0f
WIP: VCF
mwiewior Feb 17, 2025
01fcddb
Refactor
mwiewior Feb 17, 2025
e33f884
Parsing info draft
mwiewior Feb 19, 2025
da8ee0a
Working parser
mwiewior Feb 19, 2025
6d9e915
Refactoring infos
mwiewior Feb 19, 2025
7a92f18
async-trait downgrade
mwiewior Feb 19, 2025
d796d1b
Reverting to 0-based
mwiewior Feb 19, 2025
8f0a070
Renamed columns
mwiewior Feb 20, 2025
078f043
Add retry go operator
mwiewior Feb 24, 2025
08d5e1e
Add IOTimeout
mwiewior Feb 24, 2025
0160b15
Fixing streams
mwiewior Feb 24, 2025
c489775
Adding s3
mwiewior Feb 24, 2025
4b1fbca
fix: Basic fields
mwiewior Feb 25, 2025
d607031
Adding support for remote reading of uncompressed VCFs
mwiewior Feb 28, 2025
dc02686
Fix header
mwiewior Feb 28, 2025
222a367
Optimize variant_end
mwiewior Feb 28, 2025
712dad9
Enabling projection
mwiewior Mar 1, 2025
b0ed8c5
Fixing local vcf without compression and gcs reads optimization
mwiewior Mar 4, 2025
a7e6c7c
Fixing local vcf reading with no compression
mwiewior Mar 4, 2025
dfdb5be
Describe VCF
mwiewior Mar 5, 2025
dec3402
fix: Tag case sensitive
mwiewior Mar 7, 2025
0713aa5
add unit tests for storage
Jun 1, 2025
a47b690
add performance/time measurement for batch processing vcf with noodles
pdebski21 Apr 13, 2025
5e0f826
add retry mechanism and adjust chunk size along with minimal concurre…
pdebski21 Apr 13, 2025
3dc5630
Refactor scan to separate projected schema computation and use Field:…
pdebski21 Apr 18, 2025
8a84533
propagate builders errors
pdebski21 Apr 18, 2025
492eb47
Optimize OptionalField::new() to use with_capacity
pdebski21 Apr 18, 2025
0e9b30e
Improve info_to_arrow_type logic
pdebski21 Apr 18, 2025
606a602
refactor format fields and cleanup code
pdebski21 Apr 19, 2025
46e4d72
complete bgzf compressed files format ingestion tests
pdebski21 Apr 20, 2025
24bb9df
add bgzf test in similar format to test_noodles.rs
pdebski21 Apr 27, 2025
fa7c1a5
add docker-compose for testing iceberg
pdebski21 Apr 28, 2025
7ed3d7d
Cleanup a few warnings
mwiewior May 31, 2025
1613662
Bump runner image
mwiewior May 31, 2025
ccb5a9f
Update storage.rs
pdebski21 Jun 1, 2025
545b1c9
Update storage.rs
pdebski21 Jun 1, 2025
2955381
Update storage.rs
pdebski21 Jun 1, 2025
0eb1be4
Update storage.rs
pdebski21 Jun 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: CI

on:
push:
branches:
- main
pull_request:

jobs:
build-test:
runs-on: ubuntu-22.04
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
steps:
- name: Checkout code
uses: actions/checkout@v2
with:
submodules: "recursive"
fetch-depth: 1

- name: Setup Rust
uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: '1.85.0'
components: 'clippy, rustfmt'

- name: Cache Cargo registry and build
uses: actions/cache@v3
with:
path: |
~/.cargo/registry
~/.cargo/git
datafusion/vcf/target
key: ${{ runner.os }}-cargo-${{ hashFiles('datafusion/vcf/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-

- name: Check formatting
working-directory: datafusion/vcf
run: cargo fmt --all -- --check

- name: Run tests
working-directory: datafusion/vcf
run: cargo test
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ Cargo.lock
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/
8 changes: 8 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
repos:
- repo: https://github.com/doublify/pre-commit-rust
rev: v1.0
hooks:
- id: fmt
args: ["--all", "--"]
- id: cargo-check

Loading