Skip to content

Releases: bioinform/somaticseq

Added sequence complexity feature

01 Apr 01:33

Choose a tag to compare

  • Added linguistic sequence complexity (LC) as a feature: 80-bp window adjacent to and spanning the variant position. For adjacent, the lower value (between right and left) is retained. Therefore, be careful for models trained before this release. The feature set has changed.
  • Fixed a bug for xgboost mode when training and prediction mode used different feature set.
  • Changed the ada model file name to have "ada" in it.

Incorporated xgboost algorithm as an option (default is still ada)

14 Oct 00:28

Choose a tag to compare

  • To invoke xgboost for somaticseq_parallel.py, do --algorithm xgboost before the paired or single option. The default is still ada.
  • Set tree depth = 16 for ada because this seems optimal after internal benchmarking.
  • Minor bug fixes and improvements (see docs for details).

Special release for SEQC2 special project

09 Sep 21:24

Choose a tag to compare

  • This is a special release for the somatic mutation working group of the SEQC2 consortium to establish a set of tumor-normal reference samples
  • It is based on v2 of SomaticSeq, containing custom scripts with hard-coded sample names, etc.
  • Corresponding docker image at lethalfang/somaticseq:seqc2_v1.1
  • NOT intended for general use.

Fixed the TA2CG variable in R script

24 Apr 23:54

Choose a tag to compare

Fixed a bug in nucleotide change feature in training mode. T>C and only T>C base change was not annotated properly.

Improved installation script

22 Mar 23:41

Choose a tag to compare

  • Re-wrote in Python some somatic caller run script generators that were once written in bash, at utilities/dockered_pipelines/makeSomaticScripts.py.
  • Fixed setup.py, even though running "setup.py install" is optional. You can still (always) run scripts from where you downloaded SomaticSeq.

minor maintenance release

19 Feb 21:32

Choose a tag to compare

  • Fixed some bash scripts involved with single-sample multi-thread callers.
  • vcfModifier/splitVcf.py to handle multi-allelic calls better for indels, and exclude complex variants that are technically not indels. We may handle them separately (e.g., variants like GCA>GATT) in a future release.

Special release for one-time project

20 Jan 23:53

Choose a tag to compare

Pre-release
  • This is a special release for a one-time project.

  • It is based on v2 of SomaticSeq, containing project-specific custom scripts and sample names, etc.

  • Corresponding docker image at lethalfang/somaticseq:seqc2_v1.0

  • Not intended for general use.

bug fix and add support for platypus

05 Oct 06:13

Choose a tag to compare

  • Fixed a bug introduced in v3.0.1 that caused the program to handle .vcf.gz files incorrectly.
  • Incorporated Platypus into paired mode.
  • When splitting MuTect2 files into SNV and INDEL, make sure either the ref base or the alt base (but not both) consists of a single base, i.e., discarding complex variants like GCAA>GCT.

bug fix

13 Sep 22:10

Choose a tag to compare

Fixed a bug that was introduced in v3.0.0, that did not handle Strelka and LoFreq indel files correctly.

version 3 release

27 Aug 20:59

Choose a tag to compare

Made SomaticSeq extendable as a python library.
See docs/Manual.pdf for details.