VARMERGE¶
The VARMERGE command ensures that overlapping variants, represented as reference sequence and alternative sequence, are denoted in an equivalent manner.
The -seg
option is used to indicate if the variants are denoted with zero-based segments (e.g. Chrom,bpStart,bpStop,Ref,Alt) as opposed to the default one-based position format (e.g. Chrom,Pos,Ref,Alt).
By default the variants are represented in a right-normalized format such that SNPs have one letter notation and InDels have an identical base in the first letter of refcol and altcol. The -nonorm
option does skip the normalization step and represents the variants with the maximum span of the reference sequence (depends on the overlap between reference sequences in the input stream).
Right-normalising has the benefit of presenting the variations in different rows into a coherent form. This is done because most aligners do not guarantee consistent representations of InDels for sequence reads in repeat regions.
The merge span (as defined by the -span
option) is capped at 1M base-pairs, however, such high normalization is not recommended. Rather consider using VARNORM
and the use of the -norm option in VARJOIN.
Usage¶
gor ... | VARMERGE refcol altcol [ -seg | -nonorm| -span]
Options¶
|
The variant is denoted as segment, e.g. (chr,bpstart,bpstop,ref,call). |
|
Do not minimize (normalize) the variant after merging the reference seqs. |
|
Max merge span. The default is 100bp, max 1Mb (crazy high!) |
Examples¶
For examples of how to use the VARMERGE command, check the chapter on Merging Variants.