top of page

Renaming and dropping annotations
in a DVCF

At a glance

 

In some cases, an annotation’s name (rather than value) changes between the Primary and Luft renditions. This happens in case of a REF⇆ALT switch for annotations with a name that contains a reference to the REF or ALT allele (for example: ALT_F1R2), and in case of a strand reversal where the annotation name contains a reference to the strand, for example ADF. In some other cases, the annotation makes no sense in the Luft coordinates, and should be dropped entirely. Genozip implements annotation dropping by adding a “DROP_” prefix to their name.

 

Genozip default annotation renaming and dropping

Annotations that are renamed by default:

Annotation
Type
Renamed to
Upon
MAX_AF
INFO
DROP_MAX_AF
REF⇆ALT switch
CLNHGVS
INFO
DROP_CLNHGVS
Always
ADF
FORMAT
ADR
Strand reversal
ADR
FORMAT
ADF
Strand reversal
RDF
FORMAT
RDR
Strand reversal
RDR
FORMAT
RDF
Strand reversal
F1R2
FORMAT
F2R1
Strand reversal
F2R1
FORMAT
F1R2
Strand reversal
REF_F1R2
FORMAT
REF_F2R1 ALT_F1R2 ALT_F2R1
Strand reversal REF⇆ALT switch REF⇆ALT + Strand
ALT_F1R2
FORMAT
ALT_F2R1 REF_F1R2 REF_F2R1
Strand reversal REF⇆ALT switch REF⇆ALT + Strand
REF_F2R1
FORMAT
REF_F1R2 ALT_F2R1 ALT_F1R2
Strand reversal REF⇆ALT switch REF⇆ALT + Strand
ALT_F2R1
FORMAT
ALT_F1R2 REF_F2R1 REF_F1R2
Strand reversal REF⇆ALT switch REF⇆ALT + Strand
The --dvcf-rename option

Annotations may be renamed by specifying the --dvcf-rename command line option, together with --chain, for example:

 

genozip myfile.vcf --chain mychain.chain.genozip --dvcf-rename=FORMAT/ADF:STRAND>ADR|REFALT>DROP_ADF

The argument is a comma-separated list of all annotations that need to be renamed (this example contains only one annotation - FORMAT/ADF):

- The annotation name (FORMAT/ADF in this case) can be the name only (eg ADF), or prefixed with INFO/ or FORMAT/ to resolve ambiguity.

- The rules for renaming the particular annotation are specified as a | (pipe)-separated list to the right of the colon. In the example above, we have two rules: STRAND>ADR and REFALT>DROP_ADF.

 

- Each rule consists of an event and a destination annotation name, separated by a > (greater-than) character. The event can be one of the four:​

Rule
Rule activated upon
STRAND
Strand reversal
REFALT
REF⇆ALT switch
TLAFER
Concurrent strand reversal and REF⇆ALT switch
ALWAYS
Always
The --dvcf-drop option

Annotations may be dropped with the --dvcf-drop command line option, for example:

 

genozip myfile.vcf --chain mychain.chain.genozip --dvcf-drop=INFO/MAX_AF:REFALT

This is equivalent of --dvcf-rename=INFO/MAX_AF:REFALT>DROP_MAX_AF

To override Genozip’s default renaming, just rename the tag to itself, for example:

--dvcf-rename=INFO/MAX_AF:ALWAYS>MAX_AF

The --show-rename-tags option

--show-rename-tags can be used in combination with --chain or when compressing a DVCF file, to display the list of annotations that are to be renamed.

Questions? support@genozip.com

bottom of page