
Compress your genomic data
Save terabytes to petabytes
-
Widely used in hundreds of institutions, hospitals and companies (our users)
-
Best compression available for a wide range of cases (benchmarks, publications)
-
Compresses all common genomic formats - FASTQ, BAM, VCF, GFF, FASTA and more
-
Lossless, verified by MD5 (see here)
-
Affordable price ; free for certain research applications (see here)
Examples
Example of compressing paired-end FASTQ files. These are Illumina NovaSeq 30X WGS files:
[20:23:56]$ genozip GFX0241869_SA_L001_R1_001.fastq.gz GFX0241869_SA_L001_R2_001.fastq.gz --pair --reference GRCh38_full_analysis_set_plus_decoy_hla.fa
genozip GFX0241869_SA_L001_R1_001.fastq.gz : Done (2 minutes 50 seconds)
genozip GFX0241869_SA_L001_R2_001.fastq.gz : Done (6 minutes 30 seconds, FASTQ compression ratio: 21.8 - better than .fastq.gz by a factor of 4.5)
testing: genounzip GFX0241869_SA_L001_R1_001.fastq.gz : verified as identical to the original FASTQ
testing: genounzip GFX0241869_SA_L001_R2_001.fastq.gz : verified as identical to the original FASTQ
[20:35:04]$
[20:35:06]$ ls -nh *fastq*
-rw-------+ 1 100 100 29G Oct 12 2020 GFX0241869_SA_L001_R1_001.fastq.gz
-rw-------+ 1 100 100 33G Oct 12 2020 GFX0241869_SA_L001_R2_001.fastq.gz
-rw-rw-r--+ 1 100 100 14G Jan 12 20:33 GFX0241869_SA_L001_R1+2_001.fastq.genozip
Example of compressing a BAM file. This is the same NovaSeq data as above, aligned with bwa mem:
[20:56:02]$ genozip GFX0241869.h38.bam --reference GRCh38_full_analysis_set_plus_decoy_hla.fa
genozip GFX0241869.h38.bam : Done (11 minutes 26 seconds, BAM compression ratio: 3.8)
testing: genounzip GFX0241869.h38.bam.genozip : verified as identical to the original BAM
[21:20:02]$
[21:20:02]$ ls -nh GFX0241869.h38.bam*
-rw-------+ 1 100 100 56G Apr 10 2022 GFX0241869.h38.bam
-rw-rw-r--+ 1 100 100 15G Jan 12 21:07 GFX0241869.h38.bam.genozip
Contact
Sales inquiries: sales@genozip.com
Technical questions: support@genozip.com
All other inquires: info@genozip.com