What Our Users Say
"The Institute of Genomic Medicine's (IGM) Bioinformatics Core, situated within the Columbia University Irving School of Medicine, manages a variant warehouse containing approximately 130,000 whole-genome sequencing (WGS) and whole-exome sequencing (WES) samples. This warehouse serves the dual purpose of gene discovery and diagnostic analysis and has been utilized in numerous published analyses. Additionally, the IGM acts as a long-term repository for original off-machine FASTQ files of internally and externally sequenced samples, which must be preserved in their original form.
After an extensive evaluation of the cost, compute, compression benefits of multiple options we decided upon the use of Genozip Premium package.
We applied the lossless Genozip compression on approximately 172,000 of our most recent internally stored FASTQ pairs. This reduced their data footprint from 537.4 TB to 115.6 TB, resulting in an average space savings of 78.5%. Not only did this significantly reduce storage costs, but it also facilitated the migration of the entire dataset to our cloud infrastructure.
I can highly recommend Genozip to any organization looking to reduce the storage footprint of their FASTQ files." (ref)
Daniel S. T. Hughes MBioch (Hons; Oxford) PhD (Cambridge)
Director of Bioinformatics, Institute of Genomic Medicine
"I form part of the team working on the TargetID research project at the University of Malta, which is using RNA-seq and whole genome sequencing to find suitable drug targets that will prevent COVID-19 induced cytokine storms. Our pipelines were generating data at higher rates than our initial estimate, and we needed to free up some storage. Standard file compressors weren’t cutting it, and while researching file-specific compressors I encountered ‘Genozip’. At first I was sceptical of its claimed compression ratios and ability to compress CRAM files, but I was able to reproduce these results on our own files. We have now incorporated genozip into our pipelines which has reduced our initial estimate of 205TB to a new estimate of 55TB. I was pleasantly surprised with Genozip and I recommend it to those encountering similar storage space issues."
University of Malta
"Dramatic decreases in sequencing costs and advances in analysis techniques have significantly increased the amount of data obtained by NGS. However, the cost of the data storage has not decreased in the same way, which is a major problem in sequence analysis. I am currently working as a member of a team at Kyoto University on single-cell and whole-genome analysis projects, and we have been struggling with the rapidly increasing amount of sequence data. Our collaborator introduced Genozip to us, and we carefully verified its performance. We were very impressed by the speed, high compression ratio, high versatility, ease of implementation, and quick support, and are now using Genozip to compress all our Bam and Fastq files. The speed of compression and decompression is very important for daily use, and although we used to compress files only for long-term storage, we are now compressing files for short-term storage as well, which has resulted in a great reduction in the amount of data used."
Masahiro M Nakagawa M.D., Ph.D.
"Metagenomic pipelines can generate tons of intermediate files, some of which you want to retain until your analyses are complete. It's not uncommon for 100 GB of sequencing reads to balloon to 1 TB or more after assembly, alignment, and binning. Genozip is fast, efficient, and can be incorporated seamlessly to curtail file bloat in any metagenomics workflow. Even FASTQs that have been previously compressed by gzip are made smaller by Genozip. It really is the best software to manage genomics data."
Albert Vill, PhD