Monthly Archives: December 2019

STAR index for human genome – overcoming the hardware barriers

Recently I was testing a Docker image to run a container for Next Gen sequencing, a way to test an existing “pipeline” on the first published study of the effect of the Zika virus. (https://hub.docker.com/r/maayanlab/zika/) Running a docker container may provide some ease in reproducibility, but sometimes there are also hardware barrier that need to… Read More »

Down-sampling FASTQ.gz paired ends

Downsampling I have performed a search for creating a set of down-sampled data from an actual  large dataset, and while there are many creative information on BioStar and other forums, I find that the most versatile and easy to use tool would be one recommended on the forums: seqtk which is available on Github: github.com/lh3/seqtk  Quoting… Read More »

Hunting for SRA sequence archives

SRA: Sequence Read Archive The Sequence Read Archive (SRA) makes biological sequence data available to the research community to enhance reproducibility and allow for new discoveries by comparing data sets. The SRA stores raw sequencing data and alignment information from high-throughput sequencing platforms, […] However, it is rather difficult to even find the download links… and even… Read More »