Category Archives: Uncategorized

Bioawk for handling bioinformatics formats

Today I found a new tool: bioawk that was written by Heng Li who also wrote samtools and bwa. I first discovered it ont this blog: bioawk-basics (Bioinformatics Workbooks) There is also a short tutorial on GitHub: github.com/vsbuffalo/bioawk-tutorial I also found a recent docker image, and in fact there are only 2 images on docker hub: lbmc/bioawk updated 2 months ago,… Read More »

A Boy And His Atom: The World’s Smallest Movie

Moving atoms For some reason a paper copy of “Chemical and Engineering News” (November 11, 2019 – Vol 97 Issue 44) ended up in my hands, and I almost missed this fun section named: “30 years of moving atoms: How scanning probe microscopes revolutionized nanoscience” (link.) The article is progressing over time from 1993 til… Read More »

GREP was written overnight – Birth and Name

I use grep very often, and I made-up and acronym that made sense to me: Get REgular ExPression But I discovered this YouTube video that gives an accurate historical recounting of its birth and where its name came  from. See video below, titled “Where GREP Came From – Computerphile” Summary: this comes from the command g/re/p… Read More »

STAR index for human genome – overcoming the hardware barriers

Recently I was testing a Docker image to run a container for Next Gen sequencing, a way to test an existing “pipeline” on the first published study of the effect of the Zika virus. (https://hub.docker.com/r/maayanlab/zika/) Running a docker container may provide some ease in reproducibility, but sometimes there are also hardware barrier that need to… Read More »

Down-sampling FASTQ.gz paired ends

Downsampling I have performed a search for creating a set of down-sampled data from an actual  large dataset, and while there are many creative information on BioStar and other forums, I find that the most versatile and easy to use tool would be one recommended on the forums: seqtk which is available on Github: github.com/lh3/seqtk  Quoting… Read More »

Hunting for SRA sequence archives

SRA: Sequence Read Archive The Sequence Read Archive (SRA) makes biological sequence data available to the research community to enhance reproducibility and allow for new discoveries by comparing data sets. The SRA stores raw sequencing data and alignment information from high-throughput sequencing platforms, […] However, it is rather difficult to even find the download links… and even… Read More »

Docker tutorials for Biologists

I have started a series of tutorials that I am writing from the perspective of a biologist wanting to use a Docker container for a specific application. An easy example could be using EMBOSS, the molecular biology open suite for analysis. The tutorials are online at the Biochemistry department here: Docker tutorials (general page) Docker… Read More »

asciinema: record commands in terminal

RE: asciinema.org  (Linux/MacOS) It may be nice to share/show commands being typed on a Text Terminal and embed this simple “movie” within blog or HTML page. It seems that the recording gets uploaded to their web site… Since it’s all text-based the file should be rather small and the clarity of replay very good compared… Read More »