AI is at the forefront of Society changes. We are at the dawn of a new world. Let’s hope that it will NOT be a “Brave New World“. Today I discovered (through a LinkedIn post by the group “Generative AI” which has 1.3M followers.)
This brand new AI is called “youai.ai” with mostly free “mini-AI” engines applications. For the purpose of this blog I tried the free blog-post-generator (Remix)
. (Remix means that another user made some changes on an existing, previous version.) The “mini-AI” are created by developpers (or anyone that register for free) and have an back engine that can be ChatGPT 3.5, 4, or Claude 1, Claude 2, and other engines will be added in the future.
Short demonstrations are available as short video on their YouTube channel: https://www.youtube.com/@YouAi_ai/videos
Steps
This AI Blog generator has some automation steps that will help create a better targetted post, in that sense it asks a few questions, including a sample of previous writing to hopefully write in a similar style. I provided a few previous posts from DNA.today. Finally I asked the AI to write a short blog (I asked 250 words) to answer the question:
How can I analyze RNA-Seq data from E.Coli
Below is the provided answer which include the title, the blog (which includes code) and a summary of the “thinking of the AI” that made it write the content of the post.
Interestingly the post “evolved” as it was written… I believe that I saw hisat2
within the program list at first, but it is no longer in the final blog.
The copying does not keep the formatting to paste directly here, but pasting first within MacDown
.app on the Mac helps. Here is the Blog unedited except for color added to code for clarity:
Analyzing RNA-Seq Data of E.Coli Using Bash Commands
by DNA TODAY
RNA sequencing (RNA-Seq) is a popular technique for studying gene expression and regulation at a genome-wide level. Analyzing RNA-Seq data accurately and efficiently is important for gaining biological insights. For small datasets, bash commands can be useful for a quick analysis without needing complex bioinformatics pipelines.
To get started, download E.Coli RNA-Seq data from the NCBI SRA database using the SRA Toolkit. Convert the SRA files to FASTQ format, the standard format for storing raw sequencing reads. The
fastq-dump
tool can be used for this conversion.
Align the reads to the E.Coli reference genome with a spliced aligner like HISAT2. The aligned reads in SAM format can then be sorted and converted to BAM, a binary and compressed version of SAM, using
samtools sort
and
samtools view -bS
.
To check the mapping quality, use
samtools flagstat
to get summary stats like the percent of reads that mapped to the genome. The
samtools depth
tool can generate a genome coverage plot to check for evenness of coverage across the genome.
For quantifying gene expression, use
htseq-count
with the E.Coli GFF3 gene annotation file. This will output raw gene counts that can be normalized and used for differential expression analysis.
In a nutshell, with a few bash commands and efficient bioinformatics tools, an RNA-Seq analysis of prokaryotic data can be performed. For analyzing larger eukaryotic datasets, a more comprehensive bioinformatics pipeline may be necessary. But for a quick look at your data, bash commands can be quite useful!
Thinking Overview
- I chose a title, opening paragraph, and closing paragraph in an accurate and concise style without excess fluff, as per the user’s description of their writing style.
- I focused on providing a high-level overview of core RNA-Seq analysis steps like read alignment, quality control, and gene quantification using tools commonly used for prokaryotic data.
- I described each step briefly without delving into too much detail, keeping the word count around 250 words as requested. The content is aimed at the target audience with knowledge of RNA-Seq and bioinformatics.