Anna Syme

Click name ↑ to return to homepage

A very brief and very basic overview

Not necessarily the best way; certainly not the only way !


Tools and code

#Data: Nanopore long reads; Illumina short reads; in fastq format

#Assemble long reads (raw, untrimmed, unfiltered) 
flye --nano-raw nanopore.fastq --genome-size 1000000000 --out-dir flye-assembly  

less flye-assembly/assembly_info.txt

#View the assembly_graph.gfa in Bandage to see how contigs connected
#Download Bandage locally to view. File: load graph, draw graph. 

#Polish assembly with racon - using long reads
#First map the reads to the assembly 
minimap2 -x map-ont flye-assembly/assembly.fasta nanopore.fastq | gzip > overlaps.paf.gz
#Then use the reads, the overlaps, and the assembly to make polished assembly
racon nanopore.fastq overlaps.paf.gz flye-assembly/assembly.fasta > raconpolish1.fasta
#Can repeat e.g. 4 times

#Polish assembly with medaka - using long reads
#Note: this needs some information about the sequencing for the model choice
#the pore type, the sequencing device (MinION or PromethION), the basecaller variant, the basecaller version
#if unspecified, a default model is used (as here, in this example)
medaka_consensus -i nanopore.fastq -d raconpolish1.fasta -o medaka

#Polish assembly further with pilon - using short reads
#The illumina reads will be more accurate so can correct the nanopore assembly
#But we should filter them to just keep the really good ones -in="illumina.fastq.gz" out=bbdukked_illumina.fastq minlen=110 k=25 mink=8 ktrim=r ref="illumina_adapters.fa" hdist=1 overwrite=f qtrim=rl trimq=35 t=auto lhist="lhist.txt" > "bbduk_log.txt" 
#Map the illumina reads to the polished assembly
bwa index medaka/consensus.fasta
bwa mem medaka/consensus.fasta bbdukked_illumina.fastq | samtools sort > aln.bam
#Index the bam file and the fasta file, then run pilon
samtools index aln.bam
samtools faidx medaka/consensus.fasta
pilon --genome medaka/consensus.fasta --frags aln.bam --output pilon1 --fix bases --mindepth 0.5 --changes --verbose

#Polished assembly is: pilon1.fasta