MetaCerberus Tutorial - What’s in the Results folder?

Now that we’ve run the MetaCerberus pipeline, let’s take a look at the results folder. Having ran the --super option on my data, the results folder looks like so:

_images/results_folder.jpg

Now a closer look at each subdirectory of our results:

Step_5-format:

_images/step_5_outputs.jpg

Step 5 contents only consist of a complete file, which merely indicates Step 5 ran to completion.

Step_6-metaomeQC

Here are the contents of step_06-metaomeQC:

_images/S6_contents.jpg

The file read-stats.txt contains statistics for your input file, like so:

_images/step6-read-stats-txt.jpg

Note

The file stderr.out is a log file where any error messages will be stored.

Step_7-geneCall

Contents of the step_07-geneCall directory are:

_images/Step7_contents.jpg

These are protein files in different formats.

Step_8-hmmer

Contents of the step_08-hmmer directory are:

_images/Step8_contents.jpg

For your MetaCerberus run, you should get a subdirectory for the mode that MetaCerberus used (FragGeneScan, Prodigal, Prodigalgv, etc). In this example run, we have several file outputs for FragGeneScan. The types of outputs are similar for prodigal.

This is what they look like:

Note

.tsv files can be opened with Excel.

KOFam_all_FOAM-FragGeneScan_Lambda_phage_sequences.tsv:

_images/S8-KOFam_FOAM_FGS_tsv.jpg

KOFam_all_KEGG-FragGeneScan_Lambda_phage_sequences.tsv

_images/S8-FGS-KOFam-KEGG_tsv.jpg

filtered-KOFam_all_FOAM.tsv

_images/S8-filtered-KOFam-FOAM_tsv.jpg

filtered-KOFam_all_KEGG.tsv

_images/S8-filtered-KOFam-allKEGG_tsv.jpg

filtered.tsv

_images/S8-filtered_tsv.jpg

Step_9-parse

The contents of step_09-parse are:

_images/S9_contents.jpg

Looking a little closer:

HMMER-KOFam_all_FOAM_top_5.tsv

_images/S9-HMMR-KOFam_allFOAM_top5_tsv.jpg

HMMER-KOFam_all_KEGG_top_5.tsv

_images/S9-HMMR_KOFam_allKEGG_top5_tsv.jpg

HMMER_BH_KOFam_all_FOAM_rollup2.tsv

_images/S9-HMMR_BH_KOFam_FOAM_rollup2_tsv.jpg

HMMER_BH_KOFam_all_KEGG_rollup2.tsv

_images/S9-HMMR_BH_KOFam_all_KEGG_rollup2_tsv.jpg

HMMER_top_5.tsv

_images/S9_HMMR_top_5.tsv.jpg

KOFam_all_FOAM-rollup_counts.tsv

_images/S9_KOFam_all_FOAM_rollup_counts_tsv.jpg

KOFam_all_KEGG-rollup_counts.tsv

_images/S9-KOFam_allKEGG_rollup_counts_tsv.jpg

counts_KOFam_all_FOAM.tsv

_images/S9-counts_KOFam_allFOAM_tsv.jpg

counts_KOFam_all_KEGG.tsv

_images/S9-counts_KOFam_all_KEGG_tsv.jpg

top_5-FragGeneScan_Lambda_phage_sequences.tsv

_images/S9-top5_FGS_tsv.jpg

Step_10-visualizeData

The contents of step_10-visualizeData are:

_images/S10-contents.jpg

What’s in the FragGeneScan and Prodigal subdirectories?

_images/S10-FGS-Prod-contents.jpg

Files under FragGeneScan or Prodigal:

KOFam_all_FOAM_level-1.tsv

_images/S10-KOFam_all_FOAM_level-1_tsv.jpg

KOFam_all_FOAM_level-2.tsv

_images/S10-KOFam_all_FOAM_lvl2_tsv.jpg

KOFam_all_FOAM_level-3.tsv

_images/S10-KOFam_all_FOAM_lvl3_tsv.jpg

KOFam_all_FOAM_level-4.tsv

_images/S10-KOFam_all_FOAM_lvl4_tsv.jpg

KOFam_all_FOAM_level-id.tsv

_images/S10_KOFam_all_FOAM_lvl_id_tsv.jpg

KOFam_all_KEGG_level-1.tsv

_images/S10-KOFam_all_KEGG_lvl1_tsv.jpg

KOFam_all_KEGG_level-2.tsv

_images/S10-KOFam_all_KEGG_lvl2_tsv.jpg

KOFam_all_KEGG_level-3.tsv

_images/S10_KOFam_all_KEGG_lvl3_tsv.jpg

KOFam_all_KEGG_level-id.tsv

_images/S10_KOFam_all_KEGG_lvl-ID_tsv.jpg

fasta_stats.txt

_images/S10_fasta_stats_txt.jpg

sunburst_KOFam_all_FOAM.html — open in web browser

_images/S10_Sunburst_KOFam_all_FOAM_html.jpg

sunburst_KOFam_all_KEGG.html — open in web browser

_images/S10_Sunburst_KOFam_all_KEGG_html.jpg

Contents under combined:

At a glance:

_images/S10_combined_contents.jpg
counts_KOFam_all_FOAM.tsv
_images/step10-combined-countsKOFamFOAM.jpg
counts_KOFam_all_KEGG.tsv
_images/S10_combined_counts_KOFam_all_KEGG_tsv.jpg
stats.html — open in web browser
_images/S10_combined_stats_html.jpg
stats.tsv
_images/S10_Stats_tsv.jpg
img — contains the individual .png image files which are collectively located in stats.html
_images/S10_combined_img_contents.jpg

Final

The contents of final are:

_images/Final_contents.jpg
  • There’s two .gbk files which are in GenBank Format.

  • The ./final/fasta subdirectory contains .faa, .ffn, and .fna files of FragGeneScan, Prodigal, etc (depending on commands given):


  • .faa - Protein FASTA file of the translated CDS/ORFs sequences

  • .ffn - FASTA Feature Nucleotide file, the Nucleotide sequence of translated CDS/ORFs.

  • .fna - Nucleotide FASTA file of the input contig sequences.

_images/Final_fasta_folder.jpg
  • The ./final/gff contains .gff and .gtf files:


  • .gff - General Feature Format

  • .gtf - Gene Transfer Format

_images/Final_gff_folder.jpg

What’s in the ./final/FragGeneScan_<file_name> and ./final/prodigal_<file_name> subdirectories?

_images/Final_contents_expanded_FGS_Prod.jpg

A closer look:

HMMER_top_5.tsv
_images/Final_HMMR_top5_tsv.jpg
annotation_summary_KOFam_all_FOAM.tsv
_images/Final_annotation_summary_KOFam_allFOAM_tsv.jpg
annotation_summary_KOFam_all_KEGG.tsv
_images/Final_annotation_summary_KOFam_allKEGG_tsv.jpg
final_annotation_summary.tsv
_images/Final_annotation_summary_tsv.jpg
rollup_KOFam_all_FOAM.tsv
_images/Final_rollup_KOFam_allFOAM_tsv.jpg
rollup_KOFam_all_KEGG.tsv
_images/Final_rollup_KOFam_all_KEGG_tsv.jpg