This repository contains code and analysis results used in the paper Mannen et al (2023) Multiple roles of a conserved glutamate residue for unique biophysical properties in a new group of microbial rhodopsins homologous to TAT rhodopsin.
The workflow is divided into two parts: most of the analyses are covered by the default snakemake file workflow/Snakefile
while the beast2 analysis is in workflow/Beast2.snakefile
.
Most of the dependencies of the main workflow pipilene are taken care of with conda, but in addition to snakemake and conda, the following dependencies have to be installed manually:
- usearch is expected to be available from the
PATH
; - RootAnnotator is expected to be in the directory named
RootAnnotator
in the current directory; - mad is expected to be available from the
PATH
.
The protein fasta file with the expressed TwRs is provided in the file Expressed_TwRs.faa
.
The workflow files are located in workflow/
.
Input files to run the pipeline(s) from scratch are in input/
. They include:
input/ingroup.fna
-- ORF sequences for the representative TwRsinput/ingroup.tsv
-- metadata for the representivatie TwRsinput/beast2/beast_linked_models.xml
-- input for beast2 including the CDS alignment
Final output files are in the folder output
. Immediate analysis results needed to produce them are included as well:
analysis/TAT/rhodopsins.mafft
-- fasta files with alignment of all of the collected and reference rhodopsinsanalysis/IIIa_phylophlan/IIIa.tre.treefile
-- results of phylogenetic analysis of Pelagibacterales subclade IIIaanalysis/diamond_collect/{gtdb,lanclos,oceandna}.tsv
-- tsv files summarizing presence of rhodopsins in Pelagibacterales genomes obtained from three sourcesanalysis/metadata/{gtdb_filtered,lanclos,oceandna_filtered}.tsv
-- metadata for the analyzed Pelagibacterales genomesanalysis/metadata/gtdb_clade.nwk
-- tree in newick format corresponding to the o__Pelagibacterales clade in GTDB r. 214.1analysis/beast2/{beast_linked_models-codon12.trees,beast_linked_models.xml.state,beast_linked_models.log}
-- results of the beast2 runanalysis/beast2/beast_linked_models-rootAnnotator_annotatedMCCTree.nexus_fixed
-- (fixed) output of rootAnnotator, tree in nexus formatanalysis/lazarus
-- lazarus analysis
Use the Issue tracker for questions/requests.