Skip to content

Installation guide for PhageScanner

Dreycey Albin edited this page Jun 7, 2024 · 14 revisions

Overview

PhageScanner incorporates several command line tools within its pipeline, many of which are designed for Unix systems, common in bioinformatics for use on HPCs. This guide outlines installation steps for multiple operating systems. The necessary tools include:

  1. CD-HIT - Clusters proteins in the database pipeline to avoid duplicates during testing and training.
  2. BLAST - Blast is used as a pseudo model in both the train and predict pipelines.
  3. Megahit - Assembles reads in the predict pipeline.
  4. Phanotate - Used to identify open reading frames in the predict pipeline.

NOTE: PhageScanner is optimized for 64-bit macOS and Ubuntu Linux. Windows users can utilize the tool via the Docker image or by installing the Ubuntu Windows Subsystem for Linux (WSL). This workaround is necessary because some dependencies, such as cd-hit, phanotate, and megahit, rely on C++ libraries not natively supported on Windows. For the best experience, we recommend using PhageScanner on macOS or Linux.

Installation Options (Table of Contents)

Directions for installing these dependencies are provided for Mac, Linux, and Windows:

Installing dependencies for Linux/Mac

Using Anaconda

For Mac or Linux (including WSL/WSL2), Anaconda simplifies the installation of command line tool dependencies. Start by installing miniconda. Then, create a new virtual environment (e.g., conda create -n phagescanner python==3.12.3) and install the following tools:

CD-HIT (see CDHIT package here)

conda install bioconda::cd-hit

BLAST (see BLAST package here)

conda install bioconda::blast

Megahit (see Megahit package here)

conda install bioconda::megahit

PHANOTATE (see Phanotate package here)

conda install bioconda::phanotate

Installing dependencies for Windows

There are couple of ways we recommend using PhageScanner on Windows. One method is to download the Windows Linux Subsystem (WSL) or to create a docker container for the dependencies. This guide walks through both of these options.

Using Docker to Install PhageScanner

Docker can be used to install PhageScanner locally any on operating system. For questions on using Docker please visit the Docker Guide. In addition, there may be an interest in the resource that docker containers can use, please see here. For any questions about problems with Docker, visit the corresponding stack overflow page here.

Using the Docker image hosted on DockerHub

PhageScanner is available on DockerHub, simplifying its deployment on Windows:

  • Download the Docker image:
docker pull dreyceyalbin/phagescanner
  • Verify the installation:
docker run --rm dreyceyalbin/phagescanner --help

Building Docker image locally

The docker image can be built locally to allow for more flexibility. There are two steps involved in this process:

  • Navigate to the Docker/ directory and build the image:
docker build -t dreyceyalbin/phagescanner .
  • Test the image:
docker run --rm dreyceyalbin/phagescanner --help

Windows Subsystem for Linux (WSL)

Another option is using the Windows Subsystem for Linux:

  1. Install WSL with Ubuntu 22.00.0+ following these directions..
  2. Install the dependencies using the Linux and Mac directions above. Make sure to follow the steps to verify the dependencies have been correctly installed.
  3. Install Python v12 and PhageScanner:
    • Install Phage scanner using python -m pip install -r requirements.txt from the root directory.

After these steps, your system should be ready to test the examples provided by PhageScanner.