NanoOK Reporter is a tool for real-time analysis of data generated by NanoOK RT. In the following text, we will guide you through analysis of data generated for the BAMBI project.
Please download NanoOK Reporter from https://github.com/richardmleggett/NanoOKReporter
Please download the BAMBI dataset from https://opendata.earlham.ac.uk/opendata/data/bambi/BAMBI_1D_19092017_P8_LSK108.tar.gz
Running NanoOK Reporter¶
To clone from GitHub and run, type:
git clone https://github.com/richardmleggett/NanoOKReporter.git
NanoOK Reporter relies on taxonomy information from NCBI, which you will need to download separately. This can be download from ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/ The files that NanoOK Reporter needs are the nodes.dmp and names.dmp files from the taxdump.tar.gz file. You need to download this file and untar it, then tell NanoOK Reporter the location of the files by setting the environment variable NANOOK_TAXONOMY to point to the directory containing the nodes.dmp and names.dmp files. On Linux or MacOS, you would typically do this by adding the following command to your .bash_profile (or .profile on Ubuntu) file or ‘source’ script:
On MacOs, you can edit the file using:
open -e ~/.bash_profile
You will also need to set another variable NANOOK_CARD to point to the directory containing the card files such as aro.csv:
Once you have done this, you can run the program as follows:
cd NanoOKReporter java -jar dist/NanoOKReporter.jar
A flash screen will display and then after a moment, you will be prompted to select a sample directory.
Loading the BAMBI data¶
Untar the BAMBI dataset:
tar -xvzf BAMBI_1D_19092017_P8_LSK108_nanook.tar.gz
This is a NanoOK sample directory containing just the BLAST subdirectories - it is these that NanoOK Reporter uses.
With the tar file uncompressed, go back to NanoOK Reporter and browse to find the BAMBI_1D19092017_P8_LSK108 sample directory that you just untarred. Having highlighted the directory, click the “Choose” button.
Press the “bacteria” button to load the bacteria data. You can follow its progress at the bottom of the window. There are 202 chunk files to load for this dataset and it may take a minute or two to load. If you want to just load a subset of data, select “10” or “100” from the “Chunks” dropdown menu.
When the load has completed, the “Bacteria” tab will show a list of the top hitting sequences. This is calculated by taking the top BLAST hit for each read.
Clicking the “Taxonomy” tab will show a taxonomic tree of the species composition of the samples, with the radius and colour shade indicating how abundant species are. This is based around a Lowest Common Ancestor assignment which considers multiple BLAST hits per read.
Clicking the “Summary” tab will show a donut plot of the species composition of the sample, also based around the Lowest Common Ancestor assignment.
Press the blue “CARD” button to load the CARD AMR data. When the load has finished, click the CARD tab to see the top AMR hits.
In both the “Bacteria” and “CARD” tabs, you can move the slider to move back in time through the data. You cannot yet do this for the “Summary” or “Taxonomy” views.
Pressing the “Walk” button will perform a walkout analysis in which the AMR hits are placed within containing bacteria. Once this has completed, the display will automatically switch to the “Walkout” tab, where plots will show the main bacteria containing AMR genes.
Running with live sequencing¶
When running with live sequencing, you can set the Refresh rate of the application by selecting from the “Refresh” dropdown menu. This specifies how frequently (in minutes) new data will be loaded the various views updated. You can also manually force the latest data to be loaded by clicking on the “CARD” or “Bacteria” buttons again. This will not reload chunk files that have already been analysed, just new files.