Here you will find some useful software for analysis and processing of NGS datasets. The software is provided under the terms of the Creative Commons Attribution Non-Commercial License V2.0. Detailed information about usage of the software is provided in a separate documentation. To download the software right-click on the logo and choose "save link/target as". Execution of Perl scripts requires installation of a Perl interpreter. Free Perl distributions can be found at or

unitasdocumentationunitas is a convenient tool for efficient annotation of small non-coding RNA sequence datasets produced by Next Generation Sequencing. All you need is a computer and a connection to the internet. unitas uses latest reference sequences from publicly available online databases to annotate user input sequences. No installation, no further prerequisites; it runs out-of-the-box on any popular platform (Linux, MacOS, Windows) and can be started with one simple command from the command line (terminal). unitas accepts sequence files in FASTA or FASTQ format, or alternatively map files in SAM or ELAND3 format (standard output of sRNAmapper)

Available downloads
version LINUX MacOS MSWin source / Perl script
V 1.4.6 Standalone executable
Linux (64 bit)
Standalone executable
MacOS (64 bit)
Standalone executable
Windows (64 bit)
Perl script / source code
(all platforms)
V 1.5.0 request request request Perl script / source code
(all platforms)
V 1.5.1 request request request Perl script / source code
(all platforms)
* contact us for older versions of unitas.

We will constantly provide versions with updated internal URL link lists. Since unitas will always first try to use the URL link lists stored on our server (instead of its internal list), older versions of unitas will still work properly. New executables (incl. a updated documentation) will be provided in case of new functionality or bug fixes.

Information for Mac and Linux users:
Commonly, executable files downloaded from the web cannot be executed without changing file permissions. Therefore, try the following terminal command:

chmod 755 unitas
chmod a+rwx unitas

Further, unitas will download the SeqMap source code and compile it on your local machine with g++. If g++ is missing on your computer unitas will download a precompiled SeqMap executable. In this case, you will most likely have to change file permissions manually, to make this file executable.

proTRACdocumentationproTRAC predicts and analyzes genomic piRNA clusters based on mapped piRNA sequence reads. proTRAC 2.0 and later versions apply a sliding window approach to detect loci that exhibit high sequence read coverage. Subsequently, sequences mapped to these loci are analyzed with respect to typical piRNA and piRNA cluster characteristics to ensure high specificity. proTRAC runs with basic core Perl which is commonly pre-installed on Unix and Mac computers. Windows users can install a free Perl distribution such as Strawberry Perl or ActivePerl. Alternatively we provide a precompiled proTRAC executable (v.2.4.1) that runs on 64 bit Windows computers.

NGS toolboxdocumentationNGS TOOLBOX is a collection of simple open source Perl scripts that perform basic analyses and processing steps using next generation sequencing (NGS) datasets. Each tool is designed to ensure convenient and intuitive usage. Installation and usage does not require any bioinformatics skills. All scripts work out-of-the-box. Advanced users may use the command line based Perl scripts to build their own automated sequence analyses/processing pipelines.

sRNA mapperdocumentationsRNAmapper is specifically designed to map small RNA sequences to genomes. To this end it uses a specialized mapping algorithm that requires a perfect 5' seed match (default: 18 nt) and optionally allows non-template 3' nucleotides as well as internal mismatches in the part of the sequence that follows the seed match. Allowing non-template 3' ends will ensure the mapping of 3' modified (adenylated/uridylated) small RNAs while allowing internal mismatches can enhance sensitivity considering degressive read quality towards 3' ends.

reallocatedocumentationreallocate post-processes map files in order to reallocate read counts of multiple mapping sequences according to the transcription rate of genomic loci based on uniquely mapping reads. Map files must be in ELAND format and can be created using sRNAmapper which is provided along with the proTRAC software. reallocate will output a modified map file that contains two additional columns that refer to i) total number of genomic hits of a sequence and ii) read counts that are assigned to this locus. proTRAC 2.0.5 and later versions accept this format and utilize this information for cluster prediction. Generally, using reallocate will result in a higher amount of sequence reads that can be assigned to predicted piRNA clusters and may also alter the number of predicted piRNA clusters (more true-positives, less false-positives). Using reallocate is specifically recommended for datasets with large amounts (>= 50%) of transposon related small RNAs such as pre-pachytene mammalian piRNA transcriptomes or drosophila piRNA transcriptomes.

piFETCHdocumentationpiFETCH is a tiny fetching tool to download data from piRNA cluster database without using the web interface. piFETCH allows to download complete proTRAC results for available NCBI SRA datasets or specified information (piRNA cluster sequence, reads mapped to a cluster, proTRAC image file) from selected piRNA clusters for a desired SRA dataset. You can also download clipped and filtered reads from any available SRA sequence set as well as sequence reads from the specified SRA dataset(s) that matched miRNA- or miRNA precursor sequences, respectively

phaserdocumentationPHASER is a tool to analyze the 3 - 5 distances of mapped sequence reads. It has been recently described that secondary piRNA biogenesis (piRNA ping-pong) can induce Zucchini-dependent primary processing of targeted transcripts resulting in the production of so-called phased piRNAs (Han et al. 2015, Mohn et al. 2015). In this process, the target molecule is sliced consecutively starting from a ping-pong target site, and each downstream cleavage position determines the 3 and 5 end of adjacent (trail-) piRNAs, respectively. The amount of phased piRNAs can be determined when analyzing 3 - 5 distances of mapped sequence reads where a distance of 1 indicates a pair of phased piRNAs.

ppmeterdocumentationPPmeter is is a tool to quantify and compare the amount of ongoing ping-pong amplification. Since the number of ping-pong pairs within a given datasets depends on dataset size and grows non-lineary, other methods must be applied when comparing the ping-pong footprint across different datasets. PPmeter generates pseudo-replicates by repeated bootstrapping (default=100) of a fixed number of sequence reads (default=1000000) from a set of original sRNA sequence datasets. PPmeter then calculates the ping-pong signature of each pseudo-replicate and counts the number of sequence reads that participate in the ping-pong amplification loop. The obtained parameter - ping-pong reads per million bootstrapped reads - is comparable across different datasets.

QuasardocumentationQuasar is an annotation tool designed for repeat annotation of whole genome sequences (quick and sensitive annotation of repeats). It is similar to the RepeatMasker software developed by A.F.A. Smit, R. Hubley & P. Green (unpublished) but is optimized for subsequent repeat annotation of genomically mapped small RNA sequences. Quasar annotation differs from RepeatMasker annotation in that Quasar will annotate according to the highest sequence similarity whereas RepeatMasker rather mirrors the biological transposon insertion history. Therefore, using Quasar will give a more accurate target prediction of e.g. piRNAs as compared to using RepeatMasker annotation.
homeresearchteachingpublicationspiRNA cluster databasesoftwarepeople