Introduction
Hmm… so what exactly is Cypiripi?
Cypiripi is a tool for exact genotyping of CYP2D6 using High Throughput Sequencing Data. A paper describing the algorithm is about to be published in ISMB 2015.
How do I get Cypiripi?
Just clone our repository to get the latest binary:
git clone https://github.com/sfu-compbio/cypiripi.git
DISCLAIMER : Due to the CPLEX licencing restrictions and fact that CPLEX is statically linked to the binary, this software is only available for ACADEMIC use. It is strictly forbidden to use it for any commercial purpose, unless explicitly allowed in writing by the authors. This might change in the future, though. Thank you!
How do I run Cypiripi?
Ideally, just invoke cypiripi.py
with the following options:
python cypiripi.py --fasta reference --fastq [mygenome.fastq] --cov [coverage]
where:
--fasta
is the prefix of a pre-processed CYP2D6 reference. One reference is already provided in the package, and it is namedCYP2D6
.--fastq
is your interleaved and paired.fastq
file--cov
is the coverage per chromosome. E.g. for 40x sample,--cov
should be 20.
Please check next section for more details and requirements.
Requirements
- Wrapper script needs at least Python 2.7 in order to run.
- mrfast and mrsfast should be located within the
PATH
in order for wrapper script to complete. - Binary has been compiled on CentOS 5.x with gcc 5.2, and it might not work with older distributions.
Caveats
While genotyping is usually very fast, mapping can take a lot of time. In addition to this, mrfast (unlike mrsfast) is known to load all of the reads in memory, so it can create problems for a sample with large FASTQ files. It is recommended that you split large FASTQ files to smaller ones (we usually use 20,000,000 lines per FASTQ file in our test configuration).
The provided script assumes that you will invoke the mapping on only one (small) FASTQ file. You can pass --sam [mymap.sam]
parameter to the wrapper script instead of --fastq
parameter if you want to map the FASTQ files manually. Please note that both mymap.sam
and mymap.sam.paired
need to be present for Cypiripi in order to produce correct results.
Lines 128–141 in cypiripi.py
contain the commands and parameters used for generation of necessary SAM files
DISCLAIMER: Cypiripi is not yet intended to be used with variable coverage data (e.g. PGRNSeq or similar technologies). It might work, but it is not guaranteed to produce a correc results. Also, if your coverage is very high (>300x), Cypiripi might require a large amounts of RAM. We’re currently working on resolving these issues.
Usage of Cypiripi binary
Parameter explanation
-
-f [reference file]
Pre-processed gene reference file. Use
reference.combined.align
provided in the package for CYP2D6/CYP2D7 gene. -
-s [SAM file]
Input SAM (mapping) file.
-
-p [paired SAM file]
Specify the input SAM file containing the paired-end mappings.
Note: For every read, its pair mapping should appear in the line after read’s mapping (e.g. lines should be as R1/1, R1/2, R2/1, R2/2 …). Do not sort your mapping files by the mapping coordinate, otherwise Cypiripi will fail. This is done by default by mrfast and mrsfast.
Default: SAM file with
.paired
suffix -
-E [exclusions]
A file containing the read names of the reads to be excluded (e.g. CYP2D8-originating reads).
Default: SAM file with
.discard
suffix -
-C [coverage]
Expected coverage per chromosome for sample.
Default: 20
-
-T [threshold]
Expected minimum cut-off threshold. We recommend 25% or 30% of the coverage.
Support
Contact & Support
Feel free to drop any inquiry to inumanag at sfu dot oh canada. Since this software is still in beta stage and thus unstable, we will be glad to troubleshoot any problem you might encounter!
Authors
Cypiripi has been brought to you by:
from the Lab for Computational Biology at Simon Fraser University.
Funding
-
NSERC Discovery Grant
-
Vanier Canada Graduate Fellowships
Licence
Copyright (c) 2014, 2015, Simon Fraser University, Indiana University Bloomington. All rights reserved. Redistribution and use in binary forms, without modification, is permitted provided that the following conditions are met:
- This software shall NOT BE USED in any commercial environment, unless explicitely allowed by the authors in the writing.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Neither the name of the Simon Fraser University, Indiana University Bloomington nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Release notes
- (15-Apr-2015) Cypiripi version 1.0 release
- Initial public release