From Comaiwiki

Revision as of 15:56, 2 May 2017 by Meric (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The indel-allelic-ratio program is part of the genotyping of mutations pipeline of the Creation and Genomic Analysis of Irradiation Hybrids in Populus protocol.
Following the overall protocol, prior to this step several steps must have been completed, including the running of CallAllelesAB and the creation of a region of interest file. Details can be found in the protocol here.

The programs CallAllelesAB.py and indel-allelic-ratio.py can be downloaded here.

This script combines the indel information obtained in Creation and Genomic Analysis of Irradiation Hybrids in Populus Basic Protocol 3 with the allelic calls obtained with the CallAllesAB.py script to calculate the mean allelic ratio across each indel mutation and across the rest of the genome, for each sample. The script outputs the following: the number of SNP alleles available in each indel, the percentage of alleles from parent A within each indel, the number of SNP alleles present outside of all indels for that sample, and the corresponding percentage of alleles from parent A.

The parameters available for indel-allelic-ratio.py:
-a STR: Path to the output file generated by CallAllelesAB.py.
-l STR: Path to the region of interest file.
-o STR: This is the output file name.

The following is an example command line to run indel-allelic-ratio.py using the allele file AB-alleles.txt, indel file AB-indels.txt, and the output file AB- output-geno.txt:

python2.6 /path-to-allelic-ratio-script/indel-allelic-ratio.py -a AB-alleles.txt -l AB-indels.txt -o AB-output-geno.txt

The output file AB-output-geno.txt is a tab-delimited text file with one line per indel mutation. The columns are defined as follows:
1-4: Correspond to columns 1 through 4 of the input indel file.
5: Total number of called SNPs in the sample that lie within the indel.
6: Percent SNPs within the indel mutation that derive from parent A.
7: Total number of called SNPs in the sample that lie outside of indel mutations.
8: Percent SNPs outside of all indel mutations that derive from parent A.

If a sample contains multiple indel mutations, the values in the last two columns will be same for each line of that sample.
The output file AB-output-geno.txt can be used to evaluate the parental origin of each indel mutation as well as the ploidy of each sample.

Personal tools