From Comaiwiki

Revision as of 15:50, 15 August 2010 by Root (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

GC Plotter

Download it here GC_plotter.

This program takes advantage of Python resources for numerical calculations and plotting that are not included in the basic distribution of Python: you need to install Matplotlib and Numpy to run it. If this seems daunting, here is a possible fix: go to Enthought and download their Python distribution. If you are in academia, it is free. Install the package and take the Matplotlib tutorial and any other that may seem appropriate. Then you can run GC_plotter. The program can be adapted to work under different settings, but to make things simple the current version assumes that you have a directory (a folder) with a single FASTA file of all the sequences you want to analyze. The FASTA file should start with the following letters: fast& (example: fast&_my_sequences.txt). The program should be installed in the same directory. GC_plotter will identify and open your FASTA file, extract the sequences, and then produce one .svg figure file for each sequence naming it with the FASTA header and saving it in the same folder. SVG images can be easily edited in a vector-graphic program such as Illustrator. You can also modify the program to output a different format, or to write the data to a text file.

Below is an example of a plot using a window of 10. GC plotter takes windows of predetermined size, recording the %GC as it advances in one base steps through the sequence. The program can be easily modified to change window size: open it with a text editor and change the line with "window = 10" to the length you prefer. Save the file as .py, not .txt. The longer the window, the smoother the plot. One could also change the step of the window, such as moving forward in two base increments, by modifying the following statement: "for base in range(len(seq)-window)". See Range function. Changing window and step is recommended for longer sequences. I have used this program for relatively short sequences (~1000b) and I have not experimented with long ones, such as whole chromosomes.

GC plot.jpg

Back to Python page

Personal tools