Thursday 30 January 2014

FASTX-Toolkit

Direct sequencing of the whole DNA from community of microorganism using Next Generation Sequencing technologies, produce sort reads with quality values in FASTA/FASTQ files. Processing of these files can be easily done with the help of FASTX-Toolkit which contains several command line tools. You can easily download this toolkit from the mentioned link. http://hannonlab.cshl.edu/fastx_toolkit/download.html

**** FASTX-Toolkit Quick Installation ****

1. First download libgtextutils same with the version of fastx you are going to download

extract using tar -xjf libgtextutils.tar.bz2
go into the directory using cd li......

./configure
make
make install

change directory cd ..

2. Download fastx_toolkit.tar.bz2

before run ./configure use command

export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH

./configure
make
make install

Wednesday 29 January 2014

Linux/Unix for file handling

For any computational biologist linux/unix is the beautiful operating system for data analysis. I am discussing here few basic commands for file handling.

LINUX COMMANDS:
1.  Commands for bundling and compressing the files
     tar -cvf files.tr file1 file2 file3  (File names, Output: files.tar)               
     gzip files.tr                                  (For compressing, Output: files.tar.gz)
     For supercompression
      bzip2 files.tar               (For super compression Output: files.tar.bz2)
        bzip2 file                                              (Output: file.bz2)

2. Unzipping
    tar - zxvf files.tar.gz         (Direct unzipping, Output: you will get all files)
      gunzip files.tzr.gz                  (First step: unzip the gz file, Output: files.tr)
      tar -xvf files.tr                       (Second step: untar, Output: all files)
      For supercompressed files
     bunzip2 files.tar.bz2         (Output: files.tar, tar -xvf for open the bundle)
      tar -xjvf files.tar.bz2         (Direct, Output: all files)