perl - Iterate through files in directory and count frequency of string -


i'm trying write perl program iterate through files in given directory , identify number of times specific string present in each of files.

it's combing through dna sequences looking frequency of atg on forward strand or reverse complement depending on direction of sequence have. know sequences contain @ least 1 atg or cat (reverse complement atg) , many more in output file it's giving me 0 or one. suggestions?

p.s. ignore unnecessary variables i'm editing written script

here's code

#!/usr/bin/perl  @file=<*.fasta>; $file (@file) {   $get_file = <../[es]rr*/> or print "could not find"; $check = substr($file, 0, 9); $filename = substr ($get_file, 3, 20);     $pattern_reverse = 'ccattttgtccaa[ac]c'; $pattern = 'g[gt]ttggacaaaatgg'; $forward_start = 'atg' ; $reverse_start = 'cat' ;  open(data,$file) or die ("couldn't open file.");  $contig_name; $not_found_mark; $position; $symbol = ">"; $contig_string; $contig_length;  $contig_name = <data>; $not_found_mark = 1; $contig_string = "";  while ((my $line = <data>) && ($not_found_mark)) {  chop($line);  $position = index($line,$symbol); if ($position < 0) {         $contig_string .= $line;         } else {         $not_found_mark = 0;         } }   print "$filename \n"; $contig_length = length $contig_string; print "the contig $contig_length characters. \n";    if ($contig_string =~ /($pattern)/ ) {         print "found forward pattern.\n";         if ( $contig_string =~ /(atg)/ ) {             $atg_count = 0;             $atg_count++;             open ( match, ">>", atg_match ) or die "could not open atg_match";             print match ">$filename $check $atg_count \n"                  or die "could not append.";             print "$atg_count \n";          } }  elsif ( $contig_string =~ /($pattern_reverse)/ ) {         print "found reverse pattern.\n";         if ( $contig_string =~ /(cat)/ ) {             $atg_count = 0;             $atg_count++;             open ( match, ">>", atg_match ) or die "could not open atg_match";             print match ">$filename $check $atg_count \n"                  or die "could not append.";             print "$atg_count \n";     } }  else  {         print "$file \n";         print "did not find pattern. \n";         open ( nomatch, ">>", no_atg_match ) or die "could not open";         print nomatch ">$filename $check\n" or die "could not append";               } } print ( "there $atg_count atg's \n" ); close ( match ); close ( nomatch ); close( data ); 

any suggestions?

it looks setting count 1 time these 2 lines.

$atg_count = 0; $atg_count++; 

given you're using ++, i'm guessing that's not need do

declaring my $atg_count = 0; near top of script need initialise it, thereafter, increment ++. (while you're @ it, there reason you're not beginning use strict; use warnings?)

you that

i'm editing written script

why? appears simple task , easier start again , write code want , understand try make code else want.


Comments