Using Cygwin, how do I aggregate content in 1 column, and then do counts of occurrences in an array from another column? -


for example:
20150401 a,c,r,ab,cd,ef,ee,ff
20150401 a,c,ef,ff,g
20150401 a,bb,c,ef,fg
20150401 r,ab,cd,ef,g
20150401 r,c,ef,ee,gg
20150402 a,c,ef,ff,g
20150402 d,dd,cd,ff,gg,ab,ee,ee
20150403 r,r,cd,ef,g,ee
20150403 a,c,ef,ff,g
20150403 d,cd,ff,ee,g,gg
20150403 f,ef,g,ee,c,ab

how count how many times each item occurs on each date without specifying each item? ideally output give me list of how many times "a" occurred on 20150401, 20150402, , 20150403. give number of occurrences of "c" on 20150401, 20150402, , 20150403. etc.

perl rescue!

save following count.pl:

#! /usr/bin/perl use warnings; use strict;  %table; while (<>) {                       # read input line line.     ($date, $list) = split;     # split on whitespace.     @items = split /,/, $list;  # split list on commas.     $table{$_}{$date}++ @items;# record occurrences. }  $item (sort keys %table) {                  # iterate on items.     $date (sort keys %{ $table{$item} }) {  # iterate on dates.         print "$item $date $table{$item}{$date}\n";     } } 

then run

perl count.pl input-file 

Comments