i have directory structure in hdfs follows:
/dir1/dir2/dir3/2011/01/01/* /dir1/dir2/dir3/2011/01/02/* .. i have done following read files @ lest assume doing following read files:
val data = sc.textfile("/dir1/dir2/dir3/2011/**/**") i want make sure have read data under 2011 (all months , dates), thought 1 checking size of rdd give me idea.
that count - docs here.
Comments
Post a Comment