i extract string pdf string need list of tracking numbers.
my extracted string "more text" rest of extracted string.
more text...__freight: 0.00__sales tax: 0.00 __602256510000; 602256510002; 602256500001; tracking...more text
i locate tracking numbers in string matching on "tracking". here regex:
((?<trackingnumber>[a-za-z0-9]+);\s)+tracking here's problem:
after execution group trackingnumber" contains last tracking number, stated above in need group "trackingnumber" have 3 matches, 1 each tracking number (without trailing ";" or space)
the way done in dot-net use capture collections
edit: - note may want make tracking chars optional
[a-za-z0-9]* incase there missing/blank number mid-stream.
continue capturing.
(example: 602256510000; 602256510002;; 602256500001; tracking)
# (?:(?<trackingnumber>[a-za-z0-9]+);\s)+tracking (?: (?<trackingnumber> [a-za-z0-9]+ ) #_(1) ; \s )+ tracking c#:
string pdf = "__602256510000; 602256510002; 602256500001; tracking "; regex rxtrack = new regex(@"(?:(?<trackingnumber>[a-za-z0-9]+);\s)+tracking"); match trackmatch = rxtrack.match( pdf ); if ( trackmatch.success ) { capturecollection cc = trackmatch.groups["trackingnumber"].captures; (int = 0; < cc.count; i++) console.writeline("[{0}] = {1}", i, cc[i].value); } output:
[0] = 602256510000 [1] = 602256510002 [2] = 602256500001
Comments
Post a Comment