C# Regex repeating match between 2 literals -


i extract string pdf string need list of tracking numbers.

my extracted string "more text" rest of extracted string.

more text...__freight: 0.00__sales tax: 0.00 __602256510000; 602256510002; 602256500001; tracking...more text

i locate tracking numbers in string matching on "tracking". here regex:

((?<trackingnumber>[a-za-z0-9]+);\s)+tracking 

here's problem:
after execution group trackingnumber" contains last tracking number, stated above in need group "trackingnumber" have 3 matches, 1 each tracking number (without trailing ";" or space)

the way done in dot-net use capture collections


edit: - note may want make tracking chars optional
[a-za-z0-9]* incase there missing/blank number mid-stream.
continue capturing.
(example: 602256510000; 602256510002;; 602256500001; tracking)


 # (?:(?<trackingnumber>[a-za-z0-9]+);\s)+tracking   (?:       (?<trackingnumber> [a-za-z0-9]+ )  #_(1)                ; \s   )+  tracking 

c#:

string pdf = "__602256510000; 602256510002; 602256500001; tracking "; regex rxtrack = new regex(@"(?:(?<trackingnumber>[a-za-z0-9]+);\s)+tracking");  match trackmatch = rxtrack.match( pdf ); if ( trackmatch.success ) {     capturecollection cc = trackmatch.groups["trackingnumber"].captures;     (int = 0; < cc.count; i++)         console.writeline("[{0}] = {1}", i, cc[i].value); } 

output:

[0] = 602256510000 [1] = 602256510002 [2] = 602256500001 

Comments