i trying search particular keyword "me", in different combinations per regex throughout length of corpus, (with being iterator in loop till length of corpus). result having me qualifications in of resumes in corpus, example:- 1. me 2. m.e in electronics 3. m.e. 4. me.-computer science etc.
matchme <- regmatches(as.string(docs[[i]]), gregexpr("\\wm\\.?e\\.?(\\s|\\.|\\-|\\(|\\:|\\,)|((master)|(master))[ss]?\\s?(((of)|(of)|(of))|((in)|(in)|(in)))\\s?((engineering)|(engineering)|(engg)|(engineering))" , as.string(docs[[i]]))) however, getting result "windows - me" people have worked on platform have not done qualification in lot of documents. want particular set removed. trying build following regex not include "windows- me" or "windows me" etc combinations, doesn't seem work
[^(windows)]\wm\.?e\.?(\s|\.|\-|\(|\:|\,)|((master)|(master))[ss]?\s?(((of)|(of)|(of))|((in)|(in)|(in)))\s?((engineering)|(engineering)|(engg)|(engineering))
Comments
Post a Comment