regex - Create clean URL from text in Excel -


i want create clean url text such one:

alpha tests' purchase of berta global associates (c)

the url should this:

alpha-tests-purchase-of-berta-global-associates-c

currently use formula in excel:

=lower(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(a38;"--";"-");" / ";"-");"  ";"-");": ";"-");" - ";"-");"_";"-");"?";"");",";"");".";"");"'";"");")";"");"(";"");":";"");" ";"-");"&";"and");"!";"");"/";"-");"""";"")) 

however, don't seem catch special symbols etc. , consequence urls not clean want them be.

do know excel formula or vba code, ensures special symbols converted clean url?

thank you.

i can suggest following function can put vba module , use normal formula:

function normalizetourl(cell range)  dim strpattern string dim regex object  set regex = createobject("vbscript.regexp") strpattern = "[^\w-]+"  regex     .global = true     .pattern = strpattern end  normalizetourl = lcase(regex.replace(replace(cell.value, " ", "-"), "")) end function 

enter image description here

the point replace spaces hyphens @ beginning, use regex matches non-word , non-hyphen characters , remove them regexp.replace.

update:

after comments, still unclear want unicode letters. delete or replace hyphen. here function tried rebuild formula, logics may flawed. prefer generic approach above.

function normalizetourl(cell range)  dim strpattern string dim regex object  set regex = createobject("vbscript.regexp") strpattern = "[^\w -]"  regex     .global = true     .pattern = "[?,.')(:!""]+" ' these removed end  normalizetourl = regex.replace(cell.value, "") normalizetourl = replace(normalizetourl, "&", "and") ' & turns "and"  regex     .global = true     .pattern = strpattern ' replace non-word chars hyphen end normalizetourl = lcase(regex.replace(replace(normalizetourl, " ", "-"), "-")) regex     .global = true     .pattern = "--+" ' shrink hyphen sequences single hyphen end normalizetourl = regex.replace(normalizetourl, "-") end function 

Comments