html - php preg_replace_callback Remove comments except from <script> tag -


i want remove html comments using preg_replace_callback . want keep comments inside <script> element, eg:

b/w <script> <!-- keep me--></script > 

my code:

$str = '     <script>     <!-- keep1 -->     keep </script> <!-- del me1 --> <body> <script> <!-- keep2 --></script> <!-- del me2 --> <script><!-- keep3 --></script> </body><!-- del me 3 -->';   $str =    preg_replace_callback('/(<([^script]\/?)(\w|\d|\n|\r|\v)>)*((.*(<?!--.*-->)|(\w|\d|\n|\r|\v)*)+)(<\/?[^script](\w|\d)*>)/s',     function($matches) {         print_r($matches);         return preg_replace('/<!--.*?-->/s', ' ', $matches[2]);     }, $str); 

technically, "html comments" between script tags no more html comments. if use dom approach these comments not selected:

$dom = new domdocument; $dom->loadhtml($html, libxml_html_noimplied | libxml_html_nodefdtd);  $xp = new domxpath($dom); $comments = $xp->query('//comment()');  foreach ($comments $comment) {     $comment->parentnode->removechild($comment); }  $result = $dom->savehtml(); 

about conditional comments:

if want preserve conditional comments, need check beginning of comment. can in 2 ways.

the first way check comment in foreach loop, , when test negative, remove node.

but since use xpath way (that consists select want once , all), follow same logic, can change xpath query to:

//comment()[not(starts-with(., "[if") or starts-with(., "[endif]"))] 

content between square brackets called "predicate" (a condition current element) , dot represents current element or text content (depending of context)

however, if work of time, slightest leading space make fail. need more flexible starts-with.

it possible register own php function used in xpath query this:

function isconditionalcomment($commentnode) {     return preg_match('~\a(?:\[if\s|\s*<!\[endif])~', $commentnode[0]->nodevalue); }  $dom = new domdocument; $dom->loadhtml($html, libxml_html_noimplied | libxml_html_nodefdtd);  $xp = new domxpath($dom);  $xp->registernamespace('php', 'http://php.net/xpath'); $xp->registerphpfunctions('isconditionalcomment');  $comments = $xp->query('//comment()[not(php:function("isconditionalcomment", .))]');  foreach ($comments $comment) {     $comment->parentnode->removechild($comment); } 

note: domdocument doesn't support default microsoft syntax (the 1 nobody uses) not html comment:

<![if !ie]> <link href="non-ie.css" rel="stylesheet"> <![endif]> 

this syntax causes warning (since not html) , "tag" ignored , disappear dom tree.


Comments