Last Updated: February 25, 2016
·
3.075K
· wkjagt

Remove anything from html

Removing script tags or comments from html using regular expressions is a bad idea so better use PHP's DOMDocument.

<?php
function removeDomNodes($html, $xpathString)
{
    $dom = new DOMDocument;
    $dom->loadHtml($html);

    $xpath = new DOMXPath($dom);
    while ($node = $xpath->query($xpathString)->item(0))
    {
        $node->parentNode->removeChild($node);
    }
    return $dom->saveHTML();
}

For example, to remove all comments from an HTML string, pass the xpath for comments:

$html = removeDomNodes($html, '//comment()');

Or to remove all script tags:

$html = removeDomNodes($html, '//script');