An Example of How To Remove Empty HTML Tags

One of the most tedious aspects of building WordPress themes is customizing and styling the comments template. This includes not only the comment form and the pingbacks, but the response text, as well.

Don’t get me wrong: It could be worse, and after you’ve done it a few times, it’s likely that you’re going to use many of the same strategies that you’ve used in previous themes or templates.

But there are examples in which certain elements will render as empty HTML tags. If you have given those tags a specific, say, background style then it can really create somewhat of an ugly experience for your readers.

The challenge, then, comes at being able to remove empty elements before the user can see them. But there’s a catch: It can’t be done on the server side because the server side sees the HTML as you would expect it to be rendered whereas the browsers take the liberty of parsing the document and adjusting the markup so that it’s a bit more semantic.

At least that’s what most of them try to do.

Anyway, this can cause some unintended side-effects.

Remove Empty HTML Tags

Take a look at the following screenshot and you’ll notice two things:

  1. Code blocks that are set with a different type and a different background
  2. An empty code block in the middle of the two lines that creates a single gray square

An empty code element.

This isn’t an uncommon problem, but this is is exactly a symptom of the issue mentioned at the start of the article. Basically, the code that’s written and saved on the server is fine.

Odds are, it looks something (give or take some wrapping elements) like this:

But when it’s rendered, the browser ends up parsing the file like this:

Whether or not you opt to blame the browser is one thing, but browsers have come a long way in the past few years and do a pretty good job of taking what appears to be broken HTML and building the DOM in such a way that it’s more correct.

It just so happens that in this case, it results in an empty code block as an unintended side effect.

If you’re working with WordPress or some other server-side language or templating system, then you may try to resolve this on the server side prior to serving it up to the browser, but there’s a catch: The DOM isn’t constructed until it gets to the browser, so the HTML that the server sees is different than what the client will eventually see.

To that end, a server side solution may not be feasible. In that case, you’re left with needing to address it with JavaScript.

Now there are a number of ways that this can be done and I’ve opted for one that I believe to be the most readable (for the sake of this article), and so that it’s easy to explain. So if you need to remove an empty element (similar to what you see above), then check out this gist:

The code is making a few assumptions, but here’s what it’s doing:

  1. It looks for all of the code elements that are descendants of any elements with the comment class. In practical terms, perhaps this is a comment container.
  2. If then trims the current code elements text.
  3. If it notices that it’s empty, it removes it from the DOM.

Easy enough, right?

I know, I know. Doing this kind of work on the client side feels weird and even a little hacky. It should be able to be done on the server side, and it most cases it is.

The problem, as mentioned, is that this is a case in which the HTML is restructured on the client so it calls for a slightly different solution.

So, what does the final result look like?

Removed Empty Code Tag

Much better, isn’t it?

Server Side Resources

If you are interested an attempting to solve this problem on the server side, then take a look at the following (and props to Gary Jones for talking with me about some of this, as well):

  1. PHP DOMDocument
  2. Simple HTML DOM

Perhaps I’ll revisit this topic in a future post.

10 Comments

Hi Tom,
Kinda unrelated to the blog topic, but I’ve been paying attention more lately to some of the differences in jQuery syntax and I’m confused about one thing:

What is the difference between:


( function( $ ) {
// blah blah blah
})( jQuery );

And


( function ( $ ) {
// blah blah blah
}( jQuery ) );

The parenthesis are different at the end.
Also, do these need to be wrapped in a document.ready function or does this only run once the document is ready?

    The difference is what you’re showing is how the anonymous function is invoked.

    The second example is my preferred example because it’s invoking the anonymous function and passing jQuery to it within a closed context whereas the former example does a similar thing, but calls it with a different scope.

    If you’re going to go the route of including document ready, then you’d do so in the context of the anonymous function like this:

    “`(function( $) {
    $(function() {
    // ….
    });
    }(jQuery));

Parsing HTML server side with PHP can be a nightmare. The best solution that I’ve found is QueryPath and this HTML parser.

The default xml or html parser for QueryPath struggles too much with various HTML5 tags. Also, it defaults to ISO-8859-1 which can screw up your content so be careful.

This probably won’t work…but something like this could work

$dom = \HTML5::loadHTML($html);
$html = qp($dom)->find('p')->filterCallback(function ($i, $item) {
if ($item->nodeValue === '') {
return true;
}
return false;
})->remove()->top()->html();

    This should work if you opt to trim() the content of the nodeValue; however, I’m not specifically familiar with the loadHTML function so I can only offer cursory advice ;).

      Absolutely.

      If you’ve run into character encoding problems and you have invisible characters there… I’ve been there and it’s terrible…. I recommend preg_match(‘/\w+/’, $text) in addition to trim() to make the call if the text is empty or not.

      so,

      if (preg_match(‘/\w+/’, trim($item->nodeValue)) { // not empty }

hello tom,

i remember i found a solution like this but using only css, quite easy

code:empty {
display: none;
}

I use it with empty paragraphs in the posts

    emm, for what i’m reading right now, new line count as a content

    so css solution will not hide this

    <code>
    </code>

    but only comments and empty
    http://css-tricks.com/almanac/selectors/e/empty/

      Right – that’ll work but only if it’s an empty code block. In many cases, the code block isn’t empty as it has a new line character in between it, thus it’s not really empty (and needs the $.trim call).

    This works great if the code element is truly empty; however, sometimes it isn’t – it’s got a carriage return, new line so that’s treated as a character so code:empty won’t work.

    That’s why I’ve used the trim function :).

Trackbacks and Pingbacks

[…] An Example of How To Remove Empty HTML Tags (Tom McFarlin) […]

Leave a Reply

Name and email address are required. Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>