What is WordPress kses?

Work with WordPress long enough and you’re bound to come across something called “kses” or the wp_kses function. Perhaps this comes with writing your own code, or it comes with reading someone else’s source. Or maybe in reading core.

kses

Whatever the case, the function has a weird name, right?

I mean, a lot of the WordPress API functions have clear names so it’s easy to know what you’re doing. This one is more of the exception than the rule. That doesn’t change anything, does it? I mean, it still raises the questions:

What’s the purpose of the function,
Why does it matter,
Why should we use it,
And what purpose do its variations serve?

We should be asking these questions for all functions with which we work. But when the name isn’t clear, the answers aren’t as easy to deduce.

What is wp_kses?

Straight from the Codex:

This function makes sure that only the allowed HTML element names, attribute names and attribute values plus only sane HTML entities will occur in $string. You have to remove any slashes from PHP’s magic quotes before you call this function.

As per the description, the function makes sure that there’s a subset of HTML that we’ll permit in the input. Of course, this implies that we’re actually able to specify said HTML.

If you review the article, you’ll see the function accepts three arguments:

The first argument is the string to filter. So if you have data from the user or the database and you want to process it, this is where you’d pass it into the function.
The second argument is the allowed tags. This is an array and should include the tag and the attributes. You can see an actual example of this in the Codex.
The final argument, which is optional, is for the allowed protocol. This means that we can specify if we want to accept HTTP, HTTPS, FTP and ignore TelNet, JavaScript, etc.

At this point, it’s easier to understand isn’t it?

Why Does This Function Matter?

Before answering this, note the “kses” part of the function stands for “KSES Strips Evil Scripts.” Yes, it’s yet-another-recursive-acronym that the open source community seems to love so much.

But seriously, the purpose of the function is exactly that: To make sure we’re filtering malicious input.

Sure, you can rely on WordPress and some of the built-in functions to handle a lot of this. I’d argue when you’re building a custom solution for someone, it may not be as easy.

Why Should We Use It?

You have to know what your solution is going to accept and what it will reject. It’s not enough to trust that the built-in whitelist is going to provide you with all you need.

Instead, define and identify what you want and what you don’t. From there, strip out everything else.

This will ensure that if something goes wrong, you have a place at which to start debugging. Further, you have a place to watch as data comes into and out of the database.

In fact, you can even write unit tests around this code to ensure it works exactly as you expect.

What About Its Variations?

There are two other functions related to this one:

`wp_kses_post`
`wp_kses_allowed_html`

Both of these are easy to use, straightforward, and have a clear use case designed for them.

The first, wp_kses_post, is for input that will filter allowed HTML tags for post content. So if you’re working with, say, a plugin that will be returning information to a post (or a post type), then this function fits the bill

The second function, wp_kses_allowed_html, will allow you to see what HTML is allowed for the context in which you’re working.

This means that if you’re coming into a situation where someone else has written code, or you’re working in a situation in which allowed HTML already exists, then you can see what HTML is allowed.

Furthermore, given this information, you can tweak the allowed tags to fit your needs.

Writing Strict Code

Ultimately, this all comes back to writing the most secure code as possible. Unfortunately, I’ve found this to be something that is an on-going effort.

Every time I think I’m writing robust code, something new appears showing me I’ve more to learn.

Nature of the industry, right?

Here’s a rule of thumb:

Your secure code isn’t secure enough. You need to constantly be on the lookout for things like this.

These APIs are a great starting point. But we shouldn’t stop here. If you’re reading this post and implementing these functions, you’re ahead of a lot of other WordPress developers.

From here, keep learning more about what techniques are available. Talk with others, read articles, look through other available code.

It’s what I’ve found to be the best way to make sure I’m writing the safest code possible. And that code still isn’t safe enough.