If you got back through any of the posts I’ve written in the last, say, two years, you’re likely to find me advocating using available APIs over directory database queries nearly every single time.

And the truth is that I still lean in that direction. That is, if there’s an abstraction or an API that’s available for doing something specific, then I try to use it.

But in a couple of recent projects, I’ve been working with some relatively large datasets (large in comparison to non-enterprise level datasets). And in doing so, I have tried to make sure the updates to the data are as fast as possible.

Direct Database Queries for WP VIP

In situations like this, though the WordPress Coding Standards don’t like it, I find direct database queries occasionally to be the best option for doing so under certain conditions.

Using Direct Database Queries

Direct database queries are one of those things that have a bit of a mixed reputation among WordPress developers:

  • they are discouraged in the coding standards,
  • WordPress offers access directly to the database via $wpdb,
  • there are nuances to database queries that should be understood.

So when you see something like this:

How do you know if it’s a good move or not? And better yet, how do you know if you should be using directory database queries in your code own?

There are a few questions I normally ask myself when dealing with them all shared below.

1. Are There Indexes on the Columns?

Before doing using direct database queries in WordPress is to check to see if the columns I’m querying have indexes on said columns.

And the reason is this:

Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed.

The bottom line is that the speed at which you can query and update columns is faster if there are indexes on the columns.

Direct database queries depend on the structure of the database.

Direct database queries depend on the structure of the database.

To that end, this isn’t something that we should do all the time. But there are times in which it’s appropriate (like when you’re updating a set of meta values) and when it’s not.

2. Parameterize Your Queries

When using direct database queries it’s extremely important to parameterize your queries for the sake of security. You can read all about this in the Codex, and this is not something to be skipped.

Straight from the documentation:

For a more complete overview of SQL escaping in WordPress, see database Data Validation. It is a must-read for all WordPress code contributors and plugin authors.

That is, if you’re taking user input then you need to make sure it’s sanitized, clean, and ready to be inserted into the database. But, for what it’s worth, I almost never allow user input to impact direct database queries.

Instead, they are normally based on data that I have in the code and want to update or modify when they are safely within the realm of control of the codebase.

3. Test Them and Test Them Again

And finally, before implementing any direct queries, I recommend:

  1. Starting your SQL front-end and executing the queries,
  2. Note any errors that they may throw,
  3. Fix them and try them again.

Then, once you have a working query test for edge cases to make sure something isn’t going to be mangled. This is especially true if you’re going to be using LIKE clauses in your queries (which is probably an article unto itself).

Am I Missing an API?

With the size of WordPress’ API, there’s always a chance that I’m missing something, neglecting something, or simply am not being smart about the existing APIs.

And if that’s the case, then I’m happy to correct the information that’s above (so don’t hesitate to leave a comment).

But in the meantime, if you’re working with a relatively set piece of data, you’re sure there are indexes on the columns, you know how to parameterize the queries, and you’ve tested the data, then perhaps doing with direct database queries are the way to go.

And if that’s the case, there are a few ways to ignore the coding standards without having the PHP CodeSniffer yell at you. 🙂


Join the conversation! 4 Comments

  1. Well stated Tom,

    If I have a peeve w/ WP it revolves around the lack of relational database adherence to RDBM’s paradigms “to the max”

    Developers should be (IMHO) very well versed in RDBMS design, normalization processes and then some such as why one storage model might be preferable towards their work than another. In WP I have saw many a plugin using the datastore like “This is the place to stick stuff” w/ no regard in understanding how WP stores stuff or why(s). Thus when looking at the code, soup de’ jour in interfacing to WP datastore tables/records.

    Know when to use a table and not. For example, I’ve saw code (for a video gallery as I recall, maybe it was something else) that uses the wp_options store to save every type of option towards given galleries and/or individual videos. That is to say, skins, initial state, on and on.

    Every single page that a site loads where shortcode, widget, or as defined in the admin backend (categories etc) loads ALL of it up. I have saw this sort thing many times.

    Storing each as its own “configuration” in the options table becomes more cumbersome to manage in code .vs. here’s an array, its the everything config towards say the video config/info’s, all of them. That gets unserialized, 95% of the data is not applicable to the page/post/gallery to display.

    Whilst the waste of resource (CPU/Ram/Bandwidth (DB or DB Server -> HTTP Server) may well be minimal on a single request, 10 requests, 100 requests when it is 100,000 concurrent requests, its significant.

    Maintaining one’s own Database table(s) may seem like a whole lot more code. Designing them so they interface symbiotically with WP tables may see like lots of effort. Creating indexes based on a numeric integer .vs. strings in an array mapped to a DB and translated upon read/write may seem like a hassle… But thats often it’s the right way to do it when there exists a significant amount of said data. Looking up numeric indexes in a RDBMS is a WHOLE lot more efficient than strings and said indexes often used to seek/join other indexed data.

    There are entire books written on database normalization for a reason, its important. Important for data integrity, important for performance.

    Just my opinion, but I've saw a good deal of plugin code where the data model(s) seemed to go "application first" .vs. "datastore first" and not even considering the interface to WP data.

    I cannot even begin to tell you how many plugins dont clean themselves up in respect to the database on uninstall breaking every rule of RDBMS instantly.

    WP makes it relatively easy to work with the database. No, it doesnt have ORMS (Object Relational Mapping) but ORMS does not tend lend itself well to PHP in general strictly due to performance curve characteristics).

    One of the things I would love see WP core do is create an API that replaces how data access is managed. Consume the SQL, compile the queries, cache as stored procedures then deprecate existing function. Cut down the number of trips to the data store. Then perhaps consider normalization and requirements thereof for plugin/theme authors.

    I dont imagine that happen anytime soon as it appears that focus is being placed more on function (Customizer, Rest, etc) than that of the lower level I/O which, rather surprising considering Rest best usage characteristics and concepts towards large data sets. Aka: “Is it better to try grow apples on a ill tree or take the time to grow new trees (<- plural), fertilize well and have lots of apples.”

    I stated back about a year ago when I began exploring WP core code that perhaps rather than focus on new functionality towards the world wide web of rich application function that perhaps its time for a new WP. Was’nt real popular :)

    I made said statement after both dedicated host firms I have servers with said while they see many more WP installs on Shared and VPS platforms than any other CMS app, on dedicated they are seeing the exact opposite, a decline. They stated most installs are shared hosting and rather static sites, aka: Not used in any other way than to create a basic website, not a blog, not a this/not a that.

    Those that are more interactive in nature ending up in VPS strictly because of performance and most of that due to five matters:

    Round trips to the datastore.
    Plethora’s of include files which under significant concurrent server load can bog mechanical hard drives down as and at times more so that #1 (dependent on the host topology, DB on HTTP server or separate datastore in their network topographics.)
    Overly complex usage of OOP when unwarranted.
    Code brought into a session that is never even executed.
    Heaps and heaps of HTTP requests.

    One cited a significant upspike in usage of Concrete 5 across the dedicated servers.

    Point being, American car manufacturers thought they had the market. Then Japan happened. The Japanese thought they had the market until Hyundai and Kia. Sometimes it is wise to reinnovate using the existing customer base .vs. attempt innovate atop a platform that is unable due to its construction take advantage of new technology.

    Hyundai is a very good example of this. Every 5 years or so they re-engineer. The cars when they first hit markets were crap. In scant time they set the bar with the 100,000 mile warrantys and much more. The re-engineered.

    The concept of a product not being vulnerable even if it owns 90% marketshare is a untruth. Apple had essentially 100% of smart phones. As of the iPhone 7 they gained back some, its about 50-50 (apple/android) and the iPad has been loosing marketshare.

    IMHO WP is a perfect candidate for a new application, re-engineered, ground up.

    It has the existing install base to have a successful launch and, two, its extremely vulnerable as it sits. Vulnerable because it does have that large market share.

    Here’s the deal (IMHO) – Its clear ES script has emerged both server/client of where IP IT is headed. It is capable of provisioning so many things, even application state client/server. It is the hub by which device unification c/s rich application be that video gaming or shopping.

    Drupal, Joomla, WP (etc?) all live in codebases created before this emergence. Not gonna do a history lesson here on Moo, JQ, CSS this/that.

    All these codebases will fall in what is coming. What WP is currently attempting to “catch up to” (UI, Rest, etc) is already yesterday. Need be looking at tomorrow(s) and start working on tomorrow yesterday.

    Again, just my opinion.

  2. My general rule of thumb is

    a) If I am accessing the core WP tables I will use the appropriate WP API.

    b) If I am accessing non-core tables(say tables where I am storing my own data) then I will use Direct DB Queries.

    Saying that I have used direct db queries on core tables when I need performance, and I am not so worried about playing nicely with other plugins via hooks.

    • I think those rules make sense. There are always going to be exceptions, sure, but it’s also nice when third-party tables offer an abstraction layer, too ;).

      But when they don’t, prepared, direct queries are the way to go pretty much by default.

  3. I myself prefer to use direct database queries using my own ORM, and then manually applying wp_filters on the resulting data, because WordPress’ Wp_query is highly unoptimized: it uses too much memory and it is relatively harder to select only certain columns. (for example, when you only need post titles and nothing else). I haven’t checked, but I also believe Wp_query has the very well known n + 1 problem.

    I have developed an OO MVC framework for wordpress which I expect to release soon, that includes an ORM for easier, more maintainable, and better performant data read and write.

Leave a Reply