TL;DR: Exporting a lot of data? Don't forget the "posts_per_page" argument.

If you’ve ever done any work for yourself or for others in WordPress where you’re responsible for importing a lot of data in a specific format (such as a CSV), then the odds that you’ve had to also work on writing an export tool for the same data are pretty high.

I mean, it makes sense, right? Get the original data into the new, WordPress-based system, do work in the new system, export the data so its portable for others.

But how is this information usually structured?

When it comes to working with data like this, I’m not talking about the structure of it being in terms of something like a linked list or an array, nor am I talking about it being specifically a CSV or a JSON file or any other type of format.

I’m talking about how it’s structured within WordPress. In this case, what I’ve usually seen (and worked with) is the following:

  • Custom Post Types
  • Custom Taxonomies (both hierarchical and non-hierarchical)

So when it comes down to export the data in the same format that it was imported, it’s common to need to use WP_Query. After all, it’s arguably the definitive API for working with posts, categories, tags, and meta data, right?

But, as with so many things programming related, there’s always a gotcha. If you’re working with a very small set of records, you won’t actually experience this, but if you’re writing an exporter that’s going to export all of the custom data that’s in the system, then there’s one simple parameter – posts_per_page – you need to pass to WP_Query to make sure you grab all of the information:

Simple, right? But if you don’t do that, then you end up exporting only a handful of the most n-number of recent records.

And yeah, if there’s a lot of information to export, it can take a little bit more time than not, but it’s far faster to export information than it is to import it – at least that’s what I’ve found – but that’s neither here nor there.