WordPress Refactoring: Refining Plugin Functions

Whenever it comes to writing code – be it starting a new project or working with an existing system – there are a lot of developers that aim to write clean, maintainable code.

It’s a challenge for sure, but it’s not impossible. Even if you’re responsible for simply introducing a new function into an existing system, there’s an opportunity to leave the place a little better than you found it.

Granted, trying to refactor a larger system too much can have serious unintended consequences, so I don’t necessarily advocate doing that, but if you have the opportunity to break a one larger function into several smaller, more focused functions, then I think the opportunity should be taken.

I recently had a chance to do a bit of WordPress refactoring in the context of a commissioned plugin I am working on, and wanted to share the process that I followed for doing so.

What The Function Does

In short, the function is working with a Project custom post type and is responsible for doing the following:

  • If the post is being published and the Project Owner hasn’t been emailed, then notify the Project Owner that their Project is published
  • Otherwise, email the project owner that the Project has been updated

Note that I’ve written a class that wraps the wp_mail function to add some additional functionality. In the code that you’re about to see, this is is referenced by the $mailer variable.

Here’s the code – I’ll discuss it after the block:

public function notify_project_owner( $post_id ) {

	if( 'project' == get_post_type() ) {

		// First, check to see if we're publishing this post and that it's not been previously emailed
		if( 'publish' == get_post_status( $post_id ) ) {

			// If the post doesn't yet exist, then we're creating it so we need to update the Project Owner
			if( 'true' != get_post_meta( $post_id, 'project_owner_emailed', true ) ) {

				$this->mailer->set_subject( __( 'Your Project Has Been Published!', 'cj' ) );
				$this->mailer->set_message( __( 'Need filler text.', 'cj' ) );
				$this->mailer->send_mail( $this->temp_email );

				// Mark that we've notified the project owner
				update_post_meta( $post_id, 'project_owner_emailed', 'true' );

			} // end if

		} elseif( 'inherit' == get_post_status( $post_id ) && ! isset( $_POST['project_is_completed'] ) && 'trash' != $_GET['action'] ) {

			$this->mailer->set_subject( __( 'Your Project Has Been Updated.', 'cj' ) );
			$this->mailer->set_message( __( 'Need filler text.', 'cj' ) );
			$this->mailer->send_mail( $this->temp_email );

		} // end if

	} // end if

} // end notify_project_owner

Though the comments should provide enough context, I’ll explain in detail here.

This function is attached to a save_post. If a Project is being saved, if first checks to see if the post is being published, and, if so, will then check the post’s meta data to see if the project_owner_email is set to true.

If it’s not set to true, then the email is sent and the post meta data is set.

Otherwise, if the project is being updated and has not been marked as completed (another set of post meta data), then the Project Owner should be notified that the Project has been updated.

Opportunities For Refactoring

There are two primary conditions that are too complex.

  1. The conditions are too complex.
  2. Each body of the conditional can be encapsulated into its own function.

Refactor The Conditional Evaluations

I generally try to make sure that my code reads as close to English as possible. It’s tough – I mean, it’s code - but that’s no excuse for not trying, right?

So first, I stub out the first conditional to read something that’s easier to read:

if( $this->post_is_published( $post_id ) ) {
	// Coming soon...
} // end if

Instead of:

if( 'true' != get_post_meta( $post_id, 'project_owner_emailed', true ) ) {
	// Snip Snip!
} // end if

And then I create a helper function that will allow me to encapsulate the evaluating into a single function:

private function post_is_published( $post_id ) {

	return
		'publish' == get_post_status( $post_id ) &&
		'true' != get_post_meta( $post_id, 'project_owner_emailed', true );

} // end post_is_published

Similarly, I needed to do the same for the conditional that was responsible for updating the user when the Project was updated.

So I took the following conditional:

if( 'inherit' == get_post_status( $post_id ) && ! isset( $_POST['project_is_completed'] ) && 'trash' != $_GET['action'] ) {
  // Snip Snip
}

Changed the call to this:

if( $this->post_is_updated( $post_id ) ) {
	// Coming soon...
} // end if

And introduced a new helper function:

private function post_is_updated( $post_id ) {

	 return
	 	'true' != get_post_meta( $post_id, 'project_owner_emailed', true ) &&
	 	'inherit' == get_post_status( $post_id ) &&
	 	! isset( $_POST['project_is_completed'] ) &&
	 	'trash' != $_GET['action'];

} // end post_is_updated

Refactor The Conditionals

Next, we can simplify the body of the conditional so that it’s in its own function thus making the evaluation of the conditionals even easier to read.

First, we start with the case when the user is first publishing a Project. We take the following block of code:

$this->mailer->set_subject( __( 'Your Project Has Been Published!', 'cj' ) );
$this->mailer->set_message( __( 'Need filler text.', 'cj' ) );
$this->mailer->send_mail( $this->temp_email );

// Mark that we've notified the project owner
update_post_meta( $post_id, 'project_owner_emailed', 'true' );

Then we move to code to a new function:

private function send_published_email( $post_id ) {

	$this->mailer->set_subject( __( 'Your Project Has Been Published!', 'cj' ) );
	$this->mailer->set_message( __( 'Need filler text.', 'cj' ) );
	$this->mailer->send_mail( $this->temp_email );

	// Mark that we've notified the project owner
	update_post_meta( $post_id, 'project_owner_emailed', 'true' );

 } // end send_published_email

Next, we do the same for the code that’s responsible for emailing users when the Project has been updated:

$this->mailer->set_subject( __( 'Your Project Has Been Updated.', 'cj' ) );
$this->mailer->set_message( __( 'Need filler text.', 'cj' ) );
$this->mailer->send_mail( $this->temp_email );

And then move it to its own function:

private function send_updated_email() {

	$this->mailer->set_subject( __( 'Your Project Has Been Updated.', 'cj' ) );
	$this->mailer->set_message( __( 'Need filler text.', 'cj' ) );
	$this->mailer->send_mail( $this->temp_email );

} // end send_updated_email

The End Result

At this point, we’re left with four new functions and one, smaller easier-to-read original function. Don’t mistake the number of functions for increasingly complexity, either.

Thanks to the helper functions mentioned above, the notify_project_owner function has been reduced to this:

public function notify_project_owner( $post_id ) {

	// We only care to notify the project owner if we're working with Projects
	if( 'project' == get_post_type() ) {

		// If the post is published, send the notification email
		if( $this->post_is_published( $post_id ) ) {
			$this->send_published_email( $post_id );
		} // end if

		// If the post is being updated, send the appropriate email
		if( $this->post_is_updated( $post_id ) ) {
			$this->send_updated_email();
		} // end if

	} // end if

} // end notify_project_owner

It’s significantly easier to read, isn’t it?

Despite actually adding code, we’ve actually simplified the process by giving each function a unique purpose. This makes it easier to read, easier to trace (or debug), and easier to maintain.

Sure, we’d love to be able to refactor every class and or set of functions that we come into contact with, but that’s not always a possibility, but I think that we can at least do the best with the time that we have available and the code that we have in front of us.

Truth is, the above could probably be refined even more. I just didn’t have the time :).

22 Comments

The outermost conditional could be inverted, and returned early, so that the rest of the whole function isn’t nested. Otherwise, pretty sensible refactoring, and good explanation :-)

    Yeah, the inversion of conditionals is something I haven’t made my mind up about.

    I’m of the mindset – as archaic as people say it is – that a function should have only a single point of return. Even though technically having the return statement near the header of the code is technically a single point, it still keeps logic below it.

    To me, return‘s should always be reserved for the end of a function and should be the last point of execution.

    I may come around, but this has been one of the hardest programming habits that I’ve debated with myself (as sad as that sounds).

      Hi @Tom,

      I’m with you on the single return. I had a long debate with @Rarst about this and several other people. I know that a lot of people think the “return early” style is an improvement but I find it just makes using a debugging that allows you to set breakpoints much harder.

      I’m really glad to see I’ve got some allies on this one. :)

      -Mike

        Oh yes – aliens indeed.

        I keep this card so close to my chest because of how much of a passionate debate it causes with fellow programmers :). But I’m with you 100%: single point of return specifically for debugging and to make the code easier to read (I think it keeps the flow more natural).

        But we’ll keep that between you and me ;).

          I disagree :). Though it never helps to have returns dotted all over a function I tend to have ‘very early’ returns and ‘very late’ returns. To borrow your example, at the beginning of save_post callback, I will have checks for permission, nonce, and post type. Rather than have all these checks result in some ugly nesting, I perform these security (and other ‘sanity’ checks) at the very beginning and abort straight away.

          That said – I can’t say doing that in the re-factored example above would make things any easier to read.

          Great post though – re-factoring is good the soul :) (At least I always feel better afterwards…)

          @Stephen Harris – If you write your functions with early returns you burden other developers who might need to debug through it but shouldn’t have to understand the structure of your function if your function is not the focus of their debugging sessions.

          Far too frequently I find myself having to read through someone else’s function to figure out why it didn’t stop at the breakpoint on the return like I expected it to. Using early returns is just discourteous to other developers. IMO of course. :)

          I’m with Stephen on this one. Returning early often increases readability especially for longer functions.

          Hi @Geert De Deckere

          Returning early often increases readability…

          I’m curious, what’s the basis of your assertion? Do you know of studies or other statistically valid information that would confirm that assertion? If you do I’d be very interested in reading anything you have.

          …especially for longer functions.

          Maybe rather than early returns the answer is to break up long functions into several shorter ones?

          No, I can’t back up my opinion with statistical data. For me, the reduced level of nesting improves readability.

          Have a look at the get_posts() method of WP_Query, for example. A long function which returns earlier if you request just “ids” or “id=>parent” to be returned. It doesn’t bother me, on the contrary. With a single exit point you’d have to carry the $r variable to the bottom and do another check there to see whether to return that variable or the $posts property. Again, just my opinion.

          @Stephen, to your point:

          I tend to have ‘very early’ returns and ‘very late’ return.

          I can respect that. I really dislike returns happening all over the codebase.

          But still, I just can’t bring myself to actually begin implementing it. The way I normally write code is in one of two fashions:

          I declare all variables as early in the plugin as possible. This is particularly common in JavaScript. In other languages, I declare the plugin as close to its initial use as possible.
          Because I always have a single return, I initialize that variable as the very first statement in the function because it’s basically saying “This is where I’m starting,” and then the final return says “this is where I’ve ended up.” Everything in between is what does the work on that code.

      @Geert: I’d have to agree with Mike on this – I don’t think that it increases readability for two reasons:

      Longer functions are a code smell. This doesn’t mean that I’m not guilty of this (because I am!), but some prominent developers argue that functions shouldn’t exceed 20 – 25 lines of code so that they do one thing (which is, coincidentally, what I tried to detail in this post :). As such, refactoring them will fix that particular issue.
      Though it may be a bit subjective, I think a case can be made that it does’t increase readability because the way code is read is more like “If this value is set to -1, then we’ll return. But there’s this other logic below it, so let’s see what it does. Ah, it also returns here if the value is equal to -2. And then here at the end, it looks like it’s returning a value, but I’m not sure what it is. Let me go back to earlier in the code and see where it’s being set.”

      Of course, that could just be me. I never claimed that my way of reading code is accurate, but I try to hard to make my code flow as close to English as possible, and scattering return throughout a function feels more like dropping random phrases breaking up a coherent set of sentences.

Another part I’d refactor is the way the project_owner_emailed is stored. I don’t like this line at all:

if ( 'true' != get_post_meta( $post_id, 'project_owner_emailed', true ) ) { }

Instead of storing “true” and “false” strings in the database, simply save an integer: 1 or 0. Those values will automatically be cast to the corresponding boolean values. Also, they take up a just a few bytes less of space.

if ( (bool) get_post_meta( $post_id, 'project_owner_emailed', true ) ) { }

Or even:

if ( get_post_meta( $post_id, 'project_owner_emailed', true ) ) { }

    For me, this is an issue of readability. 1 and 0 make perfect sense to developers, but for newer programmers there is still a bit of a mental hoop to jump through.

    Secondly, I think that verbally stating true or false is not only easier to read, it’s less to proces. Rather than calling 1 == true, I just read true.

    Now, I’m probably about to incriminate myself, but when it comes to “saving a few bytes,” as far as web applications are concerned – at least when it comes to booleans versus short strings – I actually don’t worry about it.

    Our hardware has become so powerful and so fast that whether or not a single byte or four bytes are stored in a column isn’t going to make that much of an impact.

    Don’t read me wrong: I’m not against efficient, optimized code, but I pick my battles and this particular one is one where readability wins out over a few bytes simply because we’re not going to be able to “feel” the difference.

    Again, I dig discussions like this and seeing where other developers fall, so I definitely appreciate the back and forth on it!

      Agreed on saving those few bytes. That’s a non-issue.

      Disagreed on readability. Code could become hard to debug if you forget to put the quotes around “true” and “false” strings. Having to put those quotes around these words when they’re actually intended to be used as real booleans feels counterintuitive and dirty.

      if ( false == get_post_meta( $post_id, 'project_owner_emailed', true ) ) {
      /* Will never be executed */
      }

      In order to increase readability you may want to store the option with a name prefixed with “has_” or “is_”, e.g. “has_project_owner_been_emailed”. That reads like English and makes it clearer that a boolean value is to be expected.

        @Geert De Deckere – Something else to consider is that storing 1 and 0 for true and false in post meta make it harder to “read” the data in the database. I much prefer ‘true’ and ‘false’ if only for that reason.

Ok, I have to chime in here, because there is a passion about this.
Common pros of returning early:
Exit early and often skips processing of later code and increases the speed of your function.
It makes it more readable and you scan the code quicker.

I disagree.

I like having a single variable for my results that is initialized at the top of the function. Then the developer can watch that single variable go through ALL processing needed for it and you can set a breakpoint at the ONLY return, to make sure it is what it is supposed to be.

I do not have the metrics on performance advantages of exiting early, but I will concede that it MAY prevent certain conditionals from being encountered.

My one point about all of this that I think we can all agree that which direction you choose, BE CONSISTENT. I have seen some code that every other function (practically) switch the pattern, so I guess it would be an anti-pattern. Ok, my 2 cents.

    To this point, I’d have to agree with your stance:

    Exit early and often skips processing of later code and increases the speed of your function.

    I say that simply because compilers have gotten so advanced. If you’re working on compiled code, most of them are smart enough to know how to optimize the code prior to generating the binary so this becomes moot.

    Of course, we’re talking about PHP here and I personally am not sure how the interpreter handles this so I can’t speak to it, but I will say that if a function is small enough, milliseconds is not going to be of any significant difference to a person.

    I like having a single variable for my results that is initialized at the top of the function. Then the developer can watch that single variable go through ALL processing needed for it and you can set a breakpoint at the ONLY return, to make sure it is what it is supposed to be.

    This is my stance, too.

    But you’re right – when it comes to whatever your coding style is, just be consistent. I’m a big fan of code reviews because I think code – especially written by a team – should look like it was written by a single developer.

    Coding standards, style, and so on can help enforce this. It’s a hard thing to push, though, especially in an open-source community.

I like the idea of refactoring. This is what I often do before working on a piece of code written by other people (with reformatting).

One thing I think you can improve in the code is using chained methods for mailer (I guess what you’re using support this. Just guess, no clues :D).

    I think chained methods work great in the context of libraries like jQuery and in frameworks like Rails, and I may be old fashioned in saying this, but I actually prefer not to violate the Law of Demeter if for nothing else because it may not always be clear what each function accepts or returns, thus you’re left with having to find where in the chain the problem occurs.

      @Tom & @Tran – Further I’ve found chaining to make debugging exceedingly difficult. You can’t set breakpoints inside a chain, and you also loose the ability to return other values, such as true or false for failed assertions similar to how get_transient() returns exactly boolean false when it doesn’t find a value which is distinct from an empty string or a zero.

Very interesting article and comments – it had never occurred to me that early returns for negative conditionals could be so contentious (how naive of me!).

Refactoring is one of those things that in an ideal world my inner geek would love to do on every project I take on that was started by other developers, and indeed on projects I have built where time and/or budget were limited at the time so I just banged out code that did the job. However we live in the real world and it’s very often not needed or worth it. It’s also not an easy thing to convince a client it’s worth spending time (i.e. their money!) on, as from their point of view the site works so what’s the point? Despite that, I’ll often squeeze it in anyway as I know it will make my life much easier in the long term.

    However we live in the real world and it’s very often not needed or worth it.

    I think that you’ve nailed the tension that exists every developer who cares about quality code feels when working on a project.

    Though I’d change what you said just a little bit and say that it’s probably needed, but not worth it (where worth is determined by the client’s budget).

    But here’d the paradox: not doing refactoring often results in an indeterminate level of technical debt that manifests itself after the initial project has been delivered. When it comes time to actually make a change or introduce a new feature, you immediately feel the pain of lack of refactoring which may ultimately drive up the cost of adding a new feature.

    This can be avoided on the front end by spending time doing a bit of proper engineering, but, as you mentioned, it’s a hard sale.

    The same can be said for doing unit testing, a user testing, but that’s a whole other post.

Leave a Reply

Name and email address are required. Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>