Fix Hanging Words in  WordPress

Automatically fix hanging words (or typographic widows) in WordPress content using this helpful function to filter the_content().

Hanging words are known as typography widows, they’re those annoying scenarios where you find yourself with a single hanging word on it’s own line in an HTML element, like this:

Everything looks great until we size down the viewport and find that we have a single word left
hanging.

In this example, hanging is a widow. It’s an old typographic phrase that dates way back to the printing press, it basically just looks ugly. The Chicago manual of style flags this as a no-no, so it’s a good idea to avoid them inside of your custom WordPress theme or plugin. We can visually correct a widow by forcing a line break early, producing a shorter line above the widow but bumping down some of the extra text. I’ve found that a minimum of 3 words per line works well for design purposes.

For the record, sometimes people on the web incorrectly use the term orphan to describe what is actually a widow. An orphan is defined as:

A paragraph-opening line that appears by itself at the bottom of a page or column, thus separated from the rest of the text.

How to Fix Typographic Widows in WordPress

To solve this issue I wrote the following function that enhances the_content() by auto-correcting widows. Drop this into the functions.php file of your custom WordPress theme and you will always see at the last 3 words of a block element on their own line, avoiding any widow scenarios automatically.

/**
 * Avoid Typography Widows
 */
function kl_avoid_content_widows( $content ) {
    $pattern = '@(?:\s)([[:punct:][:word:]]+)(?:\s)(?!/>)([[:punct:][:word:]]+)(?:\s)([[:punct:][:word:]]+)</(p|h1|h2|h3|h4|h5|h6)>@m';
    $replacement = '&nbsp;$1&nbsp;$2&nbsp;$3</$4>';
    $content = preg_replace( $pattern, $replacement, $content, -1 );

    return $content;
}
add_filter( 'the_content', 'kl_avoid_content_widows' );

This is only for content created by the TinyMCE visual editor, if you want to run this method on widgets, custom fields, or other content outside of the main editor then all you have to do is call the function directly.

How it works

Regular expressions are used to find every instance of a closing block level tag that has three words before it separated by spaces. Only block level tags that can be created with the TinyMCE visual editor in WordPress are supported, but it’s easy to add others if you’d like. This includes the following tags:

  • p
  • h1
  • h2
  • h3
  • h4
  • h5
  • h6

<blockquote> and <li> elements were deliberately skipped to avoid issues with nested paragraphs in quotes and the structure of lists. With additional rules this could be supported, but the use case is rare. When 3 words including punctuation marks are found with spaces before a closing ta inside one of these elements we replace the spaces with &nbsp; symbols. This forces the words to be shown on their own line, avoiding widows entirely.

This isn’t perfect, but it should cover 98% of use cases where this is needed without triggering additional content issues. Proceed with caution though, regex rules are very powerful and can occasionally lead to unexpected outcomes.

Why CSS Won’t Work

This may seem like something that can easily be solved with CSS, but unfortunately that’s not the case. CSS does provide a way to correct typography widows with the widows property, but unfortunately this only works for @print media and won’t work on screens. This leaves us high and dry and isn’t an option, although at first glimpse it does seem like it would work.

Citations

Meet the Author

Kevin Leary, WordPress Consultant

I'm a freelance web developer and WordPress consultant in Boston, MA with 17 years of experience building websites and applications. View a portfolio of my work or request an estimate for your next project.