Clean Up Microsoft Word Pasted HTML in  TinyMCE

Microsoft Word Logo - MacMicrosoft Word works great for creating stationary and printed documents. However, if publish content on the web there are a few formatting issues you should be aware of. When you copy and paste your Microsoft Word content into a content management system like WordPress it’s proprietary HTML code is preserved, producing very messy HTML that can lead to browser and layout issues. The same thing happens with other text editors like OpenOffice.

It can be hard to describe this to content writers. It’s not uncommon to find many WordPress blog posts riddled with nasty Microsoft Word HTML. How can you solve it? Kill it at the source by automatically filtering all content pasted into the WYSIWYG/Visual Editor (in WordPress this is called TinyMCE) .

Automatically Clean Up Microsoft Word HTML

Drop this code into your theme’s functions.php file. Anything you paste into TinyMCE will automatically be run through the built-in Microsoft Word HTML filtering.

/**
*	Safe Pasting for TinyMCE (automatically clean up MS Word HTML)
*/
function tinymce_paste_options($init) {
	$init['paste_auto_cleanup_on_paste'] = true;
	$init['paste_convert_headers_to_strong'] = true;
	return $init;
}
if( is_admin() ) add_filter('tiny_mce_before_init', 'tinymce_paste_options');

Hopefully this helps you maintain your sanity as a developer. You may find that it helps lower the number of support calls you get about “bugs” caused by Microsoft Word pasted HTML.

Meet the Author

Kevin Leary, WordPress Consultant

I'm a custom WordPress web developer and analytics consultant in Boston, MA with 17 years of experience building websites and applications. View a portfolio of my work or request an estimate for your next project.