WordPress / Tutorials / ReactJS
Posted 6.10.2018
By Amanda Tusing

Parsing WordPress Shortcodes in React/Redux

Shortcodes are a key WordPress feature, but using them in a front-end app is complicated.

One of the toughest issues we faced when integrating a WordPress REST API with a React/Redux front end is the parsing  of WordPress shortcodes. This tutorial will outline the approach we used to solve this issue.

The first thing we had to decide was where in our workflow to parse each shortcode. For shortcodes that just need to return some simple markup, it's fine to handle them in PHP and call it a day, but many shortcodes and embeds are more complicated than that, and some even require JavaScript to function properly. In those cases, we must parse the HTML returned from the WordPress API in JavaScript and replace shortcodes and embeds with React components.

We use regular expressions to find shortcodes in WordPress post content, then pass them to plain JavaScript functions or React components.

In order to do this, we've created a shortcode parsing utility. Essentially, we use regular expressions to find shortcodes and embeds in the content string, capture any attributes and content they contain and pass these as arguments to plain JavaScript functions or React components. We then break up the content string into modular blocks - an array of smaller strings mixed with React components.

A note on naming conventions:
Even though they're technically embedded content, in our code we refer to these as "single-line shortcodes" (as opposed to regular bracket shortcodes). It just seemed easier to call everything a shortcode instead of having to throw the word embed in there.

Step 1: Identify the Shortcodes for Parsing

First, you'll need a list of shortcodes with their respective functions/components and whether they're React components or regular JavaScript functions (react vs nonreact). Make a file called shortcodes.js with something like the following (replace shortcode names and components with your own):


Note that you can use the same component for multiple shortcodes, as we are with the Twitter embeds. You can even use containers if you need your component hooked up to Redux, as we are with the Audio component.

Step 2: Capture the Shortcodes and Embeds with RegEx

Next, you'll need a list of embeds with their associated component, type, regular expressions, capture group (matchPosition) for any data we want to pull out (such as the tweet ID in singleTwitterStatus), and an attributeName for the data we're capturing. In the same file as the bracket shortcodes list above, we also have:


Note that you can also use the same components for embeds that you use for shortcodes if it makes sense. You might also need to search for different regular expressions patterns for the same component (e.g., there are multiple URL formats to embed a YouTube video).

Step 3: Set Up A Utility Function to Parse Post Content Strings

We then combine the bracket shortcode and single-line shortcode objects into a single object:


Now we have a clean, consolidated object of data we want to parse out of the content string. Let's create a different file called parseShortcodesInString.js. In it, we'll have a single function to find the shortcodes for us. We'll want to have a list of shortcodes sorted in order that we'll use to break up the WordPress content string (stringToParse) into a postContent array of objects that reference either strings or React components with their associated props:


Step 4: Capture Shortcode Attributes with RegEx

Now we'll need some more regular expressions to parse the bracket shortcodes (findBracketShortcodes), but luckily we can use the same regular expressions for all of them since they all follow standard WordPress rules for shortcodes. We'll also need to parse the shortcodes' attributes for use as props/function arguments (parseShortcodeAttributes). Finally, we'll return an object for each shortcode with the relevant information, including the index where the shortcode appears in the content string so we can break the string up into pieces (returnShortcodeObject). Let's add these functions to our parseShortcodesInString function:


We referenced wp-includes/js/shortcodes.js in WordPress core to get our regexp for the above functions.

Step 5: Capture Single-Line Embeds

Next we can find the single-line shortcodes/embeds (findSingleLineShortcodes). Here, we will use the regular expressions associated with each individual embed to check its existence in stringToParse. We will call the returnShortcodeObject function again to return a standard object of shortcode information:


Step 6: Combine Shortcodes and Embeds into an Ordered Array

Now we can get arrays of found items for bracket shortcodes as well as embeds and combine them into a single array (foundShortcodes). We will then iterate over these and push them to our sortedShortcodes object, using their starting index positions in stringToParse as the keys. This will allow us to get an array of all found shortcodes' starting indices in numerical order with Object.keys():


Step 7: Rebuild the Post Content in a React-Ready Format

Now, we will want to break up the WordPress stringToParse into an array of objects with a standard format. Let's make a function to do this (returnContentObject). It will contain the content type (either string or component), the content (for string types this will just be a string of markup, while for component types it will be the content the shortcode surrounded, if any), the name of the shortcode (this will only be defined for component types), and finally the attributes (both named and numeric) which will only have data for component types:


Now we go through the process of breaking up the WordPress stringToParse into pieces. If no shortcodes/embeds were found, we will end up with an array of just one object of string type with the entire WordPress post body as its content:


Otherwise, we'll iterate over each found shortcode/embed and push objects of strings and components to our postContent array. If the shortcode/embed will be parsed through a regular JavaScript function instead of a React component (based on the react/nonreact type in our original shortcodes object), we can call that function with the shortcode's associated attributes and content as arguments to get the returned string of markup, which we can then use in content object of type string. The shortcodes/embeds that must be parsed with a React component we will put into content objects of type component:


And that's it for parseShortcodesInString! It ends with return postContent.

Step 8: Integrate Parsing Utility into the Posts Reducer

Now we need to actually call parseShortcodesInString on our WordPress content data. We only want to do this operation once per post, so we'll call it in our reducer that adds WP post data to the Redux store, adding our postContent array with a key of parsedContentpost.content in this case would be the content string that comes from the WordPress REST API:


Last Step: Render the Post Data Array as React Components

Now we have the parsed content coming through as data in our Post component. However, it is still just an array of objects containing information about strings and components. We need to turn it into an array of actual strings and components. To do this, we need to import our list of shortcodes so that we can match the components by shortcode name to their associated React component. We also pass along a few extra props containing data about the post that we'll want to use in some of our components. For the string chunks, we'll use dangerouslySetInnerHTML on them since they contain markup.


And that's it! Hopefully in the near future we'll be able to directly handle Gutenberg blocks in the WordPress API and lessen the need for shortcodes and embed parsing, but this is a good method for handling them in the meantime.