WordPress formatting bugs

I came across a few formatting bugs in the course of creating the entry related to my first plugin. When I used Movable Type, I opted to not have MT do any formatting of my posts for me. While giving me complete control over the look and layout, it was a bit tedious to wrap each paragraph in <p> and </p>, using <br /> to denote end-lines and blank lines, etc. In using WordPress, I’ve allowed it to do some of the formatting, and I checked “WordPress should correct invalidly nested XHTML automatically.”

Such handling comes with a price. Here are a few samples which will trigger WordPress formatting bugs. The bugs involve at least the function balanceTags() in wp-includes/functions-formatting.php, and possibly others. At this point in time I haven’t gotten deep enough into this aspect of the WordPress code so I don’t have a solution to these particular problems, but I’ll document them in case someone else can fix them. (I’m not the first to encounter the worst of the items listed below, as evidenced by this bug tracker item.)

  1. Create a post with the following body:

    <ul> <li>Won't close this item. <li>But will close this one.</li> <p>0123456789ABCDEFGH Note no close tag for the UL.</p>

    WordPress ‘fixes’ it as such:

    <ul> <li>Won't close this item. <li>But will close this one.</li> <p>0123456789ABCDEFGH Note no close tag for the UL.</p></li></ul>

    Which isn’t exactly what I’d expect. The first <li> tag should be closed immediately before the next <li> tag.

  2. Now let’s try this, an omitted </li>, but we WILL close the <ul> this time.

    <ul> <li>Won't close this item. <li>But will close this one.</li> </ul> <p>0123456789ABCDEFGH Only the LI wasn't closed this time.</p>

    WordPress ‘fixes’ it as such:

    <ul> <li>Won't close this item. <li>But will close this one.</li> </li></ul> <p>ABCDEFGH Only the LI wasn't closed this time.</p>

    That’s right, WordPress just ate 10 characters (the numbers 0 – 9) and those characters are gone gone gone.

  3. Now watch improper paragraph tagging:

    <div>Here is short pair of paragraphs that I want wrapped in a div.</div>

    WordPress ‘fixes’ it as such:

    <div>Here is short pair of paragraphs that I</p> <p>want wrapped in a div.</div>


    Invaliding XHTML due to a </p> without a matching preceeding <p>, and then an open <p> without a matching </p>.
  4. A slight variation of the previous item, with slightly different results:

    Entry body:

    [[code]]czo2MjpcIlRoaXMgaXMgc29tZSB0ZXh0Lg0KDQpUaGF0IElcJ2xsIGNvbnRpbnVlIHRvIGFub3RoZXIgcGFyYWdyYXBoLlwiO3tbJiomXX0=[[/code]]

    WordPress ‘fixes’ it as such:

    <p>[[code]]czo2NzpcIlRoaXMgaXMgc29tZSB0ZXh0LjwvcD4NCjxwPlRoYXQgSVwnbGwgY29udGludWUgdG8gYW5vdGhlciBwYXJhZ3JhcGguXCI7e1smKiZdfQ==[[/code]] </p>


    Which, strictly speaking, is improper nesting of tags.
  5. Entry body:

    <div class="myclass"> Some text here. &lt&;!--more--> Closing the div on this side of the more. </div>

    WordPress ‘fixes’ it as such:

    <div class="myclass"> Some text here. &lt&;!--more--> Closing the div on this side of the more. </></div>


    This, I think, is an example of the gotchas that can result from using the ‘more’ separator to separate text within a single text field. While I do believe the user is responsible for making sure their tags match properly above and below the line (treating the two as separate section), I’m sure the tag balancing code could also take the ‘more’ separator into account. At the very least, it shouldn’t throw in the non-tag </>.

4 thoughts on “WordPress formatting bugs

  1. I came across some issues too. It looks like having <p> </P> tags with <br> tags imbedded within them breaks WP parsing. All <b> tags have to be compliant like this <b />. I agree that WP should auto correct the stuff instead of having data loss like in my case.

    I always run into the non-tag </>. I do run into it more when using the < !--more--> tag

    -Ken

Comments are closed.