A Warning: Node Module And Teaser Weirdness

Photo of Greg Harvey
Thu, 2009-01-15 16:01By greg

I came in this morning to tackle the bug from hell. I spent a fair chunk of yesterday afternoon trying to figure out why when some of my nodes were edited the body content was pre-pended with the node teaser text, so if the editor didn't notice and saved the node the first paragraph would be duplicated. The weird thing was that once the node had been saved once, the problem went away.

We are running our own Content Distribution module, which mostly works great, but on this occasion I was finding the bug was manifesting directly after a node was distributed from one site to another.

I did all the usual. I printed the node, it looked fine. I looked at the $form array for the node edit form, and although the default value for the body text field was screwy, everything else looked great. I took a database dump before and after an edit and scoured them in WinMerge for any tiny differences. Nada. I handed it over to a colleague to uninstalled every single module that wasn't core and the problem was still there (bang went the theory of some bad hook_nodeapi() or hook_form_alter() code).

It was only when I came to look at the differences between the node_revisions table on the two separate sites that I spotted a clue. The body text was being stored quite differently, but the teaser text was the same.

In the initial database the column data looked like this:


body => This is paragraph one.\r\rThis is paragraph 2.
teaser => This is paragraph one.

But in the receiving database, the processed content looked like this:


body =>

This is paragraph one.

This is paragraph 2.

teaser => This is paragraph one.

How did this happen? Well we are publishing nodes from a central Drupal website out to external ones over web services. The web service uses a method called node.get to load the node to send to the other site, which is effectively a node_load() function call with XML around it. The node_load() function, however, applies HTML filters to the body field but not to the teaser field - so the result is the publishing website sends node body content as mark-up, even though the real node content is not mark-up. The node teaser content is still *not* mark-up, but plain text.

Turns out this discrepancy between the teaser and body fields in the node_revisions table causes the core node module to have a bit of a fit. (I never did pinpoint exactly how or where, as I fixed the problem and didn't want to spend a day chasing shadows in Eclipse.) In short, the node module expects the teaser to *perfectly* match the first X characters of the body or things go bad. If this is not the case (e.g. the body is wrapped in

tags and the teaser is not), the node module somehow pre-pends the teaser copy to the body copy when the node edit form is loaded (though nowhere else... go figure!)

The reason it fixed itself after the first edit was the teaser got updated on saving the node so the columns looked like this:


body =>

This is paragraph one.

This is paragraph 2.

teaser =>

This is paragraph one.

Thus the teaser matches the node body once more and the problem goes away.

We fixed it in the Content Distribution module by adding this line, rebuilding the teaser prior to our node_save() with the *actual* node body we receive from the publishing web service:

$node->teaser = node_teaser($node->body, $node->format);
?>