gist.githubusercontent.com_simonw_6b76f8780bcbb8f6ad7b8c9f0dce5392_raw_617f45992427370d36be20ffd346bb84c07656b8_blog-paragraphs: 65
This data as json
rowid | new_id | clean_paragraph |
---|---|---|
65 | 35-1 | passage: Thinking about it, almost all of the common errors I am experiencing come from the XML parser rather than the rules governing XHTML. I need an XML parser that examines each post as (or before) it is added to the blog and checks for well-formedness. Expat (used in PHP for event based XML parsing) does not validate documents against a DTD but it DOES die with an error if an XML document is malformed. It looks like it could be just what I need. |