You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 30, 2025. It is now read-only.
XMLProcessor: Support SYSTEM and PUBLIC DOCTYPE sections (#7)
Support DOCTYPE declarations with SYSTEM and PUBLIC sections. The
following, previously unsupported, DOCTYPE declarations are now
correctly parsed and validated:
**SYSTEM identifiers**:
```xml
<!DOCTYPE html SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
```
**PUBLIC identifiers**:
```xml
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
```
Here's how the new API can be consumed:
```php
$processor = XMLProcessor::create_from_string(
'<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.1//EN\" \"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd\"><root>Content</root>'
);
$processor->next_token();
// Output: html
echo $processor->get_doctype_name();
// Output: -//W3C//DTD XHTML 1.1//EN
echo $processor->get_pubid_literal();
// Output: http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd
echo $processor->get_system_literal();
```
This conforms with the following part of the DOCTYPE syntax:
```
doctypedecl ::= '<!DOCTYPE' S Name (S ExternalID)? S? ('[' intSubset ']' S?)? '>'
ExternalID ::= 'SYSTEM' S SystemLiteral | 'PUBLIC' S PubidLiteral
SystemLiteral ::= ('"' [^"]* '"') | ("'" [^']* "'")
PubidLiteral ::= '"' PubidChar* '"' | "'" (PubidChar - "'")* "'"
PubidChar ::= #x20 | #xD | #xA | [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%]
```
### Other changes
XMLProcessor now distinguishes between an invalid (malformed) and an
unsupported (e.g. inline entity declaration) DOCTYPE syntax and yields
an appropriate error.
## Testing instructions
Confirm all the XMLProcessor unit tests pass (the CI is red overall
because of other, unrelated failures).
cc @akirk
0 commit comments