[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong default value of DOMDocument.xmlStandalone #11791

Closed
uuf6429 opened this issue Jan 3, 2023 · 5 comments
Closed

Wrong default value of DOMDocument.xmlStandalone #11791

uuf6429 opened this issue Jan 3, 2023 · 5 comments

Comments

@uuf6429
Copy link
uuf6429 commented Jan 3, 2023

The documentation here says that xmlStandalone defaults to false when missing, but the following test fails on PHP 8.1*:

    public function testThatStandaloneDefaultsToFalse(): void
    {
        $doc = new \DOMDocument();

        $doc->loadXML('<?xml version="1.0" encoding="US-ASCII"?><root/>');

        $this->assertFalse($doc->xmlStandalone);
    }

* I haven't tried it on other PHP versions and 3v4l.org sadly misses DOMDocument..

@cmb69
Copy link
Member
cmb69 commented Jan 4, 2023

It doesn't look like this depends on the PHP version; PHP treats that libxml2 int as bool. However, as of libxml2 2.6.31 (released 2004), it is no longer a simple bool, but rather:

    int             standalone; /* standalone document (no external refs) 
				     1 if standalone="yes"
				     0 if standalone="no"
				    -1 if there is no XML declaration
				    -2 if there is an XML declaration, but no
					standalone attribute was specified */

The W3C DOM Level 3 specs declare it as boolean, and state "This is false when unspecified." So the PHP documentation would actually be correct, but not the implementation.

However, the WHATWG DOM specification lists that interface member as removed.

Given that the WHATWG DOM specification supersedes the respective W3C specifications, I don't think that fixing the actual bug makes sense, so we should (a) document the current behavior, and (b) deprecate ::$xmlStandalone. If we don't want to deprecate that property, we should probably fix the bug (i.e. treat negative values as false). @beberlei, what do you think about this?

FWIW, we only require libxml2 ≥ 2.7.6 as of PHP 7.4.0; previously, we required only libxml2 ≥ 2.6.11, so builds with these old libxml2 versions would show a different behavior.

@uuf6429
Copy link
Author
uuf6429 commented Jan 4, 2023

The W3C DOM Level 3 specs declare it as boolean, and state "This is false when unspecified." So the PHP documentation would actually be correct, but not the implementation.

Yep, that's kind of what I meant...I started investigating especially after reading the W3C spec contradicting the PHP behaviour.

However, the WHATWG DOM specification lists that interface member as removed.

On the PHP side, apparently we've deprecated version (but not xmlVersion) and standalone (but not xmlStandalone), meanwhile encoding/xmlEncoding is a similar situation but not deprecated.
The bigger problem here is that:

  1. we cannot force DOMDocument not to output the XML declaration (because LIBXML_NOXMLDECL is not implemented)
  2. since the xmlXX fields are readonly, the only way to change the declaration is by changing the writeable (but deprecated fields)
  3. as it is right now, we can't remove the standalone attribute from the declaration either, ie the xml input will not (functionally) match the output

But these are obviously out of the scope of the documentation.

In my specific case, I think I'll change my implementation to save the root xml node and prepend the original xml declaration (instead of saving the entire document).

@cmb69
Copy link
Member
cmb69 commented Jan 4, 2023

On the PHP side, apparently we've deprecated version (but not xmlVersion) and standalone (but not xmlStandalone), meanwhile encoding/xmlEncoding is a similar situation but not deprecated.

Ugh. None of these properties are actually deprecated (E_DEPRECATED); only the docs say they are (and that since 2007 at least). We should probably really deprecate ::$version and ::$standalone, as these don't appear to have ever been specified. Not sure what to do about ::$encoding and friends; W3C DOM Level 3 specifies ::$inputEncoding and ::$xmlEncoding, while WHATWG has the latter removed, and says the former would be a legacy alias of ::$characterSet which we do not yet support at all.

  1. we cannot force DOMDocument not to output the XML declaration (because LIBXML_NOXMLDECL is not implemented)

Indeed, another doc bug (at least).

2. since the xmlXX fields are readonly, the only way to change the declaration is by changing the writeable (but deprecated fields)

Only ::$xmlEncoding (and ::$actualEncoding) are supposed to be readonly; ::$xmlStandalone and ::$xmlVersion are not.

@uuf6429
Copy link
Author
uuf6429 commented Jan 4, 2023

Not sure what to do about ::$encoding and friends

Only ::$xmlEncoding (and ::$actualEncoding) are supposed to be readonly; ::$xmlStandalone and ::$xmlVersion are not.

That is correct, my bad.

Doesn't that also mean encoding is PHP's way of providing a writable property (since the other two are not as per spec)?

@nielsdos
Copy link
Member

I'm going to move this issue to php-src, as this is an implementation bug.

@nielsdos nielsdos transferred this issue from php/doc-en Jul 25, 2023
nielsdos added a commit to nielsdos/php-src that referenced this issue Jul 25, 2023
At one point this was changed from a bool to an int in libxml2, with
negative values meaning it is unspecified. Because it is cast to a bool
this therefore returned true instead of the expected false.
nielsdos added a commit to nielsdos/php-src that referenced this issue Jul 25, 2023
At one point this was changed from a bool to an int in libxml2, with
negative values meaning it is unspecified. Because it is cast to a bool
this therefore returned true instead of the expected false.
nielsdos added a commit that referenced this issue Jul 26, 2023
* PHP-8.1:
  Fix GH-11791: Wrong default value of DOMDocument::xmlStandalone
nielsdos added a commit that referenced this issue Jul 26, 2023
* PHP-8.2:
  Fix GH-11791: Wrong default value of DOMDocument::xmlStandalone
jorgsowa pushed a commit to jorgsowa/php-src that referenced this issue Aug 16, 2023
At one point this was changed from a bool to an int in libxml2, with
negative values meaning it is unspecified. Because it is cast to a bool
this therefore returned true instead of the expected false.

Closes phpGH-11793.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants