[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stream] STREAM_NOTIFY_PROGRESS over HTTP emitted irregularly for last chunk of data #10031

Closed
aetonsi opened this issue Dec 1, 2022 · 3 comments · Fixed by ThePHPF/thephp.foundation#90

Comments

@aetonsi
Copy link
aetonsi commented Dec 1, 2022

Description

Problem

Files

file echo.php
<?php

$fsize = 1000;
$chunksize = 99;
$chunks = floor($fsize / $chunksize); // 10 chunks * 99 bytes
$lastchunksize = $fsize - $chunksize * $chunks; // 1 chunk * 10 bytes

header("Content-length: " . $fsize);
flush();
for ($chunk = 1; $chunk <= $chunks; $chunk++) {
    echo str_repeat('x', $chunksize);
    @ob_flush();
    usleep(50 * 1000);
}

echo str_repeat('x', $lastchunksize);
file test.php
<?php

$context = stream_context_create(['http' => ['ignore_errors' => true,]]);
$lastBytesTransferred = 0;
stream_context_set_params($context, ['notification' => function ($code, $s, $m, $mc, $bytes_transferred, $bytes_max)
use (&$lastBytesTransferred) {
    $progress = $bytes_max ? $bytes_transferred / $bytes_max * 100 : 0;
    $chunksize = $bytes_transferred - $lastBytesTransferred;
    if ($code === STREAM_NOTIFY_PROGRESS) echo "\nprogress: $progress% ($bytes_transferred/$bytes_max b) (chunksize=$chunksize)";
    if ($code === STREAM_NOTIFY_FILE_SIZE_IS) echo "\nexpected filesize= $bytes_max";
    if ($code === STREAM_NOTIFY_COMPLETED) echo "\nSTREAM_NOTIFY_COMPLETED";
    $lastBytesTransferred = $bytes_transferred;
    @ob_flush();
}]);

$get = file_get_contents('http://localhost:8000/echo.php', false, $context);

if (!$get) die("\nERROR during file_get_contents. is the local server running? ( php -\"dxdebug.mode\"=off -S localhost:8000 )\n");

echo "\ngot filesize=" . strlen($get) . PHP_EOL;

CLI commands:

# to start local server
php -S localhost:8000
# to run the test
php test.php

Wrong output (happens irregularly):

> php .\src\test.php

expected filesize= 1000
progress: 0% (0/1000 b) (chunksize=0)
progress: 9.9% (99/1000 b) (chunksize=99)
progress: 19.8% (198/1000 b) (chunksize=99)
progress: 29.7% (297/1000 b) (chunksize=99)
progress: 39.6% (396/1000 b) (chunksize=99)
progress: 49.5% (495/1000 b) (chunksize=99)
progress: 59.4% (594/1000 b) (chunksize=99)
progress: 69.3% (693/1000 b) (chunksize=99)
progress: 79.2% (792/1000 b) (chunksize=99)
progress: 89.1% (891/1000 b) (chunksize=99)
progress: 90.1% (901/1000 b) (chunksize=10)
got filesize=1000

Correct output:

> php .\src\test.php

expected filesize= 1000
progress: 0% (0/1000 b) (chunksize=0)
progress: 9.9% (99/1000 b) (chunksize=99)
progress: 19.8% (198/1000 b) (chunksize=99)
progress: 29.7% (297/1000 b) (chunksize=99)
progress: 39.6% (396/1000 b) (chunksize=99)
progress: 49.5% (495/1000 b) (chunksize=99)
progress: 59.4% (594/1000 b) (chunksize=99)
progress: 69.3% (693/1000 b) (chunksize=99)
progress: 79.2% (792/1000 b) (chunksize=99)
progress: 89.1% (891/1000 b) (chunksize=99)
progress: 99% (990/1000 b) (chunksize=99)
progress: 100% (1000/1000 b) (chunksize=10)
got filesize=1000

Description

The file echo.php generates a series of 1000 consecutive "x"s. The data is emitted with intervals of 50ms (usleep(50 * 1000)), in 10 chunks sized 99bytes, plus one final chunk of 10bytes (99*10 + 10 = 1000).

The script test.php fetches the data from the url /echo.php of the local web server. file_get_contents is passed a context with a notification callback. This callback prints the download progress: percentage, bytes transferred, total bytes, current chunk size. The chunk size is calculated by subtracting the previous $bytes_transferred from the current $bytes_transferred.

If you run php test.php multiple times, you will see that sometimes the event last 1/2 STREAM_NOTIFY_PROGRESS events are not emitted correctly.

The last events should always be:

  • progress: 89.1% (891/1000 b) (chunksize=99) = chunk 9
  • progress: 99% (**990**/1000 b) (chunksize=**99**) = chunk 10
  • progress: 100% (**1000**/1000 b) (chunksize=**10**) = chunk 11

Instead, sometimes the last chunks are messed up:

  • progress: 89.1% (891/1000 b) (chunksize=99) = chunk 9
  • progress: 90.1% (**901**/1000 b) (chunksize=**10**) = chunk 10. This line here is the culprit. This chunk has the correct size of the last chunk (10 bytes), but it's reported as being the 10th! In reality, the 10th chunk is still 99bytes large, and the 11th is the 10bytes one.

When this problem occurs, 100% progress is never emitted (as the 10th chunk is mistaken for the 11th).


I don't think this is a problem in the communication between php and the web server. Any possible delay in the communication shouldn't interfere with how the chunks are passed to the notification callback. I tried running curl -#O multiple times to see how curl receives the data. This is the command line i used (powershell): 1..10 | % { curl -#LO http://localhost:8000/echo.php *>&1 | php -r "preg_match_all('/[0-9\.]+/m', stream_get_contents(STDIN), `$m);echo implode(', ', `$m[0]).PHP_EOL;" ;}. For each iteration (1..10), this prints every progress percentage reported by curl.

It prints something like this:

19.8, 39.6, 49.5, 59.4, 69.3, 79.2, 99.0, 100.0
19.8, 29.7, 39.6, 49.5, 59.4, 69.3, 79.2, 89.1, 99.0, 100.0
19.8, 29.7, 39.6, 49.5, 59.4, 69.3, 89.1, 99.0, 100.0
19.8, 39.6, 49.5, 59.4, 69.3, 79.2, 89.1, 99.0, 100.0
19.8, 29.7, 39.6, 49.5, 59.4, 69.3, 79.2, 99.0, 100.0
19.8, 29.7, 39.6, 59.4, 69.3, 79.2, 89.1, 99.0, 100.0
19.8, 29.7, 39.6, 49.5, 59.4, 79.2, 89.1, 99.0, 100.0
19.8, 29.7, 39.6, 49.5, 59.4, 69.3, 79.2, 89.1, 99.0, 100.0
19.8, 39.6, 49.5, 59.4, 69.3, 79.2, 89.1, 99.0, 100.0
19.8, 29.7, 39.6, 49.5, 59.4, 69.3, 79.2, 89.1, 99.0, 100.0

As you can see, curl doesn't always report its progress, but even though it skips some reporting, it always has the correct progress percentages. It always ends with 89.1, 99.0, 100.0, which is the correct sequence.

php on the other hand sometimes ends with 89.1, 90.1.


In conclusion, at the moment the STREAM_NOTIFY_PROGRESS event is not reliable.

I cannot determine why this problem occures only sometimes. I really can't. This appears extremely frequently when the chunks are transferred with varying sizes (eg if the php script on the server calls readfile(), and/or if the connection to the server is not stable).

PHP Version

PHP 8.1.12

Operating System

Windows 11 (10.0.22621.819)

@cmb69
Copy link
Member
cmb69 commented Dec 2, 2022

Interestingly, I cannot reproduce this on native Windows (10), but easily on WSL.

@aetonsi
Copy link
Author
aetonsi commented Dec 9, 2022

With the same php version? That's strange...

Just speculating without checking the source (i don't understand it): could it be a problem with the OS (win11 and wsl)? I mean, maybe the OS transfers the bytes correctly, but then notifies the progress incorrectly, and php blindly passes it to the notification callback. At the same time, with this hypothesis, curl and similar tools maybe don't rely on the OS reportings, but they do their own checks.

If that's the case, while waiting (and hoping) for an OS-level fix, a php-level fix would simply be to calculate the current chunk size ( $current_chunk_size = $current_file_size - $previous_file_size ) and set $bytes_transferred = $previous_notified_bytes_transferred + $current_chunk_size.

But that's all speculation as i said.

nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 2, 2023
…regularly for last chunk of data

Fixes phpGH-10031

It's possible that the server already sent in more data than just the headers.
Since the stream only accepts progress increments after the headers are
processed, the already read data is never added to the process.
We account for this by adjusting the progress counter by the difference of
already read header data and the body.
nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 2, 2023
Co-authored-by: aetonsi <18366087+aetonsi@users.noreply.github.com>
nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 2, 2023
Co-authored-by: aetonsi <18366087+aetonsi@users.noreply.github.com>
nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 2, 2023
…regularly for last chunk of data

Fixes phpGH-10031

It's possible that the server already sent in more data than just the headers.
Since the stream only accepts progress increments after the headers are
processed, the already read data is never added to the process.
We account for this by adjusting the progress counter by the difference of
already read header data and the body.
nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 2, 2023
Co-authored-by: aetonsi <18366087+aetonsi@users.noreply.github.com>
nielsdos added a commit that referenced this issue May 5, 2023
* PHP-8.1:
  Fix GH-10031: [Stream] STREAM_NOTIFY_PROGRESS over HTTP emitted irregularly for last chunk of data
nielsdos added a commit that referenced this issue May 5, 2023
* PHP-8.2:
  Fix GH-10031: [Stream] STREAM_NOTIFY_PROGRESS over HTTP emitted irregularly for last chunk of data
@aetonsi
Copy link
Author
aetonsi commented Jun 12, 2023

... and thanks again @nielsdos !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
@cmb69 @aetonsi and others