Skip to content

[HttpFoundation] Add StreamedJsonResponse for efficient JSON streaming #47709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/Symfony/Component/HttpFoundation/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ CHANGELOG
6.2
---

* Add `StreamedJsonResponse` class for efficient JSON streaming
* The HTTP cache store uses the `xxh128` algorithm
* Deprecate calling `JsonResponse::setCallback()`, `Response::setExpires/setLastModified/setEtag()`, `MockArraySessionStorage/NativeSessionStorage::setMetadataBag()`, `NativeSessionStorage::setSaveHandler()` without arguments
* Add request matchers under the `Symfony\Component\HttpFoundation\RequestMatcher` namespace
Expand Down
139 changes: 139 additions & 0 deletions src/Symfony/Component/HttpFoundation/StreamedJsonResponse.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
<?php

/*
* This file is part of the Symfony package.
*
* (c) Fabien Potencier <fabien@symfony.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/

namespace Symfony\Component\HttpFoundation;

/**
* StreamedJsonResponse represents a streamed HTTP response for JSON.
*
* A StreamedJsonResponse uses a structure and generics to create an
* efficient resource-saving JSON response.
*
* It is recommended to use flush() function after a specific number of items to directly stream the data.
*
* @see flush()
*
* @author Alexander Schranz <alexander@sulu.io>
*
* Example usage:
*
* function loadArticles(): \Generator
* // some streamed loading
* yield ['title' => 'Article 1'];
* yield ['title' => 'Article 2'];
* yield ['title' => 'Article 3'];
* // recommended to use flush() after every specific number of items
* }),
*
* $response = new StreamedJsonResponse(
* // json structure with generators in which will be streamed
* [
* '_embedded' => [
* 'articles' => loadArticles(), // any generator which you want to stream as list of data
* ],
* ],
* );
*/
class StreamedJsonResponse extends StreamedResponse
{
private const PLACEHOLDER = '__symfony_json__';

/**
* @param mixed[] $data JSON Data containing PHP generators which will be streamed as list of data
* @param int $status The HTTP status code (200 "OK" by default)
* @param array<string, string|string[]> $headers An array of HTTP headers
* @param int $encodingOptions Flags for the json_encode() function
*/
public function __construct(
private readonly array $data,
int $status = 200,
array $headers = [],
private int $encodingOptions = JsonResponse::DEFAULT_ENCODING_OPTIONS,
) {
parent::__construct($this->stream(...), $status, $headers);

if (!$this->headers->get('Content-Type')) {
$this->headers->set('Content-Type', 'application/json');
}
}

private function stream(): void
{
$generators = [];
$structure = $this->data;

array_walk_recursive($structure, function (&$item, $key) use (&$generators) {
if (self::PLACEHOLDER === $key) {
// if the placeholder is already in the structure it should be replaced with a new one that explode
// works like expected for the structure
$generators[] = $key;
}

// generators should be used but for better DX all kind of Traversable and objects are supported
if (\is_object($item)) {
$generators[] = $item;
$item = self::PLACEHOLDER;
} elseif (self::PLACEHOLDER === $item) {
// if the placeholder is already in the structure it should be replaced with a new one that explode
// works like expected for the structure
$generators[] = $item;
}
});

$jsonEncodingOptions = \JSON_THROW_ON_ERROR | $this->encodingOptions;
$keyEncodingOptions = $jsonEncodingOptions & ~\JSON_NUMERIC_CHECK;

$jsonParts = explode('"'.self::PLACEHOLDER.'"', json_encode($structure, $jsonEncodingOptions));

foreach ($generators as $index => $generator) {
// send first and between parts of the structure
echo $jsonParts[$index];
Copy link
Member

@jderusse jderusse Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we 100% sure that array_walk_recursive and json_encode will iterate over the items in the exact same order (I'm thinking about (circular) references, for instance) and placeholder position will be synchronized with $generators keys?

Also, what if the string PLACEHOLDER is part of the serialized output?
ie. ["__symfony_json__" => "foo"] or new SerializableEntity(name: '__symfony_json__'), both case will generated '"'.self::PLACEHOLDER.'"' but won't trigger the array_walk_recursive

IMHO we should get rid of the second case by using a really random placeholder.

For the first issue, we could "labelize" the generator

$generators = [];

if ($item instanceof \Traversable && !$item instanceof \JsonSerializable) {
  $generators[] = $item;
  $item = self::PLACEHOLDER.count($generators);
}

latter

$jsonParts = preg_split('/"'.self::PLACEHOLDER.'(\d++)"/', json_encode($structure, $jsonEncodingOptions), -1, PREG_SPLIT_DELIM_CAPTURE);
for ($i=0; $i < count($jsonParts); i+=2) {
  echo $jsonParts[$i];
  if ($i + 1 < count($jsonParts)) {
    $generator = $generators[(int) $jsonParts[$i+1]];
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A random placeholder does not guarantee that it won't be part of the data. It only makes it non-deterministic.
The case where the placeholder itself appear in the data is solved for values because array_walk_recursive injects a generator for such case, but it is indeed not solved for keys or JsonSerializable (for JsonSerializable, we could solve that by resolving them manually though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could solve both by do the same as we are currently doing for values with __symfony_json__. I will have a look at it at the evening and create a test case for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the handling, it is tested via testResponseOtherTraversable and testPlaceholderAsKeyAndValueInStructure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's about not JsonSerializable object?

Copy link
Contributor Author

@alexander-schranz alexander-schranz Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jderusse Test added for it in testResponseOtherTraversable also tested when it contains key and value with the placeholder value. So works like expected from my point of view.

Copy link
Member

@jderusse jderusse Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, I would expect this test to pass.

public function testPlaceholderAsObjectStructure()
{
    $object = new class() {
        public $__symfony_json__ = 'foo';
        public $bar = '__symfony_json__';
    };
    $content = $this->createSendResponse(
        [
            'object' => $object,
        ],
    );

    $this->assertSame('{"object":{"__symfony_json__":"foo","bar":"__symfony_json__"}}', $content);
}

Copy link
Member

@jderusse jderusse Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, this test breaks, which make me think that handling placeholder is really tricky

public function testPlaceholderWithNested()
{
    $content = $this->createSendResponse(
        [
            '__symfony_json__' => [
                '__symfony_json__' => '__symfony_json__',
            ],
        ],
    );

    $this->assertSame('{"__symfony_json__":{"__symfony_json__":"__symfony_json__"}}', $content);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, I would expect this test to pass.

@jderusse fixed that by handling all object via the placeholders.

also, this test breaks, which make me think that handling placeholder is really tricky

Yeah not sure if we should optimize here for an edge case? this is something which we can not handle with array_walk_recursive currently.


if ($generator instanceof \JsonSerializable || !$generator instanceof \Traversable) {
// the placeholders, JsonSerializable and none traversable items in the structure are rendered here
echo json_encode($generator, $jsonEncodingOptions);

continue;
}

$isFirstItem = true;
$startTag = '[';

foreach ($generator as $key => $item) {
if ($isFirstItem) {
$isFirstItem = false;
// depending on the first elements key the generator is detected as a list or map
// we can not check for a whole list or map because that would hurt the performance
// of the streamed response which is the main goal of this response class
if (0 !== $key) {
$startTag = '{';
}

echo $startTag;
} else {
// if not first element of the generic, a separator is required between the elements
echo ',';
}

if ('{' === $startTag) {
echo json_encode((string) $key, $keyEncodingOptions).':';
}

echo json_encode($item, $jsonEncodingOptions);
}

echo '[' === $startTag ? ']' : '}';
}

// send last part of the structure
echo $jsonParts[array_key_last($jsonParts)];
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,241 @@
<?php

/*
* This file is part of the Symfony package.
*
* (c) Fabien Potencier <fabien@symfony.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/

namespace Symfony\Component\HttpFoundation\Tests;

use PHPUnit\Framework\TestCase;
use Symfony\Component\HttpFoundation\StreamedJsonResponse;

class StreamedJsonResponseTest extends TestCase
{
public function testResponseSimpleList()
{
$content = $this->createSendResponse(
[
'_embedded' => [
'articles' => $this->generatorSimple('Article'),
'news' => $this->generatorSimple('News'),
],
],
);

$this->assertSame('{"_embedded":{"articles":["Article 1","Article 2","Article 3"],"news":["News 1","News 2","News 3"]}}', $content);
}

public function testResponseObjectsList()
{
$content = $this->createSendResponse(
[
'_embedded' => [
'articles' => $this->generatorArray('Article'),
],
],
);

$this->assertSame('{"_embedded":{"articles":[{"title":"Article 1"},{"title":"Article 2"},{"title":"Article 3"}]}}', $content);
}

public function testResponseWithoutGenerator()
{
// while it is not the intended usage, all kind of iterables should be supported for good DX
$content = $this->createSendResponse(
[
'_embedded' => [
'articles' => ['Article 1', 'Article 2', 'Article 3'],
],
],
);

$this->assertSame('{"_embedded":{"articles":["Article 1","Article 2","Article 3"]}}', $content);
}

public function testResponseWithPlaceholder()
{
// the placeholder must not conflict with generator injection
$content = $this->createSendResponse(
[
'_embedded' => [
'articles' => $this->generatorArray('Article'),
'placeholder' => '__symfony_json__',
'news' => $this->generatorSimple('News'),
],
'placeholder' => '__symfony_json__',
],
);

$this->assertSame('{"_embedded":{"articles":[{"title":"Article 1"},{"title":"Article 2"},{"title":"Article 3"}],"placeholder":"__symfony_json__","news":["News 1","News 2","News 3"]},"placeholder":"__symfony_json__"}', $content);
}

public function testResponseWithMixedKeyType()
{
$content = $this->createSendResponse(
[
'_embedded' => [
'list' => (function (): \Generator {
yield 0 => 'test';
yield 'key' => 'value';
})(),
'map' => (function (): \Generator {
yield 'key' => 'value';
yield 0 => 'test';
})(),
'integer' => (function (): \Generator {
yield 1 => 'one';
yield 3 => 'three';
})(),
],
]
);

$this->assertSame('{"_embedded":{"list":["test","value"],"map":{"key":"value","0":"test"},"integer":{"1":"one","3":"three"}}}', $content);
}

public function testResponseOtherTraversable()
{
$arrayObject = new \ArrayObject(['__symfony_json__' => '__symfony_json__']);

$iteratorAggregate = new class() implements \IteratorAggregate {
public function getIterator(): \Traversable
{
return new \ArrayIterator(['__symfony_json__']);
}
};

$jsonSerializable = new class() implements \IteratorAggregate, \JsonSerializable {
public function getIterator(): \Traversable
{
return new \ArrayIterator(['This should be ignored']);
}

public function jsonSerialize(): mixed
{
return ['__symfony_json__' => '__symfony_json__'];
}
};

// while Generators should be used for performance reasons, the object should also work with any Traversable
// to make things easier for a developer
$content = $this->createSendResponse(
[
'arrayObject' => $arrayObject,
'iteratorAggregate' => $iteratorAggregate,
'jsonSerializable' => $jsonSerializable,
// add a Generator to make sure it still work in combination with other Traversable objects
'articles' => $this->generatorArray('Article'),
],
);

$this->assertSame('{"arrayObject":{"__symfony_json__":"__symfony_json__"},"iteratorAggregate":["__symfony_json__"],"jsonSerializable":{"__symfony_json__":"__symfony_json__"},"articles":[{"title":"Article 1"},{"title":"Article 2"},{"title":"Article 3"}]}', $content);
}

public function testPlaceholderAsKeyAndValueInStructure()
{
$content = $this->createSendResponse(
[
'__symfony_json__' => '__symfony_json__',
'articles' => $this->generatorArray('Article'),
],
);

$this->assertSame('{"__symfony_json__":"__symfony_json__","articles":[{"title":"Article 1"},{"title":"Article 2"},{"title":"Article 3"}]}', $content);
}

public function testResponseStatusCode()
{
$response = new StreamedJsonResponse([], 201);

$this->assertSame(201, $response->getStatusCode());
}

public function testPlaceholderAsObjectStructure()
{
$object = new class() {
public $__symfony_json__ = 'foo';
public $bar = '__symfony_json__';
};

$content = $this->createSendResponse(
[
'object' => $object,
// add a Generator to make sure it still work in combination with other object holding placeholders
'articles' => $this->generatorArray('Article'),
],
);

$this->assertSame('{"object":{"__symfony_json__":"foo","bar":"__symfony_json__"},"articles":[{"title":"Article 1"},{"title":"Article 2"},{"title":"Article 3"}]}', $content);
}

public function testResponseHeaders()
{
$response = new StreamedJsonResponse([], 200, ['X-Test' => 'Test']);

$this->assertSame('Test', $response->headers->get('X-Test'));
}

public function testCustomContentType()
{
$response = new StreamedJsonResponse([], 200, ['Content-Type' => 'application/json+stream']);

$this->assertSame('application/json+stream', $response->headers->get('Content-Type'));
}

public function testEncodingOptions()
{
$response = new StreamedJsonResponse([
'_embedded' => [
'count' => '2', // options are applied to the initial json encode
'values' => (function (): \Generator {
yield 'with/unescaped/slash' => 'With/a/slash'; // options are applied to key and values
yield '3' => '3'; // numeric check for value, but not for the key
})(),
],
], encodingOptions: \JSON_UNESCAPED_SLASHES | \JSON_NUMERIC_CHECK);

ob_start();
$response->send();
$content = ob_get_clean();

$this->assertSame('{"_embedded":{"count":2,"values":{"with/unescaped/slash":"With/a/slash","3":3}}}', $content);
}

/**
* @param mixed[] $data
*/
private function createSendResponse(array $data): string
{
$response = new StreamedJsonResponse($data);

ob_start();
$response->send();

return ob_get_clean();
}

/**
* @return \Generator<int, string>
*/
private function generatorSimple(string $test): \Generator
{
yield $test.' 1';
yield $test.' 2';
yield $test.' 3';
}

/**
* @return \Generator<int, array{title: string}>
*/
private function generatorArray(string $test): \Generator
{
yield ['title' => $test.' 1'];
yield ['title' => $test.' 2'];
yield ['title' => $test.' 3'];
}
}