Skip to content

[Serializer][FrameworkBundle] Add a CSV encoder #19197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
use Symfony\Component\HttpKernel\DependencyInjection\Extension;
use Symfony\Component\Config\FileLocator;
use Symfony\Component\PropertyAccess\PropertyAccessor;
use Symfony\Component\Serializer\Encoder\CsvEncoder;
use Symfony\Component\Serializer\Mapping\Factory\CacheClassMetadataFactory;
use Symfony\Component\Serializer\Normalizer\DataUriNormalizer;
use Symfony\Component\Serializer\Normalizer\DateTimeNormalizer;
Expand Down Expand Up @@ -977,6 +978,12 @@ private function registerSerializerConfiguration(array $config, ContainerBuilder
$definition->addTag('serializer.normalizer', array('priority' => -900));
}

if (class_exists(CsvEncoder::class)) {
$definition = $container->register('serializer.encoder.csv', CsvEncoder::class);
$definition->setPublic(false);
$definition->addTag('serializer.encoder');
}

$loader->load('serializer.xml');
$chainLoader = $container->getDefinition('serializer.mapping.chain_loader');

Expand Down
181 changes: 181 additions & 0 deletions src/Symfony/Component/Serializer/Encoder/CsvEncoder.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
<?php

/*
* This file is part of the Symfony package.
*
* (c) Fabien Potencier <fabien@symfony.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/

namespace Symfony\Component\Serializer\Encoder;

use Symfony\Component\Serializer\Exception\InvalidArgumentException;

/**
* Encodes CSV data.
*
* @author Kévin Dunglas <dunglas@gmail.com>
*/
class CsvEncoder implements EncoderInterface, DecoderInterface
{
const FORMAT = 'csv';

private $delimiter;
private $enclosure;
private $escapeChar;
private $keySeparator;

/**
* @param string $delimiter
* @param string $enclosure
* @param string $escapeChar
* @param string $keySeparator
*/
public function __construct($delimiter = ',', $enclosure = '"', $escapeChar = '\\', $keySeparator = '.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using an array of options instead. That would make the configuration more explicit. This is also make it easier to add options.

Copy link
Contributor

@greg0ire greg0ire Jun 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe use the options resolver, then (maybe that's already what you meant)?

Copy link
Member Author

@dunglas dunglas Jun 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It mimics the prototype of CSV related PHP functions and it carries a strong semantic. Why changing that to an array carrying less semantic value? New options can be added by adding new parameters to the constructor.

Copy link
Member Author

@dunglas dunglas Jun 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw we actually use this style for all encoders and normalizers. It would be weird to change it for this one.

Copy link
Contributor

@greg0ire greg0ire Jun 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reasoning behind this is that if you use such an array, you can change the key separator without having to specify the other values. And you can still use the same names for option keys, so… also, I think you will probably add other options not related to f*csv 's prototype later, for instance, a withBom boolean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is how this withBom boolean would be used. One more argument would be that function parameters are for dependencies, and that configuration should be provided as a separate object / array, because while dependencies are often not so numerous, options can quickly lead you to violate this rule

Copy link
Contributor

@greg0ire greg0ire Jun 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw we actually use this style for all encoders and normalizers. It would be weird to change it for this one.

But then if you have to be consistent… too bad, forget it.

{
$this->delimiter = $delimiter;
$this->enclosure = $enclosure;
$this->escapeChar = $escapeChar;
$this->keySeparator = $keySeparator;
}

/**
* {@inheritdoc}
*/
public function encode($data, $format, array $context = array())
Copy link
Contributor

@greg0ire greg0ire Jun 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if $data is huge ? It would be great to have an interator here instead be able to use an iterator here. Not sure line 50 is compatible with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be an array because of the current design of the Serializer (Normalizers return arrays).

{
$handle = fopen('php://temp,', 'w+');

if (!is_array($data)) {
$data = array(array($data));
} elseif (empty($data)) {
$data = array(array());
} else {
// Sequential arrays of arrays are considered as collections
$i = 0;
foreach ($data as $key => $value) {
if ($i !== $key || !is_array($value)) {
$data = array($data);
break;
}

++$i;
}
}

$headers = null;
foreach ($data as $value) {
$result = array();
$this->flatten($value, $result);

if (null === $headers) {
$headers = array_keys($result);
fputcsv($handle, $headers, $this->delimiter, $this->enclosure, $this->escapeChar);
} elseif (array_keys($result) !== $headers) {
throw new InvalidArgumentException('To use the CSV encoder, each line in the data array must have the same structure. You may want to use a custom normalizer class to normalize the data format before passing it to the CSV encoder.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe linebreak this ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you convinced me !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabpot even if I'm personnaly convinced about no linebreak for method and class signatures, I don't really understand why we can't linebreak this thing? It's a new class, no merge conflict possible here.

Could you tell us why? I'm curious. :-) Maybe a coding standard I missed?

I think we can at least put the string argument onto a new line.

Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the code to be dense vertically. Nowadays, screens have a lot of horizontal space. And anyway, this is just an error message, I don't want exceptions to take too much screen space when reading the code.

Copy link
Contributor

@jvasseur jvasseur Jul 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error messages should be keep on one line to allow searching them more easily in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowadays, screens have a lot of horizontal space

That's why we disagree apparently : I use the fact that my screen is big to split windows and often have 4 files on the same screen. I think this is the very first factor of speed on my workflow, because it makes navigating files way easier than browsing through tabs IMO, but I know many people seem to like to use their screen for one file at a time, while they often need to switch between a controller, a service, a view and a translation file in applications, a class and its test in libraries.

Other things to take into consideration is that Github (for instance) will not display this nicely, and also long lines are harder to edit : people scroll way faster vertically than horizontally.

But anyway, your repo, your rules, so thanks for taking the time to give your point of view even if you're probably very busy.

Copy link
Contributor

@greg0ire greg0ire Jul 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error messages should be keep on one line to allow searching them more easily in the code.

That's if you linebreak them in the middle of a sentence / proposition. This is still quite greppable :

throw new InvalidArgumentException(
    'To use the CSV encoder, 
each line in the data array must have the same structure. 
You may want to use a custom normalizer class to normalize the data format 
before passing it to the CSV encoder.'
);

}

fputcsv($handle, $result, $this->delimiter, $this->enclosure, $this->escapeChar);
}

rewind($handle);
$value = stream_get_contents($handle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem here, what if $data is huge? $value will be too. Not sure you can do anything against that though, it depends if Symfony can work with a stream rather than a string…

Copy link
Member Author

@dunglas dunglas Jun 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently, but it impacts all other decoders as well. However it would be a nice addition to the component to be able to deal with streams.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would indeed! Php has yield now, which should make this kind of thing easier to work with (What if normalizers yielded values instead of returning a huge array they build ?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be done in Symfony 3 because the minimal PHP version is now 5.5. However encoders must stay backward compatible with normalizers returning raw arrays (they must support both cases).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I create a new issue about this ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding that to my todo list!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #19238

fclose($handle);

return $value;
}

/**
* {@inheritdoc}
*/
public function supportsEncoding($format)
{
return self::FORMAT === $format;
}

/**
* {@inheritdoc}
*/
public function decode($data, $format, array $context = array())
{
$handle = fopen('php://temp', 'r+');
fwrite($handle, $data);
rewind($handle);

$headers = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this var is intended to store an array later, it can be initialized as an empty array IMO. I don't see any special case null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To remove

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get your point @roukmoute. Can you please show where is this happening?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep is as is to have a code as close as possible in encode and decode (and in encode $headers cannot be initialized as an empty array because the normalized data can contain an empty array).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phansys: sorry I write a mistake.

$nbHeaders = 0;
$result = array();

while (false !== ($cols = fgetcsv($handle, 0, $this->delimiter, $this->enclosure, $this->escapeChar))) {
$nbCols = count($cols);

if (null === $headers) {
$nbHeaders = $nbCols;

foreach ($cols as $col) {
$headers[] = explode($this->keySeparator, $col);
}

continue;
}

$item = array();
for ($i = 0; ($i < $nbCols) && ($i < $nbHeaders); ++$i) {
$depth = count($headers[$i]);
$arr = &$item;
for ($j = 0; $j < $depth; ++$j) {
// Handle nested arrays
if ($j === ($depth - 1)) {
$arr[$headers[$i][$j]] = $cols[$i];

continue;
}

if (!isset($arr[$headers[$i][$j]])) {
$arr[$headers[$i][$j]] = array();
}

$arr = &$arr[$headers[$i][$j]];
}
}

$result[] = $item;
}
fclose($handle);

if (empty($result) || isset($result[1])) {
return $result;
}

// If there is only one data line in the document, return it (the line), the result is not considered as a collection
return $result[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/**
* {@inheritdoc}
*/
public function supportsDecoding($format)
{
return self::FORMAT === $format;
}

/**
* Flattens an array and generates keys including the path.
*
* @param array $array
* @param array $result
* @param string $parentKey
*/
private function flatten(array $array, array &$result, $parentKey = '')
{
foreach ($array as $key => $value) {
if (is_array($value)) {
$this->flatten($value, $result, $parentKey.$key.$this->keySeparator);
} else {
$result[$parentKey.$key] = $value;
}
}
}
}
Loading