Skip to content

[Routing] Support UTF-8 in paths and parameters #19562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions UPGRADE-4.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,14 @@ HttpKernel
have your own `ControllerResolverInterface` implementation, you should
inject an `ArgumentResolverInterface` instance.


Router
------
* URLs are now assumed to be encoded in UTF-8. Regular expressions for route
requirements now use the PCRE_UTF8 flag to allow matching non-ASCII
characters. To support legacy URLs in other encodings, the default encoding
can be overridden using the `charset` option.

Serializer
----------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,12 @@ protected function execute(InputInterface $input, OutputInterface $output)
$context->setHost($host);
}

$matcher = new TraceableUrlMatcher($router->getRouteCollection(), $context);
$pathInfo = $input->getArgument('path_info');

$traces = $matcher->getTraces($input->getArgument('path_info'));
$charset = mb_detect_encoding($pathInfo, null, true);
$matcher = new TraceableUrlMatcher($router->getRouteCollection(), $context, $charset);

$traces = $matcher->getTraces($pathInfo);

$io->newLine();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
<parameter key="router.options.matcher_dumper_class">Symfony\Component\Routing\Matcher\Dumper\PhpMatcherDumper</parameter>
<parameter key="router.options.matcher.cache_class">%router.cache_class_prefix%UrlMatcher</parameter>
<parameter key="router.options.generator.cache_class">%router.cache_class_prefix%UrlGenerator</parameter>
<parameter key="router.options.charset">%kernel.charset%</parameter>
<parameter key="router.request_context.host">localhost</parameter>
<parameter key="router.request_context.scheme">http</parameter>
<parameter key="router.request_context.base_url"></parameter>
Expand Down Expand Up @@ -66,6 +67,7 @@
<argument key="matcher_base_class">%router.options.matcher_base_class%</argument>
<argument key="matcher_dumper_class">%router.options.matcher_dumper_class%</argument>
<argument key="matcher_cache_class">%router.options.matcher.cache_class%</argument>
<argument key="charset">%router.options.charset%</argument>
</argument>
<argument type="service" id="router.request_context" on-invalid="ignore" />
<argument type="service" id="logger" on-invalid="ignore" />
Expand Down
10 changes: 10 additions & 0 deletions src/Symfony/Bundle/FrameworkBundle/Routing/Router.php
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,16 @@ public function __construct(ContainerInterface $container, $resource, array $opt

$this->resource = $resource;
$this->context = $context ?: new RequestContext();

if (array_key_exists('charset', $options)) {
try {
$this->setOption('charset', $options['charset']);
} catch (\InvalidArgumentException $e) {
// symfony/routing version 3.x does not support the charset option.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to which version will this PR be merged to? 4.x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Because the regexp generated by RouteCompiler now has the u flag, this is a BC-breaking change. See my question regarding that issue here.

unset($options['charset']);
}
}

$this->setOptions($options);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -639,6 +639,7 @@ protected function createContainer(array $data = array())
'kernel.bundles' => array('FrameworkBundle' => 'Symfony\\Bundle\\FrameworkBundle\\FrameworkBundle'),
'kernel.cache_dir' => __DIR__,
'kernel.debug' => false,
'kernel.charset' => 'UTF-8',
'kernel.environment' => 'test',
'kernel.name' => 'kernel',
'kernel.root_dir' => __DIR__,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{
"path": "\/hello\/{name}",
"pathRegex": "#^\/hello(?:\/(?P<name>[a-z]+))?$#s",
"pathRegex": "#^\/hello(?:\/(?P<name>[a-z]+))?$#us",
"host": "localhost",
"hostRegex": "#^localhost$#si",
"hostRegex": "#^localhost$#usi",
"scheme": "http|https",
"method": "GET|HEAD",
"class": "Symfony\\Component\\Routing\\Route",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
- Path: /hello/{name}
- Path Regex: #^/hello(?:/(?P<name>[a-z]+))?$#s
- Path Regex: #^/hello(?:/(?P<name>[a-z]+))?$#us
- Host: localhost
- Host Regex: #^localhost$#si
- Host Regex: #^localhost$#usi
- Scheme: http|https
- Method: GET|HEAD
- Class: Symfony\Component\Routing\Route
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
+--------------+---------------------------------------------------------+
| Route Name | |
| Path | /hello/{name} |
| Path Regex | #^/hello(?:/(?P<name>[a-z]+))?$#s |
| Path Regex | #^/hello(?:/(?P<name>[a-z]+))?$#us |
| Host | localhost |
| Host Regex | #^localhost$#si |
| Host Regex | #^localhost$#usi |
| Scheme | http|https |
| Method | GET|HEAD |
| Requirements | name: [a-z]+ |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?>
<route class="Symfony\Component\Routing\Route">
<path regex="#^/hello(?:/(?P&lt;name&gt;[a-z]+))?$#s">/hello/{name}</path>
<host regex="#^localhost$#si">localhost</host>
<path regex="#^/hello(?:/(?P&lt;name&gt;[a-z]+))?$#us">/hello/{name}</path>
<host regex="#^localhost$#usi">localhost</host>
<scheme>http</scheme>
<scheme>https</scheme>
<method>GET</method>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{
"path": "\/name\/add",
"pathRegex": "#^\/name\/add$#s",
"pathRegex": "#^\/name\/add$#us",
"host": "localhost",
"hostRegex": "#^localhost$#si",
"hostRegex": "#^localhost$#usi",
"scheme": "http|https",
"method": "PUT|POST",
"class": "Symfony\\Component\\Routing\\Route",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
- Path: /name/add
- Path Regex: #^/name/add$#s
- Path Regex: #^/name/add$#us
- Host: localhost
- Host Regex: #^localhost$#si
- Host Regex: #^localhost$#usi
- Scheme: http|https
- Method: PUT|POST
- Class: Symfony\Component\Routing\Route
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
+--------------+---------------------------------------------------------+
| Route Name | |
| Path | /name/add |
| Path Regex | #^/name/add$#s |
| Path Regex | #^/name/add$#us |
| Host | localhost |
| Host Regex | #^localhost$#si |
| Host Regex | #^localhost$#usi |
| Scheme | http|https |
| Method | PUT|POST |
| Requirements | NO CUSTOM |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?>
<route class="Symfony\Component\Routing\Route">
<path regex="#^/name/add$#s">/name/add</path>
<host regex="#^localhost$#si">localhost</host>
<path regex="#^/name/add$#us">/name/add</path>
<host regex="#^localhost$#usi">localhost</host>
<scheme>http</scheme>
<scheme>https</scheme>
<method>PUT</method>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{
"route_1": {
"path": "\/hello\/{name}",
"pathRegex": "#^\/hello(?:\/(?P<name>[a-z]+))?$#s",
"pathRegex": "#^\/hello(?:\/(?P<name>[a-z]+))?$#us",
"host": "localhost",
"hostRegex": "#^localhost$#si",
"hostRegex": "#^localhost$#usi",
"scheme": "http|https",
"method": "GET|HEAD",
"class": "Symfony\\Component\\Routing\\Route",
Expand All @@ -21,9 +21,9 @@
},
"route_2": {
"path": "\/name\/add",
"pathRegex": "#^\/name\/add$#s",
"pathRegex": "#^\/name\/add$#us",
"host": "localhost",
"hostRegex": "#^localhost$#si",
"hostRegex": "#^localhost$#usi",
"scheme": "http|https",
"method": "PUT|POST",
"class": "Symfony\\Component\\Routing\\Route",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ route_1
-------

- Path: /hello/{name}
- Path Regex: #^/hello(?:/(?P<name>[a-z]+))?$#s
- Path Regex: #^/hello(?:/(?P<name>[a-z]+))?$#us
- Host: localhost
- Host Regex: #^localhost$#si
- Host Regex: #^localhost$#usi
- Scheme: http|https
- Method: GET|HEAD
- Class: Symfony\Component\Routing\Route
Expand All @@ -22,9 +22,9 @@ route_2
-------

- Path: /name/add
- Path Regex: #^/name/add$#s
- Path Regex: #^/name/add$#us
- Host: localhost
- Host Regex: #^localhost$#si
- Host Regex: #^localhost$#usi
- Scheme: http|https
- Method: PUT|POST
- Class: Symfony\Component\Routing\Route
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<?xml version="1.0" encoding="UTF-8"?>
<routes>
<route name="route_1" class="Symfony\Component\Routing\Route">
<path regex="#^/hello(?:/(?P&lt;name&gt;[a-z]+))?$#s">/hello/{name}</path>
<host regex="#^localhost$#si">localhost</host>
<path regex="#^/hello(?:/(?P&lt;name&gt;[a-z]+))?$#us">/hello/{name}</path>
<host regex="#^localhost$#usi">localhost</host>
<scheme>http</scheme>
<scheme>https</scheme>
<method>GET</method>
Expand All @@ -20,8 +20,8 @@
</options>
</route>
<route name="route_2" class="Symfony\Component\Routing\Route">
<path regex="#^/name/add$#s">/name/add</path>
<host regex="#^localhost$#si">localhost</host>
<path regex="#^/name/add$#us">/name/add</path>
<host regex="#^localhost$#usi">localhost</host>
<scheme>http</scheme>
<scheme>https</scheme>
<method>PUT</method>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ public function testRedirectWhenNoSlash()
$coll = new RouteCollection();
$coll->add('foo', new Route('/foo/'));

$matcher = new RedirectableUrlMatcher($coll, $context = new RequestContext());
$matcher = new RedirectableUrlMatcher($coll, $context = new RequestContext(), 'UTF-8');

$this->assertEquals(array(
'_controller' => 'Symfony\Bundle\FrameworkBundle\Controller\RedirectController::urlRedirectAction',
Expand All @@ -43,7 +43,7 @@ public function testSchemeRedirect()
$coll = new RouteCollection();
$coll->add('foo', new Route('/foo', array(), array(), array(), '', array('https')));

$matcher = new RedirectableUrlMatcher($coll, $context = new RequestContext());
$matcher = new RedirectableUrlMatcher($coll, $context = new RequestContext(), 'UTF-8');

$this->assertEquals(array(
'_controller' => 'Symfony\Bundle\FrameworkBundle\Controller\RedirectController::urlRedirectAction',
Expand Down
6 changes: 6 additions & 0 deletions src/Symfony/Component/Routing/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
CHANGELOG
=========

4.0.0
-----
* URLs are now assumed to be encoded in UTF-8. Regular expressions for route requirements now use
the PCRE_UTF8 flag to allow matching non-ASCII characters. To support legacy URLs in other encodings,
the default encoding can be overridden using the `charset` option.

3.2.0
-----

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,11 @@ class {$options['class']} extends {$options['base_class']}
/**
* Constructor.
*/
public function __construct(RequestContext \$context, LoggerInterface \$logger = null)
public function __construct(RequestContext \$context, LoggerInterface \$logger = null, \$charset = 'UTF-8')
{
\$this->context = \$context;
\$this->logger = \$logger;
\$this->charset = \$charset;
if (null === self::\$declaredRoutes) {
self::\$declaredRoutes = {$this->generateDeclaredRoutes()};
}
Expand Down
47 changes: 41 additions & 6 deletions src/Symfony/Component/Routing/Generator/UrlGenerator.php
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,13 @@ class UrlGenerator implements UrlGeneratorInterface, ConfigurableRequirementsInt
*/
protected $logger;

/**
* The output encoding used when generating URLs (path, query string and fragment).
*
* @var string
*/
protected $charset;

/**
* This array defines the characters (besides alphanumeric ones) that will not be percent-encoded in the path segment of the generated URL.
*
Expand Down Expand Up @@ -81,12 +88,14 @@ class UrlGenerator implements UrlGeneratorInterface, ConfigurableRequirementsInt
* @param RouteCollection $routes A RouteCollection instance
* @param RequestContext $context The context
* @param LoggerInterface|null $logger A logger instance
* @param string $charset The encoding used in the generated URLs
*/
public function __construct(RouteCollection $routes, RequestContext $context, LoggerInterface $logger = null)
public function __construct(RouteCollection $routes, RequestContext $context, LoggerInterface $logger = null, $charset = 'UTF-8')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot to upgrade the docblock with new arg

{
$this->routes = $routes;
$this->context = $context;
$this->logger = $logger;
$this->charset = $charset;
}

/**
Expand Down Expand Up @@ -158,7 +167,7 @@ protected function doGenerate($variables, $defaults, $requirements, $tokens, $pa
if ('variable' === $token[0]) {
if (!$optional || !array_key_exists($token[3], $defaults) || null !== $mergedParams[$token[3]] && (string) $mergedParams[$token[3]] !== (string) $defaults[$token[3]]) {
// check requirement
if (null !== $this->strictRequirements && !preg_match('#^'.$token[2].'$#', $mergedParams[$token[3]])) {
if (null !== $this->strictRequirements && !preg_match('#^'.$token[2].'$#u', $mergedParams[$token[3]])) {
if ($this->strictRequirements) {
throw new InvalidParameterException(strtr($message, array('{parameter}' => $token[3], '{route}' => $name, '{expected}' => $token[2], '{given}' => $mergedParams[$token[3]])));
}
Expand All @@ -185,7 +194,7 @@ protected function doGenerate($variables, $defaults, $requirements, $tokens, $pa
}

// the contexts base URL is already encoded (see Symfony\Component\HttpFoundation\Request)
$url = strtr(rawurlencode($url), $this->decodedChars);
$url = strtr(rawurlencode($this->fromUtf8($url)), $this->decodedChars);

// the path segments "." and ".." are interpreted as relative reference when resolving a URI; see http://tools.ietf.org/html/rfc3986#section-3.3
// so we need to encode them as they are not used for this purpose here
Expand Down Expand Up @@ -266,19 +275,45 @@ protected function doGenerate($variables, $defaults, $requirements, $tokens, $pa
$fragment = isset($extra['_fragment']) ? $extra['_fragment'] : '';
unset($extra['_fragment']);

if ($extra && $query = http_build_query($extra, '', '&')) {
// "/" and "?" can be left decoded for better user experience, see
// ignore null values
$extra = array_filter($extra, function ($value) { return null !== $value; });

if ($extra) {
if ($this->charset != 'UTF-8') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Always use identical comparison unless you need type juggling;
  2. Use Yoda conditions when checking a variable against an expression to avoid an accidental assignment inside the condition statement (this applies to ==, !=, ===, and !==);

https://symfony.com/doc/current/contributing/code/standards.html#structure

if ('UTF-8' !== $this->charset) {

$extra = array_combine(
array_map(array($this, 'fromUtf8'), array_keys($extra)),
array_map(array($this, 'fromUtf8'), $extra)
);
}
$query = http_build_query($extra, '', '&');
// "/" and "?" can be left decoded for better user experience, see
// http://tools.ietf.org/html/rfc3986#section-3.4
$url .= '?'.strtr($query, array('%2F' => '/'));
}

if ('' !== $fragment) {
$url .= '#'.strtr(rawurlencode($fragment), array('%2F' => '/', '%3F' => '?'));
$url .= '#'.strtr(rawurlencode($this->fromUtf8($fragment)), array('%2F' => '/', '%3F' => '?'));
}

return $url;
}

/**
* Converts a string from UTF-8 to the encoding used in URLs.
*
* @param string $string A UTF-8-encoded string
*
* @return string A string in the encoded used in URLs
*/
protected function fromUtf8($string)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docblock documenting arg and return value

{
if ('UTF-8' === $this->charset) {
return $string;
}

return mb_convert_encoding($string, $this->charset, 'UTF-8');
}

/**
* Returns the target path as relative reference from the base path.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ interface UrlGeneratorInterface extends RequestContextAwareInterface
*
* The special parameter _fragment will be used as the document fragment suffixed to the final URL.
*
* All parameters should be encoded in UTF-8 when passed to this function. The generated URL will be
* encoding using the output encoding configured for this generator.
*
* @param string $name The name of the route
* @param mixed $parameters An array of parameters
* @param int $referenceType The type of reference to be generated (one of the constants)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,10 @@ class {$options['class']} extends {$options['base_class']}
/**
* Constructor.
*/
public function __construct(RequestContext \$context)
public function __construct(RequestContext \$context, \$charset)
{
\$this->context = \$context;
\$this->charset = \$charset;
}

{$this->generateMatchMethod($supportsRedirections)}
Expand Down Expand Up @@ -104,7 +105,7 @@ private function generateMatchMethod($supportsRedirections)
public function match(\$pathinfo)
{
\$allow = array();
\$pathinfo = rawurldecode(\$pathinfo);
\$pathinfo = \$this->toUtf8(rawurldecode(\$pathinfo));
\$context = \$this->context;
\$request = \$this->request;

Expand Down
Loading