Skip to content

[Validator] Add NoBannedWords constraint #49868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/Symfony/Component/Validator/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ CHANGELOG
* Add the `pattern` parameter in violations of the `Regex` constraint
* Add a `NoSuspiciousCharacters` constraint to validate a string is not a spoofing attempt
* Add a `PasswordStrength` constraint to check the strength of a password (requires `bjeavons/zxcvbn-php` library)
* Add a `NoBannedWords` constraint to check if a text contains banned words (with basic l33t support)
* Add the `countUnit` option to the `Length` constraint to allow counting the string length either by code points (like before, now the default setting `Length::COUNT_CODEPOINTS`), bytes (`Length::COUNT_BYTES`) or graphemes (`Length::COUNT_GRAPHEMES`)
* Add the `filenameMaxLength` option to the `File` constraint
* Add the `exclude` option to the `Cascade` constraint
Expand Down
53 changes: 53 additions & 0 deletions src/Symfony/Component/Validator/Constraints/NoBannedWords.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
<?php

/*
* This file is part of the Symfony package.
*
* (c) Fabien Potencier <fabien@symfony.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/

namespace Symfony\Component\Validator\Constraints;

use Symfony\Component\Validator\Constraint;
use Symfony\Component\Validator\Exception\ConstraintDefinitionException;

/**
* @Annotation
*
* @Target({"PROPERTY", "METHOD", "ANNOTATION"})
*
* @author Florent Morselli <florent.morselli@spomky-labs.com>
*/
#[\Attribute(\Attribute::TARGET_PROPERTY | \Attribute::TARGET_METHOD | \Attribute::IS_REPEATABLE)]
final class NoBannedWords extends Constraint
{
public const BANNED_WORDS_ERROR = 'd187ff45-bf23-4331-aa87-c24a36e9b400';

protected const ERROR_NAMES = [
self::BANNED_WORDS_ERROR => 'BANNED_WORDS_ERROR',
];

public string $message = 'The value contains the following banned words: {{ wordList }}.';

/**
* @var array<string>
*/
public array $dictionary = [];

/**
* @param array<string> $dictionary
*/
public function __construct(array $dictionary = [], mixed $options = null, array $groups = null, mixed $payload = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should keep $options as the first argument. Otherwise, this will break the XML and YAML loaders, which only pass a single argument with all options. Named arguments for all available options must come after it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks.

{
array_walk($options['dictionary'], static function (mixed $value): void {
if (!\is_string($value)) {
throw new ConstraintDefinitionException(sprintf('The parameter "dictionary" of the "%s" constraint must be a list of strings.', static::class));
}
});
$options['dictionary'] = $dictionary;
parent::__construct($options, $groups, $payload);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
<?php

/*
* This file is part of the Symfony package.
*
* (c) Fabien Potencier <fabien@symfony.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/

namespace Symfony\Component\Validator\Constraints;

use Symfony\Component\Validator\Constraint;
use Symfony\Component\Validator\ConstraintValidator;
use Symfony\Component\Validator\Exception\UnexpectedTypeException;
use Symfony\Component\Validator\Exception\UnexpectedValueException;

final class NoBannedWordsValidator extends ConstraintValidator
{
private const LEET_MAP = [
'a' => '(a|4|/\\|@|\^|aye|∂|/\-\\|\|\-\\|q)',
'b' => '(b|8|6|13|\|3|ß|P\>|\|\:|\!3|\(3|/3|\)3)',
'c' => '(c|\(|¢|\<|\[|©)',
'd' => '(d|\[\)|\|o|\)|I\>|\|\>|\?|T\)|\|\)|0|\</)',
'e' => '(e|3|&|€|£|є|ë|\[\-|\|\=\-)',
'f' => '(f|\|\=|ƒ|\|\#|ph|/\=)',
'g' => '(g|6|&|\(_\+|9|C\-|gee|\(γ,)',
'h' => '(h|\#|/\-/|\[\-\]|\]\-\[|\)\-\(|\(\-\)|\:\-\:|\|~\||\|\-\||\]~\[|\}\{|\?|\}|\-\{|hèch)',
'i' => '(i|1|\!|\||\]\[|eye|3y3|\]|\:)',
'j' => '(j|_\||_/|¿|\</|\(/|ʝ| ;)',
'k' => '(k|X|\|\<|\|\{|ɮ|\<|\|\\“)',
'l' => '(l|1|£|1_|ℓ|\||\|_|\]\[_,)',
'm' => '(m|\|v\||\[V\]|\{V\}|\|\\/\||/\\/\\|\(u\)|\(V\)|\(\\/\)|/\|\\|\^\^|/\|/\||//\.|\.\\|/\^\^\\|///|\|\^\^\|)',
'n' => '(n|\^/|\|V|\|\\\||/\\/|\[\\\]|\<\\\>|\{\\\}|\]\\\[|//|\^|\[\]|/V|₪)',
'o' => '(o|0|\(\)|oh|\[\]|¤|°|\(\[\]\))',
'p' => '(p|\|\*|\|o|\|º|\|\^\(o\)|\|\>|\|"|9|\[\]D|\|̊|\|7|\?|/\*|¶|\|D)',
'q' => '(q|\(_,\)|\(\)_|0_|°\||\<\||0\.)',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All those groups should be non-capturing, to avoid reaching the limit on the number of capturing groups in PCRE

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the brackets should just be removed actually

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the behaviour is not the same with [] instead of ().
As an example, "$yMph0NY" is not detected because of "ph". Same if I change "symfony" into "symfohny". It looks like multicharacters are not understood when grouping with []

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked to use non-capturing groups, not character classes

'r' => '(r|2|\|\?|/2|\|\^|lz|®|\[z|12|Я|\|2|ʁ|\|²|\.\-|,\-|\|°\\)',
's' => '(s|5|\$|z|§|ehs|es|_/¯)',
't' => '(t|7|\+|\-\|\-|1|\'\]\[\'|†|\|²|¯\|¯)',
'u' => '(u|\(_\)|\|_\||v|L\||µ|J)',
'v' => '(v|\\/|1/|\|/|o\|o)',
'w' => '(w|\\/\\/|vv|\'//|\\`|\\\^/|\(n\)|\\V/|\\X/|\\\|/|\\_\|_/|\\_\:_/|Ш|ɰ|`\^/|\\\./)',
'x' => '(x|\>\<|Ж|\}\{|ecks|×|\)\(|8)',
'y' => '(y|7|j|`/|Ψ|φ|λ|Ч|¥|\'/)',
'z' => '(z|≥|2|\=/\=|7_|~/_| %|\>_|\-\\_|\'/_)',
];

public function validate(mixed $value, Constraint $constraint): void
{
if (!$constraint instanceof NoBannedWords) {
throw new UnexpectedTypeException($constraint, NoBannedWords::class);
}

if (null === $value || !$constraint->dictionary) {
return;
}

if (!\is_string($value)) {
throw new UnexpectedValueException($value, 'string');
}

$toL33tRegex = fn (string $data): string => implode('', array_map(fn (string $char): string => strtr($char, self::LEET_MAP), str_split($data)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about all characters that are not keys in self::LEET_MAP (which include uppercase letters btw, as you don't enforce lowercase in the dictionary) ? Having a . in the dictionary would lead to pretty bad things (it gets even worse when you consider that anything else than a lowercase ASCII letter is injected as is in the regex, allowing to inject all regex special chars in it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For uppercase, I use the modifier i. You are right, the dictionary shall only contain lowercase alpha otherwize it does not work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strtr does not care that you use the i modifier in the final regex: it won't turn a A in the dictionary word into l33t equivalents

You are right, the dictionary shall only contain lowercase alpha otherwize it does not work.

This must be enforced, because it opens the door to bad behavior otherwise (and it is actually a nasty requirement as it would also add that requirement for the non-l33t banned words, which don't have this restriction otherwise)

$regex = sprintf('{%s}i', implode('|', array_map($toL33tRegex(...), $constraint->dictionary)));

preg_match_all($regex, $value, $matches);

if (!$matches = current($matches)) {
$this->context->buildViolation($constraint->message, [
'{{ matches }}' => implode(', ', $matches),
'{{ dictionary }}' => implode(', ', $constraint->dictionary),
])
->setCode(NoBannedWords::BANNED_WORDS_ERROR)
->addViolation();
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
<?php

/*
* This file is part of the Symfony package.
*
* (c) Fabien Potencier <fabien@symfony.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/

namespace Symfony\Component\Validator\Tests\Constraints;

use PHPUnit\Framework\TestCase;
use Symfony\Component\Validator\Constraints\NoBannedWords;
use Symfony\Component\Validator\Exception\ConstraintDefinitionException;

class NoBannedWordsTest extends TestCase
{
public function testConstructor()
{
$constraint = new NoBannedWords();
$this->assertEquals([], $constraint->dictionary);
}

public function testConstructorWithParameters()
{
$constraint = new NoBannedWords([
'dictionary' => ['foo', 'bar'],
]);

$this->assertEquals(['foo', 'bar'], $constraint->dictionary);
}

public function testInvalidDictionary()
{
$this->expectException(ConstraintDefinitionException::class);
$this->expectExceptionMessage('The parameter "dictionary" of the "Symfony\Component\Validator\Constraints\NoBannedWords" constraint must be a list of strings.');
new NoBannedWords(['dictionary' => [123]]);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
<?php

/*
* This file is part of the Symfony package.
*
* (c) Fabien Potencier <fabien@symfony.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/

namespace Symfony\Component\Validator\Tests\Constraints;

use Symfony\Component\Validator\Constraints\NoBannedWords;
use Symfony\Component\Validator\Constraints\NoBannedWordsValidator;
use Symfony\Component\Validator\Test\ConstraintValidatorTestCase;

class NoBannedWordsValidatorTest extends ConstraintValidatorTestCase
{
protected function createValidator(): NoBannedWordsValidator
{
return new NoBannedWordsValidator();
}

/**
* @dataProvider getValidValues
*/
public function testValidValues(string $value)
{
$this->validator->validate($value, new NoBannedWords([
'dictionary' => ['foo', 'bar'],
]));

$this->assertNoViolation();
}

public static function getValidValues(): iterable
{
yield ['This text does not contain any banned words.'];
yield ['Another text that does not contain any banned words'];
}

/**
* @dataProvider provideInvalidConstraints
*/
public function testBannedWordsAreCatched(NoBannedWords $constraint, string $password, string $expectedMessage, string $expectedCode, array $parameters = [])
{
$this->validator->validate($password, $constraint);

$this->buildViolation($expectedMessage)
->setCode($expectedCode)
->setParameters($parameters)
->assertRaised();
}

public static function provideInvalidConstraints(): iterable
{
yield [
new NoBannedWords([
'dictionary' => ['symfony'],
]),
'This text contains symfony, which is not allowed.',
'The value contains the following banned words: {{ wordList }}.',
NoBannedWords::BANNED_WORDS_ERROR,
[
'{{ matches }}' => 'symfony',
'{{ dictionary }}' => 'symfony',
],
];
yield [
new NoBannedWords([
'dictionary' => ['symfony'],
]),
'This text contains $yMph0NY, which is a banned words written in l337.',
'The value contains the following banned words: {{ wordList }}.',
NoBannedWords::BANNED_WORDS_ERROR,
[
'{{ matches }}' => '$yMph0NY',
'{{ dictionary }}' => 'symfony',
],
];
yield [
new NoBannedWords([
'dictionary' => ['symfony', 'foo', 'bar'],
]),
'This text contains $yMph0NY, f00 and b4r, which are all banned words written in l337.',
'The value contains the following banned words: {{ wordList }}.',
NoBannedWords::BANNED_WORDS_ERROR,
[
'{{ matches }}' => '$yMph0NY, f00, b4r',
'{{ dictionary }}' => 'symfony, foo, bar',
],
];
}
}