Last Updated: February 25, 2016
·
14.15K
· Paratron

Quick and easy way to sanitize POST data in PHP

Today I'd like to introduce you to a method in a helper library we use every day in our projects and which I put opensource a while ago (download it on github). The method is named array_clean.

So what does it do?

The array clean method is specifically made to only keep entries of an array you need to keep and at the same time do type checks and/or conversions on the given data.

We use it specifically to clean incoming POST data to pre-sanitize user input and simplify the process of checking and validating the user input on our serverside.

Basic usage

So, lets look at an example. A user wants to be registered and submits the registration form where he is required to enter his e-mail address and a password alongside with his first- and last name.

$input = \Kiss\Utils::array_clean($_POST, array(
    'mail' => 'mail',
    'password' => 'string',
    'password_check' => 'string',
    'firstName' => 'string|trim',
    'lastName' => 'string|trim'
));

if(!$input['mail']){
    throw new ErrorException('No, or wrong e-mail given.');
}

if(!$input['password']){
    throw new ErrorException('No password given.');
}

if($input['password'] !== $input['password_check']){
    throw new ErrorException('Passwords don\'t match. Maybe a typo?');
}

if(!$input['firstName'] || !$input['lastName']){
    throw new ErrorException('Please enter your full name.');
}

$database->store($input);

The array_clean() method returns an array containing only the desired properties we defined above. You are also guaranteed that those properties are defined in the returned array, even if they have not been defined in the input array - so you don't need to check if the properties are defined but only if they contain a value.

For every property, you can specify which format you assume for that property. Formats can be: string, integer, url, email, array, boolean and regex.

Integer will try to convert the value to a integer value - if it contains something not cosher, the result will be 0.

The url and email cast does a regex check if the value is valid and sets it to NULL, if not. So you can simply validate the property with a simple boolean check in your later program flow (see example above).

The array does a typecast to array. You can also pass a modifier here, to automatically cast all values in the array to integer or boolean (see the part about modifiers below).

The regex cast will validate against a given regex and sets the value to NULL if the regex didn't match. You can also extract substrings and set the property value to the extracted string.

Modifiers

You have already seen it in the example above. Using a pipe symbol ( | ), you can apply modifiers to the typecasting to manipulate the values in different ways. Every typecast supports different modifiers.

Some modifiers expect multiple parameters which are separated by commas. Separate parameters from the filter name with another pipe symbol.

Example: integer|range|10,20

String modifiers

trim will remove unnecessary whitespace from the value.

expect_length will set the value to NULL if the string has not the expected length. Pass the string length as first parameter.

striptags will remove all HTML tags from the string.

htmlentities will escape all html entities in the string.

Boolean modifiers

boolcast will set the value to TRUE if it matches the given parameter, otherwise it will be false. Example: 'tos_checked' => 'bool|boolcast|yes' will set the property to true if the value yes has been submitted.

Integer modifiers

range forces the value to be in a given range. Pass the ranges min and max value as first and second parameter.

Array modifiers

int or integer will cast all array properties to a integer value.

bool or boolean will cast all array properties to a boolean value.

Regex modifiers

extract will set the value to the first occurance of the match.

Universal modifiers (applicable on all casts)

set checks if the value is contained in a given set. Otherwhise it becomes NULL. Example: 'gender' => 'string|set|male,female'.

limit the value is cut at the given number of characters. Also works on integer values (will remain of type integer).

Nested and repeated values

You can as well nest your values, if you like to:

$input = \Kiss\Utils::array_clean($_POST, array(
        'first_name' => 'string|trim',
        'last_name' => 'string|trim',
        'age' => 'integer|range|18,99',
        'mail_address' => 'mail',
        'social' => array(
            'facebook' => 'url',
            'twitter' => 'url'
        )
));

If you want to be able to receive a nested value multiple times, array_clean() has you covered: it uses a special key called {{repeat}} which tells the method that the nested property can have multiple children. Yes, this means that you can never pass a real property named {{repeat}} to the method, but I just declare this to be a minor issue :P

$input = \Kiss\Utils::array_clean($_POST, array(
    'users' => array(
       '{{repeat}}' => 0,
       'first_name' => 'string|trim',
       'last_name' => 'string|trim'
    );
));

If you set the property to 0, it means that you assume any number of datasets passed in. Set it to a fixed number to cap the amount of properties in the array.

In our example, the users property can have a unlimited number of firstname/lastname value pairs in it.

Remapping / renaming properties

Its possible to rename incoming properties on the fly, like so:

'txt-user-name > userName' => 'string'

Assuming you had a text input field with name set to txt-user-name, the array clean method will rename the property to userName, so you can continue working with that property name.

Objectification

If you prefer to work with objects, rather than associative arrays in your program flow, just pass TRUE as third parameter to the array_clean() method and you can access all properties with the arrow symbol.

Conclusion

Okay, thats the rough overview of the function. You can find a bit more deailed description inside the function definition of the method.

I hope you like the method and can make good use of it like we do.

greetings,

Chris

2 Responses
Add your response

Just use the native filter functions, such as http://docs.php.net/manual/en/function.filter-input-array.php
They're available since PHP 5.2!

over 1 year ago ·

I could switch the internal functionality of arrayclean to use PHP's filter functions - yes. On the "outside", I still like arrayclean much more, because its easier/quicker to write and looks way cleaner and easier to understand if you take a quick look at your code - but thats a matter of taste.

over 1 year ago ·