Quick and easy way to sanitize POST data in PHP
Today I'd like to introduce you to a method in a helper library we use every day in our projects and which I put opensource a while ago (download it on github). The method is named array_clean
.
So what does it do?
The array clean method is specifically made to only keep entries of an array you need to keep and at the same time do type checks and/or conversions on the given data.
We use it specifically to clean incoming POST data to pre-sanitize user input and simplify the process of checking and validating the user input on our serverside.
Basic usage
So, lets look at an example. A user wants to be registered and submits the registration form where he is required to enter his e-mail address and a password alongside with his first- and last name.
$input = \Kiss\Utils::array_clean($_POST, array(
'mail' => 'mail',
'password' => 'string',
'password_check' => 'string',
'firstName' => 'string|trim',
'lastName' => 'string|trim'
));
if(!$input['mail']){
throw new ErrorException('No, or wrong e-mail given.');
}
if(!$input['password']){
throw new ErrorException('No password given.');
}
if($input['password'] !== $input['password_check']){
throw new ErrorException('Passwords don\'t match. Maybe a typo?');
}
if(!$input['firstName'] || !$input['lastName']){
throw new ErrorException('Please enter your full name.');
}
$database->store($input);
The array_clean()
method returns an array containing only the desired properties we defined above. You are also guaranteed that those properties are defined in the returned array, even if they have not been defined in the input array - so you don't need to check if the properties are defined but only if they contain a value.
For every property, you can specify which format you assume for that property. Formats can be: string
, integer
, url
, email
, array
, boolean
and regex
.
Integer will try to convert the value to a integer value - if it contains something not cosher, the result will be 0
.
The url
and email
cast does a regex check if the value is valid and sets it to NULL
, if not. So you can simply validate the property with a simple boolean check in your later program flow (see example above).
The array
does a typecast to array. You can also pass a modifier here, to automatically cast all values in the array to integer or boolean (see the part about modifiers below).
The regex
cast will validate against a given regex and sets the value to NULL
if the regex didn't match. You can also extract substrings and set the property value to the extracted string.
Modifiers
You have already seen it in the example above. Using a pipe symbol ( | ), you can apply modifiers to the typecasting to manipulate the values in different ways. Every typecast supports different modifiers.
Some modifiers expect multiple parameters which are separated by commas. Separate parameters from the filter name with another pipe symbol.
Example: integer|range|10,20
String modifiers
trim
will remove unnecessary whitespace from the value.
expect_length
will set the value to NULL
if the string has not the expected length. Pass the string length as first parameter.
striptags
will remove all HTML tags from the string.
htmlentities
will escape all html entities in the string.
Boolean modifiers
boolcast
will set the value to TRUE
if it matches the given parameter, otherwise it will be false. Example: 'tos_checked' => 'bool|boolcast|yes'
will set the property to true
if the value yes
has been submitted.
Integer modifiers
range
forces the value to be in a given range. Pass the ranges min and max value as first and second parameter.
Array modifiers
int
or integer
will cast all array properties to a integer value.
bool
or boolean
will cast all array properties to a boolean value.
Regex modifiers
extract
will set the value to the first occurance of the match.
Universal modifiers (applicable on all casts)
set
checks if the value is contained in a given set. Otherwhise it becomes NULL
. Example: 'gender' => 'string|set|male,female'
.
limit
the value is cut at the given number of characters. Also works on integer values (will remain of type integer).
Nested and repeated values
You can as well nest your values, if you like to:
$input = \Kiss\Utils::array_clean($_POST, array(
'first_name' => 'string|trim',
'last_name' => 'string|trim',
'age' => 'integer|range|18,99',
'mail_address' => 'mail',
'social' => array(
'facebook' => 'url',
'twitter' => 'url'
)
));
If you want to be able to receive a nested value multiple times, array_clean()
has you covered: it uses a special key called {{repeat}}
which tells the method that the nested property can have multiple children. Yes, this means that you can never pass a real property named {{repeat}}
to the method, but I just declare this to be a minor issue :P
$input = \Kiss\Utils::array_clean($_POST, array(
'users' => array(
'{{repeat}}' => 0,
'first_name' => 'string|trim',
'last_name' => 'string|trim'
);
));
If you set the property to 0
, it means that you assume any number of datasets passed in. Set it to a fixed number to cap the amount of properties in the array.
In our example, the users
property can have a unlimited number of firstname/lastname value pairs in it.
Remapping / renaming properties
Its possible to rename incoming properties on the fly, like so:
'txt-user-name > userName' => 'string'
Assuming you had a text input field with name
set to txt-user-name
, the array clean method will rename the property to userName
, so you can continue working with that property name.
Objectification
If you prefer to work with objects, rather than associative arrays in your program flow, just pass TRUE
as third parameter to the array_clean()
method and you can access all properties with the arrow symbol.
Conclusion
Okay, thats the rough overview of the function. You can find a bit more deailed description inside the function definition of the method.
I hope you like the method and can make good use of it like we do.
greetings,
Chris
Written by Christian Engel
Related protips
2 Responses
Just use the native filter functions, such as http://docs.php.net/manual/en/function.filter-input-array.php
They're available since PHP 5.2!
I could switch the internal functionality of arrayclean to use PHP's filter functions - yes. On the "outside", I still like arrayclean much more, because its easier/quicker to write and looks way cleaner and easier to understand if you take a quick look at your code - but thats a matter of taste.