qgeuna
Last Updated: February 25, 2016
·
16.94K
· samdev
Aaa7db840772e3faa6c2d4c400cbdbac

[PHP] String Length, The Right Way

The title may look very common and very basics thing, cause it's known that you can use strlen function to get string length, but actually it's not that simple there are many things you've to keep in mind while working with strings length, so lets get into this guide.

The Easy and common way

This is the way most of us know about, and many coders just use it all the time

$string = 'Hello';
echo strlen($string); 

What this snippet of code will output ?

This will output 5 for sure, cause 'Hello' word is consist of 5 letters. Good Enough, lets take another example from German language.

$string = 'Tschüss';
echo strlen($string); 

What this snippet of code will output ?
This should output 7 , cause 'Tschüss' word is consist of 7 letter, but hey this how humans count it but not in programming, actually 'Tschüss' consist of 8 letters and this snippet will output 8 not
7 . You can give it a try

Why this happen ?
actually strlen() doesn't really count letters, but count bytes, the letter 'ü' is unicode letter, and unicode letters are not 1 byte all the time but it's 1 ~ 6 bytes. and 'ü' is 2 bytes so PHP counted it as 2 letters.

Note : There is no simple answer about the 1~6 bytes letters, so we wont go deep into it now, maybe another tutorial.

The Right Way

well if you're sure the string wont has any unicode letters or you really want to count the bytes not letters then strlen() will be okay. But if you want to count letters and you're not sure if there will be any unicode letters then we gonna use another function which is <a href="http://php.net/manual/en/function.mb-strlen.php">mb_strlen</a> as follow :

$string = 'Tschüss';
echo mb_strlen($string, 'utf8'); 

The first parameter is the string you want to count, and the second is the encoding. Now this gonna output 7 as we expected.

Compare String length

What if we have a string and we want to check if it's less than or equal to 5 ?
The first thing to get in your mind is to do so :

$string = 'This is some string';

if(strlen($string) <= 5)
{
//$string is less than or equal to 5
}

Well this gonna work and it's very good code, but not perfect.
This is not perfect for Performance, there are a better way to do this like so by using isset instead of strlen .

$string = 'This is some string';
if(!isset($string[5]))
{
////$string is less than or equal to 5
}

This may look weird for the first look, but when you treat a string as array then the key you trying to access is the char position, so for this example $string[5] is the character which in the position 5 which 'i' (php is 0 leading so position 5 is the 6th letter) .

But why this way is better ? Because isset is language construct and strlen is function, and in general function calls is expensive than language constructs. so it do better perfomance .

Add your questions as comment if you have any, Goodluck :)

Say Thanks
Respond

6 Responses
Add your response

7061
Big pimpin

Great tip!

here is a quick wrapper function for it: (Gist on Github)

<?php
    function _strlen($str, $use_encoding=FALSE, $encoding='utf8'){
        if($use_encoding){
            return mb_strlen($str, $encoding); 
        }
        return strlen($str); 
    }
    // usage
    $string = 'Tschüss';
    echo _strlen($string, 1); 
over 1 year ago ·
7071
Aaa7db840772e3faa6c2d4c400cbdbac

I like your function, Thanks for sharing :)

over 1 year ago ·
7278
E000ba24fdfe2d8a847bb79dea812c9d
over 1 year ago ·
7452

Good job on highlighting the necessity of the mb_* extensions for handling UTF-8 strings. However I disagree with your second assertion, the best way to compare string length in PHP is the obvious way. By making a micro-optimisation such as this early, you lose a communicative aspect to your code.

$str = 'hello!';

if (isset($str[5)) {
    // ... $str is less than or equal to 5
}

if (mb_strlen($str) <= 5) {
    // ... don't need a comment here, the intention is obvious
}

What if I want to check if a string is greater than 6? Then I am back to using mb_strlen.

These cases should only be targeted for optimisation when actually needed, by taking this practice into common use you may lose more than you gain.

over 1 year ago ·
7459
16d031b5434f37b1128f4c498666f074

Take a look at the mbstring.func_overload php.ini setting too: http://us1.php.net/manual/en/mbstring.overload.php

over 1 year ago ·
7474
Avatar

Working on a full UTF-8 'environment' should not give encoding issues, so using always mb_strlen can be the simplest and faster approach.
Nice trick with isset() anyway.

over 1 year ago ·
Awesome Job

E7334122 1780 11e8 9aaf caa9487f1cb0
DevOps Engineer
·
Negotiable, Sydney or Canberra pref
·
Full Time