Last Updated: May 15, 2019
·
316.6K
· fstrube

Splitting Strings with MySQL

Sometimes when you're writing SQL queries you may need to split a string on a certain delimiter. For instance, if you want to break up a multi-line address into individual columns. If you were doing this in PHP it would be very easy. You simply gather your returned results, and use explode to split the string. Unfortunately, there is not a "split" function that does this in MySQL, but it can be achieved with the clever use of a different string function.

Let's continue with the example of a multi-line address. Here is our address table:

CREATE TABLE `address` (
    `id` INTEGER AUTO_INCREMENT PRIMARY KEY,
    `fullname` VARCHAR(255),
    `company` VARCHAR(255),
    `street` VARCHAR(255),
    `city` VARCHAR(255),
    `state` VARCHAR(255),
    `zip` VARCHAR(20),
    `country` VARCHAR(255)
);

And here is the inserted data:

INSERT INTO `address` VALUES (
    NULL,
    'Test Testerson',
    'ACME, Inc',
    '123 Acme Way\nSuite 200',
    'San Francisco',
    'California',
    '94114',
    'United States'
);

Let's say we want to query this table and split the street information into multiple columns, so that each line is its own column. We can do this using the SUBSTRING_INDEX function in MySQL.

SELECT SUBSTRING_INDEX(street, '\n', 1) AS street1, SUBSTRING_INDEX(street, '\n', 2) AS street2, SUBSTRING_INDEX(street, '\n', 3) AS street3 FROM address;

But, this doesn't quite work yet. Here is what is returned:

street1         street2                    street3
123 Acme Way    123 Acme Way↵Suite 200    123 Acme Way↵Suite 200

It works for isolating street1, but street2 and street3 are wrong. If you read the MySQL documentation closely, you will notice that SUBSTRING_INDEX returns everything to the left of the delimiter. We want the string between the first and second occurrence of the delimiter, which we can accomplish by wrapping street2 and street3 in annother SUBSTRING_INDEX call. Now we can tell it to retrieve the everything to the right of the last occurrence of the delimiter. Our query now looks like this:

SELECT
    SUBSTRING_INDEX(street, '\n', 1) AS street1,
    SUBSTRING_INDEX(SUBSTRING_INDEX(street, '\n', 2), '\n', -1) AS street2,
    SUBSTRING_INDEX(SUBSTRING_INDEX(street, '\n', 3), '\n', -1) AS street3
FROM address;

We're almost there, but not quite. You can see that street3 is returning the same value as street2:

street1         street2         street3
123 Acme Way    Suite 200       Suite 200

It is doing this because there are only two lines in our street. In order to account for this, we need to count the number of lines, then use IF statements to control our SUBSTRING_INDEX:

SELECT
    @num_street_lines := 1 + LENGTH(street) - LENGTH(REPLACE(street, '\n', '')) AS num_street_lines,
    SUBSTRING_INDEX(street, '\n', 1) AS street1,
    IF(@num_street_lines > 1, SUBSTRING_INDEX(SUBSTRING_INDEX(street, '\n', 2), '\n', -1), '') AS street2,
    IF(@num_street_lines > 2, SUBSTRING_INDEX(SUBSTRING_INDEX(street, '\n', 3), '\n', -1), '') AS street3
FROM address;

Although our query is now long and difficult to read, it works. And for those situations where you must do everything in SQL, and cannot use a language like PHP to process your query, this is very useful.