Last Updated: February 20, 2016
· filipekiss

Denying groups using regex

So I was helping a friend with a little wordpress plugin when I stumbled across a problem. I was using regex to grab the first image of the post and use it as a thumbnail. The thing is: the regex was no quite right and it was ignoring all the images and grabing an iframe URL inside the post. So, after thinking about how to do it, I've came accross a solution;

What I had to do?

Grab the src attribut inside an image tag regardless the position of the attribute.

What I was doing Wrong?

For some reason, my first regex would only get the src attribute if it were BEFORE the style attribute the WP was generating for the image.

How to solve it

First, let's start the regex:

/<img /i

So I'd find all the image tags. Next, let's deny everything that's not a src:

/<img (?:(?!src).)+/i

The (?:(?!src).) means something like "Get everything in your way until you hit an string that matches src;

I would, then, proceed to normally extract the value of the attribute:

/<img (?:(?!src).)+src="([^"]+)/i

The "([^"]+) means "Grab everything until you find a quote". The i after the regex itself, in case you don't know, means that the search should be case insensitive (that is: it should not make distinction between upper or lower case)

tl;dr: If you ever need to deny a whole group/string using regex, just use (?:(?!your string).)