post

A REAL Regular Expression (Regex) for URLs using preg_match()

Today I wanted to find a regex expression that would validate any URL. For example, I wanted it to be able to validate the follow urls where the .com could be any extension:

  • http://domain.com, http://www.domain.com, http://subdomain.domain.com
  • https://domain.com, https://www.domain.com, https://subdomain.domain.com
  • www.domain.com
  • subdomain.domain.com
  • domain.com

Since I too wasted a ton of time looking for this I wanted to do a re-blog/re-post on this for me because I am not sure if I will find all those posts again.

I have slightly (and very slightly) modified splattermania’s post.

As I wasted lots of time finding a REAL regex for URLs and resulted in building it on my own, I now have found one, that seems to work for all kinds of urls:

<?php 
$regex = "((https?|ftp)://)?"; // SCHEME 
// My Version: $regex = "((https?|ftps?)://)?"; // SCHEME 
$regex .= "([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?"; // User and Pass 
$regex .= "([a-z0-9-.]*).([a-z]{2,3})"; // Host or IP 
$regex .= "(:[0-9]{2,5})?"; // Port 
$regex .= "(/([a-z0-9+$_-].?)+)*/?"; // Path 
$regex .= "(?[a-z+&$_.-][a-z0-9;:@&%=+/$_.-]*)?"; // GET Query 
$regex .= "(#[a-z_.-][a-z0-9+$_.-]*)?"; // Anchor 

Then, the correct way to check against the regex ist as follows:

<?php 
if( preg_match( "/^$regex$/", $url ) )
	return true; 

splattermania’s post is priceless and belongs on php.net, and it shows the value in checking php.net before Googling or in the midst of Googling for information about PHP.

You can test the expression with the Regular Expression (Regex) Test Tool by selecting preg_match() & entering the following:

/^((https?|ftps?)://)?([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?([a-z0-9-.]*).([a-z]{2,3})(:[0-9]{2,5})?(/([a-z0-9+$_-].?)+)*/?(?[a-z+&$_.-][a-z0-9;:@&%=+/$_.-]*)?(#[a-z_.-][a-z0-9+$_.-]*)?/

So as a practical use in one line:

<?php
if ( preg_match( '/^((https?|ftp)://)?([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?([a-z0-9-.]*).([a-z]{2,3})(:[0-9]{2,5})?(/([a-z0-9+$_-].?)+)*/?(?[a-z+&$_.-][a-z0-9;:@&%=+/$_.-]*)?(#[a-z_.-][a-z0-9+$_.-]*)?/', $url ) )
	// do something

Now, later you may need to add http:// to the front of a “validated” URL. Here’s what I used:

<?php
if ( isset( $url ) && ! preg_match( '~^(?:f|ht)tps?://~i', $url ) )
	$url = 'http://' . $url;

Or as a function:

<?php
wps_url_fix('test.com' ); // http://test.com
wps_url_fix('www.test.com' ); // http://www.test.com
wps_url_fix('test.com' ); // http://test.com
wps_url_fix('ftp://test.com' ); // ftp://test.com
wps_url_fix('https://test.com' ); // https://test.com
wps_url_fix('http://test.com' ); // http://test.com
wps_url_fix('test' ); // http://test

/**
 * Fixes URL, adds http:// if missing
 *
 * @param string $url String containing URL with(out) prefix
 */
function wps_url_fix( $url ) {
    if ( isset( $url ) && ! preg_match( '~^(?:f|ht)tps?://~i', $url ) )
	$url = 'http://' . $url;
    return $url;
}
About Travis Smith

As a WordPress Enthusiast, Travis writes about his journey in WordPress trying to help other WordPress travelers and enthusiasts with tutorials, explanations, & demonstrations of the things he learns.

Leave a Reply

%d bloggers like this: