Today I wanted to find a regex expression that would validate any URL. For example, I wanted it to be able to validate the follow urls where the .com could be any extension:
- http://domain.com, http://www.domain.com, http://subdomain.domain.com
- https://domain.com, https://www.domain.com, https://subdomain.domain.com
- www.domain.com
- subdomain.domain.com
- domain.com
Since I too wasted a ton of time looking for this I wanted to do a re-blog/re-post on this for me because I am not sure if I will find all those posts again.
I have slightly (and very slightly) modified splattermania's post.
As I wasted lots of time finding a REAL regex for URLs and resulted in building it on my own, I now have found one, that seems to work for all kinds of urls:
[php]<?php
$regex = "((https?|ftp)://)?"; // SCHEME
// My Version: $regex = "((https?|ftps?)://)?"; // SCHEME
$regex .= "([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?"; // User and Pass
$regex .= "([a-z0-9-.]*).([a-z]{2,3})"; // Host or IP
$regex .= "(:[0-9]{2,5})?"; // Port
$regex .= "(/([a-z0-9+$_-].?)+)*/?"; // Path
$regex .= "(?[a-z+&$_.-][a-z0-9;:@&%=+/$_.-]*)?"; // GET Query
$regex .= "(#[a-z_.-][a-z0-9+$_.-]*)?"; // Anchor
[/php]
Then, the correct way to check against the regex ist as follows:
[php]<?php
if( preg_match( "/^$regex$/", $url ) )
return true;
[/php]
splattermania's post is priceless and belongs on php.net, and it shows the value in checking php.net before Googling or in the midst of Googling for information about PHP.
You can test the expression with the Regular Expression (Regex) Test Tool by selecting preg_match() & entering the following:
[code gutter="false"]/^((https?|ftps?)://)?([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?([a-z0-9-.]*).([a-z]{2,3})(:[0-9]{2,5})?(/([a-z0-9+$_-].?)+)*/?(?[a-z+&$_.-][a-z0-9;:@&%=+/$_.-]*)?(#[a-z_.-][a-z0-9+$_.-]*)?/[/code]
So as a practical use in one line:
[php]<?php
if ( preg_match( '/^((https?|ftp)://)?([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?([a-z0-9-.]*).([a-z]{2,3})(:[0-9]{2,5})?(/([a-z0-9+$_-].?)+)*/?(?[a-z+&$_.-][a-z0-9;:@&%=+/$_.-]*)?(#[a-z_.-][a-z0-9+$_.-]*)?/', $url ) )
// do something
[/php]
Now, later you may need to add http:// to the front of a "validated" URL. Here's what I used:
[php]<?php
if ( isset( $url ) && ! preg_match( '~^(?:f|ht)tps?://~i', $url ) )
$url = 'http://' . $url;
[/php]
Or as a function:
[php]<?php
wps_url_fix('test.com' ); // http://test.com
wps_url_fix('www.test.com' ); // http://www.test.com
wps_url_fix('test.com' ); // http://test.com
wps_url_fix('ftp://test.com' ); // ftp://test.com
wps_url_fix('https://test.com' ); // https://test.com
wps_url_fix('http://test.com' ); // http://test.com
wps_url_fix('test' ); // http://test
/**
* Fixes URL, adds http:// if missing
*
* @param string $url String containing URL with(out) prefix
*/
function wps_url_fix( $url ) {
if ( isset( $url ) && ! preg_match( '~^(?:f|ht)tps?://~i', $url ) )
$url = 'http://' . $url;
return $url;
}
[/php]
Leave a Reply