FlashComGuru Home InfluxisCDNImediaseeUvault
                                                                                       Forum Index | Active Topics | Register
                                                                                                          List Overview | List Archives
                                                                                                                           About this site | Advertise
 

home

Adobe AIR (11)
Applications (40)
Books & Training (11)
Collaboration (18)
Components (10)
Events (80)
Flash Player (35)
Flex (39)
FMS (110)
General (123)
Hosting (6)
Jobs (17)
Off topic (36)
OSMF (3)
Press Releases (23)
Site Check (11)
Tools (53)
Videos & Players (74)

Follow me on Twitter

 
I recently had a requirement for Scribblar (more on that site in another post) to verify if a domain name or page URL entered by a user is valid. Luckily ActionScript 3 features support for Regular Expressions, however my RegExp skills are non existent. So I reached out via Twitter to see if anyone could help. It took all of 10 minutes and a quick session on pastebin for Robert 'Da Man' Hall to sort the problem out for me. In order to preserve this nugget of knowledge for future generations, here it is.

var regex:RegExp = /^http(s)?:\/\/((\d+\.\d+\.\d+\.\d+)|(([\w-]+\.)+([a-z,A-Z][\w-]*)))(:[1-9][0-9]*)?(\/([\w-.\/:%+@&=]+[\w- .\/?:%+@&=]*)?)?(#(.*))?$/i;

Usage

var url:String = "http://www.google.com";
var regex:RegExp = /^http(s)?:\/\/((\d+\.\d+\.\d+\.\d+)|(([\w-]+\.)+([a-z,A-Z][\w-]*)))(:[1-9][0-9]*)?(\/([\w-.\/:%+@&=]+[\w- .\/?:%+@&=]*)?)?(#(.*))?$/i;
trace(regex.test(url)); // returns true if valid url is found

Thanks Robert!

Comments
[Add Comment]
what the?
Looks like alien jibberish, but neat stuff.
# Posted By marcus | 2/17/09 10:14 AM
I have reason to believe regular expressions IS indeed Alien gibberish. Rumor has it, that it apparently can read by a small group of Russian monkeys. Though this is yet to be proven.
# Posted By Dustin Sparks | 2/24/09 11:01 PM
Once there was a programmer with a problem. He said to himself, "I could solve this using regular expressions."

Now he had two problems.

:)
# Posted By Almo | 2/28/09 11:46 PM
# Posted By James | 4/9/09 11:50 AM
I guess you don't need RegExp to compare 2 strings.

var pattern:String = "http://www.google.com";

if(string == pattern) {
trace('true');
} else {
trace('false')
}
# Posted By Sylwester | 9/1/09 11:30 AM
Lol, Almo. So true.
# Posted By egfx | 10/4/09 1:59 AM
Could be useful, but sadly it doesn't validate urls like :
http://www.google.com/?a=b
but hopefully :
http://www.google.com/c?a=b
is OK.

If I knew regex well enough, I would have tried to fix it, but it's not the case.
# Posted By Olivier | 10/7/09 8:38 PM
Well this one is doing the trick :

var regex:RegExp = /^http(s)?:\/\/((\d+\.\d+\.\d+\.\d+)|(([\w-]+\.)+([a-z,A-Z][\w-]*)))(:[1-9][0-9]*)?(\/([\w-.\/\?:%+@&=]+[\w- .\/\?:%+@&=]*)?)?(#(.*))?$/i;

I've added a "\?" in front of ":%+@&=" then I guess I've corrected a bug but I'm not sure, I've added a "\" in front of "?:%+@&=", because the question mark should be escaped, no ?
If my "\" isn't correct, it probably should also be remove in my first correction...

I tested it with a few urls and it worked well for me.
# Posted By Olivier | 10/7/09 10:20 PM
It's still me here...
shouldn't the "+" be escaped too ?
# Posted By Olivier | 10/7/09 10:23 PM
I'm sorry for spamming the website... Anyway, I hope it's useful for everybody.
I also hope someone will correct me if I'm making mistakes.

The IP address part should be corrected with : "(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
I will allow IP from 0.0.0.0 up to 999.999.999.999... (which isn't valid but it's still better than 9999999.9999.99999999.99)

And about the port part, ports can be up to 65535, not infinity, so this one is better : "(:[1-9][0-9]{0,4})", but it will allow ports from "1" up to "99999". I've no idea how to add better restrictions.
# Posted By Olivier | 10/7/09 10:38 PM
I'm back :)

From : http://www.ietf.org/rfc/rfc1738.txt some valid characters are still missing, so I've added to the regex : "'\(\)$,\*!"

The complete regex looks like this now :
var regex:RegExp = /^http(s)?:\/\/((\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(([\w-]+\.)+([a-z,A-Z][\w-]*)))(:[1-9][0-9]{0,4})?(\/([\w-.\/\?:%\+@&=]+[\w- .\/\?:%\+@&='\(\)$,\*!]*)?)?(#(.*))?$/i;

I still hope somebody will be able to tell us if it's correct or if I'm posting stupidities :-/
Remember : I'm quite new to regex, so I'm not sure everything I write is OK. From my test it's OK, but we never know.
# Posted By Olivier | 10/8/09 1:04 AM
Well, thanks guys very much for the good tips. nice work

I have a couble of comments though:

1. These expressions will not accept http://test" target="_blank">http://test/file" target="_blank">http://test" target="_blank">http://test/file or http://test" target="_blank">http://test because there is no extension while I should be able to pass it, hence it will not accept http://localhost

2. the http:// should be optional because i may need to pass relative paths

3. It doesn't either accept ftp://test.com, I assume this is because the http part.

4. This also does not accept e-mail addresses while it should because e-mail address is also a url, right? actually it should accept http://tes@t.com
# Posted By ASM | 12/30/09 6:30 PM
I think this can never be done by regular expressions alone. It might be better to devide the url to parts then work on each one depending on its position. For example, [http://, ftp://, gopher://, mailto: etc ...] and [name, name.extension, username@name.extension] then each part[name, name.extension]
# Posted By ASM | 12/30/09 6:42 PM
to clarify, I originally needed a Regex to validate URLs for websites, but not include relative urls. I'm also not interested in FTP addresses or other protocols outside http and https. Basically I needed to verify if someone entered a syntactically valid website address that I could (at least theoretically) reach when typed into my browser. Maybe it's worth to try again with some of the above suggestions.
# Posted By Stefan Richter | 12/30/09 8:35 PM
I think this might be a good validation expression

^(?:(?:http|https|ftp|telnet|gopher|ms\-help|file|notes)://)?(?:(?:[a-z][\w~%!&',;=\-\.$\(\)\*\+]*):.*@)?(?:(?:[a-z0-9][\w\-]*[a-z0-9]*\.)*(?:(?:(?:(?:[a-z0-9][\w\-]*[a-z0-9]*)(?:\.[a-z0-9]+)?)|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))(?::[0-9]+)?))?(?:(?:(?:/(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))+)*/(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))*)(?:\?[\^#]+)?(?:#[a-z0-9]\w*)?)?$
# Posted By ASMâ„¢ | 1/3/10 10:40 AM
This one for (http:// as optional) + [subdomain.]domain name/IP[.ext] required + path and query string etc as optional

which means you will have at least a domain name

^(http://)?(?:(?:[a-z0-9][\w\-]*[a-z0-9]*\.)*(?:(?:(?:(?:[a-z0-9][\w\-]*[a-z0-9]*)(?:\.[a-z0-9]+)?)|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))(?::[0-9]+)?))(?:(?:(?:/(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))+)*/(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))*)(?:\?[\^#]+)?(?:#[a-z0-9]\w*)?)?$

I hope this helps
# Posted By ASMâ„¢ | 1/3/10 10:48 AM
I'm sorry, I think this is the final thing


function IsValidRegEx($value:String, $regEx:Object):Boolean
{
   if (!$value || !$regEx) return false;
   return ($value.match($regEx) != null);
}

function IsValidUri($value:String):Boolean
{
   return IsValidRegEx($value, /^(?:(?:http|https|ftp|telnet|gopher|ms\-help|file|notes):\/\/)?(?:(?:[a-z][\w~%!&',;=\-\.$\(\)\*\+]*):.*@)?(?:(?:[a-z0-9][\w\-]*[a-z0-9]*\.)*(?:(?:(?:(?:[a-z0-9][\w\-]*[a-z0-9]*)(?:\.[a-z0-9]+)?)|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))(?::[0-9]+)?))?(?:(?:(?:\/(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))+)*\/(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))*)(?:\?[\^#]+)?(?:#[a-z0-9]\w*)?)?$/);
}

If you'd like to check only for http and without user name and password in url you can replace the expression with this one

/^(?:http:\/\/)?(?:(?:[a-z0-9][\w\-]*[a-z0-9]*\.)*(?:(?:(?:(?:[a-z0-9][\w\-]*[a-z0-9]*)(?:\.[a-z0-9]+)?)|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))(?::[0-9]+)?))?(?:(?:(?:\/(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))+)*\/(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))*)(?:\?[\^#]+)?(?:#[a-z0-9]\w*)?)?$/
# Posted By ASM | 1/3/10 3:22 PM