Test your knowledge!Take a quiz to access yourself.

Let’s practice Regular Expression

Developing regular expression against some text have never been tough but creating a perfect regular expression that brings generic level to it have never been so easy. Regular expression are dependent on the user’s logic. The stronger the logic, the better match he is going to get our of it. Practice makes a man perfect. 

So let’s practice making regular expression to catch some required text out of all.

1.  How to match a username?

Here is the regular expression for it

/^[\w-]{3,16}$/

^ representing the start of the string followed by atleast 3 and maximum 16 in count of any character or numbers, underscore or hyphen and $ representing the end of string.

 

2. How to match password?

if you want regular expression for password that includes any character smaller of bigger, any punctuation marks, special characters, then use this:

/^[a-zA-Z -~]{8,20}$/

Here the special thing to notice is, i used – in between space and ~. Well, let me explain u, we have an ascii table, where you can get a sequence of these characters being represented by their respective decimal value as shown in the picture. So when i used ” -~” in big brackets “[]”, i am allowing any character between space(i.e, decimal value 32) to ~(i.e, decimal value 126). So basically i am covering all of them which can be used in password to make it as strong as per user’s intention and intelligence.

chars-table-landscape

 

3. Getting any word starting with goo out of given sentence: “Hi, Google is God of IT industry”.

Regular expression for this would be:  

/\bgoo[a-z]*\b/i

Now let me explain why? Well, Anything that comes between “\b” parameter is considered as a word because it will only fetch strings that actually fulfill the definition of a word.

 

4. Getting any URI out of text.

Regular expression for this would be:

(?:ftps?|https?):\/\/[\w_-]{2,}(?:\.[\w_-]+)+[\w.,@?^=%&:/~+#-]*

Try to compare it with http://abcd.xyz.com/. Well, a common URI may use http or ftp or https in the start and then comes ‘:’ and 2 ‘/’ which are escaped as they might mean something else for PCRE compiler. Then, we are not sure if what characters might be there in domain or host name but at least there would be one dot.

 

5. How to capture Hex values?

Hex Values are easy to capture. Generally , Hex values are written as # tag in start and followed by either digits(0-9) or alphabets(a-f or A-F). So, the expression should match our logic, right?

/^(?:#[a-fA-F0-9]{2}\s?)+/

It will match all the hex values written together which may or may not have spaces between them as \s? shows expectation of space after every hex value.

 

6. How to capture IPv4 Address?

It looks simple to develop regex for IP address but trust me, it’s not as simple as it looks. First, you need to take care of octets, i.e, the values that an octet can have is 0-255 which includes 1 to 3 digits possible, first digit ranges 0-2, second 0-9 and third one again 0-9. We gotta make sure, our pattern matches on all possibilities from 0.0.0.0 to 255.255.255.255 . Let’s make it work then:

/\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/

The above pattern matches every possibility that won’t cause any False positive.

7. How to capture Email Address?

With a good variety of domains available in market and some very flexibility given to user to choose their usernames, matching a email address using regular expressions comes in variety based on validation being used by email service providers. Here is the pattern that will cover more than 99% of email addresses in the world:

/\b[\w.%+-]{2,}@[a-z\d.-]{2,}(?:\.[a-z]{2,})+\b/i

I’ll keep on adding more examples based on the demand. Thanks for reading the post.

8. How to capture whitelines/tab lines/newlines?

To capture the white lines which means the lines made up of spaces, tab lines which means the lines made up of tabs only and new lines which is just \n,  the simplest regex would be:

/([\s\t]{0,50}\r?\n)+/

It’ll work on both windows based text files as well as linux based ones.

5 Comments

Add a Comment

Your email address will not be published. Required fields are marked *