Sunday 3 July 2016

Java Regular Expression

In this blog post we will learn about Java Regular Expression.This feature added on java 1.4 version.
Java provides java.util.regex package for searching, extracting, and modifying text in String.This is widely used to validate strings such as password and email validation.Before proceeding to the topic first we should focus on what is regular expression.


Regular Expression:

"A group of String Objects to represent (validate) according to a particular pattern is known as regular expression.A regular expression determine a pattern for a String.Regular Expressions can be used to search, validate or manipulate string."

OR

Representation of a group of String Objects in a particular pattern is known as regular expression. 










The most important application areas where we can use regular expressions are .

>Validate phone,email on login and sign up forms.
>Validation Framework
>Pattern Matching Applications(Search)
>Compiler Design(Translater like compiler,interpreter).
>Digital circuit design(like Binary adder,Binary incremental).
>Communication area(TCP/IP,UDP).

java.util.regex package provides three classes and one interface.The below is the java.util.regex package internal hierarchy.






Java Regular Expression classes are present inside java.util.regex package that contains three classes.

1) java.util.regex.Pattern – Used for design or making patterns
2) java.util.regex.Matcher – Used for performing match operations on text using patterns
3) java.util.regex.PatternSyntaxException 
4) MatchResult interface


1-Pattern object is the compiled version of the regular expression.This class doesn’t have any  constructor and we use it’s public static method compile to create the pattern object by passing regular expression argument ya pattern String.

2- Matcher is the regex engine object that matches the input String pattern with the pattern object created.We get a Matcher object using pattern object matcher method that takes the input String as argument.We then use matches method that returns boolean result based on input String matches the regex pattern or not.

3- PatternSyntaxException is thrown if the regular expression syntax is not correct.


If you want to work with regular expression in java you always start with Pattern class.For defining and validating any expression we need pattern which is nothing but comes from Pattern Class.How to define pattern the below codes gives much more clearity.

Pattern p=p.compile("abc");              // This method gives compiled version of Regular Expression []

 String target="abcdefgabchuydabc";

Suppose this is my target String "abcdefgabchuydabc" & I want to find the above String contains String "abc" or not if  abc exits in the above expression in which postion they occur and how many times occurs in the traget String.For the above such applications we need Regular Expression.This is not the traditional concept but it is the new concept which add on Java 1.4 version.Other examples which uses Regular expression concept is Ctrl+F command in Microsoft Windows and grep command in UNIX.


Here is the sample pattern matching java program to find particular pattern Strings from target Strings using Regular Expression.

For making this program firstly we need three steps:

1-Define the pattern (Using Patterns Compile method)
2-Compare with Input String(Using Matcher ) 
3-Use find method to search pattern string position and count (occurence).

1-Pattern Matching Application:

package com.navneet.javatutorial;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegularExpression {

public static void main(String[] args) {
       int count=0;
       Pattern pt=Pattern.compile("xy");                // Define pattern using compile method of pattern class
       Matcher m=pt.matcher("zxvxybfxyuvxyioxy");      // Create matcher object to find match
       while(m.find()){
      count++;
      System.out.println(m.start());
      
      
          }
       System.out.println("The number of occurence"+count);

}

}

2-Validating Mobile And Email IDs:


package com.navneet.javatutorial;

import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ValidateMobileNumber {

public static void main(String[] args) {
Pattern p=Pattern.compile("[0]?[7-9][0-9]{9}");
String input;
// Phone number starting with 0 and followed by 10 digit mobile number starting digit should be 7 to 9.
System.out.println("Please enter 10 & 11 digit Mobile Number"); 
Scanner sc=new Scanner(System.in);
input=sc.nextLine();
Matcher m=p.matcher(input);
        if(m.find()&&m.group().equals(input))
        {
        System.out.println("Valid Mobile Number ");
        }
        else
        {
        System.out.println("Invalid Mobile Number ");
        }
}

}

Email Validation:

Suppose i want to validate any email id staring with "xxx7_.yddhh89@gmail.com" how can i make regular expression to validate such email ,below is the process how can define pattern.

Firstly make a pattern 

1-The first character starting with any digit or UpperCase/LowerCase letters  [a-zA-Z0-9]
2-The second character pattern talks about remaining characters   [a-zA-Z0-9_.]*
3-The third character is mandatory symbol @
4-The fourth character is any alpha Numeric character atleast one time [a-zA-z0-9]+
5-The fifth character is  ([.][a-zA-z0-9]+)

So complete pattern for any mail ID is

[a-zA-Z0-9][a-zA-Z0-9_.]*@[a-zA-z0-9]+[.]([a-zA-z]+)+

Pattern for only gmail IDs

[a-zA-Z0-9][a-zA-Z0-9_.]*@gmail[.]com

Pattern for yahoo IDs

[a-zA-Z0-9][a-zA-Z0-9_.]*@yahoo[.]com

package com.navneet.javatutorial;

import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class GmailIDValidation {

public static void main(String[] args) {

Pattern p=Pattern.compile("[a-zA-Z0-9][a-zA-Z0-9_.]*@gmail[.]com");
String input;
System.out.println("Please enter Gmail ID"); 
Scanner sc=new Scanner(System.in);
input=sc.nextLine();
Matcher m=p.matcher(input);
        if(m.find()&&m.group().equals(input))
        {
        System.out.println("Valid Mail ID");
        }
        else
        {
        System.out.println("InValid Mail ID");
        }




}

}


Character Class:

Character class are mostly used to make a pattern.

[abc]=Either a or b or c.
[^abc]=^ symbole means Except a,b and c
[a-z]=Any lower case letter from a to z.
[A-Z]=Any upper case letter from a to z.
[a-zA-Z]=Any lower case and upper case letter.
[0-9]=Any digit from 0 to 9.
[a-zA-Z0-9]=Any alpha numeric character.
[^a-zA-Z0-9]=Except any alpha numeric characters means special characters.


Pre-defined character classes:

\s =Space character
\S =Except space character
\d=Any digit from 0-9
\D=Except digit from 0 to 9
\w =Any word character [a-zA-Z0-9]
\W =Except any word character [a-zA-Z0-9] (means special characters)
.=Any character

Note:
The another most important method inside Pattern Class is split() method:

This method is used to split the target String according to particular pattern.

3 comments:

  1. Thank you so much sir for giving more useful concept

    ReplyDelete
  2. \W = does this read white space or spaces?

    ReplyDelete
  3. The below regular expressions are used for following purpose.

    \w Any Alphanumeric character
    \W Any Non-alphanumeric character



    ReplyDelete