www.000webhost.com
OOP's Concept




Regular Expressions

Core Java

What are regular expressions?

A regular expression defines a search pattern for strings. The abbreviation for regular expression is regex. The search pattern can be anything from a simple character, a fixed string or a complex expression containing special characters describing the pattern. The pattern defined by the regex may match one or several times or not at all for a given string. Regular expressions can be used to search, edit and manipulate text. The process of analyzing or modifying a text with a regex is called: The regular expression is applied to the text (string) . The pattern defined by the regex is applied on the text from left to right. Once a source character has been used in a match, it cannot be reused. For example, the regex aba will match ababababa only two times (aba_aba__).

Regex examples

Regex Matches
this is text Matches exactly "this is text"
this\s+is\s+text Matches the word "this" followed by one or more whitespace characters followed by the word "is" followed by one or more whitespace characters followed by the word "text".
^\d+(\.\d+)? ^ defines that the patter must start at beginning of a new line. \d+ matches one or several digits. The ? makes the statement in brackets optional. \. matches ".", parentheses are used for grouping. Matches for example "5", "1.5" and "2.21".

Rules of writing regular expressions

Regular Expression Description
. Matches any character
^regex Finds regex that must match at the beginning of the line.
regex$ Finds regex that must match at the end of the line.
[abc] Set definition, can match the letter a or b or c.
[abc][vz] Set definition, can match a or b or c followed by either v or z.
[^abc] When a caret appears as the first character inside square brackets, it negates the pattern. This ccontent/an match any character except a or b or c.
[a-d1-7] Ranges: matches a letter between a and d and figures from 1 to 7, but not d1.
X|Z Finds X or Z.
XZ Finds X directly followed by Z.
$ Checks if a line end follows.

Meta characters

Regular Expression Description
\d Any digit, short for [0-9]
\D A non-digit, short for [^0-9]
\s A whitespace character, short for [ \t\n\x0b\r\f]
\S A non-whitespace character, short for [^\s]
\w A word character, short for [a-zA-Z_0-9]
\W A non-word character [^\w]
\S+ Several non-whitespace characters
\b Matches a word boundary where a word character is [a-zA-Z0-9_].

Tip These meta characters have the same first letter as their representation, e.g., digit, space, word, and boundary. Uppercase symbols define the opposite.

There's many things to learn with Regular expression, now we will look further with example, hope it will help you to understand Regular Expression


Quantifier

Regular Expression Description
\d Any digit, short for [0-9]
\D A non-digit, short for [^0-9]
\s A whitespace character, short for [ \t\n\x0b\r\f]
\S A non-whitespace character, short for [^\s]
\w A word character, short for [a-zA-Z_0-9]
\W A non-word character [^\w]
\S+ Several non-whitespace characters
\b Matches a word boundary where a word character is [a-zA-Z0-9_].

Tip These meta characters have the same first letter as their representation, e.g., digit, space, word, and boundary. Uppercase symbols define the opposite.

Using regular expressions with String methods



Method Description
s.matches("regex") Evaluates if "regex" matches s. Returns only true if the WHOLE string can be matched.
s.split("regex") Creates an array with substrings of s divided at occurrence of "regex". "regex" is not included in the result.
s.replaceFirst("regex"), "replacement" Replaces first occurance of "regex" with "replacement.
s.replaceAll("regex"), "replacement" Replaces all occurances of "regex" with "replacement.
There's many things to learn with Regular expression, now we will look further with example, hope it will help you to understand Regular Expression

Example of Java Regular Expressions

In java we can write Regular expression in 3 ways
Create for the following package in the Java project javaessential.com.regex.example1 , dubble click on code section and ctrl+a then ctrl+c

            package javaessential.com.regex.example1;
            import java.util.regex.*;
            public class RegexExample1{
            public static void main(String args[]){
            //1st way
            Pattern p = Pattern.compile(".s");//. represents single character
            Matcher m = p.matcher("as");
            boolean b = m.matches();

            //2nd way
            boolean b2=Pattern.compile(".s").matcher("as").matches();

            //3rd way
            boolean b3 = Pattern.matches(".s", "as");

            System.out.println(b+" "+b2+" "+b3);
            }}

            //output
            true true true
        


Example 2

Create for the following package in the Java project javaessential.com.regex.example2, dubble click on code section and ctrl+a then ctrl+c

            package javaessential.com.regex.example2;

            public class RegexTestStrings {
              public static final String EXAMPLE_TEST = "This is my small example "
                  + "string which I'm going to " + "use for pattern matching.";

              public static void main(String[] args) {
                System.out.println(EXAMPLE_TEST.matches("\\w.*"));
                String[] splitString = (EXAMPLE_TEST.split("\\s+"));
                System.out.println(splitString.length);// should be 14
                for (String string : splitString) {
                  System.out.println(string);
                }
                // replace all whitespace with tabs
                System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t"));
              }
            }


        


Example 3

Create for the following package in the Java project javaessential.com.regex.example3, dubble click on code section and ctrl+a then ctrl+c

            package javaessential.com.regex.example3;

            public class StringMatcher {
              // returns true if the string matches exactly "true"
              public boolean isTrue(String s){
                return s.matches("true");
              }
              // returns true if the string matches exactly "true" or "True"
              public boolean isTrueVersion2(String s){
                return s.matches("[tT]rue");
              }

              // returns true if the string matches exactly "true" or "True"
              // or "yes" or "Yes"
              public boolean isTrueOrYes(String s){
                return s.matches("[tT]rue|[yY]es");
              }

              // returns true if the string contains exactly "true"
              public boolean containsTrue(String s){
                return s.matches(".*true.*");
              }


              // returns true if the string contains of three letters
              public boolean isThreeLetters(String s){
                return s.matches("[a-zA-Z]{3}");
                // simpler from for
            //    return s.matches("[a-Z][a-Z][a-Z]");
              }



              // returns true if the string does not have a number at the beginning
              public boolean isNoNumberAtBeginning(String s){
                return s.matches("^[^\\d].*");
              }
              // returns true if the string contains a arbitrary number of characters except b
              public boolean isIntersection(String s){
                return s.matches("([\\w&&[^b]])*");
              }
              // returns true if the string contains a number less then 300
              public boolean isLessThenThreeHundred(String s){
                return s.matches("[^0-9]*[12]?[0-9]{1,2}[^0-9]*");
              }

            }

        


Pattern and Matcher



For advanced regular expressions these two classes are used.

You first create a Pattern object which defines the regular expression. This Pattern object allows you to create a Matcher object for a given string. This Matcher object then allows you to do regex operations on a String.

Pattern and Matcher Example

        package javaessential.com.regex.example4;

        import java.util.regex.Matcher;
        import java.util.regex.Pattern;

        public class RegexTestPatternMatcher {
          public static final String EXAMPLE_TEST = "This is my small example string which I'm going to use for pattern matching.";

          public static void main(String[] args) {
            Pattern pattern = Pattern.compile("\\w+");
            // in case you would like to ignore case sensitivity,
            // you could use this statement:
            // Pattern pattern = Pattern.compile("\\s+", Pattern.CASE_INSENSITIVE);
            Matcher matcher = pattern.matcher(EXAMPLE_TEST);
            // check all occurance
            while (matcher.find()) {
              System.out.print("Start index: " + matcher.start());
              System.out.print(" End index: " + matcher.end() + " ");
              System.out.println(matcher.group());
            }
            // now create a new pattern and matcher to replace whitespace with tabs
            Pattern replace = Pattern.compile("\\s+");
            Matcher matcher2 = replace.matcher(EXAMPLE_TEST);
            System.out.println(matcher2.replaceAll("\t"));
          }
        }

Example 5 Create a regular expression that accepts alpha numeric characters only. Its length must be 6 characters long only.

        package javaessential.com.regex.example5;
        import java.util.regex.*;
        class RegexExample6{
        public static void main(String args[]){
        System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "arun32"));//true
        System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "kkvarun32"));//false (more than 6 char)
        System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "JA2Uk2"));//true
        System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "arun$2"));//false ($ is not matched)
        }}
    

Example 6Create a regular expression that accepts 10 digit numeric characters starting with 7, 8 or 9 only

        package javaessential.com.regex.example6;
        import java.util.regex.*;
        class RegexExample7
        {
            public static void main(String args[])
            {
                System.out.println("by character classes and quantifiers ...");
                System.out.println(Pattern.matches("[789]{1}[0-9]{9}", "9953038949"));//true
                System.out.println(Pattern.matches("[789][0-9]{9}", "9953038949"));//true

                System.out.println(Pattern.matches("[789][0-9]{9}", "99530389490"));//false (11 characters)
                System.out.println(Pattern.matches("[789][0-9]{9}", "6953038949"));//false (starts from 6)
                System.out.println(Pattern.matches("[789][0-9]{9}", "8853038949"));//true

                System.out.println("by metacharacters ...");
                System.out.println(Pattern.matches("[789]{1}\\d{9}", "8853038949"));//true
                System.out.println(Pattern.matches("[789]{1}\\d{9}", "3853038949"));//false (starts from 3)
            }
        }