Class MatchRatingApproachEncoder

java.lang.Object
org.apache.commons.codec.language.MatchRatingApproachEncoder
All Implemented Interfaces:
Encoder, StringEncoder

public class MatchRatingApproachEncoder extends Object implements StringEncoder
Match Rating Approach Phonetic Algorithm Developed by Western Airlines in 1977. This class is immutable and thread-safe.
Since:
1.8
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private static final String[]
     
    private static final int
    Constants used mainly for the min rating value.
    private static final String
     
    private static final int
    Constants used mainly for the min rating value.
    private static final int
    Constants used mainly for the min rating value.
    private static final int
    Constants used mainly for the min rating value.
    private static final String
    The plain letter equivalent of the accented letters.
    private static final int
    Constants used mainly for the min rating value.
    private static final int
    Constants used mainly for the min rating value.
    private static final String
     
    private static final int
    Constants used mainly for the min rating value.
    private static final int
    Constants used mainly for the min rating value.
    private static final int
    Constants used mainly for the min rating value.
    private static final String
    Unicode characters corresponding to various accented letters.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    (package private) String
    Cleans up a name: 1.
    final Object
    encode(Object pObject)
    Encodes an Object using the Match Rating Approach algorithm.
    final String
    encode(String name)
    Encodes a String using the Match Rating Approach (MRA) algorithm.
    (package private) String
    Gets the first and last 3 letters of a name (if > 6 characters) Else just returns the name.
    (package private) int
    getMinRating(int sumLength)
    Obtains the min rating of the length sum of the 2 names.
    boolean
    isEncodeEquals(String name1, String name2)
    Determines if two names are homophonous via Match Rating Approach (MRA) algorithm.
    (package private) boolean
    isVowel(String letter)
    Determines if a letter is a vowel.
    (package private) int
    Processes the names from left to right (first) then right to left removing identical letters in same positions.
    (package private) String
    removeAccents(String accentedWord)
    Removes accented letters and replaces with non-accented ascii equivalent Case is preserved.
    (package private) String
    Replaces any double consonant pair with the single letter equivalent.
    (package private) String
    Deletes all vowels unless the vowel begins the word.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • SPACE

      private static final String SPACE
      See Also:
    • EMPTY

      private static final String EMPTY
      See Also:
    • ONE

      private static final int ONE
      Constants used mainly for the min rating value.
      See Also:
    • TWO

      private static final int TWO
      Constants used mainly for the min rating value.
      See Also:
    • THREE

      private static final int THREE
      Constants used mainly for the min rating value.
      See Also:
    • FOUR

      private static final int FOUR
      Constants used mainly for the min rating value.
      See Also:
    • FIVE

      private static final int FIVE
      Constants used mainly for the min rating value.
      See Also:
    • SIX

      private static final int SIX
      Constants used mainly for the min rating value.
      See Also:
    • SEVEN

      private static final int SEVEN
      Constants used mainly for the min rating value.
      See Also:
    • ELEVEN

      private static final int ELEVEN
      Constants used mainly for the min rating value.
      See Also:
    • TWELVE

      private static final int TWELVE
      Constants used mainly for the min rating value.
      See Also:
    • PLAIN_ASCII

      private static final String PLAIN_ASCII
      The plain letter equivalent of the accented letters.
      See Also:
    • UNICODE

      private static final String UNICODE
      Unicode characters corresponding to various accented letters. For example: Ú is U acute etc...
      See Also:
    • DOUBLE_CONSONANT

      private static final String[] DOUBLE_CONSONANT
  • Constructor Details

    • MatchRatingApproachEncoder

      public MatchRatingApproachEncoder()
  • Method Details

    • cleanName

      String cleanName(String name)
      Cleans up a name: 1. Upper-cases everything 2. Removes some common punctuation 3. Removes accents 4. Removes any spaces.

      API Usage

      Consider this method private, it is package protected for unit testing only.

      Parameters:
      name - The name to be cleaned
      Returns:
      The cleaned name
    • encode

      public final Object encode(Object pObject) throws EncoderException
      Encodes an Object using the Match Rating Approach algorithm. Method is here to satisfy the requirements of the Encoder interface Throws an EncoderException if input object is not of type java.lang.String.
      Specified by:
      encode in interface Encoder
      Parameters:
      pObject - Object to encode
      Returns:
      An object (or type java.lang.String) containing the Match Rating Approach code which corresponds to the String supplied.
      Throws:
      EncoderException - if the parameter supplied is not of type java.lang.String
    • encode

      public final String encode(String name)
      Encodes a String using the Match Rating Approach (MRA) algorithm.
      Specified by:
      encode in interface StringEncoder
      Parameters:
      name - String object to encode
      Returns:
      The MRA code corresponding to the String supplied
    • getFirst3Last3

      String getFirst3Last3(String name)
      Gets the first and last 3 letters of a name (if > 6 characters) Else just returns the name.

      API Usage

      Consider this method private, it is package protected for unit testing only.

      Parameters:
      name - The string to get the substrings from
      Returns:
      Annexed first and last 3 letters of input word.
    • getMinRating

      int getMinRating(int sumLength)
      Obtains the min rating of the length sum of the 2 names. In essence the larger the sum length the smaller the min rating. Values strictly from documentation.

      API Usage

      Consider this method private, it is package protected for unit testing only.

      Parameters:
      sumLength - The length of 2 strings sent down
      Returns:
      The min rating value
    • isEncodeEquals

      public boolean isEncodeEquals(String name1, String name2)
      Determines if two names are homophonous via Match Rating Approach (MRA) algorithm. It should be noted that the strings are cleaned in the same way as encode(String).
      Parameters:
      name1 - First of the 2 strings (names) to compare
      name2 - Second of the 2 names to compare
      Returns:
      true if the encodings are identical false otherwise.
    • isVowel

      boolean isVowel(String letter)
      Determines if a letter is a vowel.

      API Usage

      Consider this method private, it is package protected for unit testing only.

      Parameters:
      letter - The letter under investiagtion
      Returns:
      True if a vowel, else false
    • leftToRightThenRightToLeftProcessing

      int leftToRightThenRightToLeftProcessing(String name1, String name2)
      Processes the names from left to right (first) then right to left removing identical letters in same positions. Then subtracts the longer string that remains from 6 and returns this.

      API Usage

      Consider this method private, it is package protected for unit testing only.

      Parameters:
      name1 - name2
      Returns:
      the length as above
    • removeAccents

      String removeAccents(String accentedWord)
      Removes accented letters and replaces with non-accented ascii equivalent Case is preserved. http://www.codecodex.com/wiki/Remove_accent_from_letters_%28ex_.%C3%A9_to_e%29
      Parameters:
      accentedWord - The word that may have accents in it.
      Returns:
      De-accented word
    • removeDoubleConsonants

      String removeDoubleConsonants(String name)
      Replaces any double consonant pair with the single letter equivalent.

      API Usage

      Consider this method private, it is package protected for unit testing only.

      Parameters:
      name - String to have double consonants removed
      Returns:
      Single consonant word
    • removeVowels

      String removeVowels(String name)
      Deletes all vowels unless the vowel begins the word.

      API Usage

      Consider this method private, it is package protected for unit testing only.

      Parameters:
      name - The name to have vowels removed
      Returns:
      De-voweled word