Finding anagrams, rhyming words and misspelled words in a dictionary.

Surprisingly, finding anagrams, rhyming words and corrections (for misspelled words) have a common O(n) solution.
- To find anagrams, we have a dictionary in which all the words are sorted internally and then compare the input( which is again sorted ). see Bentley (1998) Programming Pearls Addison-Wesley for further discussion on this.

- To find rhyming words, we have a dictionary in which we have the words in reverse order and we reverse the input and compare for similarity towards the end (say last 3 characters )

- A find the spelling correction, we use a related hybrid approach, to quote Knuth The Art of Computer Programming Vol.3 (Sorting and Searching) p.394 :

More complicated search problems can often be reduced to the simpler case
considered here. For example, suppose that the keys are words that might be
slightly misspelled; we might want to find the correct record in spite of this
error. If we make two copies of the file, one in which the keys are in normal
lexicographic order and another in which they are ordered from right to left (as
if the words were spelled backwards), a misspelled search argument will probably
agree up to half or more of its length with an entry in one of these two files.

No comments: