Quantcast
Channel: Active questions tagged string-manipulation - Mathematica Stack Exchange
Viewing all articles
Browse latest Browse all 186

Dictionary lookups of words composed of specified characters

$
0
0

I am trying to find all English words composed of specified characters. For example, all words composed of one or more of the following: {"A", "C", "D", "H", "I", "R", "N"}, where all the letters except "N" can appear 0 or more times, and "N" must appear at least once.

I have been handling this by brute force, creating all Tuples of the letters, screening for those that contain "N", then running a DictionaryLookup[] over the whole list. For example, for words of length seven:

letterList={"A", "C", "D", "H", "I", "R", "N"};requiredLetter="N";wordLength=7;candidates =   Map[ToLowerCase[StringJoin[#]] &,    Select[Tuples[letterList, wordLength], MemberQ[#, requiredLetter] &]];evals = Map[{#, DictionaryLookup[#, "IgnoreCase" -> False]} &, candidates];winners = Select[evals, Length[#[[2]]] > 0 &];

The qualifying words of length 7 in this case are "candida" and "handcar".

Among the problems with this approach, the list of tuples is of length Length[letterList]^ wordLength, which quickly gets into the billions and crashes the kernel for wordLengths longer than 10.

I invite suggestions for more efficient approaches that would allow me to check for words of longer length. It feels as though regular expression searches of the dictionary may work but it's not an area I have any skill with.


Viewing all articles
Browse latest Browse all 186

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>