Quantcast
Viewing all articles
Browse latest Browse all 189

Select upper case list items with spaces and punctuation

I have a sample list that includes people's names in Title Case, and their companies in UPPER CASE.

list = {"Bob Jones", "ACME-SYSTEMS", "John Smith", "FUTURETECH123", "Sally Jones", "CITY SCHOOL", "Jane Black", "CONSULTANT", "Max Speed", "A.B. CORP"}

I would like to select only the list items that are in upper case. The most obvious approach would be to use UpperCaseQ:

Select[list, UpperCaseQ](*{"CONSULTANT"}*)

Unfortunately, UpperCaseQ returns FALSE for any list item that has a non letter characters such as space or punctuation in it.

I did find that I can incorporate StringReplace within UpperCaseQ to ignore some symbols:

Select[list, UpperCaseQ[StringReplace[#, {"", "-", "."} -> ""]] &](*{"ACME-SYSTEMS", "CITY SCHOOL", "CONSULTANT", "A.B. CORP"}*)

Besides spaces, periods and hyphens, there is also the issue of numbers and symbols in strings (hence FUTURETECH123 being off the list). On a small dataset, you can hand code the exceptions, but as you get larger, you would need to automate it.

I was able to come up with this code as a way to automatically select the non letter characters, but I would think running the code on a very large document would not be very efficient.

allcharacters =  DeleteDuplicates[Flatten[Characters[list]]]; nonletters =  Select[allcharacters, LetterQ[#] == False &];Select[list, UpperCaseQ[StringReplace[#, nonletters -> ""]] &](*{"ACME-SYSTEMS","FUTURETECH123","CITY SCHOOL","CONSULTANT","A.B. CORP"}*)

Is there a more simple way to select items in a list that are UPPER CASE while ignoring the non letter characters?


Viewing all articles
Browse latest Browse all 189

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>