Quantcast
Channel: Active questions tagged string-manipulation - Mathematica Stack Exchange
Viewing all articles
Browse latest Browse all 189

Removing non-word characters in certain parts of a string

$
0
0

I would like to ask how I can remove non-word characters from a string, but only in certain cases.

I have read this article, so I know how to get the words out of a string. My text is however a bit more complicated.

For example:

trialtext = ",,temp sp.a tiral - dump NV-A rambo.6833. 16,rgcht";

From this text, I would like to get as output:

{"temp","sp.a","tiral","dump","NV-A","rambo","6833","16","rgcht"}

In other words, I want so split according to spaces, commas, hyphens and dots, EXCEPT when they have letter character before and after either a hyphen or a dot (so not commas or other signs!)

This has been my most succesful trial so far:

StringSplit[trialtext,  Except[WordCharacter, WordCharacter .. ~~ "." ~~ WordCharacter]]{"temp sp.a tiral dump NV-A rambo.6833 16,rgcht"}

although I do not understand why - if I as for "." - it decides to also take "," and "-".

Therefore also the related question: can someone please explain to me why this

StringSplit[trialtext, Except[WordCharacter, ","]]

gives this output:

 {"temp sp.a tiral dump NV-A rambo.6833 16", "rgcht"}

while this:

StringSplit[trialtext, Except[WordCharacter, "."]]

produces this output:

{"temp", "sp", "a", "tiral", "dump", "NV", "A", "rambo", "6833", "16", "rgcht"}

Thanks a bunch!


Viewing all articles
Browse latest Browse all 189

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>