I am trying to split a string using Regex in C#. I want to split it based on all non-alphanumeric characters but I would like to treat words with apostrophes as whole word when contains a contraction such as:
An example should clarify what I would like to achieve. Given a sentence such as:
"Steve's dog is mine 'not yours' I know you'd like'it"
I would like to obtain the following tokens:
steve's, dog, is, mine, not, yours, i, know, you'd, like, it
At the moment I am using:
Regex.Split(str.ToLower(), @"[^a-zA-Z0-9_']").Where(s => s != String.Empty).ToArray<string>();
steve's , dog , is , mine , 'not , yours', i , know, you'd, like'it