I am dealing with Chinese NLP Problem. I find to find word has specific suffixs. For example, I have two list!
suffixs = ['aaa','bbb','cc'.....]
words_list = ['oneaaa','twobbb','three','four']
for w in words_list: if w has suffix in suffixs: func(s,w)
I know I can use re package, but re just can deal with less than 100 suffixs,but I have 1000+ suffixs. I try to use
for w in words_list: for s in suffixs: #suffixs sorted by lenth if s is_suffix_of(w): func(s,w) break
But it is too slow.
The func(s,w) could split the word w to no_suffix word and suffix.
For example 'oneaaa' to ['one','aaa']，but the func bases on some condition and more complex.So any doesn't work here.
So I want to know whether a better way to deal with it.