I have a list of strings and I want to filter out the strings that are too similar based on levenstein distance. So if
lev(list, list) < 50; then
del list. Is there any way I can calculate such distance between every pair of strings in the list, more efficiently?? Thanks!!
data2=  for i in data: for index, j in enumerate(data): s = levenshtein(i, j) if s < 50: del data[index] data2.append(i)
The rather dumb code above is taking too long to compute...