ruby - How do I split a string token when it is not all numbers?



I want to split tokens in an array if they are of the form "a number, a dot ("."), and then non-numbers". If the tokens is of the form: "number, dot, number", I don't want to split it. I thought this would do the trick

tokens.flat_map {|o| o.scan(/^\d+\.|[a-z]+/i) }

The expression works correctly for this case:

tokens = ["44.WORD"]
tokens.flat_map {|o| o.scan(/^\d+\.|[a-z]+/i) }
# => ["44.", "WORD"] 

but the expression seems to cut off the token, as shown below:

tokens = ["72.9"]
tokens.flat_map {|o| o.scan(/^\d+\.|[a-z]+/i) }
# => ["72."] 

How do I adjust my regular expression so that if the token is a number, a dot, and a number, I keep it just as it is and split it in two otherwise?

2 Answers: 

Try this

tokens.flat_map { |token| token =~ /[a-z]/i ? token.split('.') : token }

This doesn't adjust your regexp, but sometimes it is easier to use Ruby rather than cramming everything into a regexp. And often also more readable.


Since you have a well defined notion of where to split, use split instead of scan.

["44.WORD"].flat_map{|s| s.split(/(?<=\d\.)(?=\D)/)}
# => ["44.", "WORD"]

["72.9"].flat_map{|s| s.split(/(?<=\d\.)(?=\D)/)}
# => ["72.9"]