arrays - JavaScript RegEx Troubles - Matching then Splitting


Keywords:javascript 


Question: 

I am trying to match a pattern of numbers/words and then split on that grouping into an array. I have a regex pattern that seems to identify things properly (for the most part). Using RegExr.com I have made this:

It is ALMOST Correct. I cannot see why it will not get the word "crumbs" in bread crumbs.

I want to split on each ingredient and then push each one into an index in an array.

I have tried:

var regExp = '/(\d+)([^\d]+) |(\d+\/\d+) [^\d]+ /g';
cleanedText = ingredients.split(regExp);
console.log(cleanedText);

This does not work. It's not splitting at all. It's looking for [number space words] | [or Fraction space Words] and should split on each one it finds.


4 Answers: 

Try this one:

/((\d+\/\d+|\d+)\s*)([^\d]+)/g

I like regex101.com better. They provide much more information on what is going on in your RegExp.

See my example:

This RegExp works for integers like:

  • 3 Cups or
  • 1lb flour

and fractions like

  • 1/3 teaspoons

BTW: If you use match instead of split then you get back an array that provides you the number and the rest of the string. Just in case you wanted to separate those as well.

 

Remove the trailing space after the last+

Use match instead of split

Swap regex groups, first the one with /, and after that the usual one

cleanedText = ingredients.match(/(\d+\/\d+)[^\d]+|(\d+)([^\d]+)/g);

var ingredients = '3 strips bacon 6 large mushrooms 1 tablespoon butter 1lb - Simply Balanced 1/2 onion, diced 1 clove garlic, sliced 3 ounces cream cheese 3 ounces blue cheese 1/3 cup bread crumbs';
cleanedText = ingredients.match(/(\d+\/\d+)[^\d]+|(\d+)([^\d]+)/g);
console.log(cleanedText);
 

Ok, why isn't your regex matching bread crumbs? Because you are matching all the words that ends with a space, and since "bread crumbs" is the last word, it won't get matched.

Regular expressions are sensitive to whitespaces, so remove them from:

/(\d+)([^\d]+)|(\d+\/\d+)([^\d]+)/g

Then, the split method works differently than what you expect. Probably you mean to use the match method:

var sentence = "3 strips bacon 6 large mushrooms 1 tablespoon butter 1lb - Simply Balanced 1/2 onion, diced 1 clove garlic, sliced 3 ounces cream cheese 3 ounces blue cheese 1/3 cup bread crumbs"; 
var ingredients = sentence.match(/(\d+)([^\d]+)|(\d+\/\d+)([^\d]+)/g);

It returns an array which each element is a matched group.

Edit: This regex fixes the problems with fractions:

/(\d+)([^(\d\/)]+)|(\d+\/\d+)([^\d]+)/g