javascript - replace string start with >


Keywords:javascript 


Question: 

I'd like to retrieve the string from the text area and then remove the row which has starting char '>' (regardless whether has spaces in front of '>'), remove one or more than one occurrences of '-', and then split the string to array with result as follow:

[BBAAAACC, CBDAAACC, EBAAAACC, FBAAAACC]

I have tried using regex follow but could not get the result.

var data = document.getElementById("data").value.replace(/^[>-]+/g,"");
var array = data.split("\n");
console.log(array);
        <textarea rows="15" cols="50" id="data">
            		>1/1AA
            BBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >2/1B
            CBDAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >3/1-CD
            EBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >4/1-11
            FBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            ------------- 
            	</textarea>
  

3 Answers: 

Your code is a beginning, all your array needs is a bit of cleanup :

var array = document
    .getElementById("data")
    .value
    .split("\n")
    .map(l => l.trim() // Removes extra spaces before and after
             .replace(/-/g,'')) // Removes all occurrences of '-'
    .filter(l => l.length && l[0] !== ">") // Keeps only the lines with a length and not beginning with '>'

console.log(array);
<textarea rows="15" cols="50" id="data">
            		>1/1AA
            BBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >2/1B
            CBDAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >3/1-CD
            EBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >4/1-11
            FBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            ------------- 
</textarea>
 

Your original idea is to remove lines starting with > and any - symbols, but it seems all you need is to extract chunks of 1+ uppercase ASCII letters from the start of each line.

You may use

var data = document.getElementById("data").value.match(/^\s*[A-Z]+/mg).‌​map(x => x.trim())

where the regex, /^\s*[A-Z]+/mg, matches

  • ^ - start of a line
  • \s* - 0+ whitespaces
  • [A-Z]+ - 1 or more uppercase ASCII letters
  • /mg - multiline and global modifiers.

Note that in case you want to make sure you match the letters that are followed with -, you may append (?=-) positive lookahead at the end of the pattern.

Note that you still need to trim all the leading whitespaces matched with \s* using .map(x => x.trim() (or .map(function(x) {return x.trim();})). You may also capture the letters, ([A-Z]+), and then use RegExp#exec() in a loop grabbing match[1] values, but that involves more code.

See the JS demo:

var data = document.getElementById("data").value.match(/^\s*[A-Z]+(?=-)/mg).map(x => x.trim());
console.log(data);
<textarea rows="15" cols="50" id="data">
                >1/1AA
        BBAAAACC-------------------------------------------------------------
        ------------------------------------------------------------------------
        -------------
        >2/1B
        CBDAAACC-------------------------------------------------------------
        ------------------------------------------------------------------------
        -------------
        >3/1-CD
        EBAAAACC-------------------------------------------------------------
        ------------------------------------------------------------------------
        -------------
        >4/1-11
        FBAAAACC-------------------------------------------------------------
        ------------------------------------------------------------------------
        ------------- 
</textarea>
 

Have a map and a filter - the other answer is ES5/6 so not working in older browsers:

var array= document.getElementById("data").value
  .split("\n")
  .map(function(line) {
    line = line ? line.trim() : "";
    line = line.replace(/-/g, "");
    return line; })  
  .filter(function(line) {
    return line && line.indexOf(">") != 0;});
console.log(array);
<textarea rows="15" cols="50" id="data">
            		>1/1AA
            BBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >2/1B
            CBDAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >3/1-CD
            EBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            -------------
            >4/1-11
            FBAAAACC-------------------------------------------------------------
            ------------------------------------------------------------------------
            ------------- 
</textarea>