Specifically when does ^ mean "match start" and when does it mean "not the following" in regular expressions? From the Wikipedia article and other references, I've concluded it means the former a...
Javascript RegExp () allows you to specify a multi-line mode (m) which changes the behavior of ^ and $. ^ represents the start of the current line in multi-line mode, otherwise the start of the string $ represents the end of the current line in multi-line mode, otherwise the end of the string For example: this allows you to match something like semicolons at the end of a line where the next ...
Normally the dot matches any character except newlines. So if .* isn't working, set the "dot matches newlines, too" option (or use (?s).*). If you're using JavaScript, which doesn't have a "dotall" option, try [\s\S]*. This means "match any number of characters that are either whitespace or non-whitespace" - effectively "match any string". Another option that only works for JavaScript (and is ...
59 That regex "\\s*,\\s*" means: \s* any number of whitespace characters a comma \s* any number of whitespace characters which will split on commas and consume any spaces either side
I'm reading the regular expressions reference and I'm thinking about ? and ?? characters. Could you explain me with some examples their usefulness? I don't understand them enough. thank you
It indicates that the subpattern is a non-capture subpattern. That means whatever is matched in (?:\w+\s), even though it's enclosed by () it won't appear in the list of matches, only (\w+) will. You're still looking for a specific pattern (in this case, a single whitespace character following at least one word), but you don't care what's actually matched.
Now, when the regex engine tries to match against aaaaaaaab, the .* will again consume the entire string. However, since the engine will have reached the end of the string and the pattern is not yet satisfied (the .* consumed everything but the pattern still has to match b afterwards), it will backtrack, one character at a time, and try to match b.
Parentheses in regular expressions define groups, which is why you need to escape the parentheses to match the literal characters. So to modify the groups just remove all of the unescaped parentheses from the regex, then isolate the part of the regex that you want to put in a group and wrap it in parentheses. Groups are evaluated from left to right so if you want something to be in the second ...
By putting ^ at the beginning of your regex and $ at the end, you ensure that no other characters are allowed before or after your regex. For example, the regex [0-9] matches the strings "9" as well as "A9B", but the regex ^[0-9]$ only matches "9".
For reference, from regular-expressions.info/dot.html: "JavaScript and VBScript do not have an option to make the dot match line break characters. In those languages, you can use a character class such as [\s\S] to match any character. This character matches a character that is either a whitespace character (including line break characters), or a character that is not a whitespace character ...