The best I can suggest is a regex that matches the string as a whole. Because the problem here is that an analysis local can produce different results from an analysis global.
My attempt at solution would be:
^[^"]*(?:"(?:[^"\s]|[^"\s][^"]*[^"\s])?"[^"]*)*$
Example in the Ruble. Explanation:
^
- string start
[^"]*
- followed by zero or more characters that are not quotation marks (text outside quotation marks)
(?:...)*
- followed by zero or more of:
"
- quote
(?:...|...)?
- with or without:
[^"\s]
- a single character which is neither quotation marks nor spaces; or:
[^"\s]
- a character that is neither quotation marks nor spaces, followed by
[^"]*
- zero or more characters that are not quotes, followed by
[^"\s]
- a character that is neither quotation marks nor spaces, followed by
"
- unquote
[^"]*
- zero or more characters that are not quotation marks (text outside quotation marks)
$
- end of string
Explaining in natural language, she takes a stretch out of the quotation marks, then a stretch in, a stretch out, a stretch in, and so on. Quotation marks can be of three types: a) empty - ""
; b) with a single character - "a"
; c) with a character that before and after, and anything in between - "a...b"
.
It should be noted that all this regex says is whether the string is valid or invalid: it cannot show you what character the error is in.
Updating: if what you want is a regex that marries strings with error - and tell you where the mistake is - that’s the best I could do:
^[^"]*(?:"(?:[^"\s]|[^"\s][^"]*[^"\s])?"[^"]*)*("(?:\s[^"]*|[^"]*\s)")[^"]*(?:"(?:[^"\s]|[^"\s][^"]*[^"\s])?"[^"]*)*$
Example in jsFiddle. This "monstrosity" boils down to:
^ regex_original ("(?:\s[^"]*|[^"]*\s)") regex_original $
That is: "Marry something that is correct, followed by something that is incorrect, followed by something that is correct". It will detect one and only one error like this - if the string has two or more errors, or if it has a quote that opens but does not close, etc, the regex will not be able to catch.
I believe that with a little more effort we can improve this a little, but we are getting to the point where regex is no longer the most suitable tool for the work...
@Kyllopardiun In my opinion, the example of the question is correct. Even because no false positive would occur if the
e
was inside the quotes.– mgibsonbr
@Kyllopardiun No. The phrase is like that. With the and out of quotes.
– Dinho
@mgibsonbr Exact.
– Dinho