Re: Regex doesn't recognize single quote
On 08/01/12 08:41, Roedy Green wrote:
On 7 Jan 2012 11:42:26 GMT, ram@zedat.fu-berlin.de (Stefan Ram) wrote,
quoted or indirectly quoted someone who said :
That is not what a regex is for.
How do you know what it is for?
Regexes are for searching for patterns. Transforming or deleting
characters is much simpler done with a for loop.
How do I know what a regex is for? I am familiar with the API. I have
attempted to use them for various purposes and discovered they were
suitable for some and not for others.
Just use a StringBuilder the length of your String. Then
loop through the chars with charAt. If the character is a
' or \w, ignore it, else append. If it gets complex, use a
switch or if it gets really complicated use a BitSet.
This might be needless (as far as we know right now)
optimization bloating the code reducing its readability and
low-level thinking, which might be required sometimes, but
does not serve as a general rule. Still it is nice to know
how it could be done if required.
What is your simpler implementation?
/** remove ' and \w from string
* @param s string to process
* @return string without ' or \w
*/
private static String scrunch( final String s )
{
final Stringbuilder sb = new StringBuilder( s.length() );
for (int i=0; i<s.length(); i++ )
{
char c = s.charAt(i);
if ( !( c = '\'' || c = '\w' ) )
{
sb.append ( c );
}
}
return sb.toString();
}
In most cases is better to use a StringBuilder to perform replacements,
but in this particular case String.replaceAll() is better. By the way,
the escape sequence \w is not a java regular escape sequence but belongs
to the pattern syntax (although you should already know about it, as you
say you are familiar with the API).
Anyway a simpler implementation (and one which works, because yours
doesn't):
/** remove ' and \w from string
* @param s string to process
* @return string without ' or \w
*/
private static String scrunch( final String s ) {
return s.replaceAll("[^'\\w]+", "");
}