Re: regex capability
On 5 Apr., 14:28, Patricia Shanahan <p...@acm.org> wrote:
On 4/5/2011 2:10 AM, Paul Cager wrote:
On Apr 5, 2:35 am, markspace<-@.> wrote:
On 4/4/2011 1:13 PM, Robert Klemme wrote:
if ( m.matches() ) {
for (m = number.matcher(m.group(1)); m.find();) {
int x = Integer.parse(m.group());
}
Why re-invent the wheel?
In this case I just wanted to demonstrate the strategy to first check
overall validity of the input and extract the interesting part and
then ripping that interesting part apart. Whether a Scanner or
another Matcher is used for the second step wasn't that important to
me. Also, the thread is called "regex capability". :-)
But, of course, your approach using the Scanner is perfectly
compatible with the two step strategy as Patricia also pointed
out. :-)
public class ScannerTest {
public static void main(String[] args) {
StringReader in = new StringReader(
"Support DDR2 100/200/300/400 DDR2=
SDRAM");
Scanner scanner = new Scanner(in);
scanner.useDelimiter( "[^0-9]+" );
while( scanner.hasNextInt() ) {
System.out.println( scanner.nextInt() );
}
}
}
(Lightly tested.)
$ java ScannerTest
2
100
200
300
400
2
This is a nice illustration of the case for a strategy I often use in
this sort of situation, combining tools using each to do the jobs it
does best.
For example, a regular expression match could pull out the
"100/200/300/400" substring, and a Scanner could extract the integers
from that. More generally, it could be split and then each of the split
results processed some other way.
I generally prefer scanning over splitting in those cases. The
difference might be negligible for this case but assuming that the
original pattern changes (e.g. because we want to allow "@" as
separator instead of or additionally to "/") then for the split
approach two patterns need to be changed while for scanning of
integers (pattern \d+) only the master pattern needs to change. Also,
with scanning it is clear what I want (positively defining the matched
portion) while with splitting it is not so clear (negatively defining
what I do not want, the separator) - but that leaves a lot of room for
what is returned from _between_ separators.
Kind regards
robert