in message <4NtOcFDtiLTFFwAO@nowhere.nnn>, Jeffrey Spoon
('JeffreySpoon@hotmail.com') wrote:
In message <d6rmk29nb7eef9rdn2n500e45e22d09lij@4ax.com>, David Segall
<david@address.invalid> writes
Jeffrey Spoon <JeffreySpoon@hotmail.com> wrote:
Hello, has anybody seen well-known/good practice CSV parsing algorithms
in Java? I've been googling about but can't see anything suitable so
far. I'm not interested in using library functions, rather implementing
the algorithm myself (or at least learning how to).
Any pointers appreciated, thanks.
Roedy Green has assembled some useful information on this topic.
<http://mindprod.com/jgloss/csv.html>
Thanks, I had a look. The reason I'm asking is because I had a graduate
role interview and they asked this as a question, as in to write one. I
didn't know how to anyway, but looking at Roedy's, just the get() method
is 200 hundred lines, am I really expected to know this stuff off by
heart?
Thanks to the others who suggested as well, I'll get around to them.
Heavens, writing a CSV parser is trivial. It's simply a case of a
StringTokenizer in a for loop:
public ResultClass parse( InputStream in, String separatorChars)
throws IOException
{
ResultClass result = new ResultClass();
BufferedReader buffy =
new BufferedReader( new InputStreamReader( in));
for ( String line = buffy.readLine(); line != null;
line = buffy.readLine)
{
StringTokenizer tok =
new StringTokenizer( line, separatorChars);
while ( tok.hasMoreTokens())
{
// do something with result and
tok.nextToken()
}
}
/* consider (and document) whether it's your or the
caller's
* responsibility to close the stream; since you were
passed the
* stream I suggest it's the caller's */
return result;
}
As to what that ResultClass object should be, if the first line in your
CSV
may be column headers and each value in the first row is distinct then
probably what you want is a vector of maps where the keys of the maps are
the corresponding values from the first line; otherwise I'd probably just
return a vector of vectors.
Obviously you may not want to schlurp a whole CSV file into core memory at
one go; it may be better to produce a parser to which you can add
callbacks/listeners for the fields or patterns you are interested in. But
the general pattern is as given.
--
simon@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/
;; Let's have a moment of silence for all those Americans who are stuck
;; in traffic on their way to the gym to ride the stationary bicycle.
;; Rep. Earl Blumenauer (Dem, OR)