The preferred csv parser for java?
My project needs to parse a csv file.
Of course I could write "yet-another-csv-parser", but this time I looked a bit further - past my own code to see if someone else had created the preferred csv java api.
My requirements seemed fair:
- Open source:
- I would like to see the code and have the ability to extend where needed
- If the code is available from a maven repository it is easy to download and add to my favorite IDE (intellij idea)
- Distributed to ibiblio with sourcecode would be great as it eases my work even more.
- Some kind of error reporting
It would be nice if the parser could report a more specific error than just an IOException with now details. It would be nice if the error could tell something about what line that couldn't be parsed.
I found a couple of api's that I took a closer look at:
http://www.csvreader.com/
- Seemed easy to use, but not available from ibiblio (or any other repository I could find)
- Binary was available from sourceforge as was the sourcecode
- No error reporting besides IOException
http://www.mvnrepository.com/artifact/genjava/gj-csv
- Binary and source available from ibiblio.
- No error reporting besides IOException
http://www.mvnrepository.com/artifact/net.sf.opencsv/opencsv
- Only binary available from ibiblio.
- No error reporting besides IOException
Looked at some other apis as well but my general observation was that non of the api's had a focus on error reporting or validation of the document.
Why doesn't the java community have a preferred api for csv parsing?
One explanation could be that every project implements its own as parsing a csv file seems as a trivial operation.
Another explanation could be that this kind of work suffer from the Not invented here syndrome? :-)
Please let me know that I am missing the implementation - it simply must be out there.
7 comments:
A suggestion could be http://flatpack.sourceforge.net/
Flatpack looks promising, but I can't find it in any maven repository in contrast to what
this thread states.
Do you or anyone else know if it is available from a maven repository?
Check out this one: http://jffp.sourceforge.net/
How about:
String[] fields = String.split(",");
Just to add to the list, I did a *very* basic one a couple years ago: http://kasparov.skife.org/csv/ and I know Henri Yandell wrote one somewhere in http://www.osjava.org/ as well, I think another was donated to Jakarta at one point but no one picked up the ball -- that one was a commercial thing and probably is the most robust I know of in the face of bizarro input, but was much slower than mine.
Hi,
Looking at one comment wrt FlatPack, I'd like to say that flatpack will available on a Maven repository as soon as we complete this release.
In the meantime, feel free to grab a SNAPSHOT at:
http://objectlabkit.sf.net/m1-repo/
under net.sf.flatpack
Thanks
Benoit
Apache Commons has a fairly intelligent CSVParser, and includes a CSVStrategy class that allows one to set delimiters, encapulators, escape characters, etc. and handle complex encapsulators.
http://commons.apache.org/sandbox/csv/apidocs/org/apache/commons/csv/package-summary.html
Post a Comment