Java CSV parsing with Apache Commons CSV parser 

Joined:
04/09/2007
Posts:
776

June 21, 2011 15:41:51    Last update: June 22, 2011 11:33:36
Demo code for CSV parsing with Apache Commons CSV parser.
  1. Java code:
    import java.io.*;
    import org.apache.commons.csv.CSVParser;
    import org.apache.commons.csv.CSVStrategy;
    
    public class CSVParseDemo {
        public static void main(String[] args) throws IOException {
    	if (args.length < 1) {
    	    System.out.println("Usage: java CSVParseDemo <csv_file>");
    	    return;
    	}
    
    	CSVParser parser = new CSVParser(new FileReader(args[0]), CSVStrategy.EXCEL_STRATEGY);
    	String[] values = parser.getLine();
    	while (values != null) {
    	    printValues(parser.getLineNumber(), values);
    	    values = parser.getLine();
    	}
        }
    
        private static void printValues(int lineNumber, String[] as) {
    	System.out.println("Line " + lineNumber + " has " + as.length + " values:");
    	for (String s: as) {
    	    System.out.println("\t|" + s + "|");
    	}
    	System.out.println();
        }
    }
    

  2. Test with a simple CSV file:
    psmith01,CLASS2B,Peter Smith 1,YEAR2,1,N,ADVANCED,STAFF,1,Y,Y
    smehta,CLASS3G,Smeeta Mehta,LOCAL,1,Y,STANDARD,PUPIL,2.1,N,Y
    

    Result:
    Line 1 has 11 values:
    	|psmith01|
    	|CLASS2B|
    	|Peter Smith 1|
    	|YEAR2|
    	|1|
    	|N|
    	|ADVANCED|
    	|STAFF|
    	|1|
    	|Y|
    	|Y|
    
    Line 2 has 11 values:
    	|smehta|
    	|CLASS3G|
    	|Smeeta Mehta|
    	|LOCAL|
    	|1|
    	|Y|
    	|STANDARD|
    	|PUPIL|
    	|2.1|
    	|N|
    	|Y|
    

    The parser worked correctly.

  3. Test with a more complicated CSV file:
    "psmith01 abc", "CLASS2B            "    , " Peter, Smith 1", "\", YEAR2 \""
    " smehta ' \", \\, "   ,     "CLASS3G \\"
    " smehta ' \", \\, "   ,     "CLASS3G \"
    

    Result:
    Line 1 has 4 values:
    	|psmith01 abc|
    	|CLASS2B            |
    	| Peter, Smith 1|
    	|", YEAR2 "|
    
    Line 2 has 2 values:
    	| smehta ' ", \\, |
    	|CLASS3G \\|
    
    Exception in thread "main" java.io.IOException: (startline 2)eof reached before encapsulated token finished
    	at org.apache.commons.csv.CSVParser.encapsulatedTokenLexer(CSVParser.java:510)
    	at org.apache.commons.csv.CSVParser.nextToken(CSVParser.java:365)
    	at org.apache.commons.csv.CSVParser.getLine(CSVParser.java:239)
    	at CSVParseDemo.main(CSVParseDemo.java:16)
    

    The third line is invalid input, but throwing a Java IOException is a bit grave. Also, the parser is not able to escape a backslash.

  4. Add a new line in item two:
    "One", "Two
    ", "Three"
    

    Result:
    Line 2 has 3 values:
    	|One|
    	|Two
    |
    	|Three|
    

Share |
| Comment  | Tags