Java: Read .csv file and save into arrays

I have a problem with an exception while I am trying to read a .csv file and save each column into array. Although, it may seem long program, it isn't. I just have 15 different arrays.

This is the exception "Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2" in the row

department[i] = dataArray[2];

Is there something that I could do?

      BufferedReader CSVFile = 
            new BufferedReader(new FileReader("Sub-Companies.csv"));

      String dataRow = CSVFile.readLine();
      // Read the number of the lines in .csv file 
      // i = row of the .csv file
      int i = 0; 
      while (dataRow != null){
          i++;
          dataRow = CSVFile.readLine();

        }
      System.out.println(i);
      // Close the file once all data has been read.
      CSVFile.close();

      // End the printout with a blank line.
      System.out.println();

      // Save into arrays
      customer_id = new String[i];
      company_name = new String[i];
      department = new String[i];
      employer = new String[i];
      country = new String[i];
      zipcode = new String[i];
      address = new String[i];
      city = new String[i];
      smth1 = new String[i];
      smth2 = new String[i];
      phone_no1 = new String[i];
      phone_no2 = new String[i];
      email = new String[i];
      website = new String[i];
      customer_no = new String[i];

      // Read first line.
      // The while checks to see if the data is null. If 
      // it is, we've hit the end of the file. If not, 
      // process the data.
      int j;
      int counter;
      i = 0;

      // Read the file again to save the data into arrays
      BufferedReader CSV = 
            new BufferedReader(new FileReader("Sub-Companies.csv"));

      String data = CSV.readLine();

      while (data != null){
          String[] dataArray = data.split(";");
          for (String item:dataArray) {
            customer_id[i] = dataArray[0];
            company_name[i] = dataArray[1];
            department[i] = dataArray[2];
            employer[i] = dataArray[3];
            country[i] = dataArray[4];
            zipcode[i] = dataArray[5];
            address[i] = dataArray[6];
            city[i] = dataArray[7];
            smth1[i] = dataArray[8];
            smth2[i] = dataArray[9];
            phone_no1[i] = dataArray[10];
            phone_no2[i] = dataArray[11];
            email[i] = dataArray[12];
            website[i] = dataArray[13];
            customer_no[i] = dataArray[14];
            }


          //System.out.print(address[i] + "\n"); 
          data = CSV.readLine(); // Read next line of data.
          i++;
      }

Thank you in advance!

Some data is "E3B3C5EB-B101-4C43-8E0C-ADFE76FC87FE;"Var Welk" Inh. Kar;NULL;NULL;DE;16278;Rotr 3;Angermünde;NULL;NULL;03331/354348-0;0343331/364548-15;info@aalls.com;http://www.adss.com;ipo241", but there could differ (smaller or bigger).

Answers


Best is use ArraList<String> and if you want convert as Array.

your problem is you are counting no of lines to create array size but you are adding data based on split(";") so there is mismatch in array length and available values to add in array from split(";").


This should do the trick: it basically creates a matrix representation of the csv file.

LinkedList<String[]> rows = new LinkedList<String[]>();
String dataRow = CSVFile.readLine();
// Read the number of the lines in .csv file 
// i = row of the .csv file
int i = 0; 
while ((datarow = CSVFile.readLine()) != null){
    i++;
    rows.addLast(dataRow.split(","));
}

String[][] csvMatrix = rows.toArray(new String[rows.size()][]);

In csvMatrix[row][col]...

When accessing to a column, assert that the col number you are trying to access is in range by doing :

if(col < csvMatrix[row].length)

There are several problems with your code. The exception is caused by the fact that one of the lines doesn't contain enough of the ';' separated values.

The strange thing about your code is this bit:

  for (String item:dataArray) {
    customer_id[i] = dataArray[0];

This simply means you repeat the same assignments 15 times (just remove the for (String item: ...)).

If I were you, I'd do the following:

create a class; something like this:

public class Customer {
    private String customerId;
    private String companyName;

    // ...
    public static Customer create(final String... args) {
        if (args.length != 15) {
            return null; // or throw an exception
        }
        final Customer rv = new Customer();
        rv.setCustomerId(args[0]);
        rv.setCompanyName(args[1]);
        // ...
        return rv;
    }

    public String getCustomerId() {
        return customerId;
    }

    public void setCustomerId(final String customerId) {
        this.customerId = customerId;
    }

    public String getCompanyName() {
        return companyName;
    }

    public void setCompanyName(final String companyName) {
        this.companyName = companyName;
    }
}

use collection (as suggested in post above):

    BufferedReader csv = new BufferedReader(new FileReader("Sub-Companies.csv"));
    List<Customer> customers = new LinkedList<Customer>();

    String data;
    while ((data = csv.readLine()) != null){
        Customer customer = Customer.create(data.split(";"));
        if (customer != null) {
            customers.add(customer);
        }
    }

If you require array instead of collection, you can do:

Customer[] arr = customers.toArray(new Customer[customers.size()]);

Use a library to to read the file... You can try http://opencsv.sourceforge.net/ for example.


department[i] = dataArray[2];  

The exception means that the dataArray does not have that much elements (i.e. 3). If you want to parse your CSV file you can make your life easier by specifying that for any missing elements there must be a placeholder. What I mean is that you can have a record like:

a;b;c;d;e;f;g;h;j Where each of the characters represent the values of your columns but when an element is missing the format must be: a;;;;;f;g;h;j and not a;f;g;h;j

This is not an unusual expectation but the norm in CSV files and would simplify your code a lot and would avoid array index exception as your line will always have the expected columns


Using ArrayList:

public ArrayList<ArrayList<String>> parseDataFromCsvFile()
{
     ArrayList<ArrayList<String>> dataFromFile=new ArrayList<ArrayList<String>>();
     try{
         Scanner scanner=new Scanner(new FileReader("CSV_FILE_PATH"));
         scanner.useDelimiter(";");

         while(scanner.hasNext())
         {
            String dataInRow=scanner.nextLine();
            String []dataInRowArray=dataInRow.split(";");
            ArrayList<String> rowDataFromFile=new ArrayList<String>(Arrays.asList(dataInRowArray));
            dataFromFile.add(rowDataFromFile);
         }
         scanner.close();
     }catch (FileNotFoundException e){
        e.printStackTrace();
     }
     return dataFromFile;
}

Calling the method(displaying csv content):

ArrayList<ArrayList<String>> csvFileData=parseDataFromCsvFile();

public void printCsvFileContent(ArrayList<ArrayList<String>> csvFileData)
{
    for(ArrayList<String> rowInFile:csvFileData)
    {
        System.out.println(rowInFile);
    }
}

If you want to load data into a Parameterized JUnit test using Gradle ( instead of Maven) , here is the method:

// import au.com.bytecode.opencsv.CSVReader;
@Parameters(name = "{0}: {1}: {2}")
public static Iterable<String[]> loadTestsFromFile2() {
    String separator = System.getProperty("file.separator");
    File tFile = loadGradleResource( System.getProperty("user.dir") + 
        separator +  "build" + separator + "resources" + separator +  "test" + 
            separator + "testdata2.csv" );
    List<String[]> rows = null;
    if ( tFile.exists() ) {
        CSVReader reader = null;
        try {
            reader = new CSVReader( new FileReader( tFile ), ',' );
            rows = reader.readAll();
        } catch (FileNotFoundException e) {
                e.printStackTrace();
        } catch (IOException e) {
                e.printStackTrace();
        }   
    }
    staticlogger.info("Finished loadTestsFromFile2()");
    return rows;
} 

Please check if java.util.StringTokenizer helps

Example:

StringTokenizer tokenizer = new StringTokenizer(inputString, ";")

Manual: StringTokenizer docs


Need Your Help

Do variadic templates work correctly with current compilers?

c++ c++11 g++ visual-studio-2013 mingw

I tried to implement a simple tuple according to c++11 variadic templates feature like that:

My PIXASTIC "Lighten.js" filter Javascript and HTML5 code is not working

javascript image-processing html5-canvas pixastic

Here is my code, what am I doing wrong? The image currently is showing up in the browser, but it is not lightened...