Parsing CSV Files

When it comes to parsing CSV files, the way you go about doing it is relatively the same (hence the need for the standard format, right?):

  1. Read the file
  2. Break each line into an entry into a collection
  3. Iterate through the collection
  4. Create an object or element with attributes based on the data in the current given line

Sure, this is a high-level view of how it’s done, and your specific implementation may have finer nuances, but – as I mentioned – I think its safe to say that the way we go about parsing a file is the same regarding of what we do with the data once we start reading it.

When you’re working on an implementation of CSV parser and you’re accepting data from an upload, then there are several things that need to be checked such as the validity of the file type.

For example: If a person uploads an image, you don’t want to proceed with parsing it; however, if they upload an actual CSV, you obviously want to process the file.

But this can be tricky depending on the operating system.

Parsing CSV Files in PHP: The File Type

Generally speaking, I think the way that we go about validating that the incoming file type that we want to use follows a format like this:

  1. If the user has permission to upload files
  2. And if the given file in the $_FILES collection is a CSV
  3. Process the CSV file

But one of the problems that can crop up when implementing this across operating systems is that the file type will not always be the same.

For example, let’s say that you have a conditional setup to check the file type like this:

But let’s say you run this script on a Windows-based system where Microsoft Excel is set as the default reader for CSV files.

In that case, you need to add another clause to the condition. For example:

Easy enough, right?

Even still, this particular nuance can cause issues if you’re working on projects for which the application will be used across a variety of operating systems.

So in the case that you find you (or your clients) unable to actually upload the file, check in the conditionals that’s validating the file type to make sure it’s what you expect it o be.

Category:
Articles
Tags:

Join the conversation! 2 Comments

  1. Hi ,
    Thanks for the awesome tutorial.

    I have one question though , will this work for the large sized csv ?? i mean with millions of rows, cause i have done almost similar thing ( getting the content of a csv to an array) and it breaks after 10K rows.

    Any suggestions ??

    • This is usually related to one or more things:

      • Server-side timeout
      • Max upload size

      Check those variables in your `php.ini` and you should be able to increase the time; however, another strategy maybe to break the rows up into smaller batches of several thousand, then import each one so that you know it’s within the available limits.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.