Photo by Freddy Castro on Unsplash

Reading data from a data source is very common when building web applications. CSV is the most popular among the many data sources because of how easily the data are formatted inside, making it easy to parse these files. In his tutorial, we will see how to read the content of a CSV file then parse his content for further usage in the application.

Dataset to use

For the tutorial, we need a sample file with data. I found a CSV file containing the cities in the world. You can download this file at this link.

Let's open the sample file and see what is inside:

Content of the CSV file to read

So here, our goal is to read these data and convert them to a Typescript object to use inside the application, like saving in a database or returning it as a JSON response.

Mapping between table headers and the Typescript types

Based on the picture below, we will have a type similar to this:

type WorldCity = {
    name: string;
    country: string;
    subCountry: string;
    geoNamId: number;
};

Setup the project

Initialize a Node.js project and also Typescript

mkdir node-csv-read

cd node-csv-read

yarn init -y

yarn add -D typescript ts-node @types/node

yarn tsc --init

touch index.ts

Install the Node package to use for reading the file called csv-parse.

yarn add csv-parse

Inside the file index.ts, add the following code:

import * as fs from "fs";
import * as path from "path";
import { parse } from 'csv-parse';

type WorldCity = {
  name: string;
  country: string;
  subCountry: string;
  geoNameId: number;
};

(() => {
  const csvFilePath = path.resolve(__dirname, 'files/world-cities_csv.csv');

  const headers = ['name', 'country', 'subCountry', 'geoNameId'];

  const fileContent = fs.readFileSync(csvFilePath, { encoding: 'utf-8' });

  parse(fileContent, {
    delimiter: ',',
    columns: headers,
  }, (error, result: WorldCity[]) => {
    if (error) {
      console.error(error);
    }

    console.log("Result", result);
  });
})();

Here, we first define the path where the file to read is located; in our case, we create a folder named files in the root project directory, then copy the CSV file we downloaded before in this folder.

We read the content of the file and use the parse() function from csv-parse to parse the string and return the result as an array of items of type WorldCity.

We also add two options to define the delimiter and the columns that allow mapping CSV header to the properties of WorldCity type.

Let's try to execute the application to see the result:

yarn ts-node index.ts

We got an output similar to this:

The content of the CSV file is parsed and printed in the console.

Apply transformation on parsing

On the content printed in the output, we see the geoNameId must be a number, yet it is a string. We will need to convert the string to a number while parsing the data.

Csv-parse provides an option called cast that allows applying a custom transformation on each column for each row. The parse function now looks like this:

parse(fileContent, {
    delimiter: ',',
    columns: headers,
    fromLine: 2,
    cast: (columnValue, context) => {
      if (context.column === 'geoNameId') {
        return parseInt(columnValue, 10);
      }

      return columnValue;
    }
  }, (error, result: WorldCity[]) => {
    if (error) {
      console.error(error);
    }

    console.log("Result", result);
  });

We check if the column's name is geoNameId and parse the value to a number; otherwise, we return the value without any change. We added a new option called fromLine which excludes the CSV Header from the data to parse. Run the code and see the result:

Data parsed with the correct type.

The number is now parsed as expected:

Retrieve specific lines in the CSV data

Let's say we want to retrieve the cities from France only. How could we do that? Csv-parse provides another option called on_record that allows us to filter data at the line level to exclude a whole line by using this. The ย parse function now looks like this:

parse(fileContent, {
    delimiter: ',',
    columns: headers,
    fromLine: 2,
    cast: (columnValue, context) => {
      if (context.column === 'geoNameId') {
        return parseInt(columnValue, 10);
      }

      return columnValue;
    },
    on_record: (line, context) => {
      if (line.country !== 'France') {
        return;
      }

      return line;
    },
  }, (error, result: WorldCity[]) => {
    if (error) {
      console.error(error);
    }

    console.log("Result", result);
  });

Run the code and see the result:

Lines filtered to display only the cities of France.

We can see the number of items retrieved dropped drastically.

Wrap up

We saw how to read a CSV file using the library csv-parse, which provides many options that give us more flexibility in how we parse the file. csv-parse has many others options and if you want to learn more about it, check out this link.

You can find the code source on the GitHub repository.

Follow me on Twitter or subscribe to my newsletter to not miss the upcoming posts and the tips and tricks I share every week.

Happy to see you soon ๐Ÿ˜‰