Photo by Lance Anderson / Unsplash

Reading data from a data source is very common when building web applications. Excel is widely used among the many data sources because of how easily the data are formatted inside, making it easy to parse these files.

In this tutorial, we will see how to read the content of an Excel file and then parse his content for further usage in the application.

Dataset to use

For the tutorial, we need a sample file with data. I found an Excel file containing the Hockey players of the USA and Canada for the 2018 Olympic Games. You can download this file at this link.

Let's open the sample file and see what is inside:

Content of the Excel file to read.
Content of the Excel file to read.

So here, our goal is to read these data and convert them to a Typescript object to use inside the application, like saving in a database or returning it as a JSON response.

Excel Column Typescript property Typescript type
ID id number
Team team M or W
Country country Canada or USA
NameF firstName string
NameL lastName string
Weight weight number
Height height number
DOB dateOfBirth string (YYY-MM-DD)
Hometown hometown string
Prov province string
Pos position enum
Age age number
HeightFt heightFt number
Htln htln number
BMI bmi number

From the table above, the type Player will look like this:

type Team = 'M' | 'W';
type Country = 'Canada' | 'USA';
type Position = 'Goalie' | 'Defence' | 'Forward';

type Player = {
  id: number;
  team: Team;
  country: Country;
  firstName: string;
  lastName: string;
  weight: number;
  height: number;
  dateOfBirth: string; // (YYY-MM-DD)
  hometown: string;
  province: string;
  position: Position;
  age: number;
  heightFt: number;
  htln: number;
  bmi: number;
};

Set up the project

Initialize a Node.js project with Typescript

mkdir node-excel-read

cd node-excel-read

yarn init -y

yarn add -D typescript ts-node @types/node

yarn tsc --init

touch index.ts

Download the excel file below and copy it into the project directory; it is the file we will read the content:

Install the Node package to use for reading the file called exceljs.

yarn add exceljs

Add the code below to the file index.ts

import * as path from 'path';
import Excel from 'exceljs';

const filePath = path.resolve(__dirname, 'olympic-hockey-player.xlsx');

type Team = 'M' | 'W';
type Country = 'Canada' | 'USA';
type Position = 'Goalie' | 'Defence' | 'Forward';

type Player = {
  id: number;
  team: Team;
  country: Country;
  firstName: string;
  lastName: string;
  weight: number;
  height: number;
  dateOfBirth: string; // (YYY-MM-DD)
  hometown: string;
  province: string;
  position: Position;
  age: number;
  heightFt: number;
  htln: number;
  bmi: number;
};

const getCellValue = (row:  Excel.Row, cellIndex: number) => {
  const cell = row.getCell(cellIndex);
  
  return cell.value ? cell.value.toString() : '';
};

const main = async () => {
  const workbook = new Excel.Workbook();
  const content = await workbook.xlsx.readFile(filePath);

  const worksheet = content.worksheets[1];
  const rowStartIndex = 4;
  const numberOfRows = worksheet.rowCount - 3;

  const rows = worksheet.getRows(rowStartIndex, numberOfRows) ?? [];

  const players = rows.map((row): Player => {
    return {
      // @ts-ignore
      id: getCellValue(row,1),
      // @ts-ignore
      team: getCellValue(row, 2),
      // @ts-ignore
      country: getCellValue(row, 3),
      firstName: getCellValue(row, 4),
      lastName: getCellValue(row, 5),
      // @ts-ignore
      weight: getCellValue(row, 6),
      height: +getCellValue(row, 7),
      dateOfBirth: getCellValue(row, 8), // (YYY-MM-DD)
      hometown: getCellValue(row, 9),
      province: getCellValue(row, 10),
      // @ts-ignore
      position: getCellValue(row, 11),
      age: +getCellValue(row, 12),
      heightFt: +getCellValue(row, 13),
      htln: +getCellValue(row, 14),
      bmi: +getCellValue(row, 15),
    }
  });

  console.log(players);
};

main().then();

Save and run the code with the command yarn start

Display the content of the Excel file in the console.
Display the content of the Excel file in the console.

As you can see, we retrieved the content of the excel file as expected, but there are improvements to make.

  • There are many // @ts-ignore in the code, meaning Typescript complains about the value we get from the cell.
  • The value of properties team, height, dateOfBirth and height aren't well-formed.
  • The properties age, heightFt, htln and bmi return NaN for a particular reason that we will see how to fix.

Fix the value of the property "team"

The value must be M or W yet the output return Men or Women. We will write the code to transform the value:

const transformTeam = (value: string): Team => {
  return value === 'Men' ? 'M' : 'W';
};

The property team will be now:

team: transformTeam(getCellValue(row, 2)),

Fix the value of the property "height"

The value is NaN; we try to convert an invalid string number. The code below fixes it:

const transformHeight = (value: string): number => {
  return +value.replace("'", ".");
};

Retrieve value from cell formula

The properties age, heightFt, htln and bmi return NaN because they are formulas, meaning the cell's value is the arithmetical formula from other cells.

If we look at the value of those cells, we have this output:

View the structure of a formula cell.
View the structure of a formula cell.

In the contrary to simple cells that have a plain value, formula cells have an object, and the value we want is inside the property result. We will create a function getCellFormulaValue for this specific case.

Here is the final code below:

import * as path from 'path';
import Excel from 'exceljs';

const filePath = path.resolve(__dirname, 'olympic-hockey-player.xlsx');

type Team = 'M' | 'W';
type Country = 'Canada' | 'USA';
type Position = 'Goalie' | 'Defence' | 'Forward';

type Player = {
  id: number;
  team: Team;
  country: Country;
  firstName: string;
  lastName: string;
  weight: number;
  height: number;
  dateOfBirth: string; // (YYY-MM-DD)
  hometown: string;
  province: string;
  position: Position;
  age: number;
  heightFt: number;
  htln: number;
  bmi: number;
};

const getCellValue = (row:  Excel.Row, cellIndex: number) => {
  const cell = row.getCell(cellIndex);

  console.log(cell.value);

  return cell.value ? cell.value.toString() : '';
};

const getCellFormulaValue = (row:  Excel.Row, cellIndex: number) => {
  const value = row.getCell(cellIndex).value as Excel.CellFormulaValue;

  return value.result ? value.result.toString() : '';
};

const transformTeam = (value: string): Team => {
  return value === 'Men' ? 'M' : 'W';
};

const transformHeight = (value: string): number => {
  return +value.replace("'", ".");
};

const transformDateOfBirth = (value: string) => {
  const date = new Date(value);

  return `${date.getFullYear()}-${date.getMonth() + 1}-${date.getDate()}`;
};

const main = async () => {
  const workbook = new Excel.Workbook();
  const content = await workbook.xlsx.readFile(filePath);

  const worksheet = content.worksheets[1];
  const rowStartIndex = 4;
  const numberOfRows = worksheet.rowCount - 3;

  const rows = worksheet.getRows(rowStartIndex, numberOfRows) ?? [];

  const players = rows.map((row): Player => {
    return {
      id: +getCellValue(row,1),
      team: transformTeam(getCellValue(row, 2)),
      country: getCellValue(row, 3) as Country,
      firstName: getCellValue(row, 4),
      lastName: getCellValue(row, 5),
      weight: +getCellValue(row, 6),
      height: transformHeight(getCellValue(row, 7)),
      dateOfBirth: transformDateOfBirth(getCellValue(row, 8)), // (YYY-MM-DD)
      hometown: getCellValue(row, 9),
      province: getCellValue(row, 10),
      position: getCellValue(row, 11) as Position,
      age: +getCellFormulaValue(row, 12),
      heightFt: +getCellFormulaValue(row, 13),
      htln: +getCellFormulaValue(row, 14),
      bmi: +getCellFormulaValue(row, 15),
    }
  });

  console.log(players);
};

main().then();

Re-run the command yarn start to see the output:

The content of the Excel is well extracted and displayed in the console.
The content of the Excel is well extracted and displayed in the console.

Wrap up

exceljs make it easy to read and parse an Excel file and while using it, make sure to handle simple cell and formula values. There are other cell types like RichText, Hyperlink, Date, and Shared Formula.

You can do a lot more with this library, so check out the documentation to see where you can go with.

You can find the code source on the GitHub repository.

Follow me on Twitter or subscribe to my newsletter to avoid missing the upcoming posts and the tips and tricks I occasionally share.