Parse City Data using JavaScript in Node.js

Parsing out city data is commonly done when you have a project requiring a database of all the cities but you don’t yet have a database laying around for that information already. Various sources out there have the data and could be more or less up-to-date.

I went to this site and downloaded their CSV database:
http://simplemaps.com/resources/us-cities-data

(They also included a JS function for calculating the distance between two lat/long coordinates.)

It can be of value to look over the data manually and also to perform some analyses on it. Here’s how to parse that out quickly and optionally record unique values for the zip, state, and city values.

First install this module, fast-csv.

npm install fast-csv

Source of cities.js

var fs = require('fs'),
    csv = require('fast-csv'),
    fileStream = fs.createReadStream('cities.csv'),
    csvStream,
    lines= 0,
    zips = {},
    states = {},
    cities = {};

// increment a count so we can see unique values
function inc(obj, val) {
    if (!obj[val]) {
        obj[val] = 1;
    } else {
        obj[val] += 1;
    }
}

// csv line event
function csvLine(data) {
    var zip = data[0],
        state = data[1],
        city = data[2],
        lat = data[3],
        lng = data[4];

    // increment line count
    lines += 1;

    // skip the header in processing
    if (lines === 1) {
        return;
    }

    // increment unique value counts
    inc(zips, zip);
    inc(states, state);
    inc(cities, city);
}

// csv end event
function csvEnd() {
    console.log('unique zips', Object.keys(zips).length);
    console.log('unique states', Object.keys(states).length);
    console.log('unique cities', Object.keys(cities).length);
    console.log('total lines', lines);
}

// create csv parser
csvStream = csv()
    .on('data', csvLine)
    .on('end', csvEnd);

// pipe the csv file into the parser
fileStream.pipe(csvStream);

Here’s the output:

unique zips 29468
unique states 52
unique cities 16699
total lines 29472

52 US States? It just turned out the value for some of the rows in the states column was undefined rather than being an actual value. Overall the data was of good quality and consistency. I couldn’t say if it is fresh or not. I hope that is helpful for you if you’re parsing out

Toodles!

Author Tony Crowe, Salt Lake City, UT
top
home
Attribution
Lato font by Łukasz Dziedzicwww.latofonts.com/team
Roboto font by Christian Robertson, Google Incchristianrobertson.comwww.google.com/fonts
Inconsolata font by Raph Levien, Google Incwww.levien.comwww.google.com/fonts
hexo: fast, simple & powerful blog frameworkhexo.io
ace editorace.c9.io
virtual-domwww.npmjs.com/package/virtual-dom
Black Granite Water Droplets, William Warbyhttps://www.flickr.com/photos/wwarby