

to skip the first line of the data file since that row just has the names of the column labels. (Still truncating the data file with head to make it easy to see what’s going on): head -n3 sdss1738478.csv | \ Now that I have my format, I need to create a factory that takes a line of CSV as input and returns an array of formatted JSON records. Step 3: Write a jq expression against a limited data set Here’s some psuedocode to illustrate the format I’ve designed: [ The column headings obviously should be used as the data labels but I’m going to edit them slightly so that they’ll make naturally convenient keys for a JSON object. The simplest thing would be to deliver an array of JSON records (one for each line) containing labeled data fields. Now that I have a sense of what the raw data looks like I can sketch out what I’d like the finished data set to look like - once it has been converted into JSON. Head is also useful here because sdss1738478.csv is over a million lines long. Using head allows me to abstract away the complexity of working with a very large file and just concentrate on figuring out the column headings and the data format. Seeing this helps me enormously to design the command that will convert this file to JSON.
#Jq convert string to number plus
I use the head command to examine the column header row plus a couple of lines of data.

For one thing I expect that the first line of the file will be a list of column headings which I can use to structure the JSON output I’m going to generate. Having downloaded sdss1738478.csv, the first thing I want to do is take a look at the first few lines of the file to get a sense of what I am dealing with. DevOps Borat JanuStep 1: Visual examination of the raw data
