Fun with CSV Files in Linux: Viewing, Editing, and Parsing

 


Fun with CSV Files in Linux: Viewing, Editing, and Parsing


Introduction

CSV (Comma-Separated Values) files are a widely used and simple format for storing tabular data. While they are often associated with spreadsheet applications, CSV files play a significant role in system administration, programming, and data science due to their versatility. In this article, we’ll explore how to view, edit, and parse CSV files in Linux, leveraging powerful command-line tools that make working with this format a breeze.


Create a CSV File

For this lesson, let's first begin by creating a CSV file we can work with.  In vi or vim or nano, create a file called yourfile.csv and enter the following text:


Name,City,Age,Amount
Joe,Dallas,29,1000
Henry,Fort Worth,33,500
Kelly,Tulsa,40,2000
John,Houston,37,5000


Viewing CSV Files in Linux

When working with CSV files, your first step is often to view their contents. The simplest way to display a CSV file is using the cat command:


$ cat yourfile.csv <ENTER>

Name,City,Age,Amount
Joe,Dallas,29,1000
Henry,Fort Worth,33,500
Kelly,Tulsa,40,2000
John,Houston,37,5000


However, cat simply dumps the file contents without any formatting, which can be difficult to read if the file contains lots of data. For a more structured view, you can use the column command:


column -s, -t < yourfile.csv | less -#2 -N -S <ENTER>
1 Name   City        Age  Amount
2 Joe    Dallas      29   1000
3 Henry  Fort Worth  33   500
4 Kelly  Tulsa       40   2000
5 John   Houston     37   5000


Here, -s tells column to use a comma as the delimiter, while -t aligns the data into neatly formatted columns. The less command allows you to scroll through larger files interactively.


For more advanced formatting, csvkit offers the csvlook tool, which prettifies your CSV into a table:


$ csvlook yourfile.csv <ENTER>

| Name  | City       | Age | Amount |
| ----- | ---------- | --- | ------ |
| Joe   | Dallas     |  29 |  1,000 |
| Henry | Fort Worth |  33 |    500 |
| Kelly | Tulsa      |  40 |  2,000 |
| John  | Houston    |  37 |  5,000 |


This provides a polished, easy-to-read table format, especially helpful when working with larger datasets.


Editing CSV Files in Linux

There are multiple ways to edit CSV files from the command line. For quick edits, you can use a text editor like vim or vi or nano:


$ vim yourfile.csv <ENTER>


While these editors work for small tweaks, Linux offers specialized tools for CSV manipulation, such as csvkit. With csvcut, you can select specific columns to create a new file or view just a portion of your data.


For instance, if you want to extract only the first and third columns, you can run:


csvcut -c 1,3 yourfile.csv > newfile.csv <ENTER>
$ cat new file.csv <ENTER>
Name,Age
Joe,29
Henry,33
Kelly,40
John,37


This command helps you manage large CSV files by narrowing the focus to just the columns you need.


Another powerful editing tool is awk, which allows you to edit CSV files on a row-by-row basis. For example, to extract rows where a particular value in the third column is greater than 30, you can use:


awk -F',' '$3 > 30' yourfile.csv <ENTER>
Name,City,Age,Amount
Henry,Fort Worth,33,500
Kelly,Tulsa,40,2000
John,Houston,37,5000


Parsing and Processing CSV Files

Parsing CSV files in Linux is simple and efficient with command-line utilities. One of the most common tools for processing CSV files is awk, which can filter and extract data based on conditions.


For example, if you want to print only the names (first column) from a CSV file, you can use:


awk -F',' '{print $1}' yourfile.csv <ENTER>
Name
Joe
Henry
Kelly
John


CSV Files and SQLite

For more complex operations, csvkit offers the csvsql command, which can turn your CSV file into SQL queries:


csvsql --insert --db sqlite:///mydatabase.db --tables mytable yourfile.csv <ENTER>

$ sqlite3 mydatabase.db <ENTER>

sqlite> .tables <ENTER>
mytable
sqlite> SELECT * FROM mytable; <ENTER>
Name   City        Age   Amount
-----  ----------  ----  ------
Joe    Dallas      29.0  1000.0
Henry  Fort Worth  33.0  500.0 
Kelly  Tulsa       40.0  2000.0
John   Houston     37.0  5000.0


This command converts your CSV data into SQL commands that can be executed in an SQLite database, providing a seamless way to migrate data into relational databases.


CSV Files and JSON

You can also convert your CSV file into other formats, such as JSON, using in2csv:


in2csv yourfile.csv > newfile.json <ENTER>


This flexibility allows you to easily integrate CSV data into a variety of different applications or workflows.


Conclusion

Working with CSV files in Linux is not only efficient but also enjoyable when you have the right tools. Whether you're viewing, editing, or parsing data, tools like column, awk, and csvkit make the process simple and powerful. With these utilities, you can handle large datasets, manipulate content, and even convert your data into different formats. Give these tools a try the next time you work with CSV files on Linux—you might find that the command line is your new favorite way to process data! 😊



Image:  Pixabay

Comments

Popular posts from this blog

The New ChatGPT Reason Feature: What It Is and Why You Should Use It

Raspberry Pi Connect vs. RealVNC: A Comprehensive Comparison

The Reasoning Chain in DeepSeek R1: A Glimpse into AI’s Thought Process