Skip to main content

Bash scripting: How to read data from text files

Here's how to extract data from a text file such as reading in a list of servers to test connectivity to them.
Image
Bash scripting: How to read data from text files
Photo by Ichad Windhiagiri from Pexels

Something that I like in Linux (and in Unix-like systems in general) is that configurations and properties are contained in text files. This allows an administrator with the right permissions to examine the files and make changes if required. Text files are also simple and convenient data sources for a sysadmin's typical operations. In certain situations, you can use text files as an output to be shared with regular users as well. I cover examples of both cases in this article.

Note: Writing to stdout and reading from stdin using pipes is like using a virtual text file. In many cases, you do have a text file, but in others, you simply use the output of some previous command as if it were a text file.

[ You might also enjoy: My 8 favorite practical Linux commands ]

So, when should you use one method or the other? Well, in some cases, the text file already exists, like the /etc/hosts file, for instance. In other cases, you don't need to have the file physically written because you're only interested in the result (and the data structure is really simple). Different scenarios might require that you store the information to a file for reasons such as clarity, troubleshooting, auditing, or being able to analyze the information's structure and other types of content that are there.

Check the reachability and name resolution for a list of servers

Suppose you have a list of new machines and need to verify that they:

  1. Are reachable from your server
  2. Have name resolution working for them
  3. Are listening on port 22 (for SSH)

You also need to report the status to the project team for which you're required to submit in a spreadsheet format. If you're working with dozens of servers and you need to repeat these tests on different days, it is definitely useful to think about an automated way of doing so.

The input file

This is the spreadsheet where I got the CSV (comma-separated value) file used in the following examples.

Image
Spreadsheet with column A containing hostnames and column B containing IPs to test

And this is the CSV file:

ServerName,IP
m2.example.com,192.168.2.99
xtower.example.com,192.168.2.111
win2k16.example.com,192.168.101.41
control.example.com,192.168.101.200
node1.example.com,192.168.101.201
node2.example.com,192.168.101.202
node3.example.com,192.168.101.203
node4.example.com,192.168.101.204
node5.example.com,192.168.101.205

(Converting the spreadsheet to/from CSV was done manually and will not be covered in this article.)

The script

The following is the script I use to test the servers:

1     #!/bin/bash
2     
3     input_file=hosts.csv
4     output_file=hosts_tested.csv
5     
6     echo "ServerName,IP,PING,DNS,SSH" > "$output_file"
7     
8     tail -n +2 "$input_file" | while IFS=, read -r host ip _
9     do
10        if ping -c 3 "$ip" > /dev/null; then
11            ping_status="OK"
12        else
13            ping_status="FAIL"
14        fi
15    
16        if nslookup "$host" > /dev/null; then
17            dns_status="OK"
18        else
19            dns_status="FAIL"
20        fi
21    
22        if nc -z -w3 "$ip" 22 > /dev/null; then
23            ssh_status="OK"
24        else
25            ssh_status="FAIL"
26        fi
27    
28        echo "Host = $host IP = $ip" PING_STATUS = $ping_status DNS_STATUS = $dns_status SSH_STATUS = $ssh_status
29        echo "$host,$ip,$ping_status,$dns_status,$ssh_status" >> $output_file
30    done

The following line items explain the script entries above:

Line 6: Initialize the output file with the header and three new fields to represent the status of reachability via ping and name resolution

Line 8: Read input file line by line using a while loop, ignoring the first line (header). It also creates the variables for host and ip, extracting the values using the separator (comma) and ignoring the rest.

Lines 10 to 26: Run the pingnslookup, and nc commands, sending the output to null because we are just interested in the return status from the commands

Line 28: Send the output to stdout for the person running the script

Line 29: Send the data to the output file with the three new columns (ping_status, dns_status, and ssh_status)

Open the output file as a spreadsheet

Send the output file to a workstation where you can open it in your favorite spreadsheet application. If you're sending this to non-technical people, you might want to save it in the default spreadsheet format used in the company to make their life easier.

Image
Spreadsheet with column A hostnames, B, IP addresses, C ping results, D name resolution attempts, E SSH attempts

[ Free cheat sheet: Get a list of Linux utilities and commands for managing servers and networks. ] 

Wrap up

In this article, I applied some common and simple tools that are available on all Linux systems to automate the test for reachability, name resolution, and connectivity via SSH. In some projects, this type of validation would have to be done for dozens of servers and repeated a significant number of times due to the change processes that involve other teams (Network and Firewall, for example).

The principles can be extended to other types of tests. For example, you could test connectivity for a different port. If the tests become more complex, such as run a command on the host if it is reachable via SSH, then you're dealing with a different type of problem, which requires a different tool. And for a situation like that, I strongly recommend that you look at Ansible.

Topics:   Linux   Scripting  
Author’s photo

Roberto Nozaki

Roberto Nozaki (RHCSA/RHCE/RHCA) is an Automation Principal Consultant at Red Hat Canada where he specializes in IT automation with Ansible. More about me

Try Red Hat Enterprise Linux

Download it at no charge from the Red Hat Developer program.