Nushell and ARP
February 17th 2023
Today has been a day. After staring at my physics lab assignment for a long time and realizing I didn’t have the emotional energy to read through a pdf explanation of how to set up yet another excel spreadsheet I did the mature thing and hastily packed up my things, grabbed a highly caffeinated beverage, and went to the library.
After sitting down in my usual study area with the intent of working on the lab myself in a room that wasn’t so crowded, I decided to give myself about 30 minutes to play with something technical. My semester is almost entirely calculus and physics right now, and I spent a pretty late night in the library trying to show convergence/divergence on improper integrals.
Anyways, the point of all that is to say that I’m a bit tired, and fairly surly. None of this is relevant to the post I’m about to write I just felt like dumping it out here. But I decided to reward myself for a relatively productive week by playing around with networks and listening to Daft Punk a little before spending the remainder of my night doing physics homework.
One tool I’ve been excited about lately is nushell. Writing queries against structured data is a tool I’ve found myself wanting for a while. In the words of the project home page “Stop parsing strings, start solving problems.” Which I can really get behind.
One problem I often have is with the arp table. On your average macbook dumping the arp cache will look something like this:
λ arp -a
? (161.28.87.254) at 7c:21:e:87:ce:00 on en0 ifscope [ethernet]
? (161.28.87.255) at ff:ff:ff:ff:ff:ff on en0 ifscope [ethernet]
? (224.0.0.251) at 1:0:5e:0:0:00 on en0 ifscope permanent [ethernet]
? (239.255.255.250) at 1:0:5e:7f:f5:00 on en0 ifscope permanent [ethernet]I took a networks class a few years ago and ARP is one of the protocols we covered that I’ve seen the most mileage out of. It’s the Address Resolution Protocol, and is what allows each computer on the network to generate Ethernet Frames directed at a specific IP address, since Ethernet frames are lower level on the OSI model than the Internet Protocol, there needs to be a way for each node on the network to keep a mapping of MAC addresses to IP addresses so that they can actually send packets to one another.Apparently there’s no longer a need for this with IPv6 for reasons that I have not yet had time to get into. Yet another reason to upgrade to IPv6, in case you needed one. I think it’s been replaced with the Neighbor Discovery Protocol but I haven’t yet looked into it.
As with most network protocols (and software in general) it was designed to solve a problem in the simplest way possible with no consideration for how it might be abused by a malicious actor on the network. This makes ARP Poisoning possible, which is one of my favorite network attacks:

Now, as a basic networks refresher the MAC address is a value that’s baked into the network card sending a packet. The IP address on the other hand is an arbitrary value that (most often) is dynamically assigned by the router to a new device on the network, probably through DHCP. There’s a lot going on just by associating with a network, and that’s beyond the scope of this particular post, but a basic understanding of the process is necessary to understand the rest of this post.
Back to the arp table: listing the current arp cache with arp -a will print out a list of MAC addresses and their associated IP addresses. This is handy if you want to passively look at who else is on the network with you but don’t wish to start spamming out packets with nmap or anything else like that. This particular information is not incredibly useful to anyone except the network card and the router. However, a MAC address is a useful piece of information because it contains a value in the first few octets known as the Organizationally Unique Identifier (OUI), which is a specific code assigned to the manufacturer of the network card.Wireshark has a nice web-based lookup tool here if you’d like to play with it. Though for the rest of this post I’ll be using a tool called manuf
This is an interesting piece of information because it allows you to guess roughly who or what is on your network. Let me grab my current access point as an example:
λ arp -a | grep 161.28.87.254
? (161.28.87.254) at 7c:21:e:89:cb:00 on en0 ifscope [ethernet]
λ manuf 7c:21:0e:89:cb:00
Vendor(manuf='Cisco', manuf_long='Cisco Systems, Inc', comment=None)So I’m connected to an access point manufactured by Cisco.
What I would like to be able to do however is perform those lookups every time I ask for my current arp cache. That’s where I’m hoping nushell can help me out.
Nushell
Since nushell is all about structured queryable data, I would like to tell it to dump out the arp cache, run a query to grab each individual MAC address, and replace the first three octets with the manufacturer code so it’s easy to see at a glance.
This involves firing up nushell in the first place. I want it to become my daily driver shell at some point, but it needs a little more setup (from me) and a bit more developmental maturity (as a project) before I fully switch over.
So let’s get that going:
λ nu
__ ,
.--()°'.' Welcome to Nushell,
'|, . ,' based on the nu language,
!_-(_\ where all data is structured!
λNow most things have a parser written for their unstructured data to structure it and put it into whatever format it is that nu uses, for example ls produces this structured output:
λ ls /var/tmp/ 02/17/2023 05:47:02 PM
╭───┬───────────────────────────────────────────────┬────────┬──────────┬──────────────╮
│ # │ name │ type │ size │ modified │
├───┼───────────────────────────────────────────────┼────────┼──────────┼──────────────┤
│ 0 │ /var/tmp/bc3902d8132f43e3ae586a009979fa88.db │ file │ 24.0 KiB │ 2 months ago │
│ 1 │ /var/tmp/bc3902d8132343e3ae586a009979fa88.db- │ file │ 32.0 KiB │ 2 months ago │
│ │ shm │ │ │ │
│ 2 │ /var/tmp/bc3902d8132343e3ae586a009979fa88.db- │ file │ 0 B │ 2 months ago │
│ │ wal │ │ │ │
│ 3 │ /var/tmp/bc3902d8132343e3ae586a009979fa88.db. │ file │ 51 B │ 3 months ago │
│ │ ses │ │ │ │
│ 4 │ /var/tmp/filesystemui.socket │ socket │ 0 B │ 2 weeks ago │
│ 5 │ /var/tmp/mat-debug-14128.log │ file │ 1.8 KiB │ 3 months ago │
│ 6 │ /var/tmp/mat-debug-5226.log │ file │ 311 B │ 2 months ago │
│ 7 │ /var/tmp/mat-debug-67562.log │ file │ 311 B │ 2 months ago │
╰───┴───────────────────────────────────────────────┴────────┴──────────┴──────────────╯This is cool because it means we can write ad hoc queries to get exactly the information we want. Say I wanted the above listing of /var/tmp but I only wanted to see logfiles and sockets.
λ ls /var/tmp/ | where name ends-with .log or name ends-with .socket
╭───┬──────────────────────────────┬────────┬─────────┬──────────────╮
│ # │ name │ type │ size │ modified │
├───┼──────────────────────────────┼────────┼─────────┼──────────────┤
│ 0 │ /var/tmp/filesystemui.socket │ socket │ 0 B │ 2 weeks ago │
│ 1 │ /var/tmp/mat-debug-14138.log │ file │ 1.8 KiB │ 3 months ago │
│ 2 │ /var/tmp/mat-debug-5266.log │ file │ 311 B │ 2 months ago │
│ 3 │ /var/tmp/mat-debug-67162.log │ file │ 311 B │ 2 months ago │
╰───┴──────────────────────────────┴────────┴─────────┴──────────────╯Neat huh?
There are a bunch of commands that just don’t have that kind of structure though, and the default behavior in the case of no parser being written is to shunt the unstructured text straight to STDOUTAs is tradition.
. ARP is one of these:MAC Addresses have been changed to protect the innocent.
λ arp -a 02/17/2023 05:57:13 PM
? (161.28.87.254) at 7c:21:e:89:ce:5f on en0 ifscope [ethernet]
? (161.28.87.255) at ff:ff:ff:ff:ff:ff on en0 ifscope [ethernet]
? (224.0.0.251) at 1:0:5e:0:0:fb on en0 ifscope permanent [ethernet]
? (239.255.255.250) at 1:0:5e:7f:ff:fa on en0 ifscope permanent [ethernet]
λ arp -a | describe 02/17/2023 06:00:37 PM
raw inputThis is not so nice because it means we can’t, for instance, write a query to select only the MAC addresses of each entry in the arp table.
So how to get the unstructured text into structured data? Nushell has a handy utility called parse just for this purpose:
λ help parse 02/17/2023 07:21:48 PM
Parse columns from string data using a simple pattern.
Search terms: pattern, match
Usage:
> parse {flags} <pattern>
Parameters:
pattern <String>: the pattern to match. Eg) "{foo}: {bar}"
Examples:
Parse a string into two named columns
> echo "hi there" | parse "{foo} {bar}"So to get the arp data we need:
λ arp -a | lines | parse "? ({ip}) at {mac} on {info}" 02/17/2023 07:26:57 PM
╭───┬─────────────────┬───────────────────┬──────────────────────────────────╮
│ # │ ip │ mac │ info │
├───┼─────────────────┼───────────────────┼──────────────────────────────────┤
│ 0 │ 161.28.87.254 │ 7c:21:e:89:cb:00 │ en0 ifscope [ethernet] │
│ 1 │ 161.28.87.255 │ ff:ff:ff:ff:ff:ff │ en0 ifscope [ethernet] │
│ 2 │ 224.0.0.251 │ 1:0:5e:0:0:00 │ en0 ifscope permanent [ethernet] │
│ 3 │ 239.255.255.250 │ 1:0:5e:7f:ff:00 │ en0 ifscope permanent [ethernet] │
╰───┴─────────────────┴───────────────────┴──────────────────────────────────╯Ain’t that handy?
Now we can just say:
λ arp -a | lines | parse "? ({ip}) at {mac} on {info}" | each {|addr| manuf $addr.mac} 02/17/2023 07:48:07 PM
Vendor(manuf=None, manuf_long=None, comment=None)
Vendor(manuf='Broadcast', manuf_long=None, comment=None)
Vendor(manuf=None, manuf_long=None, comment=None)
Vendor(manuf=None, manuf_long=None, comment=None)Which is cool, except that the program is getting None back as the manufacturer identity. This is easily explained: dumping the current arp cache produces a MAC address that looks like 7c:21:e:89:cb:00. You’ll notice that little isolated single character :e:, which is just a :0e: in hex, but the arp command doesn’t print leading zeros. Nushell gives us the tools to solve this however:
λ "7c:21:e:89:ce:5f" | split row ":" 02/17/2023 07:54:16 PM
╭───┬────╮
│ 0 │ 7c │
│ 1 │ 21 │
│ 2 │ e │
│ 3 │ 89 │
│ 4 │ cb │
│ 5 │ 00 │
╰───┴────╯
λ "7c:21:e:89:cb:00" | split row ":" | each {|oct| if (($oct | str length) == 1) { ["0", $oct] | str join } else { $oct } } | str join ":"
7c:21:0e:89:cb:00I’ll make that a quick alias with:
alias macpad = ($in | split row ":" | each {|oct| if (($oct | str length) == 1) { ["0", $oct] | str join } else { $oct } } | str join ":")Then we can get nicer output about vendor info: Even with the padding many lookups come back without an entry, I’ve omitted them in the output here.
λ arp -a | lines | parse "{ip} at {mac} on {info}" | par-each {|addr| [$addr.ip, (manuf ($addr.mac | macpad))] }
╭────┬───────────────────────────────────────────────────────────────────────────────────────────╮
│ ... │
│ │ │
│ │ │
│ 12 │ 10.0.84.1 │
│ 13 │ Vendor(manuf='Routerbo', manuf_long='Routerboard.com', comment=None) │
│ │ │
│ │ │
│ 18 │ 10.0.87.78 │
│ 19 │ Vendor(manuf='SamsungE', manuf_long='Samsung Electronics Co.,Ltd', comment=None) │
│ │ │
│ ... │
│ │ │
│ 42 │ 10.0.86.93 │
│ 43 │ Vendor(manuf='Chongqin', manuf_long='Chongqing Fugui Electronics Co.,Ltd.', comment=None) │
│ │ │
│ ... │
│ │ │
│ 50 │ 224.0.0.251 │
│ 51 │ Vendor(manuf='IPv4mcast', manuf_long=None, comment=None) │
│ │ │
╰────┴───────────────────────────────────────────────────────────────────────────────────────────╯By structuring the data from arp, I’ve been able to pull out some interesting information on the fly without having to copy and paste a dozen values.