As part of my recent work on core:net
for Odin, I’ve spent a chunk of time digging through a mountain of RFCs
to get DNS domain name decoding and encoding working.
So, terms here are goofy, but important.
A hostname is typically the name of a particular machine (ex: mycomputer
), whereas a domain name is the name for the domain generally (ex: gravitymoth.com
), and a FQDN (Fully Qualified Domain Name) is typically the hostname + the domain name (ex: mycomputer.gravitymoth.com
).
Decoding is fairly simple (assuming you ignore newer unicode extensions).
To keep DNS packet sizes small, FQDNs get compressed using a technique similar to RLE (run-length encoding), with a little bit of
extra magic to allow label pasting.
Segments separated by .
are referred to as “labels”. A label, represented as <length><data>
, is a maximum of
63 bytes long. FQDNs are a max of 255 bytes. Keep in mind while writing your DNS code, DNS packet data is big-endian.
Thinking about the hostname decoder as a bytecode interpreter, it supports 3 types of operations:
<length><data>
0xC0<offset>
0
<length><data>
(ex: 6google
) is easy, you read the length, and then add length
number of characters
after it, plus a .
to your result
For 0xC0<offset>
, you jump to the offset, indexed from the start of the packet (0
is packet[0]
)
and then read your next op
0
is terminal. When you hit it, you’ve decoded the full FQDN
Decoding looks roughly like this
def decode_name(packet, start_idx):
name = ""
cur_idx = start_idx
labels_added = 0
while True:
match packet[cur_idx]:
# We're at the end of the FQDN
case 0:
return name
# Jump through the offset to more data in the packet
case 0xC0:
# The offset is 16bit, big endian
cur_idx = u16be(packet[cur_idx:cur_idx+2])
# This is a label, insert it into the name
case _:
if labels_added > 0:
name += "."
labels_added += 1
label_size = int(packet[cur_idx])
cur_idx += 1
name += packet[cur_idx:cur_idx+label_size]
cur_idx += label_size
return name