Names and Addresses

Paul Buis

Covers the relationships between physical hardware addresses and network protocol addresses as well as the relationships between protocol addresses and protocol host names.

1.0 Overview

We will look at physical hardware addresses with special attention to Ethernet interfaces and then go on to consider both IP and IPX network protocol addresses. We will look at host names in the IP, Netware, and NetBIOS systems and then consider how one translates between the three levels of identifiers used.

Here is a table of contents for our discussion:

2.0 Physical Hardware Addresses.

3.0 Network Protocol Addresses.

4.0 Network Names.

5.0 Mapping between Hardware and Protocol Addresses.

6.0 Mapping between Names and Protocol Addresses.

2.0 Physical Hardware Addresses

Each interface on a networked device has an address which serves as an identifier for the device for other devices on the network. Note that this is distinct from any identifier that might be associated with the interface that distinguishes it from other interfaces or other hardware components within the device.

For example, the 10 Mbps Ethernet interface on the machine named ``bsu-cs.bsu.edu'' has a physical hardware address of 08:00:20:79:ff:cd which designates a 48-bit number to distinguish it from all other interfaces on the network. This particular machine has three network interfaces. To the operating system these are known as ``le0,'' ``hm2,'' and ``lo0.'' However, the internal device identifiers are totally irrelevant in the discussion at hand.

In networks that do not make use of shared media, physical hardware addresses are not relevant. Such networks make use of ``point-to-point'' media such as a serial line (with or without a modem) or digital telephone lines (DS0, DS1, DS3, or ISDN). In these cases, the destination and source of any frame is clear: if it arrived at one end it must have originated at the other.

However, if one is using shared media, one must have some way of specifying the intended destination and the presumed source of each frame. Ethernet provides us with a canonical example of shared media. In an Ethernet system, each interface is given a 48-bit universally unique identifier when it is manufactured. No two interfaces have the same address preventing ambiguity in the specification of destination and source. This addressing system extends beyond Ethernet to most of the IEEE 802.x series of datalink MAC sublayer specifications.

2.1 Ethernet Addresses

Ethernet addresses are traditionally written as six pairs of hexadecimal digits separated by colons or periods such as 08:00:20:79:ff:cd. Each Ethernet frame header begins with a destination address and a source address. Each interface on a network will examine the destination address and if it is interested in receiving packets with that address, the interface will read the entire packet and place it in a holding buffer. After it has been placed in the buffer and checked for data integrity with the CRC checksum contained at the end of the frame, the interface will send an interrupt to the CPU to notify it that a packet is available for transfer out of the interface's buffer and into the operating system kernel memory.

Note that an interface may be configured to be interested in frames other than those that contain the universally unique 48-bit address it was assigned during the manufacturing process. Virtually all interfaces are configured to pay attention to frames with a destination address of ff:ff:ff:ff:ff:ff, known as the ``broadcast'' address. ``Multicast'' addresses may also be used to allow a single frame to be delivered to some, but not all, hosts on the network.

In particular, it is possible to configure an Ethernet interface to be ``promiscuous'' and send all packets on to the operating system. Normally, this would not be desirable, since interrupt processing requires context switches that slow the system considerably. However, it does allow one to construct devices to monitor network traffic for analyses of various sorts. For example, one might be able to track down misconfigured systems on the network by looking for unexpected kinds of network traffic. One might be able to determine traffic patterns that will allow one to partition the network with bridges and switches to reduce the collision rate. One also might simply grab all the user names and passwords that are transmitted without encryption over the network.

Another anomaly that routinely occurs is the sending of a frame with a source address other than the universally unique 48-bit address assigned to the interface doing the sending. A bridge receives a packet on one interface and will forward it using another interface. The sending interface will not use its own address in the source field of the Ethernet packet, it will use the source address that was there originally. In fact, one can make a bridge out of an ordinary PC, two interfaces, and a bit of software. The obvious conclusion is that one can ``forge'' Ethernet addresses very easily in the sense that one can configure an interface to transmit packets with a source address that corresponds to the 48-bit universally unique address assigned by the manufacturer to another interface on another machine. In fact, DECnet protocols require that the Ethernet interface not use their universally unique, manufacturer assigned address in the source field of packets they send.

Anomalies aside, the standard arrangement is for each interface to interrupt its host only for packets with destination addresses matching its own manufacturer assigned address or the broadcast address. The standard arrangement is that each packet is sent out with the source address matching the manufacturer assigned address.

A common confusion arises between a host's address and an interface's address. We must be clear on this: a host does not have a unique or constant physical hardware address. Over time, the same host may have a sequence of different interfaces and since the address is associated with the interface rather than the host, each time the an interface is replaced, its address will change. Also, a host may have more than one interface at the same point in time. Multiple interfaces may be connected to the same network in order to improve throughput (actually moving 10 Mbps between a host an interface may not be practical, most interfaces are saved because they are not involved with the vast majority of the packets, but a particular network server may) or they may be connected to different networks to either allow the host to serve as a router, or allow it to bypass a router. The key point is that the physical hardware address is associated with an interface, not with a host.

3.0 Network Protocol Addresses

Like physical hardware addresses, network protocol addresses are also associated with each interface rather than each host. In addition to identifying an interface, a network protocol address may also need to identify which network the interface is attached to so internetworking can be done by routers.

Hence, an network protocol address typically will be divided into parts: the network part and the host part. Different network protocols have different addressing schemes. Hence, an interface with a single physical hardware address may have multiple network protocol addresses. Typically, one would expect it to have one network protocol address for each network protocol it supports. For example, if one interface is being used for IP, IPX, and DECnet, one would expect it to have a separate network protocol address for each of these three protocols. Each of these addresses would have a different format and may be totally unrelated to each other.

In fact, each interface may have several network protocol addresses for the same protocol. No simple, one-to-one relationship exists between physical hardware address and network protocol address except in trivial network protocols. However, the typical case is for each interface to have exactly one network protocol address per network protocol.

Similarly, no simple one-to-one relationship exists between the network part of a network protocol address and individual datalink layer networks. Not only may a given Ethernet have different network protocol addresses for different network protocols, but a single Ethernet may have several different network protocol addresses for the same protocol. Again, the typical case is for each datalink layer network to have a single network layer address.

Network protocol addresses are simply identifiers used to specify interfaces and networks. As an analogy, identifiers are like personal names, while most of us are content with a single identifier for ourselves, in some cases in makes sense for us to use different identifier for ourselves depending on what role we are playing at the moment. Similarly, having multiple identifiers for interfaces and networks may be used as a form a aliasing.

3.1 IPX Addresses

An IPX address is a 10 byte value with a 4 byte network part and a 6 byte host part. IPX routers use the 4 byte network part. The 4 byte network part can be arbitrarily assigned to a particular data-link layer network as long as it is unique in the internetwork of communicating IPX systems. The 6 byte host part is the IEEE 802 physical address for the interface if it has one. For systems that are do not use this form of physical addressing, the physical address is still used and padded with 0's on the left.

Since IPX addresses are associated directly with the physical address, an IPX datagram can be delivered directly from one machine on the datalink layer network to the destination without any need to do any form of address lookup. However, if a machine changes its interface, its IPX address will change. Also, if a machine is replaced (when upgrading to a new model to play the same role) the IPX address changes. Hence, one cannot store the IPX address of an interface for a machine for any length of time and be assured of reaching that machine.

Since IPX is typically limited to a single organization, these numbers are typically assigned by the network system administrator for the organization. At Ball State the system used by our network system administrators (in UCS) to assign the network part of the address is as follows: take the last to bytes of the IP address for a Novell server, write them down in decimal, concatenate the decimal digits and use them as the hexadecimal address for the network to which the server is attached. If multiple Novell servers are connected to the same Ethernet, then that Ethernet is given multiple IPX addresses. Of course, with an assignment policy like that, not everyone manages to follow it, and I may have not quite gotten it straight myself.

3.2 DECnet Addresses

Just to point out that there are many alternatives for network protocol addressing schemes, we will look at DECnet addresses for a moment even though we will not talk about DECnet much elsewhere. DECnet uses 16-bit network protocol addresses. The first 6 bits are a network part (which DEC calls an area number to point out the similarity between network parts and area codes in telephone numbers) and the last 10 bits are a host part. DECnet demands that the 48-bit IEEE address be altered to use this two byte value (byte swapped since a VAX is little-endian) as the last two bytes and use the standard 4 byte prefix of AA:00:04:00:09.

Additionally, DECnet specifies the addresses of AB:00:00:03:00:00 as a network layer multicast address for all routers to listen to and AB:00:00:04:00:00 as a network layer multicast address for endnodes in a datalink layer network to listen to.

DECnet takes just the opposite approach from IPX. IPX uses more bytes in its addresses than IEEE, so it embeds the IEEE address in the IPX address. DECnet uses less bytes in its addresses than IEEE, so it embeds its addresses into the IEEE address.

3.3 IP addresses

IP addresses are 32-bit values that have no direct relationship to IEEE datalink addresses. Hence, they need a special mechanism for mapping a datalink layer addresses into an IP address and an IP address into a datalink layer address. This mapping will be considered separately a bit later. The reason for the lack of relationship is the simple historical fact that IP was developed before IEEE LANs such as Ethernet and Token Ring.

IP addresses are divided into several classes by the location of the first 0 bit in the binary expansion of the 32-bit address. Addresses with a 0 in the first bit are class A addresses. Addresses with the first 0 in the second bit are class B addresses, and so on through class E. Classes D and E are not commonly used. Each of the remaining three address classes have progressively smaller host parts. Class A addresses have a 24 bit host part (allowing for millions of hosts in the same network), class B addresses have a 16 bit host part (allowing for thousands of hosts in the same network), and class C addresses have a 8 bit host part (allowing for hundreds of hosts in the same network).

To further complicate matters, some network part addresses and some host part addresses are reserved and given special meanings. No interface may be assigned a host part with all binary 1's or all binary 0's. No network may be assigned an address with these two values either. Hence, a class C network can have at most 254 host addresses and there can be at most 254 class A networks.

Furthermore, the network part of an address is broken down into two parts: the logical site part and the physical datalink part. The strange thing is that the logical site part is whatever is left over from the host part and the physical datalink part is embedded in the host part itself. The logical site part is the network identifier for an organization's internetwork (note lower case ``i'') as seen from outside. Internally, an organization's internetwork has some structure at the datalink level. This internal structure is hidden from routers outside the organization. These ``external'' routers deliver datagrams to ``border'' routers that are inside the organization and are aware of its internal structure. These ``border'' routers may then use ``internal'' routers to forward datagrams to the correct datalink layer network.

Essentially, ``external'' routers use the network part of the IP address as an identifier for an organization as described by the class A, B, C scheme. By the time a datagram has reached the right organization, ``internal'' routers use the remaining part of the address as a two part identifier. The part that is associated with the datalink layer structure of the network is called the ``subnet'' part of the IP address and the remaining part identifies a particular interface on that datalink layer network is misnamed the ``host'' part. Not only does this part not uniquely specify a host (only an interface), the word ``host'' was already used to describe the part left over from the class A, B, C network addressing scheme.

To separate the subnet part of an IP address from the host part of an IP address, each datalink layer network is assigned a 32-bit subnet ``mask.'' The bits in a subnet mask are bitwise ``anded'' with the 32-bit IP address to determine the ``subnet'' address. For example, UCS at Ball State has been assigned the class B network address of 147.226.0.0. The address of our Web server is 147.226.112.102. The subnet mask assigned to the Ethernet network it uses is 255.255.255.0. This mask is commonly used on many class B networks to allow a couple of hundred data link layer networks with a couple of hundred hosts on each. The bitwise and of the mask with the address yields 147.226.112.0 which is treated at the network address of the Ethernet it uses by internal routers. External routers just use the 147.112.0.0 address to get datagrams to a Ball State border router. Hence, we say that our Web server is host 102 on subnet 112 of the BSU IP network.

Subnetting can be used more creatively than shown here. While the subnet mask must include 1 bits for the external network part of an address, there is no requirement that the remaining bits start with 1's or that the 1's be all together. In fact, it is not required that all data link layer networks in an organization use the same subnet mask. One would expect a class A network to be extensively subnetted, and class C networks to rarely be subnetted.

4.0 Network Names

Using numeric identifiers for interfaces works well for network protocols, but is quite inconvenient for people. People find mnemonic identifiers much easier to use: such identifiers are usually called ``names.'' Again, each network protocol may define its own naming system unrelated to other network protocols. However, for a machine that supports multiple network protocols, a similar name will typically be assigned by a network administrator who wishes to remain sane. It is crucial to note that any similarity between these names for a given machine is purely a matter of some convention contrived by the network administrator. A machine does not have some network name from which names are derived for each protocol in some universal standard way.

4.1 Netware Names

Individual hosts do not have names in the Netware system, only ``servers'' have names. Names may be fairly long and contain some punctuation (such as dashes and underscores) as well as letters and numbers. For example, at Ball State we have Netware servers with names such as CSSERVER1, BL_MAIN, and CD-REF-BL. The first one is the primary Netware server in the CS Department, the second is the primary Netware server in Bracken Library, and the third is in the reference section of Bracken Library and is used with CD-ROMs.

4.2 DNS Names

The TCP/IP suite uses the Domain Name System (DNS). DNS is quite general and is used to support more than just IP addresses, so not all DNS names have corresponding IP addresses. Some DNS names refer to the DNS name of mail exchangers for hosts that are not continually connected to the Internet, so DNS names that are part of e-mail addresses are often not directly associated with any IP address. Other DNS names refer to other DNS names with no specific reason given (there is always a reason, but it might not be obvious or explicit) for this association. In this case the name doing the referring is called an ``alias'' and the name referred to the ``canonical'' name (aliases are not supposed to refer to other aliases). For example, ``www.cs.bsu.edu'' is an alias for a machine with the canonical name ``dns.cs.bsu.edu.''

The Domain Name System gets its own name from the fact that it splits up the set of names into ``domains'' that can be separately administered. Every domain has a name is associated with the name of a ``nameserver.'' Note that the name of a domain is not directly associated with the address of a nameserver, but its name. Also, a DNS name may be associated with more than one thing. The name ``cs.bsu.edu'' refers to the name ``bsu-cs.bsu.edu'' as an alias, it refers to the name ``bsu-cs.bsu.edu'' as a mail exchanger, and it refers to the name ``dns.cs.bsu.edu'' as a nameserver for the domain named ``cs.bsu.edu.'' (nameserver names are not supposed to be aliases for the canonical name assoaciated with the address to prevent double indirection).

Any domain may be divided into subdomains by the person administering the domain. The ``edu'' domain is divided into subdomains ``bsu.edu,'' ``purdue.edu,'' and ``indiana.edu'' (among others). The ``bsu.edu'' domain is further divided into domains named ``cs.bsu.edu,'' ``ucs.bsu.edu'' and ``cp.bsu.edu'' (designating the Computer Science Department, University Computing Services, and any machine located in the Cooper Physical Science building). There is no causal relationship between DNS domain names and IP network addresses. Two hosts with addresses having different network parts may have names for those addresses that belong to the same domain. Two names that are part of the same domain may have different network parts to their addresses.

DNS names may also be associated with more than one IP address. In some cases, such a name used as the name for a host that has multiple interfaces (and hence multiple interfaces, two interfaces should never have the same IP address). In other cases, it is for redundancy. For example, a company providing e-mail services may have several machines with the same DNS name. System delivering mail may try these an address of a machine which happens to be down at the moment and then go down the list of addresses for the DNS name until it finds one that will accept the mail message.

4.3 NetBIOS Names

Like NetWare names, NetBIOS names are not heirarchical and hence limited in scope. However, like a TCP/IP system using DNS, each machine is given a name, but this name is assigned to the machine by the machine's user, not a central naming authority (although the user may abide by the wishes of some central authority).

The notion of a ``workgroup'' has been added to the NetBIOS system as a means of partitioning the namespace. However, it is a two-level system, one can have a machine named ``nt'' in a domain named ``cs'' that is different from a machine named ``nt'' in a domain named ``ucs,'' for example. A machine belonging to a particular workgroup need not specify the workgroup name of another machine belonging to the same workgroup in order to get its address. Also, when the user interface displays lists of hosts, it limits the display to the hosts in the same workgroup unless specifically instructed otherwise.

A host may belong do a ``domain'' instead of a workgroup. As far as naming goes they are essentially equivalent with the bizarre twiste that a domain and a workgroup with the same name are two distinct sets of machines (although many user interfaces combine these sets). Membership in a ``domain'' is part of a Microsoft authentication system rather than an integral part of the naming system. An individual NetBIOS machine belongs to either a domain or a workgroup, but not both.

5.0 Mapping between Hardware and Protocol Addresses

As we saw earlier, IPX and DECnet addresses were tied closely to the IEEE addresses for the datalink layer with one embedded in the other. IP addresses bear no direct relationship to IEEE datalink addresses. IP requires that each kind of datalink protocol it is used with have some standard procedure for converting between the two kinds of addresses.

Whenever an IP datagram needs to be sent out over a particular interface connected to a shared media network, the operating system needs to determine the destination address before it can instruct the interface on how to encapsulate the datagram in a packet. This process is called ``address resolution'' and protocols to do address resolution are supported by some datalink protocols. Other datalink protocols require that address resolution be accomplished by providing providing the operating system with a method such as table lookup where a table of datalink addresses is indexed by network protocol addresses.

Reverse address resolution may also need to be done on occasion. That is, given a physical hardware (datalink) address, one must determine the network protocol address that should be used. While any IP address corresponds to at most one Ethernet address, for example, a given Ethernet address may correspond to several IP addresses. In this case, reverse resolution may only need to provide one such address rather than a list of addresses, but it may provide an entire list depending on the intent of the protocol.

5.1 Ethernet ARP/RARP

Ethernet provides an elegant and general mechanism for address resolution and reverse address resolution. Note that these protocols, ARP and RARP respectively, are part of the Ethernet protocol and not part of the IP protocol. In fact, protocols that use Ethernet other than IP, such as the AppleTalk protocol, also use ARP and RARP.

ARP requests are broadcast by a machine that needs to send to a known network protocol address. The ARP request contains the network protocol address of both the source and the target as well as hardware address of the source. The ARP reply not broadcast, but sent back directly to the source. The ARP reply is a copy of the ARP request with the target hardware address added.

In effect, the ARP request broadcast is asking, ``Interface with network protocol address such and such, where are you?'' The reply is saying, ``Here I am.'' Technically, it is not required that the machine with that network protocol address respond, it is only required that some machine respond. An ``ARP server'' may respond with reply that says ``there it is.'' An ARP server may be used if a target machine is, for example, not paying attention to broadcasts to dedicate all its CPU cycles to doing something more important. Of course, a machine might reply to an ARP request when it should not. Often an IP address is misconfigured on one computer so it duplicates the IP address of another. If both machines are trying to use IP, at least one of them experiences problems. Sometimes, a malicious user will cause such a duplication on purpose (a technique known as ``ARP spoofing'') so as to have a impersonate (imcomputerate?) another, and may succeed in doing so if sufficiently clever.

ARP requests are not made every time a network protocol needs to resolve an address. Instead, responses are cached for a few minutes, so requests for a given address are only made every few minutes rather than a few hundred times a second. A response is not kept in the cache indefinitely, not primarily due to space constraints (one would only expect a few hundred to a few thousand entries to ever occur, since an Ethernet cannot support an unlimited number of hosts effectively). Rather, the hardware address to protocol address mapping may change over time. If a response is cached by a computer after the change occurs, the system would have no way of sending data after an Ethernet interface for a machine were replaced.

Since an ARP request is broadcast and contains both the network protocol address and the hardware address of the source, all machines on the network are interrupted and presented with an opportunity to cache this information. They might do so to prevent themselves from needing to make an ARP request for this address themselves. However, since the information would expire in a few minutes anyway and would just increase the time required to seach the cache for needed information, few systems actually take advantage of this opportunity.

Many operating systems do however have provisions for making entries in the ARP cache ``static'' so they do not time out every few minutes. I recommend using this feature to prevent ARP spoofing, but it requires updating the cache manually every time a hardware address changes.

RARP performs reverse address resolution to determine a network protocol address corresponding to a known hardware address. It was commonly used by network devices that did not have any form of persistant memory, such as a diskless workstation. Such a device can determine its hardware address by querying its interface device internally. It then proceeds to broadcast a request which includes its own hardware address but no other information. The reply comes from one or more ``RARP servers.'' A RARP server is a machine that has been manually configured to contain a table of physical hardware addresses along with the corresponding network protocol addresses. It is suggested that if RARP is in use on an Ethernet that at least two RARP servers be set up so that machines requiring RARP to boot may do so when one of them is not functioning.

5.2 BOOTP/DHCP

A more advanced mechanism for boot time reverse address resolution is the ``BOOT Protocol'' (BOOTP). Like RARP, BOOTP requires a machine to make a broadcast to query for information that will allow it to participate in a network layer protocol. Also like RARP it can get this information from a server that is waiting for such a request. Unlike RARP, which is part of the datalink layer, BOOTP is an application layer protocol using UDP as a transport protocol. Hence, the machine must be capable of making a limited UDP broadcast with the IP address 255.255.255.255 before it can make the request. The server software may need to broadcast the reply if it is not a privileged application that can modify the ARP cache (the machine making the request certainly is not in a position to respond to ARP requests for its own address yet, it doesn't know it).

At a bare minimum, a machine needs to know its own IP address in order to participate in normal IP communciation. It should also know things like the subnet mask and the IP address of at least one router and at least one DNS nameserver, although in a pinch one can get by without that additional information for limited operation. BOOTP can also be used to specify the name of a machine to consult for further booting instructions and the name of a file to request from that machine with those intructions. Some machines actually receive their operating system this file with BOOTP/UDP being built into the boot ROM to initiate the boot process. Since BOOTP provides a router address, the machine holding the file with further boot instructions need not be part of the same subnet as the machine that is booting: it can be anywhere on the internet.

BOOTP was designed to be very simple with a fixed number of fixed length fields so it could be burned into small boot ROMs. However, if one enhances the protocol with additional fields in the response, all sorts of interesting information can be made available to a machine in the response. The Dynamic Host Configuration Protocol (DHCP) is just such an extension to BOOTP. A DHCP server can respond to BOOTP requests with limited information or to DHCP requests with the whatever additional information it has in its database associated with the hardware address for the machine making the request. The primary advantage of DHCP is that it allows for autoconfiguration of machines. These machines may, in fact, be capable of storing this configuration information in persistant storage. But, modifying persistant sorage a bunch of machines is time consuming and tedious, especially if it requires being physically present in the same room with machine scattered in many different rooms.

The odd part about DHCP is that it has a mechanism for ``leasing'' an IP address. Rather than have a static relationship between hardware address and IP address, DHCP sets up a dynamic relationship. It assigns an IP address to a requestor, and gives it a time limit on how long the address should be used. The host should request a renewal of this lease each time it boots or when the lease is half over. Thus, DHCP allows the server to learn about hardware addresses on the fly and keep track of the mapping between them. Since the DHCP server may assign a name as part of the additional information provided, it may also inform the naming system of this newly formed correspondence or it may require the host to do so. A host that fails to request a lease renewal or fails to get its lease renewed when it should can seriously confuse the network administrator (a person who is usually confused anyway dispite being intelligent and well educated).

DHCP may provide information that is not strictly limited to IP. For example, it may provide information such as a NetBIOS host name, a NetBIOS workgroup name, or a preferred NetWare server.

6.0 Mapping between Names and Protocol Addresses

Previously, we described the structure of names for several network protocol systems. Here we will examine how the naming systems actually work. For example, given a name, how does a system translate it into a (list of) protocol address(es). Or, given a protocol address, how does a system translate it into a (list of) name(s).

6.1 Netware

Netware assigns names only to servers, not to all machines on a network. Servers advertise their services with periodic broadcasts. Not only does this allow Microsoft Windows to browse for resources offered by Netware servers on a particular Ethernet, but it allows IPX routers to be made aware of the names and addresses all named machines on a particular datalink layer network. An IPX router is attached to two or more datalink layer networks and forwards these broadcasts (perhaps less frequently and perhaps after aggregating the information from several different servers) to all of the datalink layer networks it is connected to.

This system requires each IPX router in an IPX internet to have a complete list of all names in that IPX internet. Clearly, such a requirement adversely affects the scalability of such an internet (note the lower case ``i'').

6.2 DNS

DNS does not require any machine no have complete knowledge of all names in the Internet (there were millions last time somewone attempted a count). For each domain in the DNS, at least two nameservers are registered as being ``athoritative'' for that domain. These to nameservers are to have no common point of failure, which implies that they be sufficiently distant at to get their power from separate electric generators and have independent routes to the Internet. Generally, this means that anyone wishing to set up a DNS domain must rely on some off-site support or fudge a bit on the application for the domain registration.

Generally, one of these nameservers is designated as the ``primary'' nameserver which actually holds a database for the root ``zone'' of the domain and refers to athoritative nameservers for any subdomains of the domain. Other nameservers may be ``secondary'' for a zone, and do a ``zone transfer'' from the primary to obtain thier copy of the database. Secondary nameservers might not all be ``authoritative'' and in the sense that they may not specifically be listed as such in the zone database. Whenever the primary is updated, it has the serial number for the zone database increased, so a secondary may simply check the serial number of the database before doing a complete zone transfer. Zone transfers are a ``pull'' initiated by the secondary rather than a ``push'' initiated by the primary. Hence, secondaries need to periodically poll the primary. The polliing interval is a property specified in the zone database.

Any particular DNS nameserver may be slightly out of date while it has not yet performed a zone transfer from a primary for a domain (yes, there can be more than one primary, all or none of which might actually be authoritative, but keeping them syncronized is not part of the protocol).

The nameservers for a subdomain may be the same as for the domain or they may not. Thus, a nameserver might be responsible for many zones, all of which may form a complete subtree of a domain, or it might be responsible for a single zone in an abitrary portion of the subtree (not necessarily a leaf of the tree).

A host requesting information about a DNS name has a list of addresses of DNS nameservers. If no response is obtained when querying the first address after a while the second address is queried and so on down the list. A response might be positive or negative in addition athoritative or non-athoritative. A positive response indicates that the information was successfully obtained. A negative response indicates that the requested information did not exist for the specified name. Alternatively, an iterative response indicates that the nameserver did not have the information, but is suggesting the address(es) of nameservers that are more likely to have the information. These nameservers may also give iterative responses until the host actually tracks down a nameserver that can give a definitive answer, either a positive one or a negative one.

Typically, when a nameserver does not have the requested information in its local zone database(s), it will request the information from another nameserver itself. This nameserver may go through the same process, forming a recursive rather than an iterative process to track down the information. Alternatively, some combination of iteration and recursion may occur. Generally, any information returned from one nameserver to another for a recursive response is cached. If the same query is made in the future, a response can be made from the cached information rather than requiring the name system to track down the authoritative server each time. A nameserver will typically receive the same request many times from its clients, so caching is critical to the performance of the system.

Furthermore, a nameserver may respond with additional information that was not specifically requested. For example, if a request is made to find the list of names of the nameservers for a domain, the response might additionally include the addresses for those nameservers as well. If a request is made to find the name of a mail exchanger for a given name, again, the address might be additionally supplied. This additional information may be cached along with the information specifically requested. If, as might be expected, the initiator of the original request makes the anticipated followup request of its nameserver, the nameserver will be able to supply the information immediately from its cache rather than track the information down recursively or iteratively.

Unfortunately, a mischevous namserver administrator might have his/her nameserver respond with incorrect additional information and corrupt the cache of any nameserver on the Internet that consults his/her nameserver either directly or indirectly in the course of a recursive name lookup. Hence, although caching of additional information records are critical to the performance of DNS, they are a potential security problem. I recommend that systems take steps to flush their caches more frequently than required and use caches that are as independent of other systems as possible.

Note that nameservers cache DNS information. Individual hosts may or may not do so. Most Unix systems I am familiar with do not cache DNS information unless they run the namserver software. To speed up DNS, the Ball State Computer Science Department runs an authoritative, primary nameserver for its domain (cs.bsu.edu) on the same Unix system as its DHCP server and its Web server. This machine is also a router with a single interface having both the 147.226.112.102 address and the 204.75.248.1 address. In addition, it runs a nonauthoritative, secondary nameserver on its primary timesharing machine. This machine then forms its own DNS cache which allows it to resolve queries for cached information without creating any network traffic or delay.

6.3 NetBIOS

NetBIOS names, used in Microsoft's networking protocols for Windows the older LANManager product, can become interrelated with DNS names. Recall that NetBIOS can be encapsulated in a number of different network protocols. It can be encapsulated by NetBEUI, which is not a routable protocol, or it can be encapsulated by either IPX or IP.

For NetBIOS/NetBEUI, computers advertise their names in a manner similar to NetWare server advertisements (SAP) in the sense that broadcasts are periodically made to advertise a NetBIOS name to datalink address correspondence. However, additionally, a machine on the network is elected to be the ``browser'' for the network. The ``browser'' caches the name to datalink address associations. A host wishing to resolve a name into an address can contact this browser and request the information. Additionally, both the advertising protocol and the browsing protocol may allow for additional information about a machine with a specific name beyond just its datalink address.

For NetBIOS/IPX, a host may simply choose to participate in the NetWare SAP protocol to advertise some information about itself including its name and address. However, the SAP protocol as implemented by Novell and Microsoft is not as compatibile as it should be. In particular, Windows 95 machines may be configured to do SAP advertisements, but not respond to SAP queries. In this case, machine running Novell software may see the SAP advertisement and direct a SAP query to the Windows 95 machine. When the Windows 95 machine fails to respond, the machine running Novell software is unable to boot. The basic problem is that either Microsoft is not implementing the SAP protocol as fully as it should, or Novell is making the assumption that implementations of the SAP protocol go beyond the published specification.

For NetBIOS/IP, the NetBIOS name need not be the same as the DNS name. The normal method for NetBIOS/IP name respolution is to use a Windows Internet Name Server (WINS). A WINS server keeps a database of the NetBIOS names that correspond to an IP address as each host registers its NetBIOS name with it. WINS has a different operating philosophy than DNS: WINS is not authoritarian dicatating what names should be associated with a particular address, instead it allows each host to register its name and address. A NetBIOS machine may register itself with as many WINS servers as it wishes and may specify different names to each: in this way WINS and DNS are similar in the sense that many names may correspond to the same IP address. However, WINS servers do not generally consult each other to resolve names that they do not know how to resolve. No established heirarchy of domains and authority exist in a worldwide standard.

If one wishes for the NetBIOS names to correspond to DNS names, several mechanisms exit. One may simply instruct a NetBIOS machine to consult a DNS server instead of consulting a WINS server in order to resolve names into addresses. Or, one may configure a machine as a WINS-DNS gateway to consult the DNS when a machine broadcasts a NetBIOS name query. Further confusion results when one wishes to use the NetBIOS browsing mechanism to browse a NetBIOS network that has elements on Ethernets separated by IP routers. The browsers get information generated by broadcasts that are filtered by a standard router. More on this later.


Last Modified: 12:41pm , September 30, 1996