RFC 3467:Role of the Domain Name System (DNS)
RFC-Ref

1. Introduction and History

   The DNS was designed as a replacement for the older "host table"
   system.  Both were intended to provide names for network resources at
   a more abstract level than network (IP) addresses (see, e.g.,
   [RFC625], [RFC811], [RFC819], [RFC830], [RFC882]).  In recent years,
   the DNS has become a database of convenience for the Internet, with
   many proposals to add new features.  Only some of these proposals
   have been successful.  Often the main (or only) motivation for using
   the DNS is because it exists and is widely deployed, not because its
   existing structure, facilities, and content are appropriate for the
   particular application of data involved.  This document reviews the
   history of the DNS, including examination of some of those newer
   applications.  It then argues that the overloading process is often
   inappropriate.  Instead, it suggests that the DNS should be
   supplemented by systems better matched to the intended applications
   and outlines a framework and rationale for one such system.

   Several of the comments that follow are somewhat revisionist.  Good
   design and engineering often requires a level of intuition by the
   designers about things that will be necessary in the future; the
   reasons for some of these design decisions are not made explicit at
   the time because no one is able to articulate them.  The discussion
   below reconstructs some of the decisions about the Internet's primary
   namespace (the "Class=IN" DNS) in the light of subsequent development
   and experience.  In addition, the historical reasons for particular
   decisions about the Internet were often severely underdocumented
   contemporaneously and, not surprisingly, different participants have
   different recollections about what happened and what was considered
   important.  Consequently, the quasi-historical story below is just
   one story.  There may be (indeed, almost certainly are) other stories
   about how the DNS evolved to its present state, but those variants do
   not invalidate the inferences and conclusions.

   This document presumes a general understanding of the terminology of
   RFC 1034std13 [RFC1034] or of any good DNS tutorial (see, e.g., [Albitz]).

1.1. Context for DNS Development

   During the entire post-startup-period life of the ARPANET and nearly
   the first decade or so of operation of the Internet, the list of host
   names and their mapping to and from addresses was maintained in a
   frequently-updated "host table" [RFC625], [RFC811], [RFC952].  The
   names themselves were restricted to a subset of ASCII [ASCII] chosen
   to avoid ambiguities in printed form, to permit interoperation with
   systems using other character codings (notably EBCDIC), and to avoid
   the "national use" code positions of ISO 646 [IS646].  These
   restrictions later became collectively known as the "LDH" rules for
   "letter-digit-hyphen", the permitted characters.  The table was just
   a list with a common format that was eventually agreed upon; sites
   were expected to frequently obtain copies of, and install, new
   versions.  The host tables themselves were introduced to:

   o  Eliminate the requirement for people to remember host numbers
      (addresses).  Despite apparent experience to the contrary in the
      conventional telephone system, numeric numbering systems,
      including the numeric host number strategy, did not (and do not)
      work well for more than a (large) handful of hosts.

   o  Provide stability when addresses changed.  Since addresses -- to
      some degree in the ARPANET and more importantly in the
      contemporary Internet -- are a function of network topology and
      routing, they often had to be changed when connectivity or
      topology changed.  The names could be kept stable even as
      addresses changed.

   o  Provide the capability to have multiple addresses associated with
      a given host to reflect different types of connectivity and
      topology.  Use of names, rather than explicit addresses, avoided
      the requirement that would otherwise exist for users and other
      hosts to track these multiple host numbers and addresses and the
      topological considerations for selecting one over others.

   After several years of using the host table approach, the community
   concluded that model did not scale adequately and that it would not
   adequately support new service variations.  A number of discussions
   and meetings were held which drew several ideas and incomplete
   proposals together.  The DNS was the result of that effort.  It
   continued to evolve during the design and initial implementation
   period, with a number of documents recording the changes (see
   [RFC819], [RFC830], and [RFC1034]).

   The goals for the DNS included:

   o  Preservation of the capabilities of the host table arrangements
      (especially unique, unambiguous, host names),

   o  Provision for addition of additional services (e.g., the special
      record types for electronic mail routing which quickly followed
      introduction of the DNS), and

   o  Creation of a robust, hierarchical, distributed, name lookup
      system to accomplish the other goals.

   The DNS design also permitted distribution of name administration,
   rather than requiring that each host be entered into a single,
   central, table by a central administration.

1.2. Review of the DNS and Its Role as Designed

   The DNS was designed to identify network resources.  Although there
   was speculation about including, e.g., personal names and email
   addresses, it was not designed primarily to identify people, brands,
   etc.  At the same time, the system was designed with the flexibility
   to accommodate new data types and structures, both through the
   addition of new record types to the initial "INternet" class, and,
   potentially, through the introduction of new classes.  Since the
   appropriate identifiers and content of those future extensions could
   not be anticipated, the design provided that these fields could
   contain any (binary) information, not just the restricted text forms
   of the host table.

   However, the DNS, as it is actually used, is intimately tied to the
   applications and application protocols that utilize it, often at a
   fairly low level.

   In particular, despite the ability of the protocols and data
   structures themselves to accommodate any binary representation, DNS
   names as used were historically not even unrestricted ASCII, but a
   very restricted subset of it, a subset that derives from the original
   host table naming rules.  Selection of that subset was driven in part
   by human factors considerations, including a desire to eliminate
   possible ambiguities in an international context.  Hence character
   codes that had international variations in interpretation were
   excluded, the underscore character and case distinctions were
   eliminated as being confusing (in the underscore's case, with the
   hyphen character) when written or read by people, and so on.  These
   considerations appear to be very similar to those that resulted in
   similarly restricted character sets being used as protocol elements
   in many ITU and ISO protocols (cf. [X29]).

   Another assumption was that there would be a high ratio of physical
   hosts to second level domains and, more generally, that the system
   would be deeply hierarchical, with most systems (and names) at the
   third level or below and a very large percentage of the total names
   representing physical hosts.  There are domains that follow this
   model: many university and corporate domains use fairly deep
   hierarchies, as do a few country-oriented top level domains
   ("ccTLDs").  Historically, the "US." domain has been an excellent
   example of the deeply hierarchical approach.  However, by 1998,
   comparison of several efforts to survey the DNS showed a count of SOA
   records that approached (and may have passed) the number of distinct
   hosts.  Looked at differently, we appear to be moving toward a
   situation in which the number of delegated domains on the Internet is
   approaching or exceeding the number of hosts, or at least the number
   of hosts able to provide services to others on the network.  This
   presumably results from synonyms or aliases that map a great many
   names onto a smaller number of hosts.  While experience up to this
   time has shown that the DNS is robust enough -- given contemporary
   machines as servers and current bandwidth norms -- to be able to
   continue to operate reasonably well when those historical assumptions
   are not met (e.g., with a flat, structure under ".COM" containing
   well over ten million delegated subdomains [COMSIZE]), it is still
   useful to remember that the system could have been designed to work
   optimally with a flat structure (and very large zones) rather than a
   deeply hierarchical one, and was not.

   Similarly, despite some early speculation about entering people's
   names and email addresses into the DNS directly (e.g., see
   [RFC1034]), electronic mail addresses in the Internet have preserved
   the original, pre-DNS, "user (or mailbox) at location" conceptual
   format rather than a flatter or strictly dot-separated one.
   Location, in that instance, is a reference to a host. The sole
   exception, at least in the "IN" class, has been one field of the SOA
   record.

   Both the DNS architecture itself and the two-level (host name and
   mailbox name) provisions for email and similar functions (e.g., see
   the finger protocol [FINGER]), also anticipated a relatively high
   ratio of users to actual hosts.  Despite the observation in RFC 1034std13
   that the DNS was expected to grow to be proportional to the number of
   users (section 2.3), it has never been clear that the DNS was
   seriously designed for, or could, scale to the order of magnitude of
   number of users (or, more recently, products or document objects),
   rather than that of physical hosts.

   Just as was the case for the host table before it, the DNS provided
   critical uniqueness for names, and universal accessibility to them,
   as part of overall "single internet" and "end to end" models (cf.

   [RFC2826]).  However, there are many signs that, as new uses evolved
   and original assumptions were abused (if not violated outright), the
   system was being stretched to, or beyond, its practical limits.

   The original design effort that led to the DNS included examination
   of the directory technologies available at the time.  The design
   group concluded that the DNS design, with its simplifying assumptions
   and restricted capabilities, would be feasible to deploy and make
   adequately robust, which the more comprehensive directory approaches
   were not.  At the same time, some of the participants feared that the
   limitations might cause future problems; this document essentially
   takes the position that they were probably correct.  On the other
   hand, directory technology and implementations have evolved
   significantly in the ensuing years: it may be time to revisit the
   assumptions, either in the context of the two- (or more) level
   mechanism contemplated by the rest of this document or, even more
   radically, as a path toward a DNS replacement.

1.3. The Web and User-visible Domain Names

   From the standpoint of the integrity of the domain name system -- and
   scaling of the Internet, including optimal accessibility to content
   -- the web design decision to use "A record" domain names directly in
   URLs, rather than some system of indirection, has proven to be a
   serious mistake in several respects.  Convenience of typing, and the
   desire to make domain names out of easily-remembered product names,
   has led to a flattening of the DNS, with many people now perceiving
   that second-level names under COM (or in some countries, second- or
   third-level names under the relevant ccTLD) are all that is
   meaningful.  This perception has been reinforced by some domain name
   registrars [REGISTRAR] who have been anxious to "sell" additional
   names.  And, of course, the perception that one needed a second-level
   (or even top-level) domain per product, rather than having names
   associated with a (usually organizational) collection of network
   resources, has led to a rapid acceleration in the number of names
   being registered.  That acceleration has, in turn, clearly benefited
   registrars charging on a per-name basis, "cybersquatters", and others
   in the business of "selling" names, but it has not obviously
   benefited the Internet as a whole.

   This emphasis on second-level domain names has also created a problem
   for the trademark community.  Since the Internet is international,
   and names are being populated in a flat and unqualified space,
   similarly-named entities are in conflict even if there would
   ordinarily be no chance of confusing them in the marketplace.  The
   problem appears to be unsolvable except by a choice between draconian
   measures.  These might include significant changes to the legislation
   and conventions that govern disputes over "names" and "marks".  Or

   they might result in a situation in which the "rights" to a name are
   typically not settled using the subtle and traditional product (or
   industry) type and geopolitical scope rules of the trademark system.
   Instead they have depended largely on political or economic power,
   e.g., the organization with the greatest resources to invest in
   defending (or attacking) names will ultimately win out.  The latter
   raises not only important issues of equity, but also the risk of
   backlash as the numerous small players are forced to relinquish names
   they find attractive and to adopt less-desirable naming conventions.

   Independent of these sociopolitical problems, content distribution
   issues have made it clear that it should be possible for an
   organization to have copies of data it wishes to make available
   distributed around the network, with a user who asks for the
   information by name getting the topologically-closest copy.  This is
   not possible with simple, as-designed, use of the DNS: DNS names
   identify target resources or, in the case of email "MX" records, a
   preferentially-ordered list of resources "closest" to a target (not
   to the source/user).  Several technologies (and, in some cases,
   corresponding business models) have arisen to work around these
   problems, including intercepting and altering DNS requests so as to
   point to other locations.

   Additional implications are still being discovered and evaluated.

   Approaches that involve interception of DNS queries and rewriting of
   DNS names (or otherwise altering the resolution process based on the
   topological location of the user) seem, however, to risk disrupting
   end-to-end applications in the general case and raise many of the
   issues discussed by the IAB in [IAB-OPES].  These problems occur even
   if the rewriting machinery is accompanied by additional workarounds
   for particular applications.  For example, security associations and
   applications that need to identify "the same host" often run into
   problems if DNS names or other references are changed in the network
   without participation of the applications that are trying to invoke
   the associated services.

1.4. Internet Applications Protocols and Their Evolution

   At the applications level, few of the protocols in active,
   widespread, use on the Internet reflect either contemporary knowledge
   in computer science or human factors or experience accumulated
   through deployment and use.  Instead, protocols tend to be deployed
   at a just-past-prototype level, typically including the types of
   expedient compromises typical with prototypes.  If they prove useful,
   the nature of the network permits very rapid dissemination (i.e.,
   they fill a vacuum, even if a vacuum that no one previously knew
   existed).  But, once the vacuum is filled, the installed base

   provides its own inertia: unless the design is so seriously faulty as
   to prevent effective use (or there is a widely-perceived sense of
   impending disaster unless the protocol is replaced), future
   developments must maintain backward compatibility and workarounds for
   problematic characteristics rather than benefiting from redesign in
   the light of experience.  Applications that are "almost good enough"
   prevent development and deployment of high-quality replacements.

   The DNS is both an illustration of, and an exception to, parts of
   this pessimistic interpretation. It was a second-generation
   development, with the host table system being seen as at the end of
   its useful life.  There was a serious attempt made to reflect the
   computing state of the art at the time.  However, deployment was much
   slower than expected (and very painful for many sites) and some fixed
   (although relaxed several times) deadlines from a central network
   administration were necessary for deployment to occur at all.
   Replacing it now, in order to add functionality, while it continues
   to perform its core functions at least reasonably well, would
   presumably be extremely difficult.

   There are many, perhaps obvious, examples of this.  Despite many
   known deficiencies and weaknesses of definition, the "finger" and
   "whois" [WHOIS] protocols have not been replaced (despite many
   efforts to update or replace the latter [WHOIS-UPDATE]).  The Telnet
   protocol and its many options drove out the SUPDUP [RFC734] one,
   which was arguably much better designed for a diverse collection of
   network hosts.  A number of efforts to replace the email or file
   transfer protocols with models which their advocates considered much
   better have failed.  And, more recently and below the applications
   level, there is some reason to believe that this resistance to change
   has been one of the factors impeding IPv6 deployment.

Google
Web
RFC-Ref