Proceedings of Name Collisions Workshop Available

Burt Kaliski | Mar 26, 2014

Presentations, papers and video recordings from the name collisions workshop held earlier this month in London are now available at the workshop web site, namecollisions.net.

The goal for the workshop, described in my “colloquium on collisions” post, was that researchers and practitioners would “speak together” to keep name spaces from “striking together.”  The program committee put together an excellent set of talks toward this purpose, providing a strong, objective technical foundation for dialogue.  I’m grateful to the committee, speakers, attendees and organizers for their contributions to a successful two-day event, which I am hopeful will have benefit toward the security and stability of Internet naming for many days to come.

Keynote speaker, and noted security industry commentator, Bruce Schneier (Co3 Systems ) set the tone for the two days with a discussion on how humans name things and the shortcomings of computers in doing the same.  Names require context, he observed, and “computers are really bad at this” because “everything defaults to global.”  Referring to the potential that new gTLDs could conflict with internal names in installed systems, he commented, “It would be great if we could go back 20 years and say ‘Don’t do that’,” but concluded that policymakers have to work with DNS the way it is today.  

Bruce said he remains optimistic about long-term prospects as name collisions and other naming challenges are resolved:  “I truly expect computers to adapt to us as humans,” to provide the same kind of trustworthy interactions that humans have developed in their communications with one another.

After the keynote, Matt Thomas (Verisign) and Andrew Simpson (Verisign) shared their research on DNS query analysis techniques, which was well placed as a technical introduction to the day’s topics.  In the Q&A that followed, Olaf Kolkman (NLnet Labs) noted that “throughout this whole conference” researchers are looking for the “underlying theory” of name collisions, building on work such as Matt and Andy’s.  

Colin Strutt (Interisle)  followed with a review of query traffic to corp.com, which receives around 1,400 queries per minute, half of the form <3ld>.corp.com, and many with the higher-risk “underscore” patterns reported at root server traffic, thus providing a basis for studying these kinds of queries and the systems that generate them.  Some of the queries to corp.com originate from installed systems that employ “.CORP” as an internal name (through the addition of “.com” by search-list processing during DNS resolution).  Paul Mockpetris (inventor of the DNS, whom we were privileged to have as a guest at the workshop) shared a memorable criticism of the long-ago practice that led to this potential for confusion:

Use of domain names like .corp "bad idea at the time"; people who recommended it "guilty of malpractice" -- Paul Mockapetris #namecollision

— Burt Kaliski Jr. (@modulomathy) March 9, 2014

Joe Abley (Dyn) then presented on behalf of Jim Reid (RTFM LLP), offering a preliminary review of query traffic to the root that mistakenly treats the root as a recursive name server, concluding that misconfigurations and rogue applications probably explain most of the so-called “RD=1” behavior and that it’s “not a major issue” but more testing should be done.

In the afternoon, Geoff Huston (APNIC) described his use of advertising networks to measure end-system behavior on IPv6 and DNS.  He sets up unattractive ads (the kind people are unlikely to be drawn to click on) and embeds content that will automatically test an end-system’s use of IPv6 or DNS when the ad is served, whether or not it’s clicked.  There’s no problem if a user clicks, but the less attractive the ad is, the more copies of the ad will be served to ensure that someone clicks (and the advertising network gets paid); thus the more measurements he collects.  It’s a clever technique for actively probing the DNS ecosystem to correlate specific queries with observations at the authoritative servers, rather than looking at the observations alone.

Matt Thomas gave another talk, joint work with Andy Simpson and Yannis Labrou (Verisign), on the statistical invalidity of SLD blocking.  Like other talks, it was followed by a well-rounded set of questions from the audience that helped both to validate the contributions and to confirm the limitations of the analysis.  

Keith Mitchell (OARC) gave an overview of DNS-OARC and the Day-in-the-Life of the Internet (DITL) data that has been the focus of much attention in recent studies.  While the DITL data remains a valuable reference for researchers, Keith noted that “Total knowledge of DNS is a big multi-dimensional pie.”  The DITL data is one “slice” that needs to be correlated with others.

Suzanne Woolf next chaired an expert panel on the standards and engineering challenges, joined by Peter Koch (DENIC), Olaf Kolkman, Warren Kumari (Google), and John Levine (Taughannock Networks).  I was encouraged to see the seriousness of the discussion of solutions including how to improve the DNS over time, as well as the appropriate role of the IETF in designing a solution.  John’s observation was perhaps the best counterpoint to Paul Mockpetris’ earlier criticism of the practice of selecting internal names that one presumes would not ever conflict with external ones:

"Everyone using .CORP is in a formal state of sin ... it's up to [us] to forgive them" for benefit of Internet -- John Levine #namecollision

— Burt Kaliski Jr. (@modulomathy) March 9, 2014

Danny McPherson (Verisign) then led a wrap-up session for the first day that included two “lightning talks” by Ray Arends (Nominet UK).  The first offered evidence that a spambot generates many of the “RD=1” queries to the root described in Jim Reid’s paper.  The second argued that a significant fraction of the NXDOMAIN queries for new gTLDs seen at the root are a result of a “spambot-killer” countermeasure that provides fake email addresses containing new gTLDs to the spambot (which are subsequently presented to the root by the spambot).  It will be interesting to see more analysis on that.

The second day continued the first one’s momentum with three solid research talks in the first session.  Paul Hoffman (VPNC ) spoke about approaches enterprises can take to mitigate  name collision risks.  A memorable quote:   “Creating your initial private TLD is really easy.  Changing from one to another is mind-bogglingly hard.”  He also observed that although name collisions occur only when DNS queries involving internal names “leak” the global DNS.  The overlap of different naming contexts isn’t the problem in and of itself; it’s the potential leakage of queries from one context to another that leads to collision risks.

Paul summarized in one slide the 13-step plan published in December for avoiding potential name collisions associated with a private TLD.  Warren Kumari responded with a comment on the complexities of this apparently simple advice by comparing it to the statement “We can solve world hunger by giving people more food.”

Casey Deccio (Verisign) then introduced a formal definition and model for name collisions to help analyze the causes of name collisions from an end-system perspective.  Andy Simpson followed with a presentation on how to identify search-list behavior in query data including a new observation that the “controlled interruption” technique may not “interrupt” the WPAD protocol, thus making some potential collisions.

After a break, Jeff Schmidt (JAS Global Advisors) gave a well prepared presentation on the recently published name collision mitigation framework, most of which drew parallels with naming and numbering transitions of the past, including ZIP codes and area codes in the US.  In his introduction, he acknowledged that “bad things can happen” in name collisions and that they can be serious “in some circumstances,” but didn’t elaborate.  He also noted that “collisions have occurred prior to the delegation of every TLD since (at least) 2007,” and that they have been observed under TLDs as well as under second-level domains.  A lot of this “brittleness” in the DNS infrastructure, he continued, “is actually being tolerated” by enterprises.

The framework adopts an established pattern of advance notification, communication, and a grace period before a transition is final.  The culmination of the talk was one-slide summary of the framework’s recommendations.  The primary focus on the recommendations is a new “controlled interruption” technique that is intended to serve as the “grace” or “negative acknowledgement” period notifying participants who still operate in the pre-transition mode that the transition is about to occur.

There was a significant amount of debate on whether there has been enough advance notification  and communication, as well as some technical discussion of whether the controlled interruption should go to a localhost address or to an internal network address (Paul Hoffman’s preference).  Jeff’s recommendation is the former because it can be handled more easily by most impacted parties, but described an option where an enterprise can use a Response Policy Zone to redirect clients to the latter.

The prize committee awarded the workshop’s $50K first prize to Jim Reid for his paper “Analysing the Use of the RA and RD bits in Queries to Root Servers,” in support of further research.

The closing comments generally called for more research and more awareness, but with the recognition that the time is short for the present issue.  Bruce Schneier had opened the workshop with an encouragement to focus on how humans name things, and to improve the way computers name things to support that understanding. His perspective of putting the user first serves as a continual reminder of priorities for researchers as well as policy makers pursuing the next steps on name collisions.  

For a full set of tweets on the workshop, please see the hash tag #namecollision