Yahoo Develops Interface Classification System for Hadoop

Developers at Yahoo have been working on a new interface classification system in Hadoop to distinguish two facets of an interface from the perspective of backward compatibility: the audience of the interface and the stability of the interface.

The "audience" of the interface refers to its scope or visibility. It's about the potential customers for it. The classifications in the new system include "public," "limited private" (for hooks exposed to peer frameworks or systems), and "private." The "stability" of an interface refers to how changes might or might not break compatibility. The classifications include "stable," "evolving," and "unstable."

Hadoop is the popular Java-based open-source framework for data-intensive distributed computing. The Hadoop Framework is an open-source distributed computing platform designed to support parallel computations over large data sets on so-called unreliable computer clusters. It's based on Google's MapReduce, a programming model for processing and generating large data sets, which divides an application into multiple units of work, each of which can be executed on any node in a server cluster. Hadoop supports the HDFS distributed file system, which designed to scale to petabytes of storage and to run on top of the file systems of the underlying OS.

In his Yahoo Developer Network Blog, Hadoop team member Sanjay Radia wrote: "Hadoop is increasingly being used to run large, long-lived, enterprise-class applications. Porting these applications to non-compatible upgrades of Hadoop is an arduous, expensive task that distracts teams from finding new and better ways of using Hadoop to bring value to their companies. Today, Hadoop users are demanding backwards compatibility and interface stability; these features are necessary for the next growth phase of Hadoop, as it gains wider enterprise adoption."

According to Radia, an interface can be a Java API, a configuration variable, the parameters or output of a command, or metrics variables. The system tags Java APIs using Java Annotations, while other types of interfaces (configuration options and output formats, for example), are tagged using informal documentation conventions. The upcoming release 0.21 of Hadoop will be the first to expose this classification, Radia said.

Yahoo's recommendation to app developers: stick to "public-stable" interfaces. "If you are early adopter, you may use a public-evolving interface," Radia wrote, "but be aware that the interface may change slightly in the future, forcing a change to your application." If you're a framework developer on Hadoop: "You can of course safely use any of the public interfaces, but can also use limited-private interfaces targeted to your framework. For example the Hadoop RPC layer provides limited-private interfaces for HDFS and MapReduce."

The new classification system, which is derived from OpenSolaris and Yahoo's own internal system, has been in the works for the last year. It's part of Yahoo's plan to provide stronger backward compatibility, Radia said.

The details of the classifications for interfaces system can be found here.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].

Featured

  • geometric pattern featuring interconnected circuit-like lines, neural network nodes, and abstract technology symbols

    Innovate Tech X Launches Certified AI Engineer Pathway Program for High School Students

    Tech training provider Innovate Tech X has introduced a new Certified AI Engineer Pathway Program designed to help high school students attain in-demand skills and certifications in artificial intelligence.

  • The AI Show

    Register for Free to Attend the World's Greatest Show for All Things AI in EDU

    The AI Show @ ASU+GSV, held April 5–7, 2025, at the San Diego Convention Center, is a free event designed to help educators, students, and parents navigate AI's role in education. Featuring hands-on workshops, AI-powered networking, live demos from 125+ EdTech exhibitors, and keynote speakers like Colin Kaepernick and Stevie Van Zandt, the event offers practical insights into AI-driven teaching, learning, and career opportunities. Attendees will gain actionable strategies to integrate AI into classrooms while exploring innovations that promote equity, accessibility, and student success.

  • laptop with an AI graphic, surrounded by books, a tablet, a smartphone with a graduation cap icon, a smart speaker, and a notebook with a brain illustration

    Michigan Virtual, aiEDU Partner to Expand AI Support for Teachers

    A new partnership between Michigan Virtual and the AI Education Project (aiEDU) aims to accelerate AI literacy and AI readiness across Michigan's K-12 schools.

  • Stylized illustration showing cybersecurity elements like shields, padlocks, and secure cloud icons on a neutral, minimalist digital background

    Microsoft Announces Host of Security Advancements

    Microsoft has announced major cybersecurity advancements across its product portfolio and practices. The work is part of its Secure Future Initiative (SFI), a multiyear cybersecurity transformation the company calls the largest engineering project in company history.