NSF middleware initiative goes beyond science

Effort spreads to corporate use

BOSTON - A multifaceted, federally funded initiative aimed at developing and deploying open-source, open-standards middleware and services so that scientists can share data and collaborate on research has released the fifth version of its software, as the effort spreads into corporate use and beyond U.S. borders.

The National Science Foundation's Middleware Initiative, or NMI, was launched in 2001 with $12 million in grants to be distributed over three years. Well before the end of the funding, NMI projects have swept into universities across the U.S., linking far more than scientists and researchers in an effort that now includes collaboration with software developers and scientists in the U.K., Europe and Asia.

Last week, NMI Release 5 rolled out as part of the initiative's twice-annual update of software, services and documentation. The software suite and individual components, tested and debugged before release, are distributed for free at the NMI Web site (http://www.nsf-middleware.org.) Middleware, as defined by the NMI, is software connecting two or more separate applications across the Internet, and on a larger level consists of a layer of services between a network and applications that manage security, access and information.

"It's the right people (involved in NMI) who have been around long enough to know what the real issues are," said Renee Shuey to explain the initiative's growing popularity. Shuey is the lead systems programmer in academic services and emerging technologies at The Pennsylvania State University, which has 24 campus locations.

Penn State is using Shibboleth, a federated ID management environment based on Security Assertion Markup Language (SAML), to allow its students access to physics class material at North Carolina State University and also for single sign-on and password access to the Napster music download service.

Although Shibboleth was being developed before NMI was up and running, it is a good example of an existing middleware component that is rolled in and developed with other software in NMI's releases. Shibboleth was developed by Internet2 middleware architects so that they could collaborate on Web-based projects across their university networks. Penn State came to use Shibboleth as part of that university's involvement in Internet2, a high-performance network developed for higher education use, and through that involvement Penn State is also by default involved in NMI.

Internet2 is part of NMI, which started with two systems-integration teams at multiple university locations: the Grid Research Integration Deployment and Support Center (GRIDS Center), and Enterprise and Desktop Integration Technologies, or EDIT. Internet2 and Educause, a nonprofit organization that supports higher education's "intelligent" use of IT, are part of EDIT, along with the Southeastern Universities Research Association (SURA), a consortium of more than 60 universities.

Last year, NMI added two more teams, the Open Grid Environments Collaboratory (OGCE) and Common Instrument Middleware Architecture (CIMA). The teams develop tools that make up the core of the GRIDS Center Software Suite, which features the Globus Toolkit, Condor-G and Network Weather Service.

The Globus Toolkit is an open-source enabling technology for computer grids that allows users to securely share computational power, databases and tools online across networks worldwide while maintaining local autonomy. It is the underpinning of commercial grid products, as well as a key piece of science and engineering projects worldwide.

Condor-G is a specialized workload management system for jobs that are compute-intensive, while Network Weather Service monitors and forecasts network and computational resource performance. Like Shibboleth, all three existed before NMI and were developed at universities.

It is difficult, if not impossible, to discern what has been developed as part of NMI and what has been developed as part of the various open-source middleware projects that were under way when the initiative got started and have continued in tandem, acknowledge those who are involved in NMI. They started off by sharing what they had already done, bringing those development pieces to the giant puzzle of creating middleware for wide distribution and use across disparate networks. Unwieldy as such an initiative could have been, given its size and scope, NMI has taken strong hold, though a lot of work still remains to be done, developers say.

"The thing the middleware initiative is really getting right is pulling together the larger vision," said Randy Heffner, a Dallas-based analyst for Forrester Research Inc., who follows application architecture. "It's definitely good stuff."

Different universities, scientists and researchers use different components of NMI, depending upon their needs. Given that the software is free and available to anyone who wants it, there undoubtedly are projects going on that benefit from NMI at places that don't have anyone working on NMI development. But there also are huge collaborations operating as part of the initiative.

NEESgrid, or Network for Earthquake Engineering Simulation, is one such collaboration aimed at advancing earthquake engineering and discovering new ways to reduce the hazards earthquakes present to property and lives. The grid links major equipment sites for research collaboration.

"This new environment allows completely new and powerful forms of experiments and simulations done at a scale and complexity that simply are not possible by any single equipment site," said Kevin Thompson, NMI program officer, by e-mail in response to questions. "Understand that this is a whole new way for the earthquake and civil engineers to conduct their studies; massive shake tables at some of these sites are now accessible for scheduled, controlled and coordinated experimental use, really for the first time in a consistent and supportable manner. ... NEESgrid moves into an operational phase later this year. This is exciting stuff."

The Biomedical Informatics Research Network (BIRN) is another NMI project that is picking up steam, with 14 universities and 22 research groups developing an integrated IT system for large-scale data sharing and analysis among medical researchers. BIRN labs are connected by Internet2.

"What's fascinating is the emergence of distributed collaborations within and among scientific communities that, for many of them, are a huge change in their culture and the way they conduct their work," Thomspon said.

NMI projects also are ongoing in the U.S. and Europe to build grid-like systems for biologists, who speak often of the ongoing vexations related to trying to share the huge amounts of data their research generates.

"They're sharing problems are much more complex than high-energy physics where you have one experimental device and the problem is the same for every community," said grid expert Ian Foster, a University of Chicago computer science professor who also is a scientist at Argonne National Laboratory in Argonne, Illinois, and a Globus Toolkit co-creator.

The biology data-sharing projects still have much work ahead of them because "it's not enough to give people some middleware that provides single sign-on and access." Part of the needed work is on the software development side, but "it's partly also in a lot of scientific disciplines moving into an entirely new regime," and that requires an attendant cultural shift.

At the eight universities serving as sites in the NMI Integration Testbed developed and managed by SURA, there's a "classic early-adoption technology curve," said Mary Fran Yafchak, SURA's IT program coordinator. The sites test all NMI component software at the same time as it is released to the public through the NMI Web site, with a focus on systematic and practical evaluation. Testbed representatives serve as campus evangelists, seeking out those who are willing to test software components and those testers in turn spread the word to colleagues about the software that is available. The early adopters may be "driven by an unsolved problem" to try middleware components, she said.

Potential users "really need to see this work first -- having an educated middleware evangelist to say, 'here's what this is, here's what is available today, and here's what you should be thinking about'," Yafchak said. "If they get one researcher, that researcher will bring others in. ... If a picture is worth a thousand words, then seeing functionality must be worth millions."

She keeps in close touch with Educause and Internet2 representatives, including those who are developing components for the initiative. One such developer is Ken Klingenstein, director of Internet2 middleware and security and chief technologist at the University of Colorado in Boulder. He likens his part of NMI, including Shibboleth, as the "plumbing," connecting the pieces together that will eventually bring water out of the taps. "The three-quarter-inch pipes we've connected to the plumbing are really working," he said.

Even though there is no coordinating mechanism in place for international contributions and use, NMI components have begun taking hold globally, he said, echoing an observation offered by others. That is occurring in large measure because of the growing clamor for federated identity services, which let domains share local identity and security information, while keeping their own internal directory, metadirectory, public-key infrastructure and account provisioning.

"I think the most important contribution in the long term (that his part of the project will make) is going to be to bring to fruition a marketplace for identity management that couples privacy and security," he said. "I think most people felt those were going to be very difficult to accomplish in a single framework."

A lot of the individual pieces of that effort had already been accomplished. Klingenstein works closely with the Liberty Alliance industry trade group whose focus is identity management and security standards, and with those involved in developing WS-Security, a widely supported specification for Web services security.

"We hope that Shibboleth and Liberty will converge around something we call 'Schliberty'," he said, with a laugh. Already, "there really is a lot of harmony," between NMI and the corporate community, and that also is spreading internationally. The first ever meeting will occur soon in the U.K. to address issues of international trust, including the European Union's Safe Harbor privacy provisions that are more protective than U.S. regulations.

U.S. universities also have strict privacy rules regarding personal information about students, which is what led Penn State to Shibboleth. The first pilot test was in 2002 to allow students taking physics classes access to online tests and quizzes at North Carolina State University using their Penn State sign-ons and passwords. Data sharing is increasingly common in higher education, but raises security, privacy and authentication issues.

When the campuses started the physics data-sharing Penn State students were given sign-ons and passwords to access the N.C. State data and at least in the first weeks of the term had trouble remembering yet another sign-on and password. After Penn State deployed Shibboleth, calls to the physics help desk plunged 85 percent in the first two weeks of classes, Shuey said.

"They were never intended to be managers of security databases," she said of the physics help desk, which had previously been deluged with calls from students who had forgotten their passwords to access the N.C. State system.

Then came the decision to provide Napster service to students. Shibboleth allowed Penn State programmers to set up Napster access without breaching regulations that forbid providing personal information about students, who are redirected to a Penn State site for authentication when they go to Napster's target site for the campus with no identifiable information exchanged at all. Penn State will roll out Napster as a service to its entire 80,000 student population at the beginning of the next school year and so far has had no problems at all using Shibboleth for Napster access at campus dormitories.

Meanwhile, Shuey is watching where the middleware initiative is going, thinking about which parts of it could be used at Penn State in the future.

"It's very cool technology," she said. "I get very excited about it."

Copyright © 2004 IDG Communications, Inc.

How to choose a low-code development platform