Microsoft Research's eScience group is helping scientific researchers use database and online sharing tools in ways they might not have imagined, and sharing those experiences with Microsoft product groups that can tweak their software for easier use by the scientific community.
Microsoft researchers showed off some of their projects at the American Geophysical Union annual conference in San Francisco this week, said Catherine van Ingen, partner architect with Microsoft Research's eScience group.
One example of the eScience group's work is a project with climate scientists. Many researchers in the field are independent scientists who might be doing research into local climates for the sake of their crops or other specific interests, van Ingen said. Other scientists are looking at climate change from a more global perspective. Now, the two types of researchers are combining their data.
"It becomes possible to effectively mash up scientific data from different sources and mix that with locally acquired sensor data to do fantastic science," she said.
But that presents a problem. "Most scientists I deal with aren't used to this kind of quantity of data," she said. "An awful lot of scientists, if it gets much bigger than an Excel file, it's more data than they know what to do with."
Microsoft has helped the climate scientists create a database that includes around 800 million data points, she said. Her group is helping the scientists learn how to use the database to do the kind of analysis that will help their research, she said.
The global climate scientists are particularly interested in learning exactly how efficient plants are at absorbing carbon dioxide (CO2). Scientists have discovered that there isn't as much CO2 in the atmosphere as they think there should be based on carbon emissions. It's going somewhere: either in the ocean or absorbed by plants, she said. Combining a massive amount of data that is already being collected around the globe about forests and trees and analyzing it might help them figure it out, she said.
One of the projects the eScience group presented at the conference aims to learn why salmon aren't spawning like they used to in the Russian River in California. The eScience group is helping to combine data collected by a number of agencies, including the National Oceanic and Atmospheric Administration, marine fisheries groups, the U.S. Geological Survey, and other local groups that are interested in the subject. By combining all of that data, the researchers can investigate factors such as water temperature and the clarity of the water, both of which affect spawning salmon, she said.
"We're stretching tools in different ways," van Ingen said of the work of the eScience group. Typically, scientists want to find data that points to the extremes, a need that corporate users often don't have. For example, a big retailer probably wouldn't use database software to look for sales of an odd product. That's precisely what many scientists are trying to do: look for extreme cases. The unique requirements of the scientific community that the eScience group discovers are fed back into Microsoft product groups, which can then decide to introduce new capabilities that might better serve scientists, she said.