I've always thought a lot about how data relates to architecture, especially architectural patterns such as SOA. While some would like to break them apart, I think that SOA is architecture, and indeed the foundation architecture is always going to be information/data. To that end, I've always been an advocate for a common data model (CDM) or forcing a semantic/data-level understanding of the domain and then attempting a logical restructuring before that information is bound to services and/or processes. In fact, I talk about CDM in my EAI book, b-to-b book, and SOA books, written years ago.
The last mention on this topic was a blog posted more than a year ago here on InfoWorld, which I still stand behind. However, as I speak to people about data governance and SOA, I'm often taken aback by the lack of understanding of how both notions relate one to another. While most thought leaders in this space agree, I still think that the rank-and-file SOA architect is ignoring their data. I think, in most cases, because it's a huge mess, while in other cases, it's due to ownership issues within the enterprise; sometimes it's a lot of both.
So let's get a few things very clear.
First, you can't do SOA right without a clear understanding of the data, as is, to be, abstracted, or not. I call this a semantic understanding in my SOA methodology, but it's really just defining a common understanding and modeling of the information within the architecture.
Second, force that metadata into a common data model for use within the new architecture. The common data model should be more reflective of the business, with clean and understandable schemas and entities. This is both logical and physical, but not yet deployed.
Finally, figure out a technical approach to managing and changing the data. This may mean a physical change, an abstraction, or a larger, more invasive redesign and normalization efforts.
Then you can move on to the cool stuff.