Saturday, March 8, 2008

Using SUN Master Index at Research Labs

This came to me like a real use case while discussing Project Mural offerings with a delegate from Govt. Of India Research Labs during SUN Tech Days at Hyderabad.

His Lab deals with data on protein samples that they work on. The various lab functions are geographically distributed and each of them generate similar data on the protein samples that need consolidation at a central point. It often happens that the data shipped in for the consolidation has subtle differences in the way its reported. Generating Master Index out of data with mismatch in complex protein names and associated parameters is a nightmare and it takes lots of manual effort in narrowing down human errors and creating a single view of the data in the central data pool. The volume of data generated is huge and the shipping format is a simple delimited flat file.

To me the problem looks like a simple use-case where SUN MDM solution can create a difference. Master Index Studio application in the Project Mural suite is equipped to deal with volume of similar data and can spin out Master Index of Related data sets. The application facilitates matching, de-duplication, merging, and cleansing of data from various data sources and performs probabilistic matching on the fields in the data records. MIDM (Master Index Data Manager) spices up the offering by providing the web application interface to the Master Index data and allows the user to have better control over the data being indexed.

