Data Migration and Translation Part 2
So, what does a structured process to data migration and translation look like?
First a few definitions:
- Source system – the origin of the data that needs to be translated or migrated. This could be a database or a directory structure.
- Target system – the final destination for the data. On completion of the process, data in the target should be in the correct format.
- Verify – Ensure that data placed in the target system is complete, accurate, and meets defined standards.
- Staging area – an interim location where data is transformed, cleaned, or converted before being sent to the target.
The process consists of five steps as shown below:
The process can be described as follows:
- Data to be migrated is identified in the source system. This is an important step and ensures that only relevant data is moved. Junk data is left behind.
- The identified data is extracted from the source system and placed in the staging area.
- The data is then transformed into a format ready for the target system. Such transformation could be a CAD to CAD translation, a metadata change, or a cleaning process. Transformation may also entail data enrichment – for example, append additional properties to the objects so they can be better found in the target system.
- Transformed data is then loaded into the target system. This can be done automatically via programs or manually, dependent on the chosen method. Automatic routines can fail and these are flagged for analysis and action.
- Once data is loaded, validation is carried out to ensure that the migrated data is correct in the target system and not corrupted in some fashion.
The process as described above is shown at working level:
Shown in this diagram are two software tools – extractors and loaders. These are usually custom utilities that use APIs, or hooks into the source and target systems, to move the identified data. For example, an extractor tool may query a source PLM system for all released and frozen data that was released after a given date. Once this search is complete, the data identified by this will be downloaded by the extractor from the PLM system into the staging area.
In a similar manner, a loader will execute against a correct data set in the staging area and insert this into a target system, creating the required objects and adding the files.
It is highly recommended that pilot migrations be carried out on test data in developmental environments to verify the process. This testing will identify potential bugs and allow them to be fixed before actual data is touched.
Such a structured process will guarantee success!