SAP® Best Practices for Data Migration – Hits and Misses

Data migration guru, John Morris, wrote “data migration is a business issue and not a technical issue.” Having been through many data migration projects myself, I couldn’t agree more. When choosing a framework for data migration one has to keep this in mind. Furthermore when it comes to migrating data into SAP, one needs to take into consideration a few other truisms, such as:

  1. 1.Data quality is not achieved all at once but over repeated iterations
  2. 2.While data migration itself is a one-time event, within the implementation repeated cycles of extraction, transformation and loading (ETL) into multiple SAP environments are required
  3. 3.Loading into SAP is a time consuming process

Back in the day data migration was considered an unglamorous job that paid little. Implementation consultants preferred to fight more glorious battles such as business process re-engineering. It was left to lowly clerks and programmers to duke it out with data. It took a while, but system integrators finally realized that a fabulous implementation with great processes could be quickly brought to its knees by poor data. This realization has sparked the creation of numerous approaches on how to migrate data. One approach that has gained popularity and has done so rightfully is the SAP Best Practices for Data Migration.

The SAP Best Practices for Data Migration framework is based on the SAP Data Services application which is their ETL tool. The main components of the framework are SAP Data Services, a staging database, a set of migration templates, a set of corresponding mapping templates, a tool to map source values to target values and a methodology to be followed.

The migration templates are ‘programs’ or in SAP Data Services parlance ‘jobs’ that SAP Data Services executes. These jobs fall into multiple categories. One job category downloads SAP’s configuration tables into the staging area. The framework uses the downloaded configuration / lookup data to validate the data to be migrated. A second job category handles ETL of data into SAP. Tasks such as reconciliation of data loads, status checks and creation of staging area data stores are handled by other job categories. The framework also supports creation of SAP BusinessObjects™ Universes used for reporting on the progress of the data migration project.

The jobs have a uniform and modular design that allows them to support the data migration process. The standardized design makes maintenance of the jobs easier and significantly reduces the duration of the inevitable knowledge transfers between developers that occur on a typical data migration project.

From a developer’s standpoint data migration programs can be thought of as executing a set of steps in a specific sequence. The typical steps are field mapping, value mapping or transformation, validation and finally loading. Each of these SAP Best Practices for Data Migration data service jobs comprises of parts that mirror the above tasks and sequence. In the first part the source fields are mapped to the target fields. In the second part the data is validated. The third part transforms source values into target values and the final part loads the data into SAP using IDOCs.

Like other frameworks, SAP Best Practices for Data Migration jobs validate data. The differentiator is how the validation is implemented when using SAP Best Practices for Data Migration. Three categories of validation are applied to the data. The data is validated to check if mandatory fields are populated; the format and the values of data are acceptable to SAP. While these seem like standard validations, the framework offers two advantages. First, it is not necessary to load the data into SAP to determine the validity of the data. SAP Best Practices for Data Migration allows data to be validated within its staging area. Anyone who has loaded data into SAP and has spent hours waiting for data to load will appreciate the fact that the SAP Best Practices for Data Migration framework allows you to validate the data as many times as required at the click of a button in a very short period of time. Second, SAP Best Practices for Data Migration allows you to identify all issues with an individual data record at one time.

After running the validation, the framework stores invalid data in a separate table. Each record is tagged with the reason for validation-failure in plain English. These records can be provided to the business users who can then fix the error. The corrected records can then be passed through the jobs again and if any data issues are encountered again the data can be returned to the business for fixing. This process can be repeated any as often as required until all the data quality issues are resolved. SAP Best Practices for Data Migration supports close and frequent interaction with the business. This support of the interaction shows how the framework incorporates the aphorism that data migration is a business issue and not a technical issue.

While validation is possible, there is a short coming as well. Cross field validation is not available. This has to be custom built. My colleague Chuck Schardong recommends that these custom extensions be built outside of the SAP Best Practices for Data Migration templates. Though not very difficult to build, it would’ve been nice if the framework took into account such validations. Another drawback of the framework is that it does not provide an interface that allows the business users to correct the data. At the moment the data can be provided in the lingua franca of all business users e.g. “Microsoft Excel®.”  However, handling numerous files and keeping track of which files are current is a headache that any data migration practitioner could do without.

SAP implementations span multiple months if not years and involve multiple environments and test cycles. Hence a good framework that supports a SAP data migration should be capable of repeating the data migration tasks often and in multiple environments, an area which many frameworks miss. Not so with the SAP Best Practices for Data Migration. Once the SAP Data Service jobs are customized and tested, the jobs can be executed any number of times. Anyone who has worked in an SAP implementation knows that there can be different configurations in different SAP ‘Clients’. Last minute configuration changes are the rule rather than the exception. When it comes to multiple environments and changes, the framework downloads the SAP configuration information each time and uses it to validate the data before loading into that environment. This ensures that changes to the SAP configuration are always taken into account and the data that is provided for loading, will load into the SAP system.

One of the major challenges of a data migration is loading data into SAP.  The Legacy System Migration Workbench (LSMW) is the preferred option for loading data into SAP. However, the process is lengthy and errors thrown out by the LSMW tend to be cryptic. There are also instances when a record is being loaded and LSMW will stop loading the record at the first error it encounters. If there are additional errors, they will be picked up only when the data is loaded again into SAP. Since loading data is itself a time consuming process, finding all the errors could turn into a long and painful process. This is where the framework really shines and proves its value.  Validation within the staging area and validation of all fields in a record at one time eliminates the issue of having to repeatedly load the data into SAP to discover all the errors. SAP Data Services in an SAP environment provides a variety of ways to load data into the system. The framework supports IDOCs and in cases where traditional LSMW methods are required, the framework can also provide validated text files that can be loaded into SAP. IDOCs are a fast and efficient way of loading data into SAP. By just changing a few parameters in the SAP Data Services configuration, the environment into which the framework loads data can be changed. This literally allows users to load data into SAP at the touch of a button.

In addition to the above there are other advantages of using the framework. The framework includes pre-built templates for various SAP data objects. This reduces the development cycle. In my experience I’ve seen a reduction of about 38% in terms of the effort required to build ETL jobs for data migration. Also these pre-built templates contain most of the validations that are required for SAP, which reduces the probability of missing a particular validation and provides peace of mind to developers. Of course the ease of use of the templates comes at a price. Since they necessarily have to cater to multiple requirements, they are built with a one size fits all mentality. The templates contain all the fields that are available for a SAP object. It becomes tedious when the input contains only a few fields.  Removing the extra fields from the template or setting them to null can be a long drawn out process. This perhaps is the reason why I’ve seen a 38% decrease in effort and not a greater effort savings.

In summary, the SAP Best Practices for Data Migration is a solid and reliable framework that can be and has often been used to migrate data successfully. SAP Best Practices for Data Migration has a superior validation method within the staging area. The integration with SAP allows close interaction with business users even as it supports technical tasks such as repeated loading of data into multiple SAP environments. As a practitioner I feel this is one of the better frameworks around.  

As an SAP partner and data experts, Utopia has extensive knowledge of data objects and how they align to SAP implementations for specific SAP modules learn more about our data migration jumpstart services.