**Source URL:** https://general.veevavault.dev/safety/migrations/references/data-transformation.md

# Data Transformation Considerations



Several complications can occur when populating Vault metadata. Consider the following best practices to transform data before a migration.

## CSV Format {#csv-format}

CSV files used to create or update documents using [Vault Loader](https://platform.veevavault.help/en/gr/26605) must use UTF-8 encoding and conform to [RFC4180](https://tools.ietf.org/html/rfc4180).

## Date Formatting {#date-formatting}

Dates migrated into Vault must use the format `YYYY-MM-DD`.

## Date/Time Formatting {#datetime-formatting}

Date/time conversion must use the Coordinated Universal Time (UTC) format `YYYY-MM-DDTHH:MM:SS.sssZ`, for example `2019-07-04T17:00:00.000Z`. Hence it must end with the `000Z` UTC expression, although the zeros can be any number. Ensure that date/time fields map to the correct day. This may be different depending on the time zone.

## Case {#case}

If Vault metadata is case-sensitive, convert it to match the expected format.

## Special Characters {#special-characters}

Metadata must not contain special characters, for example, tabs and smart quotes. These special characters can be lost when migrating data into Vault.

## Character Encodings {#character-encodings}

Saving Excel files in CSV format for use with Vault Loader can corrupt the file in an undetectable manner. If the file becomes corrupt, your load will fail. Failure logs contain a record of each row that has failed and are accessible by email or Vault notification. Correct the CSV files to continue loading.

## Language {#language}

If the data being migrated is multilingual, ensure your Vault is configured to [support different languages](https://platform.veevavault.help/en/gr/13309).

## Multi-value Field Comma Separator {#multi-value-field-comma-separator}

When mapping multi-value fields, values with commas can be entered through quoting and escaping. For example, `“veeva,,vault“` is equivalent to `“veeva,vault“`.

## Windows & MacOS Formatting {#windows--macos-formatting}

Data formatting can differ per environment. For instance, a line separator behaves differently when being from Windows or a MacOS.

## Boolean Fields {#boolean-fields}

Format *Yes/No* fields as `true` or `false` when migrating using the API. This doesn’t apply to Vault Loader, as it handles boolean values regardless of case.

## Trailing Spaces {#trailing-spaces}

Remove any trailing spaces from metadata. These are commonly found after commas.

## Leading Zeros {#leading-zeros}

Migrate numbers in String fields as String values to preserve leading zeros and prevent their conversion to integers.

## Unique Identifiers {#unique-identifiers}

On documents or object records where *Name* is not unique or is system-managed, set the *External ID* (`external_id__v` or `external_id__c`) to relate it to the original ID used in the legacy system. Additionally, this field helps distinguish between records in success and failure logs.

## Maximum Field Length {#maximum-field-length}

Values in Long Text fields must not exceed the maximum length configured in Vault. Vault Loader does not truncate these values.

## References to Users and Persons {#references-to-users-and-persons}

Documents and objects can reference *User* (`user__sys`) and *Person* (`person__sys`) records. These records must be active in order to be referenced. If referencing people who have left the company or had a name change, reference a *Person* record as it does not have to be linked to a Vault user account. *User* names and *Person* names are not unique, therefore, external IDs must be referenced for these objects.

## Object Lookups {#object-lookups}

Many object records have relationships with other records. For example, the *Study Site* object has a field called `study_country__v` of data type *Parent Object* which links it to the *Study Country* object. If you create a new *Study Site* record using Vault Loader or the API and happen to know the ID for the desired *Study Country* record,you can populate it. However, these IDs will change based on the Vault environment. Use a lookup table to obtain the Vault record IDs from the `name__v` or `external_id__v` fields. An alternative is to use an [object lookup field](https://developer.veevavault.com/vql/#object-lookup-fields) in the form `study_country__vr.name__v = 'United States'`.



---

**Previous:** [Migration Best Practices](/safety/migrations/references/migration-best-practices)  
