Importing 3rd Party Data into CDB


Vault allows you to import data from 3rd party systems for cleaning and reporting in the Data Workbench application. This is intended for importing subject data that exists outside of the subject’s Casebook in Vault EDC, for example, data about a Subject from an IRT system.

Prerequisites

Before you can import data into Vault, your organization must perform the following tasks:


The following permissions are required to perform the actions described below.

Users with the Vault Owner security profile are able to perform the actions described below by default. If your vault uses custom Security Profiles, your profile must grant the following permissions:

Type Permissions Controls
Security Profile Tabs: Workbench Ability to access the Data Workbench application (via the Workbench tab)
Security Profile Objects: Study: Read Ability to view records in the Study object, which is required for creating records (via import) in Data Workbench
Security Profile Application: FTP Staging: Access Ability to access your vault’s FTP staging server
Security Profile Application: API: Access API Ability to make an API call, which is required to create and update records via FTP

Preparing Data for Import

You can import your data as a series of CSV files to your Data Workbench vault’s FTP server. To import your data, you’ll first need to create the CSV files, which list your clinical data, and a manifest file (.json), describing your data and how Vault should process it. Then, you’ll create a ZIP package (.zip) containing all of those files for import into Vault.

Data CSVs

Create a CSV for each data collection Form in your study that is independent of EDC. For each CSV, you must provide four (4) required columns, and then a column for each execution data item. For the four required columns, you can name them whatever you choose, as long as you define those names in your manifest file for each CSV. Note that these column names are case-sensitive and must be an exact match to the values provided in your manifest file. Aside from the study, site, subject, event, and sequence columns, Vault considers every column to be a data item and will import it as such.

Column Description Manifest File Key
Study In this column, provide the Name of the Study (from the Study record in Vault EDC). study
Site In this column, provide the Name (site number) of the Site (from the Site record in Vault EDC). site
Subject In this column, provide the Subject ID for the subject (from the Subject record’s Name field in Vault EDC). subject
Event In this column, provide the Event (visit) associated with the data row. Depending on the edc_matching configuration in your manifest file, you can use the Name or External ID of the EDC Event Definition, the Event Date of the EDC Event, or use Events in this column to have Workbench create events separate from the EDC schedule. Learn more about event matching below. event
Form
*Optional
In this column, provide the Form associated with the data row (from the Name of the Form Definition record in Vault EDC). This column is only needed if you’re importing data for multiple Forms in a single CSV. If you don’t include this column, Workbench assumes that each data row is an occurrence of a single Form.

Learn more about form mapping below.
form
Form Sequence
*Optional
With the 20R1 release, we renamed “sequence” as “formsequence”. You can continue to import data using “sequence” until the 20R2 release (August 2020).

The Sequence Number of the Form, if the form is repeating. A repeating Form represents the collection of the same data more than once during a single Event for a single Subject. Then, the Sequence Number uniquely identifies the data row. If there is more than one row (form) for a Subject and Event, Workbench assumes that the form is repeating and uses this column to set the Sequence Number. This column is optional, so you may leave it blank or not provide it if none of your forms are repeating. In that case, Workbench sets the Sequence Number to “1” for each row by default.
formsequence
Item Group
*Optional
In this column, provide the Item Group associated with the Item columns after it (from the Name of the Item Group Definition record in Vault EDC). This column is only needed if you’re importing data for forms with multiple Item Groups. If you don’t include this column, Workbench assumes that all Items in the data row are in a single Item Group.

Learn more about item group mapping below
itemgroup
Item Group Sequence
*Optional
The Sequence Number of the Item Group, if the item group is repeating. A repeating Item Group_represents the collection of the same data more than once during a single _Form_for a single _Subject. Then, the Sequence Number uniquely identifies the data row. If there is more than one set of Items in a form row for a Subject and Event, Workbench assumes that the form is repeating and uses this column to set the Sequence Number. This column is optional, so you may leave it blank or not provide it if none of your item groups are repeating. In that case, Workbench sets the Sequence Number to “1” for each row by default. itemgroupsequence
Row ID
*Optional
Provide a list of columns that, when their values are combined, Workbench can use to uniquely identify a row. This is useful when you don’t have a numerical key to identify a sequence. For example, when importing lab data, you could use Study, Subject, and _Lab Test_to uniquely identify rows. If you include a rowid column mapping, Workbench ignores any values for sequence and automatically sets the sequence to “1”. If omitted, Workbench uses the default array of columns to identify rows: study, subject, event, form, and formsequence. rowid

Here is an example CSV file:

STUDY_ID SITE SUBJECT_ID VISIT INITIALS DOB GENDER RACE
S.Deetoza 101 101-1001 Screening CDA 03-27-1991 F Hispanic

In this example, we have the following column mappings:

  • Study: STUDY_ID
  • Site: SITE
  • Subject: SUBJECT_ID
  • Event: VISIT

The INITIALS, DOB, GENDER, and RACE columns are all data items.

Manifest File

Your manifest file provides the Study and Source for Vault, lists the files in your ZIP, and for each file, lists which columns map to the required data points.

Source is a custom defined value that allows you to identify the contents of the package. This value is stored in the Source field in Workbench, so you can use it to identify data from this package in Workbench via CQL. The Source value must be unique within your Study.

For your Study, provide the Name of your Study (study__v) record in Vault EDC. For Source, specify the source of the data. Vault applies this source to all imported data. Then, as an array, list each CSV file, including its filename (with the extension) and its column mappings, as the value for “data”.

You can choose to include configuration metadata about each data item in your import file. See details below.

Save this file as “manifest.json”.

Note that Workbench will only accept manifest files with this exact filename (case-sensitive) and extension.

Example Manifest: Single Form

Here is an example manifest file for the Deetoza study that imports a package from an eCOA containing the Survey form:

{
  "study": "Deetoza",
  "source": "eCOA",
  "data": [
    {
      "filename": "Survey.csv",
      "study": "protocol_id",
      "site": "site_id",
      "subject": "patient",
      "event": "visit_name"
    }  
  ]   
}

Example Manifest: Multiple Forms

Here is an example manifest file for the Deetoza study that imports a package from a laboratory vendor, containing the Chemistry and Hematology forms:

{
  "study": "Deetoza",
  "source": "lab",
  "data": [
    {
      "filename": "Chemistry.csv",
      "study": "STUDY_ID",
      "site": "SITE_ID",
      "subject": "SUBJECT_ID",
      "event": "VISIT",
      "formsequence": "LAB_SEQ"
    },
    {
      "filename": "Hematology.csv",
      "study": "STUDY_ID",
      "site": "SITE_ID",
      "subject": "SUBJECT_ID",
      "event": "VISIT",
      "formsequence": "LAB_SEQ"
    }
  ]
}

You could also use a Row ID mapping and import Chemistry and Hematology lab data using a single CSV file. In that CSV file, you could use a Lab Test Set column to indicate Chemistry or Hematology for the row.

{
  "study": "Deetoza",
  "source": "lab",
  "data": [
    {
      "filename": "Labs.csv",
      "study": "STUDY_ID",
      "site": "SITE_ID",
      "subject": "SUBJECT_ID",
      "event": "VISIT",
      "rowid": "STUDY_ID, SUBJECT_ID, LAB_TEST_SET"
    }
  ]
} 

Optional: Event Matching

By default, Workbench matches the Event from the form’s CSV file to the Name of the matching Event Definition record in Vault EDC. You can also choose to have Workbench match the Events based on the event definition’s External ID. If your data was collected outside of the event scheduled in EDC, you can choose to not map to any EDC event and have Workbench create events as needed to match those in your import package.

If your Event is within a repeating Event Group, Workbench uses the event definition’s External ID and appends a sequence number to it for matching. The sequence number increments for each unique Event in the file. If an Event is reused in more than one Event Group, it isn’t considered repeating. Instead, Workbench matches it to the first matching Event in the EDC schedule.

Match on Name (Default)

If you don’t include edc_matching in your manifest file, Workbench automatically matches using the Name of the Event Definition. You can also specify Name-based batching in the manifest file.

{
  "study": "Deetoza",
  "source": "Labs", 
  "edc_matching": {
    "event": {
      "target": ["event_name"]
    } 
  }
}

Match on External ID

To match on the External ID (formerly known as OID) of the Event Definition, you must include edc_matching and set the event target to external_id. See the example excerpt below:

{
  "study": "Deetoza",
  "source": "Labs", 
  "edc_matching": {
    "event": {
      "target": ["external_id"]
    } 
  }
}

Create New Event Definition with No Matching

If your data was collected outside of the EDC event schedule, you can choose to not match with any EDC events by including edc_matching and setting the event to false. When configured, this removes the requirement that all Events in the import package map to a matching Event in EDC. Workbench creates new Events for any non-EDC events that are specific to the Source. These new Events become part of the overall Workbench Header Event record (accessible via CQL from @HDR.Event).

{
  "study": "Deetoza",
  "source": "Labs", 
  "edc_matching": {
    "event": false
  }
}

Optional: Form & Item Group Mapping

You can choose to map rows in your CSV to different Forms and Item Groups, including defining how Workbench interprets repeating forms nad item groups within your data set. This allows Workbench to transform a single CSV into multiple Forms and Item Groups, instead of a single Form and single Item Group.

For example, a Labs form could contain multiple Item Groups - one Item Group per each lab test category. You would use an itemgroup column to indicate which Item Group the data items in the row belong to. Workbench treats all unmapped columns in the row as items. Workbench imports those items within the row’s specified item group. You can specify additional metadata for those columns in the item configuration. See details below.

Use the table below to understand how Workbench will handle your data with different Form and Item Group configuration scenarios:

Form Defined Item Group Defined Result
No No Workbench treats each row as one occurrence of a single Form.
Yes No Workbench treats each row as a different Form, identifying the Form from the form column. Each unique value in the form column is handled as a different Form.
No Yes Workbench treats each row as an instance of an Item Group (as defined in the itemgroup column) and groups these rows together by Subject and Form. To treat each Item Group as a repeating Item Group, use an itemgroupsequence column.
Yes Yes Workbench groups rows together by Subject, form, and itemgroup. Each unique value in the form column is handled as a different Form, and each unique value in the itemgroup column is an Item Group.

Example Manifest: Multiple Forms

In the example below, the “Labs.csv” file contains multiple Forms. These are identified in the form column of the “Labs.csv” file.

{
  "study": "Deetoza",
  "source": "lab",
  "data": [
    {
      "filename": "Labs.csv",
      "study": "protocol_id",
      "site": "site_id",
      "subject": "subject_id",
      "event": "visit",
      "form": "form"
    }
  ]
}

Labs.csv

Example Manifest: Multiple Item Groups

In the example manifest file below, the Labs form has multiple Item Groups. These are identified in the lab_test column of the “Labs.csv” file.

{
  "study": "Deetoza",
  "source": "lab",
  "data": [
    {
      "filename": "Labs.csv",
      "study": "protocol_id",
      "site": "site_id",
      "subject": "subject_id",
      "event": "visit",
      "form": "form"
    }
  ]
}

Labs.csv

Optional: Item Configuration

Within your manifest file, you can include Item metadata to inform how Data Workbench handles data items. You can specify a data type for each Item, as well as additional properties depending on your data type selection.

When item configuration is omitted, Data Workbench treats all items as text.

Simple vs. Advanced Configuration

The item “config” object has two configuration formats, simple and advanced. For simple configuration, you specify only the item’s data type. The item’s properties use the default values. For advanced configuration, you specify the data type and properties.

Simple:
"items" {
    "Weight": "float"
}
Advanced:
"items" {
    "Weight": {
        "type": "float",
        "precision": 2
    }
}

Available Properties by Data Type

There are different configuration properties available for each data type.

Data Type Example Configuration Property Property Description
Text
"items": {
    "Subject_Initials": {
        "type": "text",
        "length": 3
    }
}
Length The number of characters allowed. The default length is 1500.
Integer
"items": {
   "Actual_Dosage": {
      "type": "integer",
      "min": 1,
      "max": 1000
   }
}
Min The minimum (lowest) allowable value. The default is -4,294,967,295.
Max The maximum (highest) allowable value. The default is 4,294,967,295.
Float
"items": {
    "Weight": {
        "type": "float",
        "length": 4,
        "precision": 1,
        "min": 80,
        "max": 400
    }
}
Length

The maximum number of digits allowed, both to the left and to the right of the decimal point.

If the maximum value has more digits than this property, Workbench uses that number instead.

Precision

The number of decimal places allowed. The default is 5.

If the maximum value has more decimal places than this property, Workbench uses that number instead.

Min The minimum (lowest) allowable value. The default is -4,294,967,295.
Max The maximum (highest) allowable value. The default is 4,294,967,295.
Date
"items": {
    "Start_Date": {
        "type": "date",
        "format": "yyyy-MM-dd"
    }
}
Format

The format pattern with which to parse date values. See a list of supported date formats below.

The default is "yyyy-MM-dd".

DateTime
"items": {
    "End_DateTime": {
        "type": "datetime",
        "format": "MM-dd-yyyy HH:mm"
    }
}
Format

The format pattern with which to parse datetime values. See a list of supported date and time formats below.

The default is "yyyy-MM-dd HH:mm".

Time
"items": {
    "Time_Collected": {
        "type": "time",
        "format": "HH:mm:ss"
    }
}
Format

The format pattern with which to parse time values. See a list of supported time formats below.

The default is "HH:mm".

Boolean
"items": {
    "Exam_Performed": "boolean"
}
N/A The boolean data type doesn't have any configuration properties.

Supported Date & Time Format Patterns

The format patterns listed below are available for date, datetime, and time items. For datetime items, combine a date and time pattern, for example, “yy-MM-dd HH:mm”. If you use a time format pattern for a date, or vice versa, your import will fail with an error (D-012).

Format Pattern Example Description
dd MM yy 02 18 20 2-digit day, 2-digit month, and 2-digit year, using a whitespace ( ) delimiter
dd MM yyyy 02 18 2020 2-digit day, 2-digit month, and full year, using a whitespace ( ) delimiter
dd MMM yyyy 02 Feb 2020 2-digit day, Abbreviated month (text), and full year, using a whitespace ( ) delimiter
dd-MM-yy 18-02-20 2-digit day, 2-digit month, and 2-digit year, using a dash (-) delimiter
dd-MM-yyyy 18-02-2020 2-digit day, 2-digit month, and full year, using a dash (-) delimiter
dd-MMM-yyyy 18-Feb-2020 2-digit day, Abbreviated month (text), and full year, using a dash (-) delimiter
dd.MM.yy 18.02.20 2-digit day, 2-digit month, and 2-digit year, using a period (.) delimiter
dd.MM.yyyy 18.02.2020 2-digit day, 2-digit month, and full year, using a period (.) delimiter
dd/MM/yy 18/02/20 2-digit day, 2-digit month, and 2-digit year, using a forward slash (/) delimiter
dd/MM/yyyy 18/02/2020 2-digit day, 2-digit month, and full year, using a forward slash (/) delimiter
ddMMMyyyy 18Feb2020 2-digit day, Abbreviated month (text), and full year (no delimiter)
ddMMyy 180220 2-digit day, 2-digit month, and 2-digit year (no delimiter)
ddMMyyyy 18022020 2-digit day, 2-digit month, and full year (no delimiter)
MM/dd/yy 02/18/20 2-digit month, 2-digit day, and 2-digit year, using a forward slash (/) delimiter
MM/dd/yyyy 02/18/2020 2-digit month, 2-digit day, and full year, using a forward slash (/) delimiter
MMddyy 021820 2-digit month, 2-digit day, and 2-digit year (no delimiter)
MMddyyyy 02182020 2-digit month, 2-digit day, and full year (no delimiter)
MMM dd yyyy Feb 18 2020 Abbreviated month (text), 2-digit day, and full year, using a whitespace ( ) delimiter
MMM/dd/yyyy Feb/18/2020 Full month (text), 2-digit day, and 4-digit year, using a forward slash (/) delimiter
MMMddyyyy Feb182020 Full month (text), 2-digit day, and 4-digit year (no delimiter)
yy-MM-dd 20-02-18 2-digit year, 2-digit month, and 2-digit day, using a dash (-) delimiter
yy/MM/dd 20/02/18 2-digit year, 2-digit month, and 2-digit day, using a forward slash (/) delimiter
yyyy MM dd 2020 02 18 Full year, 2-digit month, and 2-digit day, using a whitespace ( ) as a delimiter
yyyy-MM-dd 2020-02-18 Full year, 2-digit month, and 2-digit day, using a forward slash (/) as a delimiter
yyyy.dd.MM 2020.18.02 Full year, 2-digit day, and 2-digit month, using a period (.) as a delimiter
yyyy.MM.dd 2020.02.18 Full year, 2-digit month, and 2-digit day, using a period (.) as a delimiter
yyyy/MM/dd 2020/02/18 Full year, 2-digit month, and 2-digit day, using a forward slash (/) delimeter
yyyyMMdd 20200218 Full year, 2-digit month, 2-digit day (no delimiter)
yyyy-MM-dd'T'HH:mm:ss+HH:mm 2020-02-18'T'18:30:22+00:00 4-digit year, 2-digit month, 2-digit day, using a dash (-) as a delimeter, and the 24-hour time with seconds, with the time offset from UTC (+HH:mm)
yyyy-MM-dd'T'HH:mm:ssZ 2020-02-18'T'18:30:22Z 4-digit year, 2-digit month, 2-digit day, using a dash (-) as a delimeter, and the 24-hour time with seconds in UTC time
yyyyMMdd'T'HH:mm:ssZ 20200218'T'18:30:22Z 4-digit year, 2-digit month, 2-digit day, and the 24-hour time with seconds in UTC time
HH:mm 18:30 24-hour time
HH:mm:ss 18:30:15 24-hour time with seconds

Example Package: Single Form, Item Metadata

Here is an example import package from Verteo Pharma’s randomization vendor, containing the Randomization form with item metadata:

manifest.json:

{
  "study": "Cholecap",
  "source": "IRT",
  "data": [{
    "filename": "Randomization.csv",
    "study": "protocol_id",
    "site": "site_id",
    "subject": "patient",
    "event": "visit_name",
    "items": {
      "randomization_number": {
        "type": "integer",
        "length": "14"
      },
      "date_of_randomization": {
        "type": "date",
        "format": "yyyy-MM-dd"
      }
    }
  }]
}

Randomization.csv: Randomization.csv for the manifest.json file above

Optional: Restricting Data (Blinding)

You can restrict (blind) data at the Item, row, listing data file, and Source levels to hide it from users who don’t have permission to access restricted data. For example, your study may only order certain labs for subjects who are using the study drug. Knowing which labs a subject has would unblind the subject. You can restrict this information to prevent blinded users from identifying those subjects.

Restriction Level Manifest File Configuration
Item
"items": {
  "itemname": {
    "type": "text",
    "blinded": true
  }
}
Form *To restrict data at the form level, use the column which uniquely identifies a form.
{
  "study": "Deetoza",
  "source": "Labs"
  {
    ...
    "event": "visit",
    "row_blinded": "column_id"
  }
}
Listing Data File
{
  "study": "Deetoza",
  "source": "Labs"
  {
    "filename": "chempanel.csv",
    "blinded": true,
    ...
  }
}
Source Package
{
  "study": "Deetoza",
  "source": "Labs",
  "blinded": true
  ...
}

For users who have access to restricted data (typically lead data managers), restricted data behaves the same way as unrestricted data. For blinded users (users without the Restricted Data Access permission), the following behavioral rules apply to any imported restricted data:

  • If an Item (column) is restricted:
    • The CQL projection doesn’t return the restricted item’s column.
    • The CQL projection doesn’t return any derived columns that reference the restricted item.
    • If a blinded user references the restricted Item in their CQL statement, CQL still doesn’t return the column.
    • SHOW and DESCRIBE do not return the restricted Item.
  • If a row is restricted:
    • The result set doesn’t return any records from the Form or Item Group.
  • If a listing file (csv) is restricted:
    • The default @HDR columns are included in the listing, but no Item columns are included.
  • If a Source (package) is restricted:
    • CQL doesn’t return any Items or column results from the restricted Source in any listing.
    • CDB marks all Item Definitions, Item Group Definitions, and Form Definitions within the Source as restricted.
    • All data rows are marked as restricted.
    • The default @HDR columns will still appear in the core listings.

ZIP Package

Once you’ve finished creating your manifest file and your CSVs, create a ZIP folder of all of those files. You can name this folder whatever you’d like. However, we recommend that you name it with a unique identifier, for example, “Study-Name_Source_datetime.zip”.

A ZIP file for the Deetoza study's laboratory import

Accessing your Vault’s FTP Server

Each vault in your domain has its own FTP staging server. The FTP server is a temporary storage area for files you’re uploading to or extracting from Vault.

Server URL

The URL for each staging server is the same as the corresponding vault, for example, veepharm.veevavault.com.

How to Access FTP Servers

You can access your staging server using your favorite FTP client or through the command line.

Use the following settings with an FTP client:

  • Protocol: FTP (File Transfer Protocol)
  • Encryption: Require explicit FTP over TLS (FTPS). This is a security requirement. Your network infrastructure must support FTPS traffic.
  • Port: This does not typically need to be added and will default to Port 21.
  • Host: {DNS}.veevavault.com. For example: “veepharm” is the DNS in veepharm.veevavault.com.
  • User: {DNS}.veevavault.com+{USERNAME}. This uses the same user name that you log in with. For example: veepharm.veevavault.com+tchung@veepharm.com.
  • Password: Your login password for this vault. This is the same password used for your standard login.
  • Login Type: Normal
  • Transfer File Type: Transfer files as binary
If you are experiencing issues uploading large files, increase the FTP client timeout setting to 180 seconds.

If you have remote verification enabled on a proxy or a firewall, FTP traffic from computers on your network to Veeva FTP servers might be refused. If possible, work with your IT department to disable remote verification. If it cannot be disabled, contact Veeva Support.

FTP Directory Structure

Inside of your user directory, there is a “workbench” directory. This is where you will upload your 3rd party data. Vault is automatically aware of any files that you put here.

Directory for workbench

Vault moves any successfully imported files into “workbench/_processed”. See details below.

Importing the Data

To import your data, upload your ZIP file to the “workbench” directory of your FTP staging server, using an FTP client of your choice. Once you upload your ZIP to the FTP staging server, Vault imports it and transforms your data.

Successful Import

Upon successful import:

  • Vault creates all definition records and the relationships between them.
  • Vault automatically sets the Source field with the value provided in your manifest on all imported records to uniquely identify the data source. (Any Forms imported from Vault EDC automatically have this value set to “EDC”.)
  • Vault creates a Core Listing for each unique Form in your import package. See details below.
  • Vault moves your import ZIP file from “workbench” into “workbench/_processed/{study}/{source}” (for failed imports, your ZIP file stays in the root of the “workbench” directory. Vault also appends the date and time of import to the filename.

You can now view your listing in the Data Workbench application. In the current release, Workbench doesn’t show your new listings right away. First, navigate to your study’s Listings page, click to open another Core Listing, and then return to the Listings page. Then, your new data listings will display in the list. You can now click on the Name of one of your listings to open it.

Failed Import

Upon failed import:

  • Vault appends the date and time of the attempted import to the ZIP file’s filename, but it keeps your ZIP file in the root of the “workbench” directory.
  • Vault creates an error log (“<import datetime>_<package name>_errors.csv”).

See this list of possible errors and how to resolve them.

Viewing Import Status

You can check the status of your import package from Import > Packages. This page lists the status of every import package, from both Vault EDC and 3rd party tools. You can also download import packages and issue logs (errors and warnings) from this page.

Downloadable import packages and issue logs only include restricted (blinded) data when downloaded by users with permission to access restricted data.

Any time you import a package into Workbench, Workbench automatically reprocesses the most recent packages for all other sources. For example, when your nightly Workbench Export EDC job runs, and then imports into Workbench, Workbench reprocesses your most recent Lab Data and Imaging packages as well.

Workbench Import Statuses

When an import package is able to import with only warnings, Workbench highlights the status in orange to indicate that there are warnings. When import finishes, you can download the issue log to review the warnings.

Status Description
In Progress The import process has begun for this package, and Workbench has not identified any errors or warnings.
In Progress (with warnings) The import process is ongoing, but Workbench has identified a warning.
Error Import failed because there are one or more errors in the import package. Download the issue log and review the errors.
Complete Workbench imported the package successfully, with no errors or warnings.
Complete (with warnings) Workbench imported the package successfully, but there are one or more warnings. Download the issue log and review the warnings.
Not Imported Workbench skipped this package because a newer package for the same source was uploaded before processing began.
Reprocess in Progress Workbench has begun reprocessing this package because a new package from another source was imported.
Reprocessed Complete Workbench finished reprocessing this package with no errors or warnings.
Reprocessed Complete (with warnings) Workbench finished reprocessing this package, but there are one or more warnings. Download the issue log and review the warnings.
Reprocessed Error Reprocessing failed because there are one or more errors in the import package. Download the issue log and review the errors.

Download the Import Package

To download the import package:

  1. Navigate to Import > Packages.
  2. Locate your import package in the list.
  3. Click the Package link.
    Click to download the import package

  4. Extract the files from the ZIP folder and view them in a tool of your choosing.

Download the Issue Log

You can download an Issue Log (CSV) that lists all errors and warnings that Workbench encountered while importing the package. See a list of possible errors and warnings here.

Viewing Imported Data

Upon upload, Workbench creates a Form for each unique file in the import package. Workbench automatically generates a Core Listing for each unique Form in a Study, whether that form comes from Vault EDC or is imported from a 3rd party system.

The default CQL query for these Core Listings is:

SELECT @HDR, * from source.filename

In the laboratory import example above, Vault creates two Core Listings: Chemistry and Hematology (one for each CSV file), using these queries:

Chemistry
SELECT @HDR, * from labs.Chemistry
Hematology
SELECT @HDR, * from labs.Hematology

Definitions

Vault creates a Form Definition for each CSV file to define your imported data as “forms” within the Data Workbench application. These records use the CSV filename (without the extension) as the Name, for example, “hematology”. Vault also creates an Item Group Definition to group your data items together within the form. Vault names these by prepending “ig_” to the CSV filename (without the extension), for example, “ig_hematology”.

Both of these definitions display as a column within your listing (form_name and ig_name). You can edit the listing’s CQL query to hide these columns.