Study Data Extracts

The Study Data Extract job allows data managers to extract and download study execution data from EDC Tools. You can choose to download your data as a CSV or SAS file before running the job. Note that the encoding for CSVs is UTF-8.

Clinical Datasets

Each clinical dataset, and each Form (Adverse Events, Concomitant Medications, etc.), contains a set of Key Columns and a set of Clinical Data columns, which are detailed below.

Key Columns

Key Columns contain data to help identify which Form each row is referring to (ex: which Subject, Site, Cycle, or Event that particular form is located in).

The list of Key Columns is as follows:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SITE Site Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
EGROUP Event Group Text Char $ 200 $200
EVENT Event Text Char $ 200 $200
EVENTID Event External ID Text Char $ 200 $200
EVENTDT Event Date Date Num : vdate. DATE9. 14 8
ESEQ Event Sequence Integer Num 14 8
FORM Form Text Char $ 200 $200
FORMEID Form External ID Text Char $ 100 $100
FSEQ Form Sequence Integer Num 14 8
IGSEQ Item Group Sequence Integer Num 14 8
DLASTMOD* Datetime of Last Data Change (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
FGUID Internal Vault ID (Forms) Text Char $ 200 $200
IGGUID Internal Vault ID (item groups) Text Char $ 1000 $1000
FORMILB Intentionally Left Blank Boolean Char $ 1 $1
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8
LINKEDTO Links to other forms Text Char $ 1500 $1500

*Datetime columns can be separated into two separate Date and Time columns.

Clinical Data

After the Key Columns, the system will insert Clinical Data columns, with each column representing an item on the Form.

Clinical Data Item Headers use the item names or external ID defined in Studio. Some items can be associated with more than one column, depending on their data type:

  1. Codelists are split into two columns:
    • ITEMNAME: this column contains the codelist code
    • ITEMNAME_D: this column contains the codelist decode or label
  2. Units are split into four columns:
    • ITEMNAME: this column contains the value entered in the item
    • ITEMNAME_U: this column contains the unit entered in the item
    • ITEMNAME_TRANS: this column contains the “translated” item value, which is the value that was entered that has been converted to the standard unit
    • ITEMNAME_TRANSU: this column contains the standard unit
  3. Datetimes are split into three columns:
    • ITEMNAME: this column contains the datetime value entered (site’s timezone)
    • ITEMNAME_UTC: this column contains the datetime value converted to UTC
    • ITEMNAME_USER: this column contains the datetime value converted to the timezone of the user running the Study Data Extract

Booleans: Boolean items always have either a Y (true) or N (false/null) value.

Unknown parts in dates and datetimes are represented by UN in ISO format:

  • For a date: 2021-UN-UN
  • For a datetime: 2021-UN-TUN:UN:UN.UNKZ

You have the option to exclude blank forms from your export.

Casebook Versions

All casebook versions are displayed in Study Data Extracts, so items that were deleted from the schedule in subsequent casebook versions will still appear in Clinical Datasets. This inclusion guarantees that all data is extracted, even if certain subjects were kept on a previous casebook version.

The Study Data Extract job first adds columns for all items that are present in Casebook Version 1 before adding columns for any items that may have been added in subsequent casebook versions to create the Study Data Extract file.

Coding Data

If a form is set up for coding, meaning that a coding configuration has been defined in Studio, coding-related columns will be added after clinical data columns. If there are multiple verbatims set up in a form, the system will add a set of coding columns for each verbatim configured. If the same item in different item groups is configured for coding, the system will add “itemgroup.item” to each column header to indicate which set of columns belongs to which verbatim.

List of coding columns for MedDRA:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
CRSTATUS Coding Status Text Char $ 200 $200
DICTTYPE Dictionary Type Text Char $ 200 $200
DICTVER Dictionary Release Text Char $ 200 $200
SOC System Organ Class Text Char $ 200 $200
SOCID System Organ Class Code Text Char $ 200 $200
HLGT High Level Group Term Text Char $ 200 $200
HLGTID High Level Group Term ID Text Char $ 200 $200
HLT High Level Term Text Char $ 200 $200
HLTID High Level Term ID Text Char $ 200 $200
PT Preferred Term Text Char $ 200 $200
PTID Preferred Term ID Text Char $ 200 $200
LLT Lowest Level Term Text Char $ 200 $200
LLTID Lowest Level Term ID Text Char $ 200 $200
PRIMPATH Primary Path Text Char $ 200 $200
LASTCODEDAT Last Coded Date Text Char $ 200 $200
LASTCODEBY Last Coded By Text Char $ 200 $200

List of coding columns for WHODrug:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
CRSTATUS Coding Status Text Char $ 200 $200
DICTTYPE Dictionary Type Text Char $ 200 $200
DICTVER Dictionary Release Text Char $ 200 $200
ATC1 ATC 1 Text Char $ 200 $200
ATC1CD ATC 1 Code Text Char $ 200 $200
ATC2 ATC 2 Text Char $ 200 $200
ATC2CD ATC 2 Code Text Char $ 200 $200
ATC3 ATC 3 Text Char $ 200 $200
ATC3CD ATC 3 Code Text Char $ 200 $200
ATC4 ATC 4 Text Char $ 200 $200
ATC4CD ATC 4 Code Text Char $ 200 $200
PREFNAME Preferred Name Text Char $ 200 $200
PREFCODE Preferred Code Text Char $ 200 $200
DRUGNAME Drug Name Text Char $ 200 $200
DRUGCODE Drug Code Text Char $ 200 $200
PREFBASE Preferred Base Code Text Char $ 200 $200
PREFLABEL Preferred Base Text Char $ 200 $200
LASTCODEDAT Last Coded Date Text Char $ 200 $200
LASTCODEBY Last Coded By Text Char $ 200 $200

List of coding columns for JDrug:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
CRSTATUS Coding Status Text Char $ 200 $200
DICTTYPE Dictionary Type Text Char $ 200 $200
DICTVER Dictionary Release Text Char $ 200 $200
DRUGCODE Drug Code Text Char $ 200 $200
DRUGNAME Drug Name Text Char $ 200 $200
GDRUGNAME Generic Drug Name Text Char $ 200 $200
DRUGCODECAT1 Drug Code Category 1 Text Char $ 200 $200
DRUGCODECAT2 Drug Code Category 2 Text Char $ 200 $200
USECAT1 Use Category 1 Text Char $ 200 $200
USECAT2 Use Category 2 Text Char $ 200 $200
MANNAME Manufacturer Name Text Char $ 200 $200
MANCODE Manufacturer Code Text Char $ 200 $200
MAINTFLG Maintenance Flag Text Char $ 200 $200
MAINTDT Maintenance Date Text Char $ 200 $200
LASTCODEDAT Last Coded Date Text Char $ 200 $200
LASTCODEDBY Last Coded By Text Char $ 200 $200

Definitions File

The Study Data Extract ZIP file contains a definition file, which is a listing of all columns across all datasets included in the extract.

List of columns included in the Definitions file:

Column Header Data
Dataset Name Name of the dataset
Column Name of the column
Type Item type for clinical data items
SAS Type SAS type of the column
SAS Informat SAS informat applied
SAS Format SAS format applied
SAS Length SAS length
Label SAS Label of the column

System Datasets

Along with clinical datasets, the Study Data Extract file contains several system datasets that contain operational data and metrics.

SYS_SITE

The SYS_SITE dataset lists all sites for the selected study.

List of SYS_SITE columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 100
COUNTRY Study Country Text Char $ 100 100
SITENUM Study Site Number Text Char $ 100 100
SITENAME Study Site Name Text Char $ 100 100
INVNAME Principal Investigator Text Char $ 100 100
CASEBDEF Casebook Version Integer Char $ 100 100
STATUS Site Status Text Char $ 100 100
TIMEZONE Timezone of the Site Text Char $ 100 100
GUID Internal Vault ID of the site Text Char $ 100 100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_SUB

The SYS_SUB dataset lists all subjects for the selected study.

List of SYS_SUB columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 100
COUNTRY Country Text Char $ 100 100
SITENUM Study Site Number Text Char $ 100 100
SUBJID Subject Text Char $ 100 100
CASEBDEF Casebook Version Integer Char $ 100 100
STATUS Status of the Subject/Casebook Text Char $ 100 100
GUID Internal Vault ID of the site Text Char $ 100 100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8
LATESTARM Latest Arm Text Char $ 100 100
LATESTCOHORT Latest Cohort Text Char $ 100 100
LATESTSUBSTUDY Latest Substudy Text Char $ 100 100
SCRDDT Screened date Date Num : vdate. DATE9. 14 8
SCRFAILDT Screen failed date Date Num : vdate. DATE9. 14 8
ENRDDT Enrolled date Date Num : vdate. DATE9. 14 8
RDMDDT Randomized date Date Num : vdate. DATE9. 14 8
ENDTRTDT End of treatment date Date Num : vdate. DATE9. 14 8
WTHDRWNDT Withdrawn date Date Num : vdate. DATE9. 14 8
CMPLTDT End of study date Date Num : vdate. DATE9. 14 9

Please note that the LATESTARM, LATESTCOHORT, and LATESTSUBSTUDY columns will only display if you have Arms & Cohorts enabled.

SYS_EVT

The SYS_EVT dataset lists all events as well as event dates and status for the selected study.

List of SYS_EVT columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
EVENT Event Text Char $ 200 $200
EVENTDT Event Date Date Num : vdate. DATE9 14 8
ESEQ Event Group Sequence Integer Num 14 8
CASEBDEF Casebook Version Integer Char $ 20 $20
STATUS Status of the event/visit Text Char $ 100 $100
LASTREAS Last change reason Text Char $ 200 $200
GUID Internal Vault ID of the site Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8
EVENTID Event External ID Text Char $ 200 $200
LASTMODDAT Last modified date/time of the event date (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
WINSTAT Scheduled Window Status Text Char $ 100 $100
DAYSOW Days Outside Window Integer Num 14 8
EXPFORMS Expected Number of Forms Integer Num 14 8
FORMEOD Number of days past due when one or more forms is not yet complete Integer Num 14 8

SYS_FORM

The SYS_FORM dataset lists all forms for the selected study, including forms Intentionally Left Blank as well as review statuses, freeze and lock statuses, and additional metrics.

List of SYS_FORM columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
EVENT Event Text Char $ 200 $200
EVENTDT Event date Date Num : vdate. DATE9. 14 8
ESEQ Event group sequence Integer Num 14 8
FORM Form Text Char $ 200 $200
FSEQ Form Sequence Integer Num 14 8
STATUS Form Status Text Char $ 100 $100
CREATEDT Datetime form created Datetime Num : vdatetime. DATETIME22.3 14 8
SUBMITDT Datetime form last submitted Datetime Num : vdatetime. DATETIME22.3 14 8
NSUBMITS Number of times form was submitted Integer Num 14 8
SDVDT Datetime considered SDV complete (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
TOSDV Days from submit to SDV Float Num 14 8
DMRDT Datetime considered DMR complete Datetime Num : vdatetime. DATETIME22.3 14 8
DMRLAST Datetime of last DMR modification (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
TODMR Days from submit to DMR Float Num 14 8
FREEZEDT Datetime frozen (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
TOFREEZE Days from submit to freeze Float Num 14 8
LOCKDT Datetime locked (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
TOLOCK Days from submit to lock Float Num 14 8
SIGNDT Datetime signed (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
LSTSGNDT Last Signature Date Datetime Num : vdatetime. DATETIME22.3 14 8
TOSIGN Days from submit to sign Integer Num 14 8
EDT_SUBM Days from event date (of form) to submit date Float Num 14 8
ILB Intentionally Left Blank Text Char $ 100 $100
ILBREAS Reason for Intentionally Left Blank Text Char $ 200 $200
GUID Internal Vault ID of the event Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8
EVENTEID Event External ID Text Char $ 200 $200
FORMEID Form External ID Text Char $ 100 $100
SDVREQ SDV Required Boolean Char $ 1 $1
SDVLAST Datetime of last SDV modification (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
DMRREQ DMR Required Boolean Char $ 1 $1

SYS_Q

The SYS_Q dataset lists all queries for the selected study, along with some query metrics. Only the first query message is listed in this dataset. The full list of query messages (first message, answers, etc.) is included in the SYS_QT dataset.

List of SYS_Q columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
EVENT Event Text Char $ 200 $200
ESEQ Event sequence Integer Num 14 8
FORM Form Text Char $ 200 $200
FSEQ Form Sequence Integer Num 14 8
ITEMGROUP Item group Text Char $ 200 $200
IGSEQ Item group sequence Integer Num 14 8
ITEM Item Text Char $ 200 $200
QUERYID Query ID Text Char $ 200 $200
STATUS Query status Text Char $ 100 $100
MANUAL Manual query Text Char $ 200 $200
RULEID Rule name Text Char $ 200 $200
TRIGID Trigger name Text Char $ 200 $200
QTEXT Initial query text/message Text Char $ 200 $200
CREATED Created date/time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
CREATEDB Created by Text Char $ 100 $100
QUERYTEAM Query team Text Char $ 100 $100
QTFRESP Time to first response (days) Float Num 14 8
QAGE Query Age (days) Float Num 14 8
QOTOCL Query open to close (days) Float Num 14 8
GUID Internal Vault ID of the event Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8
EVENTEID Event External ID Text Char $ 200 $200
FORMEID Form External ID Text Char $ 100 $100
QCHGDATA Whether the data changed after query creation (field query attached) Boolean Char $ 1 $1

SYS_QT

The SYS_QT dataset lists all query messages for the selected study.

You can link the content of this file with the SYS_Q dataset using the QUERYID column. There can be multiple SYS_QT records matching one SYS_Q record (i.e. a query in SYS_Q can have multiple query messages in SYS_QT).

List of SYS_QT columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
QUERYID Query ID Text Char $ 200 $200
STATUS Query status Text Char $ 100 $100
QTEXT Initial query text/message Text Char $ 200 $200
TEXTDT Text/Message Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
TEXTBY Text/Message By Text Char $ 100 $100
QUERYTEAM Query team Text Char $ 100 $100
GUID Internal Vault ID of the event Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_ILB

The SYS_ILB dataset lists all items that have been marked Intentionally Left Blank in the selected study.

List of SYS_ILB columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
EVENT Event Text Char $ 200 $200
ESEQ Event Sequence Integer Num 14 8
FORM Form Text Char $ 200 $200
FSEQ Form Sequence Integer Num 14 8
IGROUP Item Group Text Char $ 100 $100
IGSEQ Item Group Sequence Integer Num 14 8
ITEM Item Text Char $ 200 $200
ITEMDEF Item Definition Text Char $ 200 $200
LABANALYTENAME Analyte Name Text Char $ 200 $200
DATASETNAME Dataset name Text Char $ 100 $100
COLNAME Column name Text Char $ 100 $100
ILBREAS Reason for intentionally left blank Text Char $ 200 $200
GUID Internal Vault ID of the site Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8
EVENTEID Event External ID Text Char $ 200 $200

The SYS_LINKS dataset lists all forms that belong to a form link in the selected study. Forms that are linked together will share the same link ID, which will display in the GUID column. The form unique identifier (FGUID) can then be used to identify a specific form and access its data.

List of SYS_LINKS columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITE Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
EGROUP Event group Text Char $ 200 $200
EVENT Event Text Char $ 200 $200
EVENTEID Event external ID Text Char $ 200 $200
ESEQ Event group sequence Integer Num 14 8
FORM Form Text Char $ 20 $20
FORMEID Form external ID Text Char $ 20 $20
FSEQ Form Sequence Integer Num 14 8
CREATED Created date/time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
CREATEDB Created by Text Char $ 100 $100
FGUID Internal vault ID (forms) Text Char $ 100 $100
GUID Internal Vault ID of the site Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime of when the extract was run (UTC)
LINKEDTO Forms linked to dataset Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_ASM

The SYS_ASM dataset lists the following information about assessments, if assessments are configured for a Study.

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
ASMNAME Assessment Name Text Char $ 200 $200
ASMLABEL Assessment Label Text Char $ 200 $200
ASMTYPE Assessment Type Text Char $ 200 $200
SOURCEF Source Form Text Char $ 200 $200
STATUS Status of Assessment Text Char $ 200 $200
CREATEDT Created Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
COMPDT Completed Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
NSUBMITS Submit Count Text Char $ 100 $100
LASTMODDT Last Modified Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
LASTMODB Last Modified By Text Char $ 100 $100
GUID Assessment Vault ID Text Char $ 100 $100
FGUID Source Form Vault ID Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_ASMR

The SYS_ASMR dataset lists questions and answers for all assessments in a Study, if assessments are configured.

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
ASMLABEL Assessment Label Text Char $ 200 $200
QUESTEXT Question Text Text Char $ 200 $200
QUESANS Question Answer Text Char $ 200 $200
CREATEDT Created Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
LASTMODDT Last Modified Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
LASTMODB Last Modified By Text Char $ 100 $100
AGUID Assessment Vault ID Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_PD

The SYS_PD dataset lists the following Protocol Deviations information, if Protocol Deviations is enabled.

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITENUM Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
PDID Protocol Deviation Identifier Text Char $ 200 $200
PDSUM Protocol Deviation Summary Text Char $ 1000 $1000
PDDAT Date of Deviation Date Num : vdate. DATE9. 14 8
PDDATID Date Identified Date Num : vdate. DATE9. 14 8
PDCAT Category Text Char $ 100 $100
PDSUBCAT Subcategory Text Char $ 100 $100
PDSEV Severity Text Char $ 100 $100
PDDESC Description Text Char $ 1000 $1000
PDSTATUS Protocol Deviation Status Text Char $ 100 $100
PDRES Protocol Deviation Resolution Text Char $ 100 $100
PDRULE Protocol Deviation Rule Text Char $ 100 $100
EGROUP Event Group Text Char $ 100 $100
EVENT Event Text Char $ 100 $100
ESEQ Event Sequence Text Char $ 100 $100
FORM Form Text Char $ 100 $100
IGROUP Item Group Text Char $ 100 $100
IGSEQ Item Group Sequence Text Char $ 100 $100
ITEM Item Text Char $ 100 $100
CREATEDT Created Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
CREATEDB Created By Text Char $ 100 $100
LASTMODDT Last Modified Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
USERMODDT User Modified Date Text Char $ 100 $100
USERMODB User Modified By Text Char $ 100 $100
INACBYSYS Inactivated by System Text Char $ 100 $100
LASTINACDT Last Inactivated Date Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_RAND

The SYS_RAND dataset lists the following Randomization information, if Randomization is enabled.

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
STUDYID Study Text Char $ 100 $100
COUNTRY Country Text Char $ 100 $100
SITE Study Site Number Text Char $ 100 $100
SUBJID Subject Text Char $ 100 $100
RANDSTATUS Randomization Status Text Char $ 200 $200
RANDID Randomization ID Text Char $ 200 $200
RANDSTRATA Strata Group Text Char $ 200 $200
RANDBY Randomized By Text Char $ 200 $200
RANDDAT Randomized Date Datetime Num : vdatetime. DATETIME22.3 14 8
RANDFILE Randomization File Label Text Char $ 200 $200
RANDFILEDAT Randomization File Upload Date Datetime Num : vdatetime. DATETIME22.3 14 8
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_LABRANGES

The SYS_LABRANGES dataset lists the following Labs information, if Labs is enabled.

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
LABID Lab ID Text Char $ 100 $100
LABLOC Lab Location Text Char $ 200 $200
LABANALYTENAME Analyte Name Text Char $ 200 $200
LABANALYTELABEL Analyte Label Text Char $ 200 $200
LABREFRSTATUS Lab Reference Range Status Text Char $ 20 $20
LABTESTCODE Test Code Text Char $ 200 $200
LABSEX Sex Text Char $ 20 $20
LABLOWERAGE Lower Age Integer Num 14 8
LABUPPERAGE Upper Age Integer Num 14 8
LABAGEUNIT Age Unit Text Char $ 100 $100
LABFEMCYCLE Female Cycle Text Char $ 100 $100
LABLNORMAL Lower Normal Text Char $ 100 $100
LABUNORMAL Upper Normal Text Char $ 100 $100
LABMEASUNIT Measurement Unit Text Char $ 100 $100
LABMEASUNITLABEL Measurement Unit Label Text Char $ 100 $100
LABCODENORM Codelist Normal Text Char $ 100 $100
LABTEXTNORM Text Normal Text Char $ 100 $100
LABMODIFIER Lab Modifier Text Char $ 5 5
LABFROMDT Effective From Date Datetime Num : vdatetime. DATETIME22.3 14 8
LABTODT Effective To Date Datetime Num : vdatetime. DATETIME22.3 14 8
LABDESCR Description Text Char $ 100 $100
LABSPECTYPE Specimen Type Text Char $ 100 $100
LABTESTMET Testing Method Text Char $ 100 $100
CREATB Created By Text Char $ 100 $100
CREATEDT Created Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
LASTMODB Last Modified By Text Char $ 100 $100
LASTMODDT Last Modified Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_LABLOC

The SYS_LABLOC dataset lists the following Labs information, if Labs is enabled.

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
LABID Lab ID Text Char $ 100 $100
LABLOC Lab Location Text Char $ 100 $100
LABLOCSTATUS Lab Location Status Text Char $ 100 $100
LABAPPROVED Approved Lab Location Text Char $ 100 $100
LABADDRESS Lab Address Text Char $ 200 $200
LABCOUNTRY Lab Country Text Char $ 200 $200
LABCONTACT Lab Contact Name Text Char $ 200 $200
LABCONTITLE Lab Contact Title Text Char $ 200 $200
LABTEL Lab Telephone Text Char $ 20 $20
LABFAX Lab Fax Text Char $ 100 $100
LABPRIMEMAIL Lab Primary Email Text Char $ 100 $100
LABSECMAIL Lab Secondary Email Text Char $ 100 $100
LABSTATUS Lab Status Text Char $ 100 $100
CREATEDB Created By Text Char $ 100 $100
CREATEDT Created Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
LASTMODB Last Modified By Text Char $ 100 $100
LASTMODDT Last Modified Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
GUID Internal Vault ID Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

SYS_ANALYTES

The SYS_ANALYTES dataset lists the following Labs analyte information, if Labs is enabled.

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
LABANALYTENAME Analyte Name Text Char $ 200 $200
LABANALYTELABEL Analyte Label Text Char $ 200 $200
LABDESCR Description Text Char $ 1000 $1000
LABDATATYPE Data Type Text Char $ 200 $200
LABMEASUNIT Measurement Unit Text Char $ 200 $200
LABCODELIST Codelist Text Char $ 200 $200
LABMODIFIER Lab Modifier Text Char $ 5 5
LABSPECTYPE Specimen Type Text Char $ 200 $200
LABLEN Length Integer Num 14 8
LABPREC Precision Text Char $ 200 $200
LABSYSMANAGED System Managed Text Char $ 5 5
LABTESTMET Testing Method Text Char $ 200 $200
LABLOINC LOINC Code Text Char $ 200 $200
LABSDTMNM SDTM Name Text Char $ 1000 $1000
CREATEDB Created By Text Char $ 100 $100
CREATEDT Created Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
LASTMODB Last Modified By Text Char $ 100 $100
LASTMODDT Last Modified Date/Time (UTC) Datetime Num : vdatetime. DATETIME22.3 14 8
GUID Internal Vault ID Text Char $ 100 $100
LASTRUN Last run of this data/listing Datetime Num : vdatetime. DATETIME22.3 14 8

Form, Event, & Query Status

Below are statuses and their descriptions for forms, events, and queries.

Form Status

Status Description
Blank The Form has not been submitted and has no queries, no autosaves, and no fields left blank.
In Progress The Form has either not been submitted since the last data change or there are tasks that have not been completed on the Form.
Submitted The Form has been submitted and there are no open tasks.
In Progress Post-Submit During a retrospective casebook amendment, there is an option to set the form status to In Progress for any modified forms.

Event Status

Status Description
Planned The Event has no Actual Start Date or is scheduled in the future.
Blank The Event has an Actual Start Date, but all forms are in the Planned status.
In Progress The Event has an Actual Start Date and at least one form is not Submitted or has at least one task. This status can also indicate that at least one but not all Forms in the Event are frozen or locked.
Submitted All Forms in the Event are submitted and no forms have a Planned or Blank status.
Did Not Occur The Event did not occur during study conduct.

Query Status

Status Description
Open "Open" indicates that a query has not been answered.
Answered "Answered" indicates that a query has been addressed. For example, the site has responded with a reason for the queried value.
Closed "Closed" indicates that a query requires no further action or discussion. A CRA or Data Manager can reopen a closed query if needed.

SAS Format

If you select SAS as the Export File Format when scheduling your Study Data Extract job, the generated ZIP file contains two additional folders:

  • SAS: Contains sas7bdat files for each dataset
  • XPT: Contains XPT files for each dataset

Key columns use fixed conversions to either num or char SAS types, whereas clinical data columns are converted to a specific SAS type based on the item data type in EDC.

SAS validations and formatting:

  • Column names are limited to 32 characters. The system will truncate as necessary to append special suffixes like _TRANSU or _D.
  • SAS labels are limited to 255 characters. Labels that exceed 250 characters are truncated and have an ellipsis (…) added to the label.
  • If a dataset starts with a number, it will be prefixed with an underscore (_).
  • If a clinical dataset has the same name as one of the system datasets, it will be appended with “_X” (X being a number that can be incremented).
  • If a column header starts with a number, it will be prefixed with an underscore (_).
  • If a value is entered with carriage returns (new line), these characters are replaced with spaces.
  • If the same column header is present multiple times (same item used in multiple item groups in the same form), the system appends “_X” (X being a number that can be incremented) to the column header. For column headers that already have a suffix (e.g. _TRANS), the “_X” suffix will be appended after the special suffix is added (_TRANS_2).

In 21R2, the Datetime in the Site’s timezone and the Datetime in the timezone of the user running the job will be treated as text and will no longer use the SAS formats, such as DATETIME22.3.

FTP Connection

As of 21R2, you can select the FTP path that you would like your files to go into. For example, if you specify a directory called custom_directory/, the file will be placed in the path of /root/u(userid)/(custom_directory). The “/(userid)/” indicates that the file will be placed in the home directory of the user for which the FTP connection is set up. Please note that you must append a trailing “/” to your directory path (e.g. custom_directory/).

You can specify up to 3 directory paths in the FTP connection configuration. The default path is vaultloader, which places the file in “/root/u(userid)/Study Data Extracts” for the SDE.

Please note that the default behavior for Vault Owners is to place the files in the user’s root directory, not in the /root/u(userid)/. For example, files would be placed in /root/Study Data Extracts if the directory is vaultloader.

FTP connection SDE

Pictured above: an example configuration for SDE with a default destination directory path of vaultloader, which will place files in the /Study Data Extracts folder.

FTP connection SDE

Pictured above: an example configuration for SDE with a destination directory path with multiple levels.

To configure a connection, see Connecting to an FTP Server from EDC Tools.

Custom Objects

You can include custom objects in your exports by adding records to the Study Data Extract Custom Object Config object in Business Administration where you must specify the Name of the record, custom object name (objectname__c), and the Study that you’re going to run the SDE for.

Once you’ve configured a custom object and run the SDE job, you will receive a file named CUSTOM_.csv, which will have records for each custom object you configured.

The order of the columns in the custom object file follows the Page Layout configuration, with the exception of the GUID of the object always appearing at the beginning of the file.

Duplicate column names are appended with an _2, _3, etc. to prevent column name conflicts.

List of Custom Objects columns:

Column Header Label Type SAS Type SAS Informat SAS Format Length SAS Length
GUID ID Text Char $ 200 $200
NAME Name Text Char $ 128 $128
STATUS Status Text Char $ 200 $200
CREATEDB Created By Text Char $ 200 $200
CREATED Created Date Datetime Num : vdatetime. DATETIME22.3 14 8
MODIFIEDB Modified By Text Char $ 200 $200
DLASTMOD Last Modified Date Datetime Num : vdatetime. DATETIME22.3 14 8
OBJECT_TYPE Object Type Text Char $ 200 $200

Running the Study Data Extract Job

To run the Study Data Extract job:

  1. Navigate to your Study in Tools > EDC Tools.
  2. Click Jobs.
  3. Click + New Job.
  4. Select Study Data Extract for Type. Study Data Extract new job

  5. Optional: Select Include Restricted Data to include data from restricted Forms in your extract.
  6. Choose which version of the SDE you’d like to run. Each job version determines which files are included in the extract and which version of the CSVs must be used. Please note that the default version selected will be the latest version that is supported in the current release.
  7. Optional: Make your selections for Clinical Data Options to determine how Vault handles clinical data formatting.
  8. Optional: Enter a ZIP File Name. This is the filename that Vault will use for the generated ZIP file.
  9. Select an Export File Type.
  10. Optional: Select an External Connection to receive the output file. Otherwise, Vault will send you a link to download the output when the job finishes.
  11. Optional: Configure a Frequency to run the job on a recurring basis. Learn more about scheduled jobs.
  12. Click Run Now.