DISCUSSION: SDTM UPDATES MARCH 2013 Public comments Comments were requested from CDISC for the new and updated domains for SDTM v3.1.4 Batch 1 : • RD - Reproduction Details / RP – Reproductive System Findings • • • • IS - Immunogenicity Specimen Assessment SR - Skin Response EX - Exposure EC - Exposure as Collected Overview of the comments: Most comments were about : Label inconsistencies, Incorrect Type, CORE and Controlled Term/Codelist/Format information Spelling errors Controlled Terminology updates Questions Suggestions Public comments Update : EX / EC domain IG v 3.1.4, batch 1 Comments (EXOCCUR/EXREASND): EX domain is an Interventions domain and therefore --PRESP, --OCCUR, --STAT and --REASND are already available, they were only excluded from EX because according to the current Implementation Guide page 85: ‘EX should contain only medications received’ - as this concept is now changed, it's making sense to allow these variables in EX and fill them as in all other domains. Public comments Update : EX / EC domain IG v 3.1.4, batch 1 Comments (EXOCCUR/EXREASND): • Also the SDTMIG V3.1.2 section 126.96.36.199 should be updated to document the use of a --REASND in combination with an --OCCUR=N. The table listed in this section is used by many electronic compliance checking tools as the set of allowed situations. It should be clearly stated that for EX/EC an exception is to be made. (Please note I refered to the SDTMIG V3.1.2 as there is not yet an updated version of this part of the IG available) Public comments Update : EX / EC domain IG v 3.1.4, batch 1 Comments (EXOCCUR/EXREASND): - In every other SDTM domain, REASND is used for the reason that an observation was not made. In findings domains, this is the reason a test was not done. In events domains, this is the reason that a question about a pre-specified event was not asked. In other interventions domains, this is the reason that a question about a pre-specified intervention was not asked. REASND is populated when STAT=NOT DONE. EXSTAT/EXREASND In this proposal, REASND would be populated when OCCUR = N. If it is desired to record the reason why a dose was not given, then a new variable for "reason not given" (or some such) should be created, instead of overloading the existing variable REASND. Even if we don't foresee using REASND for its original purpose in EX, this proposal would create a domain-specific exception to the general mapping of REASND to BRIDG, complicating the process of mapping future BRIDG-based SHARE content to SDTM. Public comments Update : EX / EC domain IG v 3.1.4, batch 1 Comments (EC Domain): It's also good to have the CRF collected dose/units and the protocol specified ones, but in my opinion it would be a lot easier to introduce something like Standard Dose and Standard Units like in the Finding domains, and there the dosage and units as specified in the protocol. The original collected dosage and unit can be kept in EXDOSE and EXDOSU. That would be more clear than making two very similar domains and linking them via RELREC. Public comments Update : EX / EC domain IG v 3.1.4, batch 1 Comments (EC Domain): Another argument for the EC/EX solution was that the values from EC can be summarized in EX to support analysis. In my opinion this should not be part of SDTM - as SDTM is usually not preparing tables for analysis but should reflect the CRF data as entered, sure some values are derived as well but to put one whole domain which is only derived and summarizing/pooling treatment data is maybe not what SDTM was intended for, for me that sounds more like ADaM. Update : EX / EC domain IG v 3.1.4, batch 1 It is indicated that EXVAMT will be deprecated, but it is not clear to me what replaces it. There are no examples showing collected doses based on a preparation step such as an infusion where study drug is diluted to a certain concentration in a vehicle such as saline. Public comments Update : RD domain IG v 3.1.4, batch 1 Comments : In the NCI Terminology Codelists there is a RP (Reproductive System Findings) domain assigned, why did it change to RD ? The Response: It was decided by CDISC that this domain will be renamed to RP (Reproductive System Findings) as per CDISC controlled terminology. Public comments Update : RD domain IG v 3.1.4, batch 1 Comments : • “Why RDSTDTC is not spelled out in the list of variables, given that one might want to know when a particular ‘finding’ occurred. What's being collected is RDDTC, the collection date. For some ‘tests’, like PREGNT, it is important to know when a pregnancy began, not simply the collection date. Reconsider this as a findings class.” The Response: RDDTC is the date of the reproductive finding and should not be confused with the collection date relative to when it was recorded on a CRF. Public comments Update : RD domain IG v 3.1.4, batch 1 Control Terminology Suggestions: RDTESTCD – PREGSTAT / RDTEST- Pregnancy Status RDTESTCD - EARLYTRM / RDTEST - Early Termination RDTESTCD – LIVEBRTH / RDTEST- Live Birth PREGSTAT -PREGNANCY ONGOING, LIVE BIRTH, STILLBIRTH, EARLY TERMINATION EARLYTRM - SPONTANEOUS ABORTION, THERAPEUTIC ABORTION, ELECTIVE ABORTION, OTHER LIVEBRTH - NORMAL, BIRTH DEFECT, OTHER The Response: Suggestions for controlled terminology should be send to the NCI EVS for consideration upon the release of the domain in provisional status. Public comments Update : IS and SR domains IG v 3.1.4, batch 1 Public comments Update : IS and SR domains IG v 3.1.4, batch 1 Comments: • IS needs controlled terminology. Many of the tests used as examples are existing LBTESTCD codes - either existing terminology e.g. HCAB or extensible e.g. as in Example 3. Will they be copied from LBTESTCD terms? Will they then be removed from LBTESTCD terminology? • Will ALL Serology and Immunology tests now reside in this domain rather than LAB? How about VIROLOGY? Can you provide a list of tests which will move? I think this needs clear guidance as the definition of serology/immunology in relation to specific testing can be rather vague. Defining the CAT for each LBTESTCD/ISTEST would be helpful. • Please explain why you don't have an upper limit of quantitation. We have serology data that requires this . Public comments Update : IS and SR domains IG v 3.1.4, batch 1 Comments: • I am concerned about moving categories of data out of LAB into their own domain. Generally all the categories fit well within the existing LAB domain and may only need a few extra variables. My preference would have been to keep serology and immunology and viral within LAB but add extra optional variables to LAB e.g. ISLLOQ & ASSAYV with examples of how to use them. Dividing the data up too much leads to problems when people have different opinions as to whether this data is immunology or virology or biomark. We end up having the same data in different dataset domains due to study team preference. Does LAB just end up being haem, chem and urine? • ISTEST: Immunoglobulin E is given as an example in the CDISC notes and HCAB in example 1. These are already a lab test codes. Will ISTESTCD/ISTEST share same codelist as LB? If so, how will users know whether to report results in LB vs IS? Does purpose of test determine domain to use? • Any specific reason why the data from IS can't be included in LB, differentiated by LBCAT? Public comments Summary of comments on v3.1.4 batch 1: For the EX/EC/IS domains a lot more thought should go in there so that there is no duplication of information and a review of the quality of work to be reviewed by users should be done. Non-standard variables Non-standard variables Update : Non-standard variables IG v 3.1.4, batch 2 Biggest Concerns: • • • Dataset Size Validation Cost Impact NSVs in standard domains Update : Non-standard variables IG v 3.1.4, batch 2 Permitting NSVs in parent domain will likely produce datasets that are very wide and sparse in the NSV columns. SuppQual increases the overall complexity of SDTM. SDTM isn’t easy to understand and explain. Jettisoning the SuppQual would make it a lot more appealing. While SAS v5 XPORT files remain as the transport mechanism, sparse datasets will result in big files, which the US FDA openly complains Separating data that belongs together in two datasets is confusing. Given that SuppQual is transposed it is even more confusing Reviewing wide datasets is very unlikely to improve review efficiency as mentioned in section 3. Data transparency would be increased if extra variables were allowed. There is also the issue of additional file size that is sure to occur from the NSVs being included in the datasets, which flies in the face of often repeated comments from CDER that we should not have dataset files that area "too big". Most of the records will likely not even have data for many of these NSVs, yet they will have the variables and the character spacing for them. The increase of the file size (for all files with Supplemental Variables) should be carefully weighed against the benefit of the "ease of use" by only some reviewers. "Please develop this. This would make the SDTM datasets more operationally useful, which is likley to lead to better transparency. More information is needed. Varied examples in the IG would be helpful." NSVs in standard domains Update : Non-standard variables IG v 3.1.4, batch 2 No. While this may address some challenges, it will create others, and we think that therapeutic area development is an overriding priority. The compliance-checking is a particularly worrisome prospect. How can a dataset be standard if it isn't? If CDISC flips their fundamental constructs around as they did when moving from v2 to v3, especially when this is purely a matter of preference, I, and others, will throw up my hands in disgust and question the validity of the CDISC "standard". Compiling NSVs into SUPPXX domain structure requires complex programming and subsequent QC algorithms. Programming and data review is more cumbersome and time consuming in SUPPXX structure. Much time is spent by sponsors programming teams removing NSV data from the original source parent domain where it is collected and moving it to the SUPPXX domain. Then again for internal data review sponsors must put the data back to the parent domain where it logically should be reviewed from. Working on studies, when reviewing data and confirming NSV's, it was found best to have them linked to the parent domain. NSVs in standard domains Update : Non-standard variables IG v 3.1.4, batch 2 At GSK we have invested millions of dollars and several years into developing an SDTM-compliant submission data development system. The creation of SUPP datasets are embedded into this system and are an integral part of the what defines the standard. Everyone knows that SUPP datasets can be transposed and merged for operational purposes. The existence of a standard is supposed to protect sponsors from having to adapt to various user preferences. To change SDTM to make it "optional" to include NSV in the main domain dataset will mean that reviewers will request it, so GSK would by necessity be forced to redesign our data creation and compliancechecking systems at great expense. Companies with existing SuppQual infrastructure or companies that derive revenue by managing SuppQual for other companies will be advocating to keep the SuppQual paradigm. The SuppQual paradigm favours larger companies that can provide the infrastructure and staff to handle the increased complexity associate with SuppQual operations The current SuppQual convention ignores the fact that not all data is ultimately submitted to the FDA so the SuppQual costs are incurred without benefit. Given the rate of failure in pharmaceutical development this doesn’t seem optimal. NSVs in standard domains Update : Non-standard variables IG v 3.1.4, batch 2 For organizations that have already invested significant resources into developing a system to handle raw data at the operational level and then create SDTM compliant data, there would be an obvious negative impact. These organizations have already found a way to collect and clean the data, then separate the non-standard data into SUPPQUAL sets. If this proposal eliminates the SUPPQUAL, the investment of resources is lost. This is not however the only potential cost. Dependent upon how the proposal is developed, for all organizations the potential exists that adding NSV data to the standardized domains will require study level custom programming to draw these in the ADaM data. (As SDTM stands, NSV data is 100% predictable – values always appear as QVAL, transposed varnames are always in QNAM, etc.) Creating standardized programming to build analysis datasets is easier in the current model. Dropping the use of SuppQuals in favour of using NSVs is a very good idea for the following reasons: Maintaining the infrastructure to transpose and merge SuppQual variables for operational, non-submission business use case is substantial. With the current standard extra costs are incurred not matter which paradigm a pharmaceutical company chooses. • The cost of converting operational, wide data sets to strict SDTM with SuppQual is expensive. • The cost of converting strict SDTM with SuppQual to operational, wide data sets is expensive. NSVs in standard domains Update : Non-standard variables IG v 3.1.4, batch 2 We are not in favour of the proposed change. We do think that there would be a heavy cost impact because of the work that would need to be undertaken to change existing standards as implemented by industry. We feel that the change is not backwards compatible with existing work. The impact to validation via OpenCDISC or other tools should also be considered. The possibility that this will open the door to the inclusion of more derived variables in SDTM data is also a concern NSVs in standard domains Update : Non-standard variables IG v 3.1.4, batch 2 No. Roche's "operational" SDTM datasets include NSVs. Roche has a standard procedure to separate the operational datasets into parent and supplemental qualifier domains and create study metadata (define.xml). As SDTM exists today, the SUPPQUAL supports collection of standardized data. Using the SUPP- - to extend the standard requires additional effort. Sponsors are incentivized to look for existing variables that meet each purpose or to find related datasets (e.g., FA) to contain data. Should a sponsor find no other way to meet the standards, sponsors are more likely to create a review process to justify the deviation from accepted industry standards. (To properly build a SUPP- domain for submission requires planning, programming resources, QC effort, etc.) By simplifying the ability to add non-standard data, this proposal increases the temptation to work outside the established standards and possibly negate the gains provided. Yes. Roche has queried members of TBI and the consensus support this proposal. Roche would like to see this proposal prioritized and expediently pushed through the SDS development process. Given the resource requirements of the TA standards development, Roche encourages CDISC to explore alternate resourcing methods. For example, could this proposal be formalized as part of an FDA/PhUSE project? This will also be much easier to use the variables in ADaM. Consequently, we at BI support this proposal of keeping NSVs in the parent domain with clear rules such as standard variable prefixes (ie) SP_NSV. This would help streamline SDTM dataset creation and subsequent data review. Any Comments ?