Hello everyone, and welcome to today's webinar. Get ready in one click: clean SDTM datasets by using library backend Rules. Feel free to ask questions throughout the whole webinar by using the widget at the bottom of your screen. Those will be answered by the end of the webinar. Let me introduce you to today's speaker, Glenn Borgmans. Glenn holds a master's degree in biology, and he obtained a PhD in biology from the University of Antwerp in 2019. He shortly after moved to the field of clinical research, and he is now working as a clinical data manager at SGS Life Sciences for the past year. The floor is yours, Glenn.
Thank you for the introduction, Jana. So let's get started. Welcome everyone, and thank you for joining my presentation. I will talk to you today about, get ready in one click: clean SDTM datasets by using library backend Rules. Here you can see the contents of my presentation today. I will start by giving a short introduction on the challenges when cleaning data, followed by an introduction into our All2One tool. Then I will use a practical showcase to explain the process of selecting backend Rules and creating new Rules. I will then move on to share with you some of the past advancements and future outlook of the All2One tool, and finally end with some conclusions. So first, a short introduction on data cleaning challenges.
One of the biggest challenges data managers must overcome during data cleaning on clinical trials is that there is no uniformity in data coming from various sources. Data can be obtained from different sources such as eCRF systems, external vendors, questionnaires, etc. This wide variety of data makes it hard to define, validate, and implement standard rules for data cleaning and reconciliation. An added degree of difficulty that was encountered during data cleaning came from the fact that for each study, the setup of the data cleaning process had to start all over again, which was a huge disadvantage when working on similar trials within a bigger project from the same client or trials on the same compound in different phases. With this in mind, we at SGS developed an in-house tool called All2One to tackle these issues. So what exactly is the All2One tool?
The All2One tool is an in-house developed application for data cleaning, which was first released in 2011. The system facilitates the data cleaning process for paper-based, eSource, and EDC trials by providing a master rule library. This library will be populated with rules that have been validated by a controlled validation process. This approach ensures that no study-specific validation is needed for each new study. The All2One tool runs on the backend SDTM database. The tool can be used by both the clinical data manager and clinical data programmer. The CDM is able to select all relevant rules for a trial from the library using advanced search functionalities. In case a needed rule is not available in the library, the CDM is able to create new rules, and programming code created by the CDP is present in All2One.
Both processes of selecting and creating rules will be explained in more detail throughout this presentation using a practical case. Before we continue, I wanted to ask you the following question: which system do you use to perform data cleaning activities? You can answer this question by clicking on any of the buttons, and please note that multiple answers are possible. Don't forget to press the Submit button in the lower right-hand corner to submit your answers. I will now give you some time to answer the question. Thank you for your answers. So now let's dive into our practical showcase. At the start of every new trial, a clinical data manager will follow the same process each time in order to initiate the data cleaning.
To determine the data cleaning needs of a given study, we will start from the protocol, the electronic case report form eCRF, the annotated CRF, and all applicable data transfer agreements with external vendors. Using these study documents, the data cleaning package will be created in a specific order. First, all appropriate backend library rules will be selected. After this, the clinical data manager will create new study-specific rules that are needed in the study. Next, the Help Tables will be updated according to the requirements of the study, and lastly, the manual rules will be created. The data cleaning package will be summarized in an automatically generated data validation plan. All these steps in creating the data cleaning package and the data validation plan will be explained in this presentation, and we will start by going over the selection of backend rules.
So let's take a look at the selection of backend rules. Before the selection process of backend rules can be explained, it is important to mention that all rules present in the library are divided into checks and listings. In general, checks are used to detect simple data issues. All output of a check needs to be investigated, as it will only generate output for records with issues. If all data that a specific check investigates is without issues, then the check will generate no output. The screenshot here shows an example of a check rule, and below we see an example of output without issues. Listings, on the other hand, will in general be used to investigate more complex issues. A listing rule will always generate output, and thus will always need to be looked at in detail.
To facilitate the cleaning of listing rules, a system of indications is used, which will mark a specific record with data issues. The screenshot here shows an example of a listing rule, and at the bottom we see a screenshot of the output of a listing rule with indications. Here we see a screenshot of the starting view in the All2One tool, but let's zoom in a bit. In this zoomed-in screenshot, you can see that there is a lot of functionality present in the All2One tool, but today I will focus on the second tab, which is the rules tab. Clicking the rules tab will open a drop-down menu, as you can see here, and clicking on the highlighted section will open the rule library. Selecting appropriate backend rules can be done directly from this rule library. In this slide, we see a screenshot of the rule library view.
We will also zoom in a bit here. In the rule library, all applicable rules can be selected by a CDM by marking the checkbox in the Copy column indicated here with the orange box. After the CDM has selected all rules that are required for a specific study, they can be copied into the study-specific library by pressing the Copy Selected button in the lower right-hand corner. Rules that are already copied for that specific trial will be highlighted in green, as you can see at the bottom of this screenshot. The All2One tool also allows Copying the entire study-specific library from a previous study, which can be done using the drop-down menu highlighted with the orange box. This screenshot shows a detailed view of the Copy From Trial drop-down menu.
It is possible to filter on a number of criteria such as client name, compound name, and even Implementation Guide-specific libraries. This feature allows the CDM to Copy rules from a previous trial and thus drastically reduces the setup time. At all times, a CDM can view a rule in more detail by clicking on the Detail button, which is shown here in the leftmost column. This Detail button will open the view as shown in this screenshot. Let's zoom in a bit here. Besides the rule name, rule description, and if applicable, the Standard Query Message, the detailed view also shows the SQL coding for that specific rule. Certain rules will use Help Tables to facilitate the data cleaning. Help Tables are created for general rules of which the specific parameters often differ between studies, for example, Vital Signs ranges, visit windows, or required lab assessments.
By using these Help Tables, we are able to avoid creating overly specific rules that are only applicable to a limited number of trials. This screenshot shows an example of the visit window help table, which will be used to investigate if all study visits are done as required. Here we can see another example, which is the vital signs ranges help table. This help table can be used in rules that investigate vital signs results that are out of range. However, across studies, there is a difference in which vital signs assessments are required and which ranges are applicable. Therefore, updating the Help Tables with study-specific parameters allows us to use these general rules across studies. All of the rules, both checks and listings that are used in a specific study, together with the updated Help Tables, will be summarized in an automatically generated data validation plan or DVP.
As part of our internal review process, this DVP will be reviewed by a second CDM to ensure all study requirements are fulfilled. The DVP can be easily exported and sent to the client for review if required. Now I will go over the process of creating new rules. When a study has certain requirements that cannot be fulfilled with the rules that are currently present in the library, a CDM can initiate the creation of a new rule. This figure shows the entire rule creation flow, which I will explain in more detail. Important to note here is that all rules start as a study-specific rule until a library team member either accepts or rejects the rule. This is indicated here by the orange dotted line. To define a new rule, the CDM completes the rule name, description, and if applicable, the standard query message.
So if we take a look at the detailed rule view, it would be these sections that the CDM would complete. The SQL code is not present yet, as this is completed by the CDP in the second step. The SQL code will be completed in this section. After coding by the CDP, the rule is sent to a second experienced CDP who performs a code review on the SQL code. Once the SQL code is approved, the rule still requires validation with test data. New listing rules are validated to see if the SQL code meets the All2One criteria. A new check rule is validated to see if the SQL code meets the All2One criteria and also uses predefined discrepancies to see if they match the SQL code. After a rule is validated, it is pending until a library team member accepts or rejects the rule.
Rejected rules will remain study-specific rules, and accepted rules will become library rules and are from then on available in the general library. Here we can see an example of an LB library rule that is used in almost all studies to check if the clinical significance is completed for out-of-range results. Here we see an example of a newly created study-specific rule in an early phase study. This rule was created for a study that had a high number of study drugs and therefore needed a rule to investigate the relation with an action taken with regards to all of these study drugs. As this new rule will probably be usable in other studies, it will most likely be accepted by the library team and become a library rule later on. Besides library and study-specific rules, manual rules are also used in exceptional cases.
Manual rules are, for example, used in early phase studies where no SAEs are expected and thus SAE reconciliation is out of scope and not required in the standard rules. In this situation, a manual rule can be added as a proactive approach in order to be prepared in case an SAE does occur. Before we continue, I wanted to ask you the following question: on which database do you perform data cleaning activities of your clinical study? You can answer this question by clicking any of the buttons below. Please note multiple answers are possible, and please do not forget to click the Submit button at the bottom of this page. I will now give you some time to answer the question. Thank you for your answers. Now let's take a look at some of the past advancements and future outlook of the All2One tool.
Since the first release of the All2One tool, a lot of work has gone into improving the tool. One of the earliest improvements was the addition of the change request function, which made it possible to make changes to library rules. Without this possibility to update non-functional rules, there would be a high number of outdated rules in the library that could no longer be used. Another improvement that has been touched upon previously was the implementation of the Help Tables. Prior to the creation of the Help Tables, every study-specific rules were created that could not be used for other studies. The use of Help Tables provides us with more flexibility to use general rules in different studies. The merging of the different implementation guide libraries has been another past improvement.
Prior to this, rules from a specific implementation guide could not be used in studies with a different implementation guide, which led to identical rules needing to be validated for each different implementation guide. The merging of the implementation guide libraries allows for rules to be used in studies with different implementation guides. We are currently working on several updates to the All2One tool. An automated check on the rule naming conventions for new rules will be rolled out soon. This update will make sure that errors in naming new rules will no longer occur. Another improvement that is coming soon will be an automatic assignment flow in All2One for new rules, which will make the rule creation flow outlined previously in the presentation much more efficient.
To allow data cleaning in parallel with SDTM, for example, for cleaning on external vendor data, we are working on having a library of rules that can be used directly on the source data or the raw data. A last update in progress is the addition of tags to library rules to increase efficiency in proposing rules when selecting rules for a new study. Besides the updates that are currently in progress, we also have some plans that would be very interesting to be able to implement in the future. One of these ideas is to create a system that makes it possible to post standard queries automatically in the eCRF system, which would make the cleaning process much more efficient for CDMs. Additionally, we aim to keep expanding and streamlining the existing library.
To conclude, the main drawback of the All2One system is that the rule library will never be finished. The growing library can result in too much choice in rules, which could be a pitfall for new users. To tackle this, it is important to use the filters and search options, and regular maintenance is needed to keep the library as efficient as possible. In our opinion, the All2One system has many benefits. First, there is no need to start all over again for new studies, which results in a shorter setup. The rule creation process allows for immediate release and use of validated library rules, which brings us one step closer to clean data being only one click away. Additionally, the client-specific libraries make it much easier to select rules for a new study of the same client and allow us to provide consistent data cleaning.
Overall, the All2One tool allows more standardization and automation, which will result in less chance for human errors.
Thank you very much for this presentation, Glenn. It was very interesting. If you have any questions, please use the widget at the bottom of your screen and submit your questions now. We still have some time for a couple of questions. Let's see if there's already some coming in. Yes, we already have a first question for you, Glenn. You mentioned the use of predefined discrepancies when validating a check rule. Could you give an example of this, please?
Yeah, that's a great question. So, for example, for a rule that checks if a certain assessment date falls before another date, we would use this data that is incorrect on purpose. We do this in order to check if the rule works as expected.
So that's an example of some of the predefined discrepancies that I mentioned during my presentation. Okay, perfect.
Thank you very much. There's another question that just came in. What is your personal experience as clinical data manager with the All2One? All2One, sorry.
Yeah, the All2One tool. So, as I mentioned in the presentation, the most difficult thing when first using the All2One tool is the choice in rules when drafting the cleaning package. There is a lot of choice, and luckily there are good filters and search options that facilitate navigating the library. A big advantage for me personally as a clinical data manager that only started working in the field a year ago is that cleaning on SDTM datasets with the All2One tool quickly provides us with an extensive knowledge on SDTM standards, which, as I said, it's a huge advantage.
Yeah, okay, perfect.
Thank you very much. Let's see if there's more questions coming in. So far, I don't see any. We give the audience a couple of seconds to submit questions. No, I don't think we have any further questions coming in, so we can end the webinar here. Thank you very much for your presentation, Glenn, and thank you very much for the people who attended this webinar. Have a good day
Thank you, you too, and thank you everyone for joining.