Archive for the ‘Skills’ Category

Practical SOPs – a quick format

Wednesday, November 2nd, 2011
How to get your SOPs as practical as possible without omitting the regulatory requirements and procedural helicopter view?After a decade of SOP creation for several companies, I managed to find a structure with which I can easily create SOPs within 6 weeks. Which works even better as a team effort. Main features, amongst others, are the primary colours and the SOP/User Manual combination.

Use the 4 primary colours to distinguish between SOP, work instructions, form and template.
1. Clear blue for  the SOP.
2. Green for the step by step work instructions
3. Red for the approval forms
4. and Yellow (eventually with a grey background) for any templates.

Secondly, create 2 documents for every procedure.
1. One for the WHY, WHAT, WHO, WHEN questions; the blue SOP.
 2. And the other containing the practical work-out; the HOW question, in your organization, with your system(s) and allocated tasks. The User Manual containing the green actions, steps to take in a certain order, the yellow templates and the red required forms for approval.


Any idea what that would result in?
SOPs that are well thought off; reflecting the team-effort. And providing the overview of solid choices made, needed input, produced results.

User Manuals with screen shots and incorporated tracking. With which (new) team members can easily conduct the biggest amount of the work. And through which they can focus on fulfilling the more complex study requirements and questions.
Created by a team-member providing the practical actions, the practical steps, in the required order. And tested, reviewed by a colleague.

Both documents are equally important. But as the User Manual reflects the practical steps, it is more often subject for change than the SOP. Thus a separate User Manual. 

Any idea what that would look like?
Example ProCDM SOP clinical study documentation
Example ProCDM User Manual clinical study documentation

Any help to get your team ready and enthusiastic to create or update their SOPs?
Contact ProCDM for the package to complete, every clinical data management SOP you need. Contact information ProCDM.

Kind regards, Maritza


© 2011, Maritza Witteveen, ProCDM

You’re welcome to re-publish this newsletter if you add the following text to it. This is an article from ProCDM. Data management for clinical research. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via

share on Twitter

From CRFs to datasets – 5 examples

Tuesday, October 25th, 2011


One of the three main tasks of data management is to translate individual subject data to logically grouped datasets ready for analysis. Study data captured in a structured format with which the statistician can work. But with what datasets can the statistician work?
In fact, with everything. Because he or she is capable of transforming your datasets to suiting datasets with the statistical software. So the question could better be, with what datasets can the statistician comfortably work? Without re-structuring the data delivered?

Well, first it is handy to know a bit more about the products statisticians deliver.

1. Tables with descriptive statistics describing the subject group under study. Overall descriptive stats for all subjects together or descriptive stats per treatment group or per gender.

Gender          Male 84 ( 40,4%)                 Female 124 (59,6%)
Age (yrs)        Mean = 56,7                         Min = 34   -    Max = 82
Weight (kg)   Mean = 78,8                         Min = 51   -    Max = 111

Tables for safety outcomes. Numbers and percentages of adverse events that occurred. Overall and per treatment group.

Adverse events                   Medication A      Medication B
                                                    (n=1205)               (n=1200)
                                         No. (%) of patients   No. (%) of patients
Gastrointestinal disorders  101 (8,4%)                            113 (9,4%)
  Diarrhea                              67 (5,6%)                               66 (5,5%)
  Nausea                               61 (5,1%)                               57 (4.8%)
Muscoloskeletal and connective
tissue disorders                   98 (8,1%)                               89 (7,4%)
  Pain in extremity                45 (3,7%)                               59 (4,9%)
  Back pain                            72 (6,0%)                               62 (5,2%)

2. Graphs to visually compare the different intervention groups under study. E.g. survival rates, pharmacokinetics.

3. Statistical tests to compare the efficacy objectives between the different intervention groups. On which the conclusion of your clinical study report will be based.

“Subjects receiving the new medicine were significantly
more likely to respond well up on overall quality of life, than were those who received the placebo(P < 0,05), whereas those walking within 24 hours after surgery, or weight loss were no more likely to respond well than those without these features.”

4. And last but not least, raw data listings if not already created by data management.
(The advantage of creating raw data listings for a study is that you get to know the individual study data. You are busy with all individual data records, instead of grouping them into a table, graph or analysis. It helps to get to know the individual drop-outs, the outliers and the missing measurements.)

Subject number  Visit date       Diastolic blood pressure  Heart rate
                                                           (mm Hg)            (bpm)
1209                        13AUG2011        127                            78
1210                        15AUG2011        116                              89
1301                        16JUN2011         104                             91

This about the products statisticians deliver for a clinical study report. Secondly some examples of datasets and why chosen as such:

1. A demography dataset, DEMO, is delivered with all demography data for all subjects, like gender, date of birth, but also subject number and date of screening. Only this demography dataset is needed to program a descriptive stats table for all subjects.

1209              12JUN2011   1               17OCT1945
1210              13JUN2011   2               10FEB1961
1301               07JUL2011   1               04DEC1954

In- and exclusion criteria can be a separate dataset. Because these are only listed and checked for deviations.

2. Datasets contain subject numbers and most of them also have visit dates. These so-called key data fields, are used to combine data from different datasets. E.g. a dataset revealing the actual treatments merged with the demography dataset. Using the subject number, both datasets can be combined. And a descriptive stats table of the subjects per treatment group can be programmed.

1209               A                     New medicine
1210               A                     New medicine
1301               B                     Placebo

With exception of the key data (subject number, visit number), CRF data should exist in one dataset only. Either in this or that, but not in two or more datasets.

3. Another example, blood – and urine laboratory assessments for all visits combined in one dataset. To check for laboratory result shifts across visits.

SUBJID   DVISIT             LABP             LABR   UNIT          OUT    CS
1210        13JUN2011   ASAT              68           U/L              2          2
1210        20JUN2011   ASAT              123        U/L             1          1
1210        08AUG2011   ASAT              72          U/L             2          2
1210        15AUG2011   ASAT               52          U/L             2          2
1210         13JUN2011   Creatinine     69   umol/L             2          2

All measurements collected in one visit are not necessarily present in one dataset. On the contrary, it is more logical to have different measurements in separate datasets. Maybe a measurements dataset for small repeating measurements.

4. Datasets that needed normalization, like often is more convenient for medical history, in- and exclusion criteria and laboratory datasets, can not be combined with non-normalized data in one dataset. Normalized datasets have additional key fields next to subject number and visit number. E.g Criteria number for an in- and exclusion criteria dataset. Or a specimen (blood/urine) field and a laboratory test field for a laboratory dataset.

SUBJID   DVISIT           CRITNO    CRIT                              INEX
1210        13JUN2011  4                   BMI < 25                     Yes
1210        13JUN2011  4                   BMI < 25                      Yes
1210        13JUN2011  5                   Is the subject pregnant?  No

Thus the single outcome of the one-time measured pregnancy test at screening is often added to the demography dataset instead of added to the in- and exclusion criteria dataset.

5. For identification and search reasons, adverse event and concomitant medication datasets contain adverse event numbers respectively concomitant medication numbers.

SUBJID  CONM No.    Medication                 Reason given   AE No.
1301         23                  Atenolol                       Prophylaxis
1301         24                  Prednisone                  Adverse event    3
1301         25                  Acetylsalicylic acid  Adverse event   12

Do you get an idea of how to structure your CRF data in logically grouped datasets?
In practice, get the blank CRF and sit down with the statistician or statistical programmer to logically group all CRF data in datasets. The total number of datasets for a regular clinical study…. is around 20 to 30 different datasets. Estimated time to draw CRF data to grouped datasets; 30 minutes. And you will discover with what structured format the statistician comfortable works.

Kind regards, Maritza

© 2011, Maritza Witteveen, ProCDM

You’re welcome to re-publish this newsletter if you add the following text to it. This is an article from Maritza Witteveen of ProCDM. Data management for clinical research. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via

share on Twitter

How to write query texts – 6 template sentences

Thursday, October 6th, 2011

How to write queries unambiguously expressing what is asked for?
Using short, polite sentences?
Objectively explaining the underlying inconsistency?

First of all my general guidelines.

  1. My preference is to use no more capitals then needed. Capitals in the middle of a query text, e.g. for CRF fields or for tick box options, could distract from getting the actual question asked. E.g. compare the same query texts, with and without extra capitals.
    Please verify stop date. (Ensure that stop date is after or at start date and that stop date is not a future date.)
    Please verify Stop date. (Ensure that Stop date is after or at Start date AND that Stop date is not a future date.)
  2. Referring to CRF fields as they are shown on the CRF. To easily find the involved field(s).
  3. I prefer to leave any ‘the’ before a CRF field referral out of the query text. For more to-the-point query texts. E.g. compare the same query texts, with and without ‘the’ before data fields.
    Please verify stop date. (Ensure that stop date is after or at start date and that stop date is not a future date.)
    Please verify the stop date. (Ensure that the stop date is after or at the start date and that the stop date is not a future date.)
  4. Consistency in phrasing a query text can help to quickly write query texts or pre-program query texts in a structured, familiar way. That’s the thought behind the following 6 template sentences for query texts. Which you can use to help you write or program your queries.


The six ‘template’ sentences for query texts:

  1. Please provide…
  2. For asking the study site people to provide required data from patient care recordings. Examples:
    Please provide date of visit.
    Please provide date of blood specimen collection.
    Please provide platelet count.
    Please provide % plasma cells bone marrow aspirate.
    Please provide calcium result.

  3. Please complete…
    For asking the study site people to complete required data as required by the study CRF design. (Not necessarily required for patient care). Examples:
    Please complete centre number.
    Please complete subject number.
    Other frequency is specified, please complete frequency drop-down list accordingly.

  4. Please verify…
  5. For asking the study site people to check date and time fields fulfilling expected timelines. Or for asking the study site people to check field formats. Examples:
    Please verify start date. (Ensure that start date is before date of visit.)
    Please verify stop date. (Ensure that stop date is after or at start date and that stop date is not a future date.)
    Please verify date of blood specimen collection. (Ensure that date of blood specimen collection is before or equal to date of visit and after date of previous visit.)
    Please verify date last pregnancy test performed.
    Please verify date of informed consent. (Ensure date of informed consent is equal to date of screening or prior to date of screening.)
    Please verify date as DDMMYYY.

  6. …., please correct.
  7. For asking the study site people to correct a data recording inconsistent with another data recording. Example:
    Visit number should be greater than 2, please correct. 

  8. …., please tick…
  9. For asking the study site people to complete required tick boxes. Examples:
    Gender, please tick male or female.
    Pregnancy test result, please tick negative or positive.
    Any new adverse events or changes in adverse events since the previous visit, please tick yes or no.
    Laboratory assessment performed since the previous visit, please tick yes or no.
    LDH, please tick normal, abnormal or not done.

  10. Please specify…
  11. For asking the study site people to specify the previous data recording. Examples:
    Please specify other dose.
    Please specify other frequency.
    Please specify other method used.
    Please specify other indication for treatment. 

Finally, for query texts popping up during CRF data recording, it could be helpful to put location information in it. Like:
Page 12: Please verify start date. (Ensure that start date is after or at start date on page 11.)

Good luck finding your way to structure query texts,
kind regards, Maritza

© 2011, Maritza Witteveen, ProCDM

You’re welcome to re-publish this newsletter if you add the following text to it. This is an article from Maritza Witteveen of ProCDM. Data management for clinical research. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via

Smoothly running clinical study data collection – Five signs

Tuesday, May 31st, 2011

Which signs provide me confidence that study data collection, data verification and data cleaning are running as they should? These are simple signs. Signs I get from the data itself, signs from the people recording and delivering the data, and  metrics (signs) from the clinical data management system. Most information is already measured by the clinical data management system and can be viewed through system (status) reports.
One sign although, I experience in study meetings. However, this ‘sign’ is also documented; written down in meeting notes, e-mails and telephone reports.

What are these numbers, information, calculations that indicate data collection is doing fine?

  1. The fact that subjects are continuously enrolled per study site. Amongst others, this is an indication that the CRF used to collect data is clear and user friendly. People are not holding back to include subjects because of difficulties they have with the CRF, logistics and/or the queries to expect. The study is progressing and people are all working towards completion.
  2. The lag times (duration) between data receipt and (query) feedback to the study sites are short. Only recent data is handled. CRF and query focus is about what’s currently happening with the subjects on the study sites. Earlier data collection is completed and new subjects and visits can be handled.
  3. The study’s raw data listing is up to date. The amount of subjects and visits listed in the study’s raw data listing reflect the current number of subjects and visits conducted at the study sites.
  4. Continuously, over 90% of data is clean. Reflecting an ongoing, up to date data verification and data cleaning process. This reveals that data is reviewed immediately after receipt for inconsistencies, and proper feedback (queries) to the study sites is communicated as soon as possible.
  5. Communication about clinical data is mainly about study results; about meeting the study objectives, safety and efficacy objectives. In fact, focus shifts more and more to study content, because study conduct is under control.

The information that indicates if your clinical data collection and verification is doing fine is already available. You only need to find out where to get it for your study, and how to read and use it!

© 2011, Maritza Witteveen, ProCDM

You’re welcome to re-publish this newsletter if you add the following text to it. This is an article from Maritza Witteveen of ProCDM. Helping clinical research professionals who struggle with data handling, to get reliable trial data, so they can work on their other study goals. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via

Stuck in CRF design? – 2 causes, 2 solutions

Friday, April 29th, 2011

A clinical study is a structured way of collecting data on product safety and efficacy. … a structured way of performing research….
CRF design should follow this structure. However, sometimes you can get stuck during CRF design. Like I found myself struggling creating a CRF. 14 years of clinical data management experience and suddenly not being able to deliver a CRF for approval…
Hours and hours of work spent on CRF design. Each page modified at least five times. Updated for the data to be collected, the sequence of collection, as well as minor textual and lay-out updates.

Why didn’t CRF design progress?

1. A major baseline measurement differed in collected CRF fields (data) as compared to the comparative measurement at other (follow-up) visits. Start and end dates collected at baseline while recording modifications only for follow-up.

2. The CRF collected incomplete information for a measurement. Data collected during visits of which, at study discontinuation, no complete picture could be made, because end dates were not asked for.

These two causes had such an impact that, although everyone involved wanted, no one was able to declare the CRF ready for approval. Until, with external support, the incomplete information was revealed.

After discovery we then re-designed CRF pages to capture complete information. While doing that, it turned out that we even became able to re-use all repetitive measurements for visits. Which comforted and strengthened the spirit that the CRF could soon be signed for approval.

100% complete and re-usable data collection. Within four days after discovery of incomplete information, the CRF was finalized for the CRF fields (data captured) and lay-out. It took another week to adjust, test and document the data checks for correct queries popping up.

Did we noticed these incorrect basic CRF design requirements at the start, we could have saved a lot of time and energy. In fact, we did notice that something wasn’t right, but we couldn’t point it.

Solutions to progress CRF design:

1. This experience got me again on the track that a CRF should ALWAYS have look alike CRF pages for the same measurements captured at different time-points. No matter how complex the study design is. A clinical study is a structured way of doing research. And the CRF should reflect this structure, amongst others, through repetitive measurements.

2. Secondly, I’ve seen that if a measurement is collected, the complete measurement should be collected. Even if it seems that only part of the measurement is needed to answer the clinical study objectives. For each measurement, collect at least a clear start and end date, the result, the collection method and the clinical outcome. If any of these are missing, you’ll get stuck in providing a complete picture of what happened at the study site(s).

However, better stuck in CRF design progress, than capturing real study data with a bad designed CRF….

© 2011, Maritza Witteveen, ProCDM

You’re welcome to re-publish this newsletter if you add the following text to it. This is an article from Maritza Witteveen of ProCDM. Helping Clinical Research Directors, who struggle with clinical data management, to get reliable, quality data successfully. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via

Computerized system validation headaches? – Use the risk triangle approach

Friday, February 11th, 2011

Conducting Clinical Data Management and needing to (re-)validate your clinical data management system. Maybe an unfamiliar, huge project, on top of your clinical study tasks.

Then…., approach this topic from another perspective.
Where could your clinical study data possibly be at risk during the process? For your clinical data management process in general, or per clinical study. The latter is recommended for differences in data flow per study. E.g. remote data capture, e-CRF, paper-based process, combinations of these.

I call this the risk triangle approach.
Because there are three main areas that could cause your clinical study data to be at risk for incorrect transcription, inconsistent transfer, data loss etc. These main areas are:
- the system (features, electronic environment behaving as expected),
- the process (quality control steps, SOPs, structure, documented evidence)
- and the accessibility (people, roles, restricted access to the clinical study data).

Listing possible risks regarding system use, access to the clinical study data and processing the clinical study data. And rule these risks properly out.

For your clinical data flow, you could ask yourself the following questions in the light of the risk triangle approach:

- Where in the clinical study data flow could data be at risk? At risk for incorrect transfer, transcription, storage, updates, entry etc.
- Are steps that need to be conducted one after the other, prevented for conductance in another sequence?
- Are all tasks captured in comprehensive checklists? CDM tasks as well as CSV tasks?
- Are solely the necessary people given access to the system? Is their access restricted to their role within the system?
- Are these people dedicated to their role? To its underlying goal? E.g. the Investigator that administers the subject’s clinical study data together with his/her delegates? Or the Data Manager verifying the data for correctness?
- Are the people granted access, conscious of the meaning of their username and password? As regarded equal to their handwritten signature?
- Are the people conducting a role within the system, trained to perform their tasks correctly?
- Are your processes logical and clear; easy to follow and easy to perform?
- Where in the clinical study data flow is (solely) relied up on electronic records? E.g. eCRF entries, eCRF query resolutions, database export to SAS datasets.
- Which clinical study data could get lost? E.g. paper CRFs, electronic records.
- Do your processes result in documented evidence for each step performed?
- Do you have controls, steps in your process with which you check for, for example, correct data storage, correct data transfer and correct database updates?

The risk triangle approach lists any possible failures you can think of with regards to the clinical study data flow under subject. Focussing on the capabilities of the system, the use of the system and the access to the system. For any possible risk, resolutions can be sought in using the system differently and/or changing the process and/or restricting the access to the clinical study data.

Wishing you insight in the risks within your own clinical study data flow with focus on system, process and access.

© 2011, Maritza Witteveen, ProCDM

You’re invited to re-publish this newsletter if you add the following text to it. This is an ezine of Maritza Witteveen of ProCDM. For Clinical Research Directors who struggle with clinical data management to get reliable, quality clinical study data successfully. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via

How to quickly define appropriate data checks for your study – Agenda

Friday, January 28th, 2011

Defining edit checks, data checks as I prefer to call them, time consuming? Difficult to get consensus about the ones required for the clinical study? Despite templates and standard data checks?
Then this article containing the agenda for the Data Check Meeting is for you.

For more efficiency; less time and the correct knowledge about the product, investigation and indication under subject.

The idea is to invite people to actively participate in the Data Check Meeting. People responsible for the clinical study results. Often already involved in creating the clinical study protocol and reviewing the CRF. People invited for their knowledge about the product, the investigation or the indication under subject. Amongst  others; Clinical Study Manager, Medical Reviewer, Research Director, CRAs and Investigators.


1-2 Weeks before Meeting: Invite the people responsible for the clinical study results. Send them the agenda and timelines for the Data Check Meeting as listed below.
1 Day before Meeting:�
- Having a printed blank or dummy CRF, all pages spread out, in sequence, on the table
- Having sufficient markers (yellow, green, pink) and blue pens.
- Study once more the timelines of the meeting to be able to stick to them during the meeting.

The agenda for the Data Check Meeting:
0-5 Minutes:
- Max. 5 minutes explanation about examples of data checks and how to ‘mark’ specific data checks
- Yellow, green and pink markers, and blue pens
‘yellow’ for ‘required’ fields
‘green’ for ‘logical’ checks (if Other, then a specification is required and vice versa). One- or two way directions can be drawn with arrows (IF start line …THEN arrow).
‘pink’ for other data checks
5-13 Minutes: Mark the first page together, with the whole group
14th Minute: Ready to start?
15th Minute: Pull back the chairs and start marking data checks on the CRF pages. Alone or per two people. Everyone, or pair, starts with another part of the CRF.
15-30 Minutes: 15 minutes conduct
31th Minute: Check if we think this is useful and decide to stop or finish the data check definitions.
(A maximum of one hour for the total meeting, one and a half hour in case of a virtual meeting.)

The responsible Data Manager, then creates the Data Checks Specification from these ‘marked’ CRFs.

Just make a 15 minute start; working per two or alone ‘marking’ the CRF pages. Again, remember to stick to the proposed time schedule. Meeting peoples expectations and the Data Check Meeting’s results.
Data Check Meeting

In case the invited people can not join the group physically, I also provide you the addition for a virtual Data Check Meeting.

Addition Virtual Data Check Meeting:
Preparation: All locations print the latest, the same, blank CRF version.

Virtual Meeting: When all pages are defined once; are ‘marked’ with data checks, all locations scan their ‘marked’ pages and e-mail these to the central location.
The central location combines these scans (e.g. with Adobe Acrobat X Pro), while the other locations grab some ‘coffee’, and sends the completely defined CRF back to all locations. Each location prints this defined CRF and the pairs or individuals review the pages they didn’t ‘mark’ themselves. Checks are added, adjusted, deleted, where deemed necessary. A short wrap up of the findings is communicated to the group by each individual or pair.
After the virtual meeting: The adjusted pages are scanned again by each location and sent to the central location.

The central location, responsible Data Manager, then creates the Data Checks Specification from the ‘marked’ CRFs.

When you and your clinical study team feel you gain experience with Data Check Meetings, you can ‘mark’ the common data checks on the CRFs in advance. So the Data Check Meeting focuses on the really study specific data checks and reviews/adjusts common checks where necessary for the study.

The Data Check Meeting facilitates immediate discussion about suitable data checks for the clinical study under subject with medical experts, clinical study experts as well as investigational product experts.
Provides a way to quickly define the data checks for the clinical study. And because the CRF is spread out on the table, you see (in)consistencies in data checks defined. You see the number of checks and it is easier to define the more complex checks across visits.

In summary, a Data Check Meeting gives you immediate assurance that you define the proper data checks for your study.
I wish you pleasant, eye-opening Data Check Meetings with medical and technical interactions and a lot of work done in a short time schedule.

© 2011, Maritza Witteveen, ProCDM

You’re welcome to re-publish this newsletter if you add the following text to it. This is an ezine of ProCDM. For Clinical Research Directors who struggle with clinical data management to get reliable, quality clinical data successfully. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via

Handy to learn SAS for clinical research?

Friday, January 14th, 2011

Is it handy to learn about SAS for clinical research? Especially when you are involved in clinical data management? That’s a question I got from one of ProCDM’s ezine Subscribers. Thank you for your question.

First off all, you don’t need to learn SAS. Instead, I recommend you to experience how the statistics department handles the datasets they receive from data management. How they perform analysis with them. This experience will give you a good understanding of what is done with your clinical data management results. What are easy to work with datasets? Naming variables and choosing datasets.

Because I worked with SAS myself, I have this software in mind when I provide you the following small assignments for your development. So, there is specific terminology, like ‘proc contents’ or ‘MERGE’, coming from. Of course the assignments can be performed with other statistical software too.

The idea is to sit together with the Statistical Programmer and build a short program yourself with his/her software and his/her help. You touch the computer, the Statistical Programmer doesn’t type (program) a thing! Because you should experience yourself; the magic of ‘run’.

1. Experience opening the software package, viewing a blank screen in which you can program anything you want.

2. Perform a ‘proc print’ and a ‘proc contents’ on a dataset. For example on the adverse event (AE) dataset. View the log and note the information on numbers given (subject numbers, number of variables).
View the output. Note the variable information given per proc contents (short name, label, type, format, length). Note the record numbers in the proc print output. Notice missing data, if any.

3. Write a very small program (‘proc means’) that performs descriptive statistics on the demography data in the demography (DM) dataset. Write the program, view the log, view the output.

4. Combine (MERGE) the demography dataset with the randomization dataset. Perform a ‘proc print’ and view the output with the additional column showing the different treatments allocated to the subjects.

5. On the merged dataset demography and randomization, you should again perform descriptive statistics. But now, BY treatment.
View the output with a descriptive statistics table for each treatment.

6. You will calculate the age for all subjects by creating a new AGE variable that is the result in years of Informed Consent Date minus Date of Birth.

7. The Statistical Programmer helps you to create a readable print-out of a normalized dataset. A dataset with more than one level of key variables. Like the medical history dataset with an additional key level for the body system. Or the laboratory dataset with two additional key levels (high normalized) for (1) sample (e.g. blood, urine) and (2) laboratory parameter (e.g. ASAT, RBC, Calcium).

8. Now it is time for the Statistical Programmer to take over and ‘run’ a program that creates a graph. Ask to show you where the dataset(s) go in, where the graph is made, with which variables from the dataset(s). Have a quick glance at the log and view the graph (output).

These small assignments will take you 1,5 to 3 hours in total, with the Statistical Programmer at his/her computer. Coffee included.

You will look from a statistical perspective at the datasets delivered.
You will notice what your statistics Client is capable of; combining, selecting, calculating.

And this will help you to discuss annotated CRFs, study database specifications (variable naming, datasets) and edit checks for clinical studies with Statistics.

In case you don’t know how to ask Statistics for your development and their time, add this newsletter to your question. To lay out the idea and mutual benefit in it.

Good luck and I think… you will enjoy this bit of programming magic.

© 2011, Maritza Witteveen, ProCDM

This is an article of Maritza Witteveen from ProCDM. Helping Clinical Research Directors who struggle with clinical data management to get reliable, quality clinical trial data successfully. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via”

The first step towards simple clinical research

Friday, December 24th, 2010

Clinical research, and clinical data management in particular, is difficult? Yes, as you see the bigger than big investments done to assure quality clinical data from clinical trials. Yes, as we constantly hire experts helping us setting and reaching our clinical study goals, helping to comply with applicable regulatory requirements. Yes, if we work with twenty page procedures full of prose to assure standardization and 100% regulatory adherence. Yes, if we look at the overwhelming amount of clinical research information available. Yes, if we are struggling to find our specific steps to get quality clinical study results. Yes, with regards to the reputation that counts and the deadlines we face. The fact that time in clinical research is money, literally. Yes, if we consider the clinical study initiative to contribute to improved (health)care.

A decade ago, I was overwhelmed too by the amount of information provided through various GCP trainings, expert people hired, combinedn with the two clinical data management books I acquired and the Good Clinical Data Management Practice. I felt intimidated by regulatory requirements in laws and guidances. Especially because the products under investigation were subject to medical device regulations, regulations for biologics as well as to drug regulations. The juridical, not my native tongue language was difficult to translate practically to my direct daily tasks and responsibilities. Did I cover everything? Wasn’t I overlooking a requirement? Was all generated documentation unambiguously clear? I had to find my way. What was relevant for my work and how to implement it.

Like me people are trained in the Good Clinical Practice guideline, receive clinical data management user training, have lots of (online) support and help manuals available, read expert magazines, undergo SOP training. Via the internet there is a lot of clinical research information accessible. Also more specific information about quality assurance and clinical data management for example. There are associations you can join to get knowledge, certificates and connections with other people in the field.
Staying informed and up to date is quite a task next to your operational tasks and responsibilities. And while doing our best to stay on top of all clinical study ins and outs it  is easy to get lost in available information and requirements. Resulting in overwhelming, intimidating and overcomplicating clinical research planning, conduct and reporting.

The misunderstanding; clinical research is difficult, bears the risk of non-quality clinical study results.
As simply shown by inconsistencies in raw data listings of final datasets. If what’s planned and going on in a clinical study isn’t transparent, you won’t have time to focus on quality clinical data. On the contrary, you will find yourself planning over and over.

What can I do then?
Start to think about clinical research yourself. Logically thinking about your work. Which choices do you make? And why? What is your contribution to clinical studies?
You could start giving your team and yourself a set of questions to answer.
Take 30 minutes to an hour time to think about these questions and form your own answers.
For at least nine years my own two most important questions regarding clinical data management are:
Why does clinical data management exist? (If it was skipped, what happens?)
What does clinical data management contribute to clinical studies?

To help you take-off, I’ve made a mindmap with words to ask yourself, to give your team. This mindmap may also help you to set your goals for the next year and to celebrate your achievements from the last year. Sit down and take a pause. As 2011 is rapidly approaching, this last week is a moment for some reflection.


In summary, you can acquire the most expensive resources, the most knowledgeable people in their profession, but if you do not know yourself what to plan why, how to practically carry out your plan and how to control performance, you will stay confronted with ‘unexpected’ inconsistencies.
People planning, conducting and controlling the clinical study, including the clinical data, need to do the right job from their own principles. I think the Good Clinical Practice document starts for a reason with its 13 principles…

© 2010, Maritza Witteveen, ProCDM

You may re-publish this newsletter if you add the following text to it:
“This is an article of Maritza Witteveen from ProCDM. Helping Clinical Research Directors who struggle with clinical data management to control their clinical trial data successfully. Receive tips and the free e-book ‘Five strategies to get reliable, quality clinical data’ by subscribing via”

Clinical Data Management Professionals – 7 skills

Wednesday, December 8th, 2010

What strenghts do clinical data managers add to clinical trials? What skills to manage clinical study data? What educational background, what qualifications, what mindset?

Where do clinical data managers come from?
From biosciences, biology, health sciences, medical laboratories etc. In general, higher educated people, familiar with medical terminology. People having had a career related to clinical research, discovering that the more administrative side, handling the clinical data treasure, attracted them. People quickly at ease; using software recently new for them.
Myself, I have a background in BioMedical Health Sciences, major Epidemiology. I was curiously attracted to clinical data management, by the somewhat unknown tasks of the job.

Adding something extra to clinical data management?
Extra to the organization and your own development?
Initiative to join the ACDM, SCDM, DIA SIAC CDM or another group where clinical data management contacts and information is provided.
People that go the way to certify themselves as a Certified Clinical Data Manager or acquiring other job related certificates, training of interest.
Speaking for myself, I joined the DIA (great networking for me) and the SCDM. The SCDM, because I’ve definitely benefitted from the Good Clinical Data Management Practices they created and keep up to date.

Clinical data treasure

What skills for clinical data management?

Concentration; being able to work for hours, concentrated and focussed to define all variables, code lists, ranges and data groups to build the database. Or to create, draw the CRF. Or to program the data verification checks. Focus and concentration to get the task done in one go. Finished, before starting the next task.

Analytical; able to translate the individual patient story to logically grouped datasets. Knowing the consequences of chosen database and CRF design for the final datasets to deliver to the Statistician. But also seeing the way of the least effort. How to get from patient CRF to grouped datasets with the least amount of work and time? The least creating and checking?

Detailed; being triggered by each inconsistency to do something to solve it! Wanting to deliver a clean database ready for analysis. Preventing questions from the Statistician regarding the data verification process. Wanting to take care. E.g. delivering raw data listings with consequent lay-out, consequent use of capitals, abbreviations, consistent key parameters sequence.

Professional; proud of their function in the clinical trial process. Being able to explain the added value of clinical data management for the clinical study process.

Supporting; doing their utmost best to make the data handling process as convenient as possible for all parties involved; the research nurses of the study sites, the data entry operators (if applicable), the Statistician.

Dedication; to the clinical study. By focussing on guaranteeing clinical data quality; complete clinical data that reflects what actually happened with the subjects.
Speaking for myself, my drive since more than a decade, is to make clinical data management transparent for the people depending on it. Giving them more insight and understanding in the profession. Having them benefit from the clinical data treasure data management is guarding.

Communication skills. Knowing when to inform people, how to explain choices and their consequences, when to get help, when to call, when writing an e-mail is good enough for a message to send.

In summary, clinical data management people are experts who are dedicated to deliver a clean clinical study database that unambiguously reveals what happened with the product under investigation.
In collaboration with the data owners (the Investigators) and all other people that contribute to correct interpretations and conclusions in the clinical study report.

© 2010 ProCDM, Maritza Witteveen