Hi, everybody. Thank you, Chris, for the nice introduction, and thank you for Twist for the invitation of doing this webinar. The title of the webinar is DNA Methylation Biomarker Discovery for Complex Diseases Using Twist Human Methylome. Today on the webinar, we're gonna talk about doing a small introduction on DNA methylation, what is biomarker discovery, what is the Human Methylome Panel, how at Diagenode we perform services using Human Methylome. I will present to you a short case study where we compare data from Human Methylome and from the EPIC array. I will explain how we can go a little bit further, going from a biomarker discovery to a biomarker validation, and then drawing some conclusion. First of all, what is DNA methylation?
DNA methylation is an epigenetic mechanism that occurs by the addition of a methyl group to the DNA, mainly at cytosines. It can occur at different genomic locations such as gene bodies, promoter, or non-coding region like repetitive regions, small RNA, pericentromeric or centromeric region. The methylation. It is very well known that methylation can have an impact on gene regulation, mainly modifying the gene function or affecting the gene expression. It is also known that DNA methylation profiles are very specific to the cell type or to the tissue. It is also very known that aberration of this DNA methylation profiles are a hallmark in cancer. In cancer, we can find two main different mechanism. A global loss of methylation, which is called hypomethylation, which can result in genomic instability.
Also an increased DNA methylation, which is called hypermethylation, specifically at some region, mainly at CpG rich region, which usually induce transcriptional repression. Knowing that, researcher know that now, Aberration of DNA methylation for cancer can be used as a biomarker for early disease detection. How we can study DNA methylation? The gold standard to study DNA methylation is the use of bisulfite conversion. What is bisulfite conversion? Bisulfite conversion is a chemical reaction which transform unmethylated cytosine to uracil and then after PCR or library preparation to thymine. While the presence of, a methylation over the cytosines will basically impede that modification, to happen.
After conversion, when we sequence the data and we find, and we know that on the genome of reference we have a C and on our library we still have a C, that means that there was no conversion, so the C was methylated. While when the genome of reference we have a C, but we actually sequenced a T, that means that the C was not methylated and converted to a T. Over the last two years, we have also coming on the market with another option of bisulfite conversion, a parallel option, which is enzymatic conversion, which is shown here. The principle is exactly the same, but instead of using a chemical reaction, we are using a series of enzymatic reaction to come at the same output.
However, while the chemical reaction is very harsh in damaging the DNA, the enzymatic conversion is not damaging and will lead to a much reduced loss of the DNA. Now that we have introduced the DNA methylation and why it is important to study DNA methylation, how can we use DNA methylation for a biomarker discovery? What is a biomarker discovery? A biomarker discovery is a pipeline for which we can basically use a different sample that we would like to find a sort of pattern that can differentiate one group to the other to find a sort of signature.
For doing that, of course, we want to use the most genome-wide solution, so, screen as much as possible CpGs to find the one that are really important and can discriminate in between the different group, treatment versus control, healthy versus patient. To do that, we can of course use different option. We have the option for whole genome bisulfite sequencing or EPIC, Human Methylome, the EPIC array. Those are very used screening high number of biomarker. Due to that, we should reduce and just try to avoid using a high number of sample. Here today in this webinar, we will focus on the biomarker discovery using the Human Methylome from Twist. What is the Human Methylome? Of course.
The Human Methylome is a targeted NGS-based technology that analyze DNA methylation at single nucleotide resolution, achieving a very high coverage. We are talking here about more than 150x. Just to compare, whole genome bisulfite sequencing is usually sequenced at 30x, so we have really a high difference here in coverage. How the Human Methylome Panel was designed. This panel was designed by Twist, and they have designed it using the recent database like UCSC and ENCODE. That panel basically covers more than 30,000 genes, which are connected to roughly 4 million CpG sites. Those 4 million CpG sites cover more than 80% of the CpG island in the human genome.
Very interestingly, the Human Methylome Panel overlaps more than 90% of the probes of the EPIC array. What are the applications for the Human Methylome Panel? Since it's very broad, it can be used for biomarker discovery and for monitoring in, for example, cancer, in neurodegenerative diseases, cardiovascular diseases, metabolic diseases, fertility or immunology. What is really nice about the Human Methylome Panel is that it's very flexible in terms of what are the sample type and the input that we can use for that technology. We can go from really clinical low input samples such as liquid biopsies and plasma cell-free DNA. We can use whole blood, PBMCs, fresh-frozen tissue, but also FFPE tissue or cell culture in general.
how we can use also the HumanMethylome in cancer research. If we cross the panel of the HumanMethylome with a database which is called the Network of Cancer Genes and Healthy Drivers, we can find that, for example, the HumanMethylome cover 277 genes that have been already reported to be important for breast cancer. It covers 150 genes that have been showed in this database to be important for colorectal cancer, 27 genes for ovarian cancer, and 190 genes for the pancreatic cancer. It seems to be a very wide panel that can be used for many different type of subtype of cancer. We're together with Twist.
At the Diagenode, we have been working since many years, and as soon as they launched the HumanMethylome, we started a collaboration together to be one of the first services provider to provide the HumanMethylome services. How the HumanMethylome services workflow work at the Diagenode. We can start from a very different type of sample input, as told before. It could be genomic DNA, FFPE DNA, cell-free DNA. We attach the adapters to those DNA. We perform the enzymatic conversion, and after that, we perform the library amplification. At that step, we have basically produced a methyl-seq genome-wide library preparation, so it covers the whole genome. After that, there is the critical part.
We perform the capture using the Human Methylome Panel probes, and we are basically just selecting for the part of the DNA which contains the important CpGs that have been selected by Twist in the Human Methylome Panel. After a small reamplification, we go for sequencing and bioinformatics. The whole workflow at the Diagenode have been optimized and for example, if you submit a 96 samples, we can work in a pool plate, and we can do that quite rapidly. With a turnaround time about six to eight weeks. Now I will present to you a case study where we have compared the data of the Human Methylome Panel versus the EPIC array here at the Diagenode.
A company, a U.S. based company called Inherent Biosciences came to us with the idea of, with a pilot study to see whether it was possible basically to, predict COVID hospitalization based on a DNA methylation profile of the patient, to see whether there is, we can discriminate, prior whether there is a high or low risk of a hospitalization. For that, they collect blood from samples. They have selected PBMCs and extract genomic DNA. We have got four samples from four genomic DNA from PBMCs for non-hospitalized COVID patient and four samples from hospitalized COVID patient. With the same DNA, we perform a pilot study with EPIC array and with the Human Methylome.
The first thing that we have done is to compare and see whether the methylation values on the probe of the EPIC array, which are also included into the Human Methylome, were giving the same DNA methylation values. We have performed a Pearson correlation between the DNA methylation values obtained with these two technologies. Here I'm showing you all the eight samples. As you can see, we can see a very nice diagonal. The Pearson correlation show a correlation of more than 95%, which strongly suggests that the two technologies are extremely valuable and interchangeable, even though the EPIC array is bisulfite-based and array-based, while the Human Methylome is enzymatic-based and NGS-based.
We have done a deep look at the EPIC array results and we have performed a pairwise comparison in between hospitalized versus non-hospitalized patients. In between the more than 700,000 CpGs that are analyzed by the EPIC after filtering in EPIC, we have found no significant DMCs, so differentially methylated Cs, and only two significant DMRs, which stands for differentially methylated region. On the other hand, looking at the Human Methylome results, what we have observed is that with the Human Methylome, we're able to detect more than 18 million CpGs detected, and out of those, roughly 9 million CpGs were covered with a high coverage.
Of course, we have to mention that thanks to NGS, when we say that the Human Methylome targets 4 million CpGs, but thanks to the Twist technology, which capture both strands into the NGS technology, which will sequence both strands, so the plus and the minus. When we say that the Human Methylome targets 4 million, we are actually doubling that number because every CpGs has a plus and a minus. So we can actually also measure the DNA methylation there, and that's why we have here 9 million coverage with a high coverage.
When we have performed basically the comparison between hospitalized versus non-hospitalized on the Human Methylome results, among the 7.7 million CpGs that were common to all the samples, we have found more than 12,000 differentially methylated Cs and more than 800 differentially methylated regions when we have set the threshold at 25% methylation difference with the p-value adjusted of 0.01. This, you can see it here in the volcano plots. Very interestingly, when we have used those more than 800 DMRs to perform a heat map and a hierarchical clustering, we have nicely observed that, based on those 800 DMRs using a hierarchical clustering, we're able to discriminate and separate very well between the four samples that were not hospitalized versus the four samples that were hospitalized.
Those data, even though they are still a pilot project on a very limited number of samples, this strongly suggests that the Human Methylome Panel, thanks to its broader coverage, is able to detect more easily DNA methylation signature that can occurs in such peculiar or complex diseases. If we would have run that on a higher number of samples, just to be sure, and to finalize that those 800 DMRs and 12,000 DMCs are actually a very strong signature to discriminate between non-hospitalized and hospitalized data, then we will be able to move to a more custom target DNA methylation and moving from biomarker discovery to biomarker validation. Indeed, what is a biomarker validation?
The biomarker validation is the second step of the pipeline of biomarker discovery, where once we have finalized and find which one are the CpGs that are really important to discriminate in between a different group, healthy or treatment, cancer or healthy patient, then we're gonna reduce the number of biomarkers only to the one that we are interested in or the critical ones, and we can dramatically increase the number of samples just to validate the performance of the DNA methylation signature in a higher cohorts, as it's shown here. How this is in practically done.
Once we have performed the screening with the Human Methylome, together with Twist, we will work and perform a redesign of the probe panel going down, so reducing from the 4 million CpGs to the specific number of CpGs that have been shown to be very important. In the case, for example, of the small pilot study that was done before, we were talking about only 12,000 CpGs. Together with Twist, we will redesign the panel and then using a higher number of sample, we can go through the exact same workflow here as for the Human Methylome, just changing the part of the capture.
Instead of using the Human Methylome Panel, we can use the custom panel, which will capture only the region which contain the CpGs that have been shown to be important to discriminate for your specific biological question. Thanks to that, we can of course reduce the cost of the sequencing because we're gonna reduce the number the region that needs to be that are captured and need to be sequenced, while increasing the coverage of those regions to be even more sensible for on those. Coming to the conclusion of that webinar. What I showed to you today is that HumanMethylome is a targeted NGS-based DNA methylation assay, which is cost-effective to reach a high CpG coverage. In reality, we have a more than 30x coverage of real targeted CpGs and have a broader coverage than EPIC array.
Human Methylome is, of course, ideal for biomarker discovery in cancer, but also in other complex diseases. We can apply the Human Methylome to very difficult clinical samples such as plasma, cell-free DNA or FFPE. After the first biomarker discovery, of course, we can work with you together with Twist to reduce the panel from the Human Methylome going to a more custom panel to perform the success, the successive biomarker validation on a high sample volume, a high number of patients, to validate and to confirm that the signature is the right one to go through to screen many, many patients. In conclusion, why do you want to use Diagenode as your targeted DNA methylation services?
We are happy to be accredited by Twist as their NGS lab service provider. Of course, we have a huge experience in targeted DNA methylation projects from difficult samples using many different panels, the Human Methylome, but also many other custom panels. We offer an end-to-end services from sample to the analysis. We always design the project in a very customized and collaborative way with our clients. You will have someone from our services team that will discuss with you the project design to design the best option ever. You will have a dedicated in-house expert for your project that will follow up on all the steps of the project. We produce reliable and consistent bioinformatic read data with high quality ready for publication.
With that, I'm concluding the webinar of today. I will thank you, all of you, for your attention, and I'm of course happy to take questions.
Matteo, thank you so much. That was a fascinating presentation. We're gonna go to questions now. Just so people, if you do have questions for Matteo, then you can ask those in the ask question box, and we'll try to get through as many as we can. I can already see a few coming in. While I let people ask, I wanted to question my own for your data set. It's quite striking data, is it not? That you've got with the EPIC array, no DMCs and only two DMRs, and then you move on to the human data, and you've got over 12,000 DMCs or 879 DMRs. Why? Is it just that the Human Methylome gives you a higher sensitivity?
Is that something about the EPIC arrays? It just seems like there's such a massive gap there. What is your theory for those, that big difference?
Of course, the fact that when we have compared and find a very nice correlation between the data of the EPIC array and Human Methylome Panel
Is that why?
That has said to us that.
There's such a big difference.
Probably. The CpGs that are detected by the EPIC array are not the one informative for such a complex disease like, for example, COVID.
Sorry.
So we know, for example-
Is my sound not working particularly well?
Yeah, I heard you not-
Okay.
Sometimes it.
My apologies. I'm going to ask my colleague, Larissa. Just give me a moment, and I'll try a different microphone.
Sorry for that. Chris' audio is really bad. You're gonna hear another voice from the background now. I just quickly ask my questions. I hope my audio is better. Matteo, please let me know.
I can hear you very well, Larissa.
Oh, perfect. Okay. Let's start from question one. Michael asked, because you're sequencing all four strands, is the coverage then 37.5x, or can you see every strand read as a new read?
When we say about the 130x , this is the sequencing coverage, and this is not including the fact that we have this double strand, which is very specific for DNA methylation. That sequencing compared to the 30x in the whole genome then would need to be divided basically by two because we have a specific signal from the plus strand and then a specific signal for the minus. However, in one of my last slide, I say that we actually reach more than 30x coverage of actual CpG coverage, and that is really on a strand-specific CpG. The final coverage per CpGs, per targeted CpGs with the strand-specific, is more than 30x.
Okay. Thank you for that. We have more questions from Carla. Can you recommend an R package for analyzing data from a targeted panel designed by Twist?
Okay. In terms of bioinformatics analysis, there are many different options, of course. I'm not sure that there is. You can use the standard Bismark one, which is the most used in the scientific community. We are actually right now also at least internally at Diagenode moving towards other software which are more powerful and less time-consuming than Bismark. There is no let's say a specific software that you need that you must use to analyze targeted DNA methylation data using either a Human Methylome Panel or other custom panel.
Thank you. Okay. Next question: Did you found DMCs? Is Twist good to look at methylation per CpGs? Or do you tie your region with the probes and individual CpGs methylation measurements? Is it not possible? That's complicated.
Yes. Can you maybe re-read the question?
Well, first question of this is, did you find DMCs?
Yeah, DMCs. Yeah.
And, uh-
So-
Yeah, just answer that, then we go to the next one.
I don't know if the question is about the case study. If the question is about the case study, again, we have found DMCs, not in the EPIC array, but in the human methylome.
The second question is: Is Twist good to look at methylation per CpGs, or do you tie a region with the probes and individual CpGs methylation measurements? Is it not possible?
Okay. I'm not sure that I completely got the question, but I think that basically what we do when we design a panel with Twist or also what Twist has done with the Human Methylome Panel, they design probes. The probes are 120-mers, and those probes will capture the fragment of DNA which contains the important CpGs. You can have more than one CpGs in each fragments or captured by the same probes, and you can tile the probes in a different level. It can be done in a very customized way, especially for a custom panel, because more the panel is smaller and more you want to capture to keep as much as possible of the part that you're interested in.
Okay. Thank you so much. We have, can you speed up turnaround from six to eight weeks to six to eight hours for clinical use?
Very good question. Yeah, no, the protocol requires quite some time because you have, of course, to extract DNA, you have to do the QC of DNA, and after the QC of DNA, depending on the kind of sample that we are working with, we might or we might not need some DNA shearing. That's again a new QC, and only then you can start with DNA library preparation, which take at least two days in the lab. After that, you have the capture, so you can add one other day minimum. Then there is the sequencing. The sequencing itself takes at least two days, 24, 40, 48 hours.
In the hours of a working day, it is not possible to have. More in general, I guess, NGS data are not possible to have it just one day.
Okay. Next question is: What is the minimum DNA amount for a cfDNA?
Okay. Here at Diagenode, we have pushed in more in general the methylation system using Twist probe for some customized panel down to 10-20 ng of cell-free DNA. With that, we are very comfortable. We know that the library preparation can work, and then afterwards, capture will work. However, of course, compared to the maximum input, which is around 200 ng, you would expect that you're gonna have some PCR duplications or some reads that will be lost and that will be discarded at the analysis.
We just saw we got a lot of questions, which is really, really nice how engaging you are. The next question gonna be. First, a big thank you to you. Very nice talk. Then, is Human Methylome applicable to samples taken from saliva, mouth swab or lung immune cells?
It is a great question. The straight answer is yes. Of course, when we are working with, for example, saliva or mouth swab, we need always to remember that those samples are quite contaminated, so you're gonna have human cell, but also contamination to whatever the person had, for example, just eaten before. We can have some chicken or some meat, of course. You're gonna have contamination for that and some bacteria. Of course, we can perform the libraries.
Thanks to the fact that we are doing the capture using probes that are specific only for the human, at the sequencing level, we're gonna sequence only what is coming from the human, which is a kind of a great advantage compared to, for example, doing a whole genome bisulfite sequencing, for which if you have 25% of contamination, you're gonna lose 20 or 25% of reads over those non-human species.
We have the question, I guess, kind of regarding to that. How many milliliter blood sample is required for this test?
Usually, if you're talking about cell-free DNA coming from blood, then we can. Basically, if you do a blood draw of 10 ml, you perform the centrifugation to take out the plasma. The plasma is roughly half of that, so 4 ml-5 ml. From 4 ml-5 ml of plasma, we can reach something between 10 ng and 20 ng of cell-free DNA. That's why we have validated all the pipeline here at Diagenode using the Twist probe panels in the range between 10 ng and 20 ng. If you're talking about whole blood, you do expect to have a lot of DNA from, for example, a PAXgene Tube, which if I remember right, is about 10 ml.
For that, you're probably more reaching micrograms of DNA. The maximum input of the library preparation method is 200 ng, so we will be definitely fine with that.
We have a question regarding the EPIC array and not specifically the human methylome. Can we do the enzymatic conversion instead of the bisulfite conversion?
Okay. On the Illumina array, the Illumina array has been validated by Illumina using the bisulfite conversion. It is something very interesting to check whether it is possible also to use the enzymatic conversion. It's something that it could be tested. We haven't yet tested it, but it's something that could be very nice. We don't know whether there is something that can basically interfere with the Illumina steps after the conversion. In theory, it is feasible. If you are interested, please, contact us and we can discuss about that.
Thanks for that. More questions coming up. Have you looked at methylation profiles for long COVID since this affects tens of millions? Also for ME. ME stands for, oh my God, myalgic encephalomyelitis. Sorry for that. CFS. That can be quite similar.
Can you repeat the question, Larissa? I'm not sure if I got it right, so I want to answer it correctly.
Have you looked at methylation profiles for long COVID since this affects tens of millions?
Oh, okay. Now I understand. Being at Diagenode, a services provider, we don't do R&D ourselves. We've always working together with some biotech companies which came to us asking for with a specific biological question. For that specific case studies I'm showing here, the company was Inherent Biosciences, and they want to see whether there is a signature. There was some possibility of using that signature to predict something. After the first pilot, even though it was very interesting to work and to use the human methylome as a biomarker discovery for that, the project had been stopped by them, and we haven't gone through or continue in a more deep way those analysis. No long COVID was done.
This is something that might be very interesting. Since, with this pilot project that we have performed, we have a strong suggestion that there was a clear DNA methylation signature that could discriminate in between high and low risk. We can think that there might be also a DNA methylation signature that can discriminate the presence or not of a long COVID in people.
Okay. Thank you very much for everything. Every question we now may not have answered, when you put in your contact details, we're gonna reach out to you on them. We have contact, Matteo gonna answer them directly. Also on your side, on the right side, you see Matteo's email address and LinkedIn, so please feel free to contact him also directly if a question later comes up, maybe. A big thank you to Matteo for this great presentation. Thanks for doing it. It's lovely to work with you. For everyone else.
Thank you very much.
I wish you a lovely day. We're having a survey popping up when you close the window. It would be really great if you could take part so we get feedback on how the webinar went for you. I wish everyone a lovely day. Thanks for joining, and see you soon, hopefully.