Summary
Pancreatic cancer is a leading cause of cancer-related mortality worldwide. Due to the absence of symptoms in patients with early-onset pancreatic cancer, it is often diagnosed in advanced stages and carries a poor prognosis. Deep neural network machine learning models, constructed using disease trajectories and time sequences of clinical events, pave a path for early pancreatic cancer detection with high accuracy.
What is Pancreatic Cancer?
Pancreatic cancer is the 11th most prevalent cancer and the seventh leading cause of cancer-related death worldwide. Pancreatic cancer is often diagnosed at an advanced stage and has a mortality rate of over 93%. However, when diagnosed early, pancreatic cancer can be cured by a combination of surgery, chemotherapy, and radiotherapy. The risk factors for pancreatic cancer include age, obesity, family history, and genetics with treatment regimens depending on the stage of the cancer, its location, and the patient’s overall health.
There is a lack of available screening routines and known symptoms for the early diagnosis of pancreatic cancer. Recent studies have focused on machine learning (ML) approaches utilizing patient records, such as disease codes and trajectories, to predict the onset of pancreatic cancer. In the health industry, disease codes represent specific diseases, while disease trajectories refer to the pattern and progression of disease over time. A major limitation of existing predictive ML studies is that they only used disease codes, aka the type of disease, without the time sequence progression of disease states over time.
In a recent study published in Nature Medicine, Placido et al. developed an artificial deep neural network (DNN) ML model utilizing disease trajectories in conjunction with the time sequence of clinical events to provide both prediction for pancreatic cancer occurrence and a risk assessment in incremental time intervals after the predictive assessment of risk.
What is a Deep Neural Network (DNN)?
There are two primary types of neural networks that are used with clinical data: convolutional neural networks (CNNs) and artificial neural networks (ANNs). CNNs are utilized primarily for image-based AI, such as the analysis of histological data. ANNs, however, are used for clinical data with categorical values. ANNs emulate neurons in their information processing and can be trained to make accurate predictions for many applications, such as cancer diagnosis.
A key aspect of ANNs is the non-linear transformations of input data in the hidden layers to make prediction scores. ANNs consist of three layers: input, hidden, and then an output layer. The input layer consists of variables to either train or evaluate the performance of an ANN. In the hidden layers, data from the training set is transformed into abstract representations, and the sum of these transformations is used to make predictions or output values. ANNs can be further broken down into shallow or deep depending upon the number of hidden layers. A deep neural network (DNN) is a type of ANN with more than one hidden layer.
Use of DNNs for Early Prediction of Pancreatic Cancer
To build their DNN model, Placido et al. used disease trajectories from the Danish National Patient Registry (DNPR) and demographic information from the Central Persons Registry (CPR). The DNPR contains disease codes, such as diabetes and other diseased states, with explicit time stamps to train and test a DNN model for its ability to predict cancer. The DNN model was used in a surveillance program for patients at an elevated risk of pancreatic cancer following various assessment intervals, ranging from 3 to 60 months post-assessment. Using the DNN, the authors were able to identify high-risk patients as well as precancerous states in patients more accurately than the current standard of care.
The development of a deep learning algorithm capable of predicting pancreatic cancer risk from disease trajectories represents a significant leap forward in cancer research. However, ongoing research is necessary to further improve the accuracy of the model and validate its performance across diverse patient populations. Leveraging the power of artificial intelligence offers hope for early detection, personalized treatment, and improved outcomes for patients with pancreatic cancer.
Outsourcing Bioinformatics Analysis: How Bridge Informatics Can Help
Groundbreaking studies like these are made possible by technological advances making biological data generation, storage, and analysis faster and more accessible than ever before. From pipeline development and software engineering to deploying existing bioinformatics tools, Bridge Informatics can help you on every step of your research journey.
As experts across data types from leading sequencing platforms, we can help you tackle the challenging computational tasks of storing, analyzing, and interpreting genomic and transcriptomic data. Bridge Informatics’ bioinformaticians are trained bench biologists, so they understand the biological questions driving your computational analysis. Click here to schedule a free introductory call with a member of our team.
Haider M. Hassan, Data Scientist, Bridge Informatics
Haider is one of our premier data scientists. He provides bioinformatic services to clients, including high throughput sequencing, data pre-processing, analysis, and custom pipeline development. Drawing on his rich experience with a variety of high-throughput sequencing technologies, Haider analyzes transcriptional (spatial and single-cell), epigenetic, and genetic landscapes.
Before joining Bridge Informatics, Haider was a Postdoctoral Associate at the London Regional Cancer Centre in Ontario, Canada. During his postdoc, he investigated the epigenetics of late-onset liver cancer using murine and human models. Haider holds a Ph.D. in biochemistry from Western University, where he studied the molecular mechanisms behind oncogenesis. Haider still lives in Ontario and enjoys spending his spare time visiting local parks. If you’re interested in reaching out, please email [email protected] or [email protected]