Publications

Improving Predictive Process Monitoring Through Reachability Graph-Based Masking of Neural Networks

Published in IEEE Transactions on Computational Social Systems, 2022

Predicting the next event during process runtime is an objective of interest in predictive process monitoring (PPM). Decay replay mining is one of few deep learning-based next event prediction approaches that are built upon process model notations. However, this algorithm does not fully intertwine its neural network with the available process knowledge contained in the process model. This work, which is an extended version of an earlier conference publication, investigates the reachability graphs of underlying Petri net process models for masking the neural network of decay replay mining to ultimately increase the quality of next event predictions. A more comprehensive set of experiments is performed to provide robust statistical evidence of the usefulness of the approach and relativizes earlier made claims and hypotheses. In addition, the decay replay mining approach is applied with the suggested reachability graph-based masking extension to a healthcare use case of sepsis patients facilitating decision-making for healthcare practitioners. The obtained results further underscore the validity of the masking of neural networks using knowledge contained in the reachability graph of a Petri net process model.

Recommended citation: J. Theis and H. Darabi, "Improving Predictive Process Monitoring Through Reachability Graph-Based Masking of Neural Networks," in IEEE Transactions on Computational Social Systems, 2022, doi: 10.1109/TCSS.2022.3220262. https://ieeexplore.ieee.org/abstract/document/9955412

A Process Mining- Deep Learning Approach to Predict Survival in a Cohort of Hospitalized COVID‐19 Patients

Published in BMC Medical Informatics and Decision Making, 2022

Various machine learning and artificial intelligence methods have been used to predict outcomes of hospitalized COVID-19 patients. However, process mining has not yet been used for COVID-19 prediction. We developed a process mining/deep learning approach to predict mortality among COVID-19 patients and updated the prediction in 6h intervals during the first 72 h after hospital admission. Our proposed process mining/deep learning model performed significantly better than commonly used machine learning approaches that ignore time information. Thus, time information should be incorporated in models to predict outcomes more accurately.

Recommended citation: M. Pishgar, S. Harford, J. Theis, W. Galanter, J. M. Rodríguez-Fernández, L. H Chaisson, Y. Zhang, A. Trotter, K. M. Kochendorfer, A. Boppana, and H. Darabi, "A process mining- deep learning approach to predict survival in a cohort of hospitalized COVID‐19 patients," BMC Medical Informatics and Decision Making 22, Article number: 194 (2022). doi: https://doi.org/10.1186/s12911-022-01934-2 https://bmcmedinformdecismak.biomedcentral.com/track/pdf/10.1186/s12911-022-01934-2.pdf

Prediction of Unplanned 30-Day Readmission for ICU Patients with Heart Failure

Published in BMC Medical Informatics and Decision Making, 2022

We presented a process mining/deep learning approach for the prediction of unplanned 30-day readmission of ICU patients with HF. A patient’s health records can be understood as a sequence of observations called event logs; used to discover a process model. Time information was extracted using the DREAM (Decay Replay Mining) algorithm. Demographic information and severity scores upon admission were then combined with the time information and fed to a neural network (NN) model to further enhance the prediction efficiency. Additionally, several machine learning (ML) algorithms were developed to be used as the baseline models for the comparison of the results. The proposed approach was capable of modeling the time-related variables and incorporating the medical history of patients from prior hospital visits for prediction. Thus, our approach significantly improved the outcome prediction compared to that of other ML-based models and health calculators.

Recommended citation: M. Pishgar, J. Theis, M. Del Rios, A. Ardati, H. Anahideh, and H. Darabi, "Prediction of unplanned 30-day readmission for ICU patients with heart failure," BMC Medical Informatics and Decision Making 22, Article number: 117 (2022). doi: https://doi.org/10.1186/s12911-022-01857-y https://bmcmedinformdecismak.biomedcentral.com/track/pdf/10.1186/s12911-022-01857-y.pdf

Masking Neural Networks Using Reachability Graphs to Predict Process Events

Published in 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI), 2021

Decay Replay Mining is a deep learning method that utilizes process model notations to predict the next event. However, this method does not intertwine the neural network with the structure of the process model to its full extent. This paper proposes an approach to further interlock the process model of Decay Replay Mining with its neural network for next event prediction. The approach uses a masking layer which is initialized based on the reachability graph of the process model. Additionally, modifications to the neural network architecture are proposed to increase the predictive performance. Experimental results demonstrate the value of the approach and underscore the importance of discovering precise and generalized process models.

Recommended citation: J. Theis and H. Darabi, "Masking Neural Networks Using Reachability Graphs to Predict Process Events," 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI), 2021, pp. 1-6, doi: 10.1109/ICCSI53130.2021.9736237. https://ieeexplore.ieee.org/document/9736237

Predicting clinical outcomes among hospitalized COVID-19 patients using both local and published models

Published in BMC Medical Informatics and Decision Making, 2021

Background: Many models are published which predict outcomes in hospitalized COVID-19 patients. The generalizability of many is unknown. We evaluated the performance of selected models from the literature and our own models to predict outcomes in patients at our institution. Methods: We searched the literature for models predicting outcomes in inpatients with COVID-19. We produced models of mortality or criticality (mortality or ICU admission) in a development cohort. We tested external models which provided sufficient information and our models using a test cohort of our most recent patients. The performance of models was compared using the area under the receiver operator curve (AUC). Results:Our literature review yielded 41 papers. Of those, 8 were found to have sufficient documentation and concordance with features available in our cohort to implement in our test cohort. All models were from Chinese patients. One model predicted criticality and seven mortality. Tested against the test cohort, internal models had an AUC of 0.84 (0.74–0.94) for mortality and 0.83 (0.76–0.90) for criticality. The best external model had an AUC of 0.89 (0.82–0.96) using three variables, another an AUC of 0.84 (0.78–0.91) using ten variables. AUC’s ranged from 0.68 to 0.89. On average, models tested were unable to produce predictions in 27% of patients due to missing lab data. Conclusion: Despite differences in pandemic timeline, race, and socio-cultural healthcare context some models derived in China performed well. For healthcare organizations considering implementation of an external model, concordance between the features used in the model and features available in their own patients may be important. Analysis of both local and external models should be done to help decide on what prediction method is used to provide clinical decision support to clinicians treating COVID-19 patients as well as what lab tests should be included in order sets.

Recommended citation: Galanter, William, Jorge Mario Rodriguez-Fernandez, Kevin Chow, Samuel Harford, Karl M. Kochendorfer, Maryam Pishgar, Julian Theis, John Zulueta, and Houshang Darabi. "Predicting clinical outcomes among hospitalized COVID-19 patients using both local and published models." BMC Medical Informatics and Decision Making 21, no. 1 (2021): 1-18. https://link.springer.com/article/10.1186/s12911-021-01576-w

On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

Published in 6th International Workshop on Process Querying, Manipulation, and Intelligence 2021, 2021

Process mining algorithms discover a process model from an event log. The resulting process model is supposed to describe all possible event sequences of the underlying system. Generalization is a process model quality dimension of interest. A generalization metric should quantify the extent to which a process model represents the observed event sequences contained in the event log and the unobserved event sequences of the system. Most of the available metrics in the literature cannot properly quantify the generalization of a process model. A recently published method called Adversarial System Variant Approximation leverages Generative Adversarial Networks to approximate the underlying event sequence distribution of a system from an event log. While this method demonstrated performance gains over existing methods in measuring the generalization of process models, its experimental evaluations have been performed under ideal conditions. This paper experimentally investigates the performance of Adversarial System Variant Approximation under non-ideal conditions such as biased and limited event logs. Moreover, experiments are performed to investigate the originally proposed sampling hyperparameter value of the method on its performance to measure the generalization. The results confirm the need to raise awareness about the working conditions of the Adversarial System Variant Approximation method. The outcomes of this paper also serve to initiate future research directions.

Recommended citation: Theis, J., Mokhtarian, I., Darabi, H. (2022). On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization. In: Munoz-Gama, J., Lu, X. (eds) Process Mining Workshops. ICPM 2021. Lecture Notes in Business Information Processing, vol 433. Springer, Cham. https://doi.org/10.1007/978-3-030-98581-3_21 https://link.springer.com/chapter/10.1007/978-3-030-98581-3_21

Process Mining Model to Predict Mortality in Paralytic Ileus Patients

Published in 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI), 2021

Paralytic Ileus (PI) patients are at high risk of death when admitted to the Intensive care unit (ICU), with mortality as high as 40%. There is minimal research concerning PI patient mortality prediction. There is a need for more accurate prediction modeling for ICU patients diagnosed with PI. This paper demonstrates performance improvements in predicting the mortality of ICU patients diagnosed with PI after 24 hours of being admitted. The proposed framework, PMPI(Process Mining Model to predict mortality of PI patients), is a modification of the work used for prediction of in-hospital mortality for ICU patients with diabetes. PMPI demonstrates similar if not better performance with an Area under the ROC Curve (AUC) score of 0.82 compared to the best results of the existing literature. PMPI uses patient medical history, the time related to the events, and demographic information for prediction. The PMPI prediction framework has the potential to help medical teams in making better decisions for treatment and care for ICU patients with PI to increase their life expectancy.

Improving the In-Hospital Mortality Prediction of Diabetes ICU Patients Using a Process Mining/Deep Learning Architecture

Published in IEEE Journal of Biomedical and Health Informatics, 2021

Diabetes intensive care unit (ICU) patients are at increased risk of complications leading to in-hospital mortality. Assessing the likelihood of death is a challenging and time consuming task due to a large number of influencing factors. Healthcare providers are interested in the detection of ICU patients at higher risk, such that risk factors can possibly be mitigated. While such severity scoring methods exist, they are commonly based on a snapshot of the health conditions of a patient during the ICU stay and do not specifically consider a patient’s prior medical history. In this paper, a process mining/deep learning architecture is proposed to improve established severity scoring methods by incorporating the medical history of diabetes patients. First, health records of past hospital encounters are converted to event logs suitable for process mining. The event logs are then used to discover a process model that describes the past hospital encounters of patients. An adaptation of Decay Replay Mining is proposed to combine medical and demographic information with established severity scores to predict the in hospital mortality of diabetes ICU patients. Significant performance improvements are demonstrated compared to established risk severity scoring methods and machine learning approaches using the Medical Information Mart for Intensive Care III dataset.

Recommended citation: J. Theis, W. L. Galanter, A. D. Boyd and H. Darabi, "Improving the In-Hospital Mortality Prediction of Diabetes ICU Patients Using a Process Mining/Deep Learning Architecture," in IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 1, pp. 388-399, Jan. 2022, doi: 10.1109/JBHI.2021.3092969. https://ieeexplore.ieee.org/document/9466458

Adversarial System Variant Approximation to Quantify Process Model Generalization

Published in IEEE Access, 2020

In process mining, process models are extracted from event logs using process discovery algorithms and are commonly assessed using multiple quality dimensions. While the metrics that measure the relationship of an extracted process model to its event log are well-studied, quantifying the level by which a process model can describe the unobserved behavior of its underlying system falls short in the literature. In this paper, a novel deep learning-based methodology called Adversarial System Variant Approximation (AVATAR) is proposed to overcome this issue. Sequence Generative Adversarial Networks are trained on the variants contained in an event log with the intention to approximate the underlying variant distribution of the system behavior. Unobserved realistic variants are sampled either directly from the Sequence Generative Adversarial Network or by leveraging the Metropolis-Hastings algorithm. The degree by which a process model relates to its underlying unknown system behavior is then quantified based on the realistic observed and estimated unobserved variants using established process model quality metrics. Significant performance improvements in revealing realistic unobserved variants are demonstrated in a controlled experiment on 15 ground truth systems. Additionally, the proposed methodology is experimentally tested and evaluated to quantify the generalization of 60 discovered process models with respect to their systems.

Recommended citation: J. Theis and H. Darabi, "Adversarial System Variant Approximation to Quantify Process Model Generalization," in IEEE Access, vol. 8, pp. 194410-194427, 2020, doi: 10.1109/ACCESS.2020.3033450. https://ieeexplore.ieee.org/document/9237923

Decay Replay Mining to Predict Next Process Events

Published in IEEE Access, 2019

In complex processes, various events can happen in different sequences. The prediction of the next event given an a-priori process state is of importance in such processes. Recent methods have proposed deep learning techniques such as recurrent neural networks, developed on raw event logs, to predict the next event from a process state. However, such deep learning models by themselves lack a clear representation of the process states. At the same time, recent methods have neglected the time feature of event instances. In this paper, we take advantage of Petri nets as a powerful tool in modeling complex process behaviors considering time as an elemental variable. We propose an approach which starts from a Petri net process model constructed by a process mining algorithm. We enhance the Petri net model with time decay functions to create continuous process state samples. Finally, we use these samples in combination with discrete token movement counters and Petri net markings to train a deep learning model that predicts the next event. We demonstrate significant performance improvements and outperform the state-of-the-art methods on nine real-world benchmark event logs.

Recommended citation: Theis, Julian, and Houshang Darabi. "Decay Replay Mining to Predict Next Process Events." IEEE Access 7 (2019): 119787-119803. https://ieeexplore.ieee.org/document/8811455

Process Mining of Programmable Logic Controllers: Input/Output Event Logs

Published in 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), 2019

This paper presents an approach to model an unknown Ladder Logic based Programmable Logic Controller (PLC) program consisting of Boolean logic and counters using Process Mining techniques. First, we tap the inputs and outputs of a PLC to create a data flow log. Second, we propose a method to translate the obtained data flow log to an event log suitable for Process Mining. In a third step, we propose a hybrid Petri net (PN) and neural network approach to approximate the logic of the actual underlying PLC program. We demonstrate the applicability of our proposed approach on a case study with three simulated scenarios.

Recommended citation: J. Theis, I. Mokhtarian and H. Darabi, "Process Mining of Programmable Logic Controllers: Input/Output Event Logs," 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), Vancouver, BC, Canada, 2019, pp. 216-221. doi: 10.1109/COASE.2019.8842900 https://ieeexplore.ieee.org/document/8842900

Behavioral Petri Net Mining and Automated Analysis for Human-Computer Interaction Recommendations in Multi-Application Environments

Published in Proceedings of the ACM on Human-Computer Interaction, Volume 3, EICS, 2019

Process Mining is a famous technique which is frequently applied to Software Development Processes, while being neglected in Human-Computer Interaction (HCI) recommendation applications. Organizations usually train employees to interact with required IT systems. Often, employees, or users in general, develop their own strategies for solving repetitive tasks and processes. However, organizations find it hard to detect whether employees interact efficiently with IT systems or not. Hence, we have developed a method which detects inefficient behavior assuming that at least one optimal HCI strategy is known. This method provides recommendations to gradually adapt users’ behavior towards the optimal way of interaction considering satisfaction of users. Based on users’ behavior logs tracked by a Java application suitable for multi-application and multi-instance environments, we demonstrate the applicability for a specific task in a common Windows environment utilizing realistic simulated behaviors of users.

Recommended citation: Julian Theis and Houshang Darabi. (2019). "Behavioral Petri Net Mining and Automated Analysis for Human-Computer Interaction Recommendations in Multi-Application Environments." Proc. ACM Hum.-Comput. Interact. 3, EICS, Article 13 (June 2019), 16 pages. DOI: https://doi.org/10.1145/3331155

Dr. Julian Theis

Publications