Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Why Process Mining Is Important

3 minute read

Published:

Why Process Mining Is Important
You might have recently heard about tools and methods under the term Process Mining which is supposedly a key component in digital transformation. Whereas the term delights itself of huge interest in Europe, it seems that especially in the US it remains to date relatively unknown. I always tend to underscore the motivation of Process Mining using the example of the development of a city, like Chicago.

Potemkin Villages in Deep Learning

2 minute read

Published:

Potemkin Villages in Deep Learning
I have received a few messages asking for clarification what I meant exactly with Potemkin village models (for all who missed it: Fundamental Research). The term Potemkin village found its origin in Russia where Grigory Potemkin, a former minister, built fake portable settlements solely to impress his beloved Catherine II. Thus, the term Potemkin village means any construction which is built with the purpose to deceive others into thinking that a situation is better than it really is. So are almost all Deep Learning models - no matter if we are talking about language translation systems, picture labeling applications, or voice assistants. With the discovery of Adversarial Attacks, it has been proven that current neural networks learn shallow decision boundaries instead of actual underlying truths. Research has also shown how easy it is to manipulate currently deployed, state-of-the-art image recognition and voice control systems. As a result, any Deep Learning model can be fooled by adding controlled noise to its input such that humans cannot recognize a difference. Let’s imagine a voice control assistant at your home. The just mentioned leaks can be exploited to embed commands in e.g. manipulated music files. You won’t even hear a difference, but your assistant will naturally process those hidden messages. Though we are aware of the presence of Adversarial Attacks, we are not able to protect models from being exploited. It might not be a huge security issue for many of our current applications, but what if we are talking about road sign detection in autonomous driving environments? What about healthcare applications? Or what about Google’s newly introduced voice assistant which will make calls and manage appointments for you? Would you trust such as system which is vulnerable in a way we cannot prevent yet?

portfolio

DREAM-NAP

Decay Replay Mining to Predict Next Process Events

Activity Tracker

Application that tracks Human-Computer Interactions in Windows environments and stores them as event logs

PyDREAM

Python implementation of the Decay Replay Mining (DREAM) approach

AVATAR

Python implementation of the AdVersarial system vArianT AppRoximation (AVATAR) approach

publications

Behavioral Petri Net Mining and Automated Analysis for Human-Computer Interaction Recommendations in Multi-Application Environments

Published in Proceedings of the ACM on Human-Computer Interaction, Volume 3, EICS, 2019

Process Mining is a famous technique which is frequently applied to Software Development Processes, while being neglected in Human-Computer Interaction (HCI) recommendation applications. Organizations usually train employees to interact with required IT systems. Often, employees, or users in general, develop their own strategies for solving repetitive tasks and processes. However, organizations find it hard to detect whether employees interact efficiently with IT systems or not. Hence, we have developed a method which detects inefficient behavior assuming that at least one optimal HCI strategy is known. This method provides recommendations to gradually adapt users’ behavior towards the optimal way of interaction considering satisfaction of users. Based on users’ behavior logs tracked by a Java application suitable for multi-application and multi-instance environments, we demonstrate the applicability for a specific task in a common Windows environment utilizing realistic simulated behaviors of users.

Recommended citation: Julian Theis and Houshang Darabi. (2019). "Behavioral Petri Net Mining and Automated Analysis for Human-Computer Interaction Recommendations in Multi-Application Environments." Proc. ACM Hum.-Comput. Interact. 3, EICS, Article 13 (June 2019), 16 pages. DOI: https://doi.org/10.1145/3331155

Process Mining of Programmable Logic Controllers: Input/Output Event Logs

Published in 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), 2019

This paper presents an approach to model an unknown Ladder Logic based Programmable Logic Controller (PLC) program consisting of Boolean logic and counters using Process Mining techniques. First, we tap the inputs and outputs of a PLC to create a data flow log. Second, we propose a method to translate the obtained data flow log to an event log suitable for Process Mining. In a third step, we propose a hybrid Petri net (PN) and neural network approach to approximate the logic of the actual underlying PLC program. We demonstrate the applicability of our proposed approach on a case study with three simulated scenarios.

Recommended citation: J. Theis, I. Mokhtarian and H. Darabi, "Process Mining of Programmable Logic Controllers: Input/Output Event Logs," 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), Vancouver, BC, Canada, 2019, pp. 216-221. doi: 10.1109/COASE.2019.8842900 https://ieeexplore.ieee.org/document/8842900

Decay Replay Mining to Predict Next Process Events

Published in IEEE Access, 2019

In complex processes, various events can happen in different sequences. The prediction of the next event given an a-priori process state is of importance in such processes. Recent methods have proposed deep learning techniques such as recurrent neural networks, developed on raw event logs, to predict the next event from a process state. However, such deep learning models by themselves lack a clear representation of the process states. At the same time, recent methods have neglected the time feature of event instances. In this paper, we take advantage of Petri nets as a powerful tool in modeling complex process behaviors considering time as an elemental variable. We propose an approach which starts from a Petri net process model constructed by a process mining algorithm. We enhance the Petri net model with time decay functions to create continuous process state samples. Finally, we use these samples in combination with discrete token movement counters and Petri net markings to train a deep learning model that predicts the next event. We demonstrate significant performance improvements and outperform the state-of-the-art methods on nine real-world benchmark event logs.

Recommended citation: Theis, Julian, and Houshang Darabi. "Decay Replay Mining to Predict Next Process Events." IEEE Access 7 (2019): 119787-119803. https://ieeexplore.ieee.org/document/8811455

Adversarial System Variant Approximation to Quantify Process Model Generalization

Published in IEEE Access, 2020

In process mining, process models are extracted from event logs using process discovery algorithms and are commonly assessed using multiple quality dimensions. While the metrics that measure the relationship of an extracted process model to its event log are well-studied, quantifying the level by which a process model can describe the unobserved behavior of its underlying system falls short in the literature. In this paper, a novel deep learning-based methodology called Adversarial System Variant Approximation (AVATAR) is proposed to overcome this issue. Sequence Generative Adversarial Networks are trained on the variants contained in an event log with the intention to approximate the underlying variant distribution of the system behavior. Unobserved realistic variants are sampled either directly from the Sequence Generative Adversarial Network or by leveraging the Metropolis-Hastings algorithm. The degree by which a process model relates to its underlying unknown system behavior is then quantified based on the realistic observed and estimated unobserved variants using established process model quality metrics. Significant performance improvements in revealing realistic unobserved variants are demonstrated in a controlled experiment on 15 ground truth systems. Additionally, the proposed methodology is experimentally tested and evaluated to quantify the generalization of 60 discovered process models with respect to their systems.

Recommended citation: J. Theis and H. Darabi, "Adversarial System Variant Approximation to Quantify Process Model Generalization," in IEEE Access, vol. 8, pp. 194410-194427, 2020, doi: 10.1109/ACCESS.2020.3033450. https://ieeexplore.ieee.org/document/9237923

Improving the In-Hospital Mortality Prediction of Diabetes ICU Patients Using a Process Mining/Deep Learning Architecture

Published in IEEE Journal of Biomedical and Health Informatics, 2021

Diabetes intensive care unit (ICU) patients are at increased risk of complications leading to in-hospital mortality. Assessing the likelihood of death is a challenging and time consuming task due to a large number of influencing factors. Healthcare providers are interested in the detection of ICU patients at higher risk, such that risk factors can possibly be mitigated. While such severity scoring methods exist, they are commonly based on a snapshot of the health conditions of a patient during the ICU stay and do not specifically consider a patient’s prior medical history. In this paper, a process mining/deep learning architecture is proposed to improve established severity scoring methods by incorporating the medical history of diabetes patients. First, health records of past hospital encounters are converted to event logs suitable for process mining. The event logs are then used to discover a process model that describes the past hospital encounters of patients. An adaptation of Decay Replay Mining is proposed to combine medical and demographic information with established severity scores to predict the in hospital mortality of diabetes ICU patients. Significant performance improvements are demonstrated compared to established risk severity scoring methods and machine learning approaches using the Medical Information Mart for Intensive Care III dataset.

Recommended citation: J. Theis, W. L. Galanter, A. D. Boyd and H. Darabi, "Improving the In-Hospital Mortality Prediction of Diabetes ICU Patients Using a Process Mining/Deep Learning Architecture," in IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 1, pp. 388-399, Jan. 2022, doi: 10.1109/JBHI.2021.3092969. https://ieeexplore.ieee.org/document/9466458

Process Mining Model to Predict Mortality in Paralytic Ileus Patients

Published in 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI), 2021

Paralytic Ileus (PI) patients are at high risk of death when admitted to the Intensive care unit (ICU), with mortality as high as 40%. There is minimal research concerning PI patient mortality prediction. There is a need for more accurate prediction modeling for ICU patients diagnosed with PI. This paper demonstrates performance improvements in predicting the mortality of ICU patients diagnosed with PI after 24 hours of being admitted. The proposed framework, PMPI(Process Mining Model to predict mortality of PI patients), is a modification of the work used for prediction of in-hospital mortality for ICU patients with diabetes. PMPI demonstrates similar if not better performance with an Area under the ROC Curve (AUC) score of 0.82 compared to the best results of the existing literature. PMPI uses patient medical history, the time related to the events, and demographic information for prediction. The PMPI prediction framework has the potential to help medical teams in making better decisions for treatment and care for ICU patients with PI to increase their life expectancy.

Recommended citation: J. Theis and H. Darabi, "Masking Neural Networks Using Reachability Graphs to Predict Process Events," 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI), 2021, pp. 1-6, doi: 10.1109/ICCSI53130.2021.9736237. https://ieeexplore.ieee.org/document/9736217

On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

Published in 6th International Workshop on Process Querying, Manipulation, and Intelligence 2021, 2021

Process mining algorithms discover a process model from an event log. The resulting process model is supposed to describe all possible event sequences of the underlying system. Generalization is a process model quality dimension of interest. A generalization metric should quantify the extent to which a process model represents the observed event sequences contained in the event log and the unobserved event sequences of the system. Most of the available metrics in the literature cannot properly quantify the generalization of a process model. A recently published method called Adversarial System Variant Approximation leverages Generative Adversarial Networks to approximate the underlying event sequence distribution of a system from an event log. While this method demonstrated performance gains over existing methods in measuring the generalization of process models, its experimental evaluations have been performed under ideal conditions. This paper experimentally investigates the performance of Adversarial System Variant Approximation under non-ideal conditions such as biased and limited event logs. Moreover, experiments are performed to investigate the originally proposed sampling hyperparameter value of the method on its performance to measure the generalization. The results confirm the need to raise awareness about the working conditions of the Adversarial System Variant Approximation method. The outcomes of this paper also serve to initiate future research directions.

Recommended citation: Theis, J., Mokhtarian, I., Darabi, H. (2022). On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization. In: Munoz-Gama, J., Lu, X. (eds) Process Mining Workshops. ICPM 2021. Lecture Notes in Business Information Processing, vol 433. Springer, Cham. https://doi.org/10.1007/978-3-030-98581-3_21 https://link.springer.com/chapter/10.1007/978-3-030-98581-3_21

Predicting clinical outcomes among hospitalized COVID-19 patients using both local and published models

Published in BMC Medical Informatics and Decision Making, 2021

Background: Many models are published which predict outcomes in hospitalized COVID-19 patients. The generalizability of many is unknown. We evaluated the performance of selected models from the literature and our own models to predict outcomes in patients at our institution. Methods: We searched the literature for models predicting outcomes in inpatients with COVID-19. We produced models of mortality or criticality (mortality or ICU admission) in a development cohort. We tested external models which provided sufficient information and our models using a test cohort of our most recent patients. The performance of models was compared using the area under the receiver operator curve (AUC). Results:Our literature review yielded 41 papers. Of those, 8 were found to have sufficient documentation and concordance with features available in our cohort to implement in our test cohort. All models were from Chinese patients. One model predicted criticality and seven mortality. Tested against the test cohort, internal models had an AUC of 0.84 (0.74–0.94) for mortality and 0.83 (0.76–0.90) for criticality. The best external model had an AUC of 0.89 (0.82–0.96) using three variables, another an AUC of 0.84 (0.78–0.91) using ten variables. AUC’s ranged from 0.68 to 0.89. On average, models tested were unable to produce predictions in 27% of patients due to missing lab data. Conclusion: Despite differences in pandemic timeline, race, and socio-cultural healthcare context some models derived in China performed well. For healthcare organizations considering implementation of an external model, concordance between the features used in the model and features available in their own patients may be important. Analysis of both local and external models should be done to help decide on what prediction method is used to provide clinical decision support to clinicians treating COVID-19 patients as well as what lab tests should be included in order sets.

Recommended citation: Galanter, William, Jorge Mario Rodriguez-Fernandez, Kevin Chow, Samuel Harford, Karl M. Kochendorfer, Maryam Pishgar, Julian Theis, John Zulueta, and Houshang Darabi. "Predicting clinical outcomes among hospitalized COVID-19 patients using both local and published models." BMC Medical Informatics and Decision Making 21, no. 1 (2021): 1-18. https://link.springer.com/article/10.1186/s12911-021-01576-w

Masking Neural Networks Using Reachability Graphs to Predict Process Events

Published in 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI), 2021

Decay Replay Mining is a deep learning method that utilizes process model notations to predict the next event. However, this method does not intertwine the neural network with the structure of the process model to its full extent. This paper proposes an approach to further interlock the process model of Decay Replay Mining with its neural network for next event prediction. The approach uses a masking layer which is initialized based on the reachability graph of the process model. Additionally, modifications to the neural network architecture are proposed to increase the predictive performance. Experimental results demonstrate the value of the approach and underscore the importance of discovering precise and generalized process models.

Recommended citation: J. Theis and H. Darabi, "Masking Neural Networks Using Reachability Graphs to Predict Process Events," 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI), 2021, pp. 1-6, doi: 10.1109/ICCSI53130.2021.9736237. https://ieeexplore.ieee.org/document/9736237

Prediction of Unplanned 30-Day Readmission for ICU Patients with Heart Failure

Published in BMC Medical Informatics and Decision Making, 2022

We presented a process mining/deep learning approach for the prediction of unplanned 30-day readmission of ICU patients with HF. A patient’s health records can be understood as a sequence of observations called event logs; used to discover a process model. Time information was extracted using the DREAM (Decay Replay Mining) algorithm. Demographic information and severity scores upon admission were then combined with the time information and fed to a neural network (NN) model to further enhance the prediction efficiency. Additionally, several machine learning (ML) algorithms were developed to be used as the baseline models for the comparison of the results. The proposed approach was capable of modeling the time-related variables and incorporating the medical history of patients from prior hospital visits for prediction. Thus, our approach significantly improved the outcome prediction compared to that of other ML-based models and health calculators.

Recommended citation: M. Pishgar, J. Theis, M. Del Rios, A. Ardati, H. Anahideh, and H. Darabi, "Prediction of unplanned 30-day readmission for ICU patients with heart failure," BMC Medical Informatics and Decision Making 22, Article number: 117 (2022). doi: https://doi.org/10.1186/s12911-022-01857-y https://bmcmedinformdecismak.biomedcentral.com/track/pdf/10.1186/s12911-022-01857-y.pdf

A Process Mining- Deep Learning Approach to Predict Survival in a Cohort of Hospitalized COVID‐19 Patients

Published in BMC Medical Informatics and Decision Making, 2022

Various machine learning and artificial intelligence methods have been used to predict outcomes of hospitalized COVID-19 patients. However, process mining has not yet been used for COVID-19 prediction. We developed a process mining/deep learning approach to predict mortality among COVID-19 patients and updated the prediction in 6h intervals during the first 72 h after hospital admission. Our proposed process mining/deep learning model performed significantly better than commonly used machine learning approaches that ignore time information. Thus, time information should be incorporated in models to predict outcomes more accurately.

Recommended citation: M. Pishgar, S. Harford, J. Theis, W. Galanter, J. M. Rodríguez-Fernández, L. H Chaisson, Y. Zhang, A. Trotter, K. M. Kochendorfer, A. Boppana, and H. Darabi, "A process mining- deep learning approach to predict survival in a cohort of hospitalized COVID‐19 patients," BMC Medical Informatics and Decision Making 22, Article number: 194 (2022). doi: https://doi.org/10.1186/s12911-022-01934-2 https://bmcmedinformdecismak.biomedcentral.com/track/pdf/10.1186/s12911-022-01934-2.pdf

Improving Predictive Process Monitoring Through Reachability Graph-Based Masking of Neural Networks

Published in IEEE Transactions on Computational Social Systems, 2022

Predicting the next event during process runtime is an objective of interest in predictive process monitoring (PPM). Decay replay mining is one of few deep learning-based next event prediction approaches that are built upon process model notations. However, this algorithm does not fully intertwine its neural network with the available process knowledge contained in the process model. This work, which is an extended version of an earlier conference publication, investigates the reachability graphs of underlying Petri net process models for masking the neural network of decay replay mining to ultimately increase the quality of next event predictions. A more comprehensive set of experiments is performed to provide robust statistical evidence of the usefulness of the approach and relativizes earlier made claims and hypotheses. In addition, the decay replay mining approach is applied with the suggested reachability graph-based masking extension to a healthcare use case of sepsis patients facilitating decision-making for healthcare practitioners. The obtained results further underscore the validity of the masking of neural networks using knowledge contained in the reachability graph of a Petri net process model.

Recommended citation: J. Theis and H. Darabi, "Improving Predictive Process Monitoring Through Reachability Graph-Based Masking of Neural Networks," in IEEE Transactions on Computational Social Systems, 2022, doi: 10.1109/TCSS.2022.3220262. https://ieeexplore.ieee.org/abstract/document/9955412

talks

Software-defined Networking

Published:

Guest lecture on Software-defined Networking as part of the graduate course Multimedia Networking at RheinMain University of Applied Sciences.

Secure Socket Layer Virtual Private Networks

Published:

Guest lecture on Secure Socket Layer Virtual Private Networks as part of the graduate course Secure Networking at RheinMain University of Applied Sciences.

Machine Learning Using Tensorflow

Published:

Guest lecture on Machine Learning using Tensorflow as part of the graduate course Data Science I (IE594) in the Mechanical and Industrial Engineering department at UIC.

Regression and Classification Algorithms

Published:

Python workshop on regression and classification algorithms as part of the graduate course Data Science I (IE594) in the Mechanical and Industrial Engineering department at UIC.

Hands-On: Artificial Intelligence

Published:

Introduction on Deep Learning and Artificial Intelligence for 30 incoming freshmen engineering students as part of the Summer Bridge Program organized by the UIC College of Engineering.

Artificial Intelligence Around Us

Published:

An Artificial Intelligence workshop and demonstration hosted by the University of Illinois at Chicago Institute of Industrial and Systems Engineers.

Neural Networks and Deep Learning

Published:

Neural Network and Deep Learning introduction seminar at the School of Public Health at the University of Illinois at Chicago.

PyData Meetup - Process Mining

Published:

Process Mining describes a set of data-driven techniques to discover process models from event logs, to check conformance and monitor processes, and to enhance and improve models. During the last ten years, these techniques have developed into a mature research field with many use cases, resulting in a strong industry uptake. Its main applications can be found in business process management to create actionable insights leveraging existing data from IT systems. However, recent research shows a broader range of Process Mining applications such as in Software Engineering, Human-Computer Interaction, and control of manufacturing processes. In this presentation, Julian Theis will introduce the fundamentals of Process Mining based on real-world examples. He will also talk about recent research projects in atypical fields and will discuss future directions.

Behavioral Petri Net Mining and Automated Analysis for Human Computer Interaction Recommendations in Multi-Application Environments

Published:

Process Mining is a famous technique which is frequently applied to Software Development Processes, while being neglected in Human-Computer Interaction (HCI) recommendation applications. Organizations usually train employees to interact with required IT systems. Often, employees, or users in general, develop their own strategies for solving repetitive tasks and processes. However, organizations find it hard to detect whether employees interact efficiently with IT systems or not. Hence, we have developed a method which detects inefficient behavior assuming that at least one optimal HCI strategy is known. This method provides recommendations to gradually adapt users’ behavior towards the optimal way of interaction considering satisfaction of users. Based on users’ behavior logs tracked by a Java application suitable for multi-application and multi-instance environments, we demonstrate the applicability for a specific task in a common Windows environment utilizing realistic simulated behaviors of users.

Hands-On: Artificial Intelligence

Published:

Introduction of basic Artificial Intelligence fundamentals with real world examples to 13 incoming freshmen engineering students as part of the Summer Bridge Program organized by the UIC College of Engineering. This workshop included three hands-on parts on building a machine learning classifier, training deep neural networks, and buidling a Generative Adversarial Network (GAN).

Process Mining of Programmable Logic Controllers: Input/Output Event Logs

Published:

We present an approach to model an unknown Ladder Logic based Programmable Logic Controller (PLC) program consisting of Boolean logic and counters using Process Mining techniques. First, we tap the inputs and outputs of a PLC to create a data flow log. Second, we propose a method to translate the obtained data flow log to an event log suitable for Process Mining. In a third step, we propose a hybrid Petri net (PN) and neural network approach to approximate the logic of the actual underlying PLC program. We demonstrate the applicability of our proposed approach on a case study with three simulated scenarios.

On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

Published:

Process mining algorithms discover a process model from an event log. The resulting process model is supposed to describe all possible event sequences of the underlying system. Generalization is a process model quality dimension of interest. A generalization metric should quantify the extent to which a process model represents the observed event sequences contained in the event log and the unobserved event sequences of the system. Most of the available metrics in the literature cannot properly quantify the generalization of a process model. A recently published method called Adversarial System Variant Approximation leverages Generative Adversarial Networks to approximate the underlying event sequence distribution of a system from an event log. While this method demonstrated performance gains over existing methods in measuring the generalization of process models, its experimental evaluations have been performed under ideal conditions. This paper experimentally investigates the performance of Adversarial System Variant Approximation under non-ideal conditions such as biased and limited event logs. Moreover, experiments are performed to investigate the originally proposed sampling hyperparameter value of the method on its performance to measure the generalization. The results confirm the need to raise awareness about the working conditions of the Adversarial System Variant Approximation method. The outcomes of this paper also serve to initiate future research directions.

Masking Neural Networks Using Reachability Graphs to Predict Process Events

Published:

Decay Replay Mining is a deep learning method that utilizes process model notations to predict the next event. However, this method does not intertwine the neural network with the structure of the process model to its full extent. This paper proposes an approach to further interlock the process model of Decay Replay Mining with its neural network for next event prediction. The approach uses a masking layer which is initialized based on the reachability graph of the process model. Additionally, modifications to the neural network architecture are proposed to increase the predictive performance. Experimental results demonstrate the value of the approach and underscore the importance of discovering precise and generalized process models.

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.