Classification of Anxiety Disorders using Machine Learning Methods: A Literature Review

This paper focuses on providing a comprehensive literature review on the application of machine learning algorithms in the diagnosis of anxiety disorder, treatment response, and prediction of onset of anxiety disorder in recent years. Clinical decision support systems based on data-driven classifier design demonstrated a range of benefits for medical experts. The social media boom in the last decade and wearable sensors technology also opened doors for new opportunities to discover better support for clinical decisions. Still, there is a lot of room for improvement of the quality of diagnosis and new treatment scenarios can be used to enhance the mental health of the general population and reduction of tendencies of mental illness falling in severe outcomes such as suicides. Knowing the depression, anxiety, and mood of the population can also help better decision making at the government level.


Introduction
Anxiety disorder has been known as a mental illness, which has the highest group of social and individual burden mental disorders. According to the World Health Organization (WHO), anxiety disorders "are characterized by excessive constant fear and worrying" causes physical symptoms such as chest pain, headaches, heart racing, and abdominal pain [1]. In the year of 2018, an epidemiology survey shows the prevalence of anxiety disorders worldwide ranging from 6 to 16% of the population [2]. According to the World Health Organization, one in four persons is affected by a mental disorder at some time in their lives. In terms of cost, mental diseases come second lagging behind ischemic heart disease. Worldwide 23% of total deaths are due to depression and anxiety [3]. Also, the number of specialists in this field is not increasing at the pace compared with the increase in people suffering from anxiety disorders.

Machine Learning Algorithms for Detection and Prediction of Anxiety Disorders
Machine learning algorithms can be applied to the area of anxiety disorder in different ways as given below: • Diagnosis or detection of anxiety disorder • Prediction of risk of anxiety disorder in the future • Prediction of the response of medical treatment to the anxiety disorder Machine learning algorithms can be used to classify the presence/absence of a particular anxiety disorder, prediction of risk levels, or prediction of response levels of treatment. In Figure 1, classification process is explained. Data can be collected from different resources including demographic data, health records, medical history, different measuring scales, etc. A large set of features can be obtained from the data and important features may be selected based on an appropriate feature selection algorithm. These features consist of training and testing data set for the classifier. Based on the training data set, selection, and parameter tuning of a good classifier based on performance metrics is the last step.
In the following sub-sections, details of the figure blocks will be explained:

Data collection
Various types of data can be used to train the machine learning algorithms based on questionnaires, interviews, demographic data, medical health records, treatment history, and anxiety rating scales. There are many scales available that are used by clinical and health professionals. In the clinical setting, many tests such as Functional magnetic resonance imaging (fMRI), electroencephalogram (EEG), electrocardio-(GAD), Suicidal Tendency (ST), Panic Disorder (PD), Social mental disturbance (SAD) and Agoraphobia (AG). Generalized Anxiety Disorder (GAD) develops due to more than one object or situation and it creates fear and worries in the minds of suffering persons and they become concerned in even routine matters of life. This type of mental disorder is characterized by worry along with fatigue, restlessness, muscle tensions, and sleep disturbances [9,10]. Anxiety also plays a role in creating a tendency towards suicide in humans [11]. Social mental disturbance (SAD) is due to the inner fear of the persons from public embarrassment or failure of social interactions. This type of anxiety may also be developed due to negative public opinions about the person and he or she fails to cope with such situations. People suffering from such type of anxiety try to avoid the sources of their anxiety. This problem can be worse in the younger generation as they fail to receive proper mental health services [12]. Social physique anxiety is also part of social mental disturbance and it is more in females as compared to males [13]. In Panic Disorder (PD), a person may experience episodes of trembling, shaking, or confusion for a short time. He or she may also feel difficulty in breathing [14,15]. In Agoraphobia, people is in the state of fear in some situations that they cannot escape from this situation if anything goes wrong [16]. Different rating scales exist in the literature to assess various forms of anxiety disorders and their clinical treatment methods [17]. The presence of autistic spectrum disorders (ASD) in children and adolescents are at higher risk of anxiety disorders [18]. There are various studies related to the treatment of anxiety disorders present in the children having ASD [19][20][21]. Various assessments also exist to diagnose the anxiety in the children [22].
Machine Learning [23] refers to algorithms able to analyse complex data with heterogeneous distribution [24]. Machine Learning algorithms provide automated solutions for decision making without human intervention. This technology has been applied in various sectors, for instance, education [25], economics [26], medical imaging [27], [28] and clinical decision support systems [29,30].
Detection of mental disorders in the patients at the appropriate stage is very critical in treating these disorders. A variety of models using medical data is used to assess the mental health of a person. Interviews with the doctor, analyzing the symptoms and physical examination may help to identify the mental illness problem related to the anxiety disorder. Questionnaire-based assessment is a popular method to diagnose the mental illness of the patients [31,32]. A variety of scales based on the questionnaires exists and are practiced by the medical staff [33][34][35]. Recently, due to the use of wearable and portable devices, new methods are being used to detect anxiety disorders [36]. The next section contains a comprehensive literature review about the risk assessment of anxiety disorder using machine learning techniques. Most of the literature in this paper was retrieved from the following databases: on dimensions of observable behavior and neurobiological measures. RDoC framework highlights the importance of symptoms modeling to establish the causal origin of mental illness [53] and diagnostic of psychiatric disorders may be improved by precise clustering of symptoms.
Several scales exist for pediatric anxiety disorders including Anxiety Disorders Interview Schedule for DSM-IV (ADIS) [54] which is good for GAD, Obsessive-Compulsive Disorder (OCD), Separation Anxiety Disorder, and Social Phobia. Child and Adolescent Psychiatric Assessment (CAPA) [55] is a semi-structured interview-based scale that covers several psychiatric disorders. Diagnostic Interview for Children and Adolescents (DICA) [56] is a highly structured interview-based and has a child and parent versions. It covers all pediatric anxiety disorders. A more comprehensive list and analysis of the pros and cons of different scales in pediatric psychiatry can be found in [57].
Brain imaging is also helping in characterizing the patho-physiological mechanism in GAD. Various studies showed impairment in certain brain areas. Madonna, et al. [58] described three types of models related to the functional anatomy of anxiety disorders. According to one model, some brain areas are either hyper or hypo-activated depending on the severity of symptoms. In another model, specific brain areas are activated. Hence brain activity in fMRI can be a good set of data used to train the machine learning algorithms. Some structural changes in the MRI may also be related to different anxiety disorders. Calculating the volume of white and gray matter in different brain regions can also be served as good training data to identify anxiety disorders. The good detail about such data exists in [58]. Wearable devices can be used to measure various interesting parameters useful for the training of machine learning algorithms. A good literature review can be found in [36]. Heart rate variability (HRV) can be calculated as a change in the instantaneous heart rate measured by RR interval from ECG [59]. HRV is an important gram (ECG), etc are also being used for the diagnosis of the anxiety disorder. In addition to these types of data, recently researchers are also using data recorded by wearable and non-wearable sensors without feedback from the patients. Tests mentioned above and raw sensors data recorded from wearable and non-wearable sensors require pre-processing and feature extraction methods. DNA methylation signatures also become prevalent recently to accurately classify mental disorders from healthy control subjects [37]. Methylation is a covalent attachment of a methyl group to cytosine, one of four bases in DNA. Methylation is found to be linked to many mental disorders including PTSD, MDD, and suicide tendency etc [38][39][40].
Hamilton Anxiety Rating Scale (HAM-A) [41] is a questionnaire used by clinical staff to assess the severity of the anxiety. The hospital anxiety and depression scale (HADA) [42] is a seven items scale to identify anxiety and depression and is widely used by numerous researchers. Beck Anxiety Inventory (BAI) [43] contains 21 items and based on the score, three levels of anxiety, low, medium, and high, are identified. Depression, Anxiety, and Positive Outlook Scale (DAPOS) [44] is an 11 items scale based on a questionnaire for different levels of depression, anxiety, and positive outlook. Generalized anxiety disorder severity scale (GADSS) [45], is an interview-based GAD symptom severity scale using six items. Social interaction anxiety scale (SIAS) [46] and social phobia scale (SBS) [47] both consist of 20 items that are used to measure the severity of social anxiety symptoms. There are many scales available for different types of anxiety and depression disorders. Diagnostic and statistical manual of mental disorders, 5 th ed. (DSM-5) [48] provides the diagnostic criteria of different mental disorders. Clinician-Administered PTSD Scale for DSM-5 (CAPS-5) is a diagnostic tool [49] which is based on structured diagnostic interviews and widely used for post-trauma stress disorder (PTSD). National Institute of Mental Health (NIMH) introduced Research Domain Criteria (RDoC) [50][51][52] for investigating the mental disorders based imized. Feature selection methods are broadly divided into filter methods, wrapper methods, and embedded methods. Other feature selection methods may also exist. Good surveys of these methods can be found in [82][83][84]. Filter methods rank the features according to some suitable ranking criteria such as correlation criteria, mutual information, etc. Then a proper threshold value is used to select the best features. This method is called a filter method because it reduces the feature set before classification is performed on the feature set. Wrapper methods use an objective function (classification performance) and are broadly classified into sequential selection methods and heuristic search methods based on evolutionary algorithms. The wrapper method finds a subset of the features which maximizes the objective function. Embedded methods perform feature selection during the training phase of classification. Since training and feature, the selection is done at the same time so these methods are called embedded methods. Supervised feature selection methods evaluate the relevance of feature with the class label whereas unsupervised feature selection methods consider various properties of data to establish the relevance of the feature. In the situation when only a small labeled feature set is available along with a large amount of unlabelled feature set, semi-supervised feature selection methods can be used in which feature relevance is calculated based on both labeled and unlabelled feature set [27,[84][85][86]. Ensemble feature selection methods are relatively new algorithms to improve the robustness and performance of feature selection. These methods are based on voting aggregation scheme [87]. Information energies of Hesitant fuzzy sets can also be used to combine the ranking feature selection algorithms [88]. The ensemble of feature selection methods can also be done by creating different sub-samples from the data set and trained them on different learners and then a combination method selects the most relevant features [89].

Classifier selection and optimisation
There is a hard and fast rule about the selection of an appropriate classifier for a classification problem. Researchers try different classification algorithms based on their intuition or literature review to achieve better classification accuracy. Classifiers may be categorized as a discriminative or generative type of classifiers. Discriminative type of classifiers tries to find the mapping between input and output (Output can be classes). Some examples are Logistic regression, SVM, Neural Network, etc. Whereas, generative type of classifiers tries to model how the data is generated based on some probability distribution. Few examples are Naive Bayes, Bayesian Belief Networks, Boltzmann Machines, etc. The lazy learner type of classifier does not learn any decision rule and hence does not require training. K-nearest neighbor is a lazy classifier that requires all the data to be stored in the memory for the classification or clusters the data and keep the cluster centers in the memory. Ensemble type of classifiers [90] is based on a set of classifiers predicting the class. Boosting type of ensemble classifiers use weak learners sequentially correcting their predecessors. Some of the famous boosting classifiers are Gradient Boosting, AdaBoost, XGBoost, CatBoost, etc. Bagging classifiers train classifiers on a random sample marker of mortality in general cardiac health [60][61][62]. HRV can also be studied as a possible indication of an anxiety disorder. Measurement of EEG of brain function can also be a good indicator of different types of anxiety disorders [63][64][65][66][67]. Various features including absolute and relative power, coherence, amplitude, and event-related potentials (ERP) can be extracted from sensorimotor rhythms in the range of delta, theta, alpha, beta, and gamma frequencies. Not many papers have used EEG data and machine learning algorithms to diagnose or predict the anxiety disorder but research shows EEG as a promising candidate especially when wearable sensors to record EEG are easily available.
Text, audio, and video mining on social media can be very useful in identifying potential subjects suffering from different types of disorders. He, et al. After detecting the face from the video, various interesting features related to eyes and head movements and gaze can be useful in the identification of the presence of stress of anxiety [79]. Multi-modal biosignals including EEG, photoplethysmography (PPG), electrodermal activity (EDA), and pupil size can also be used as good data for the detection of anxiety [80]. Eye-tracking during conversations on video chatting can also be useful in feature extraction related to social anxiety disorder [81].
Depending on the data collection protocol, data may comprise different types including Qualitative, Quantitative, Attribute, Discrete, Continuous, etc.

Feature extraction and selection
Once data has been collected than features may be extracted from the data. Either the data is considered as features or features that must be extracted from the raw data collected from different sources. These features may require pre-processing or normalization. Normalization depends on the type of classification algorithm being used. Some classification algorithms do not require normalization. Before going for classification, sometimes it is essential to select the important feature so that the dimension of the feature set may be reduced and computation efforts in collecting the test data, feature extraction, and classification may be min-predictive science of psychology. Some researchers claimed that heart rate variability is reduced in patients of anxiety disorders. Chalmers, et al. [59] compared heart rate variability in 2294 control subjects and 2086 patients with anxiety disorders (447 PD subjects, 192 PTSD subjects, 90 SAD subjects, and 40 SAD without the obsessive-compulsive disorder). Authors have studied various time and frequency domain measures and concluded that high-frequency HRV and Time domain HRV is reduced in anxiety disorder patients. Carmilla, et al. [96] measured heart rate variability measure namely The standard deviation of the normal-to-normal intervals (SDNN) and respiratory sinus arrhythmia (RSA) in 2059 subjects which were classified as healthy control subjects (n = 616), subjects with anxiety disorder earlier in life (n = 420) and current anxiety disorder (1059) based on the Diagnostic and Statistical Manual of Mental Disorders, 4 th Edition (DSM-IV) and Composite International Diagnostic Interview (CIDI). Current anxiety disorder patients had significantly lower SDNN and RSA as compared to control subjects. They claimed that this reduction may be due to taking anti-depressant medication. Many other researchers have also reported a reduction in HRV is the depression and anxiety disorders [97][98][99][100]. Gunduz, et al. [101] found increased SDNN in PD patients whereas other HRV measures including duration of RMSSD, NN50, and pNN50 are reduced in the PD patients. A good systematic review can be found in [102]. Lots of people these days are using social media applications like Twitter, Facebook, Instagram, etc to communicate with each other frequently. For most people, they like to keep their daily life events online and looking for social appreciation and liking. This kind of behavior puts lots of depression on the user in their daily life and create a huge competitive burden on their brain unnecessarily. In the last five years, many research papers are published to predict depression, anxiety, and mental disorder from the text messages [74]. Bartlett, et al. [37] used DNA methylation signatures to classify three types of mental disorders, namely schizophrenia, bipolar disorder, and major depressive disorder. Four classifiers, namely Neural Networks (NN), SVM, Naive Bayes (NB), and Random forest, are tested on differentially methylated positions and regions. Authors have found good classification accuracy ranging from 93% to 96% (best accuracy is for NB classifier) and AUC ranging from 0.928 to 0.97 (best AUC is for NN).
Another interesting area where machine learning is quite helpful in drug discovery and repositioning for anxiety disorders treatment. Drug repositioning can decrease the cost and development cycle of drugs. Zhao and So [103] applied various machine learning algorithms including Deep neural network, random forest, SVM, etc to predict the drug indications based on their expression profiles for Schizophrenia and Depression/Anxiety Disorders. SVM showed better results as compared to other machine learning algorithms predictive performance. A good survey on the application of machine learning methods in drug discovery in [104].
The following subsections summarize how various machine learning algorithms are used for various types of anxiety disorders: of the training data set and prediction of the classifiers is combined. Random forest and Bagged Decision Trees are a few examples of bagging classifier. Voting type of classifiers builds different models or classifiers and a voting mechanism is used to decide the class of unknown data. Deep learning is getting popular more recently [91]. Different deep learning architectures such as deep neural networks, recurrent neural networks, and convolutional neural networks are applied in various fields successfully.

Literature Review of Machine Learning Algorithms for Anxiety Disorder
To investigate and diagnose the patient with an anxiety disorder, it takes at least 20 to 30 minutes to complete the interview with the patient. Furthermore, diagnosis based on a certain scale requires a questionnaire to be filled by the patient with a proper understanding of the contents of the questionnaire by the person. Identifying the risks of an anxiety disorder at an early stage may lead to early intervention and simplify the treatment. Therefore, with the advancement in the field of machine learning and its successful application to various fields attracted the researchers to use the machine learning algorithms for the diagnosis of anxiety disorders.
In this paper, we have reviewed research related to machine learning algorithms used for various types of anxiety disorders in the last decade and grouped the papers according to the type of anxiety disorder. Kessler, et al. [92] used baseline reports submitted by more than 1000 participants with lifetime major depressive disorder. Then they have used machine learning algorithms to predict the outcomes assessed 10 to 12 years after the baseline reports. In the prediction models, the machine learning model is performed better than conventional logistic regression models with the area under the curve (AUC) of 0.63 for chronicity and 0.71 to 0.76 for other outcomes. Meenal, et al. [93] integrated clinical and imaging features to predict the late-life depression (LLD) in 68 subjects including 33 subjects suffering from LLD. Demographics and cognitive ability scores along with features extracted from multi-modal magnetic resonance imaging. Various machine learning models including support vector machines (SVM), optimized ADTree and logistic regression are used to estimate depression and treatment response. ADTree produced the best classification accuracy of 87% in diagnosis and 89% in treatment response. Panagiotakopoulos, et al. [94] used a contextual data mining approach to assist the therapists in the treatment of anxiety disorders. Only ten subjects were used to predict the performance of the Apriori rule mining algorithm and showed the efficacy of the proposed assisted algorithm to the therapists.
Sau, et al. [3] used five Machine Learning algorithms, namely Logistic Regression, Naive Bayes, Random Forest, and Support Vector Machine and Catboost, to identify risks of anxiety for early intervention and treatment. 740 subjects were interviewed and data set comprising of 14 features is used for classification purposes. The Catboost model has the best performance with the classification accuracy of 89.3% on the testing set. Yarkoni and Westfall [95] provided some guidelines about using machine learning algorithms in the predictions. Regional classifiers did not produced significant classification accuracy but when their responses are integrated, a good predictive performance with accuracy of more than 80%. Ojeme and Mbogho [115] used a multi-dimensional Bayesian network to identify the depression and co-occurring physical illness in 1090 Nigerian subjects with 82% exact match. Theodore, et al. [116] used seven classification methods, namely IB1 (Nearest-neighbour classifier), J48 (C4.5 algorithm implementation), Random Forest, MLP (Multi-layer Perceptron), SMO (Support Vector Machines using Sequential minimal optimization), JRip (Repeated Incremental Pruning to Produce Error Reduction) and FURIA (Fuzzy Un-ordered Rule Induction Algorithm) for a data set of 103 students to classify them as anxiety or not anxiety disorder based on Beck anxiety inventory data. SMO algorithm performed better than other algorithms for the original data and pre-processed data using Iliou pre-processing method showing 92% and 98% classification accuracy respectively. Whereas the FURIA method performed best with Principal Component Analysis (PCA) pre-processing showing 90% accuracy.

Social mental disturbance (SAD)
Social media has a major impact on an individual's life and it is a tool where participants share their feeling and emotions without revealing their identity. Scientists in data sciences use social media as a source to study psychological factors that affect people's lives. Reece, et al. [117] linked anxiety disorder with childhood experienced tragedy in their lives and Twitter posts. The study used the Random Forest algorithm with 234.000 posts, 63 anxiety participants, and 111 healthy participants. The study collected predictive characteristics from participant tweets that quantify the effect, linguistic style, and context (n = 279.951) and models with supervised learning algorithms using these features. Random Forest Model showed an anxiety symptom for clinical diagnosis with 89% accuracy.
EEG based classification of SAD is investigated by [118] on a smaller set of subjects including 32 SAD subjects and 32 healthy control subjects. Two models of features extracted from EEG are considered. In one model channels of five frequency bands are concatenated without consideration of spatial configuration of EEG electrodes. Whereas in the second model spatial configuration of EEG electrodes is also considered. Convolutional neural networks (CNN), SVM, and KNN classifiers are trained on the data, and CNN was found to be the best for classification accuracy of 87%.

Post-traumatic stress disorder (PTSD)
Traumatic incidents such as war, terrorist attacks, natural disasters, and accidents cause anxiety and mental problems. If PTSD is identified at an early stage and proper interventions can reduce the long-term effect of PTSD. After any traumatic event, if symptoms are present for more than 30 days then PTSD is diagnosed in that person and requires treatment. Machine Learning methods can be used to classify anxiety disorders for patient experienced a trauma [119]. The study uses text mining methods and Natural Language processing on an online survey and compares three machine learning algorithms SVM, Decision Trees, and Naive Bayes on the data

Generalized anxiety disorder (GAD)
The most common anxiety disorder is Generalized Anxiety Disorder (GAD). In clinical practices, there are various assessment and treatment methods for GAD [105]. Sribala [106] developed a neural network-based model for the prediction of GAD using general attributes and attributes related to the Diagnostic and Statistical Manual (DSM) IV standard questionnaire [107]. The results with sensitivity analysis were 96.43% and without sensitivity analysis was 90.32% [106]. Hussain, et al. [108] collected the data from the Beck Depression Inventory (BDI) test which is one of the most common psychometric tests for measuring the severity of depression [109]. Questionnaire results from 182 responses with 112 females respondents and 70 male respondents used to create the GAD data set. A random forest tree was applied, and the best result was found with balanced data (same number of male and female respondents) and 100 trees which showed an accuracy of 99.3% giving high predictive performance and accuracy [108].
Kevin, et al. [110] applied a binary support vector machine to differentiate subjects from healthy subjects and also differentiated between GAD and major depression (MD) on a small number of data set including 24 healthy subjects, 19 subjects with GAD, and 14 subjects with MD. Data includes clinical scores, cortisol data, grey matter (GM), and white matter (WM) data. Structural MRI is used to collect the GM and WM data. Based on the questionnaire only, high accuracy of 96.4% (p<.001) is achieved for case classification (GAD and MD as cases and healthy as non-cases), whereas classification between disorders was moderate (56% with p<0.22). Using all data types, overall classification accuracy was found to be 90% for case classification and 67.4% for disorder classification.
Ball, et al. [111] utilized random forest classification on fMRI data from the individuals suffering from GAD (25 adults) and PD (23 adults) to predict their treatment outcomes. Model-based on fMRI yielded 79% accuracy whereas clinical and demographic features gave 69.

Panic disorder (PD)
Panic disorder or panic attack is a condition in which a person feels sudden fear or discomfort for a shorter duration of time that may last for several minutes. Leuken, et al. [112] predicted comorbidity status in the patients suffering from panic disorder with agoraphobia using functional MRI data. Prediction is done by ensemble tree classifier (Random Under Sampling Boost algorithm) in fifty-nine patients and comorbidity status was successfully predicted in 79% of the patients with 73% sensitivity and 85% specificity. Sundermann, et al. [113] applied multivariate pattern analysis using soft margin support vector machines to predict the individual response to cognitive behavioral therapy (CBT) in the panic disorder with agoraphobia. Authors have achieved moderate classification accuracy slightly more than 50%. In another paper, Hahn, et al. [114] predicted the response of CBT in panic disorder with Agoraphobia using fMRI of brain assessment. Authors used regional Guassian Process Classifiers (GPC) on 55 brain regions and converted the predictive probabilities to categorical el (HMM), and word shift graphs. Classification of PTSD and depression from healthy subjects with high accuracy (90%), finding onset of depression several months prior, and indicated PTSD almost immediately after trauma many months before the clinical diagnosis are the major achievements of the research.
EEG based classification of PTSD and major depressive disorder (MDD) from healthy control (HC) subjects showed a significant decrease in event-related potentials (P300) amplitudes and reduced source activities in PTSD as compared to MDD and healthy subjects on data set of total 157 subjects (51 PTSD, 67 MDD, 39 HC) [127]. P300 source activity was recorded in different regions of the brain. Both sensor level and source-level P300 features are used for the classification. A binary SVM classifier is used to classify various pairs of PTSD, MDD, and HC with classification accuracy in the range of 67% to 82.5%.

Anxiety disorders in adolescents and children
Studies in this regard show the importance of preschool-age diagnosed with anxiety enabling opportunities for early interventions that can control and limit the development of anxiety disorders [128]. Other studies show that risk factors are increasing in this group of age for instance Attention Deficit Hyperactivity Disorder (ADHD), panic anxiety, bullying, and lack of social skills [129]. The ability to quickly diagnose and intervene with anxiety disorders, while the child's brain is still progressing may put the child at a declined risk for psychiatric disturbances later in life [130]. Generalized Anxiety Disorders (GAD) and Separation Anxiety Disorder (SAD) have a high prevalence in early childhood anxiety [128]. Less than 15% of children are evaluated and treated for various anxiety disorders [131].
Ellen W. McGinnis, et al. [132] analyzed audio speech of three minutes duration on young children to verify the existence of internalizing disorders. It was found that affected children show especially low pitch voices with repeatable words inflections. Audio features were extracted from the sound and different classifiers including logistic regression (LR), SVM with a linear kernel, SVM with a Gaussian kernel, and random forest (RF) is used for binary classification with 80% accuracy (54% sensitivity, 93% specificity) with LR and 80% accuracy (62% sensitivity, 89% specificity) with SVM with a linear kernel. Carpenter, et al. [133] collected Preschool Age Psychiatric Assessment (PAPA) which is a valid diagnostic parent-report interview for assessing preschool children (2-5 years-old) to assess the anxiety disorders. Authors have used machine learning techniques to make the assessment test shorter for identifying GAD and separation anxiety disorder which may be feasible in the pediatric clinics. Alternating decision tree (ADTree) and J48 algorithm are used to identify GAD and separation anxiety disorder. With ten tree nodes of ADTree which correspond to a minimum of 8 individual PAPA items and a maximum of 17 PAPA items can produce an accuracy of 95% in separation anxiety disorder whereas 5 nodes of ADTree for GAD include only 3 to 7 individual items of PAPA to produce the accuracy of more than 95%. The maximum comprises of 150 healthy patients and 150 PTSD patients. The SVM model has the highest prediction of PTSD with an accuracy of 82%. Karstoft, et al. [120] tried to identify the risk indicators of PTSD in 561 Danish soldiers deployed in Afghanistan. A range of psychometric measures is recorded including PTSD symptoms, psychiatric problems, previous trauma exposure, social support, intelligence, emotional analysis, and affection. Markov Boundary feature selection algorithm for Generalized Local Learning (GLL) is used to predict the risk indicators with high accuracy (AUC = 0.84 in pre-deployment and AUC = 0.88 in post-deployment).
Ge, et al. [121] used the XGBoost algorithm to predict PTSD after three months from earthquake based on different features recorded after two weeks from the earthquake that happened in China. Prediction accuracy ranged from 66% to 80%. Features were recorded from different assessment methods including socio-demographic characteristics and earthquake-related experiences, sleep, mood, somatic symptoms, everyday functioning, and the Children's Revised Impact of Event Scale (CRIES). Wshah, et al. [122] collected the dataset of 90 subjects who experienced a traumatic event. PTSD Checklist (PCL-5) for Diagnostic and Statistics Manual 5 th Edition (DSM-5) is a self-reported measure that includes 20 items to assess PTSD over the last month [123]. The authors have selected 11 normalized features for binary classification of PTSD and no PTSD. Various machine learning algorithms, namely Logistic regression, Naive Bayes, SVM, Random forest, and voting classifier are used. AUC of these algorithms are in the range of 0.78 to 0.85 with the best classifier found to be the Voting classifier (VC). Deep belief network (DBN) with deep transfer learning was used by Banerjee, et al. [124] for PTSD detection using features extracted from the audio recordings of 26 patients. Prosodic, vocal-tract, and excitation features are extracted from audio samples of PTSD patients and normal subjects and achieved an accuracy of 75%. Ensemble classifiers are used by Papini, et al. [125] for the prediction of PTSD in the subjects admitted in the emergency department of a hospital after different types of trauma including fall injuries, automobile collisions, motorcycle collisions, etc. The feature set included hospital routine features, demographic features, psychological features, and census features. The probability of PTSD after three months was predicted using the XGBoost ensemble algorithm with AUC = 0.85. Levy, et al. [126] used various machine learning algorithms including SVM, random forest, AdaBoost, Kernel ridge regression, and Bayesian binary regression on 957 trauma survivors to predict the PTSD status after fifteen months on a combination of emergency room features and 10-days postevent features. Out of these features, 16 features are identified as potential predictors. They have found that the SVM classifier can predict the PTSD status with AUC = 0.78.
Reece, et al. [117] used twitter data and depression history of 204 individuals (105 depressed) and extracted predictive features like the number of tweets per day, average word count per tweet, retweet occurrence, lexicon features of the text, etc. A random forest classifier is found to the best to discriminate between healthy and affected persons. The prediction models include random forest, Hidden Markov Mod-fier using leave-one-subject-out (LOSO) produced the best discrimination with 81% classification accuracy (67% sensitivity, 88% specificity). This method is very fast as compared to the standard Child Behaviour Checklist (CBCL) which is a questionnaire completed by parents and clinical interviews with parents to assess child problem behaviors. Cyber-crimes like cyber-bullying may have psychological effects on children. Vimala, et al. [146] used machine learning algorithms, namely Naive Bayes, Random forest, and J48, which are used to classify the tweets into four categories (bully, aggressor, spammer, normal). Features including Big Five and Dark Triad models, sentiment, emotion, and Twitter-based features are used for training and testing of the classifiers. J48 and random forest classifiers performed equally well and J48 produced AUC of 0.97. Murnion, et al. [147] analyzed messages in the online games to detect cyber-bullying based on sentiment analysis. Detection and solution to stop the cyber-bullying is a hot topic these days and many researchers are using text mining approaches and machine learning algorithm to automatically detect the cyber-bullying. A good literature review on this topic can be found in.

Suicidal tendency (ST)
Suicidal tendency is not an anxiety disorder according to DSM-5. But Patients suffering from anxiety disorders are more likely to have suicidal tendencies and thoughts [148][149][150]. So a review of machine learning algorithm in diagnosing or predicting the suicidal tendency is carried out in this sub-section. Suicidal attempts whether successful or not is a major issue related to mental health worldwide. Approximately 800,000 people die annually due to suicide [151]. It is difficult to predict non-fatal suicide attempts. Effective intervention is very essential to prevent suicides and suicidal attempts. Suicidal attempts also put a lot of financial burden on the governments and families. Hence, it is important to predict the suicidal tendency in people suffering from mental illness. Although many tools have been developed to predict the suicide risks, people may feel reluctant to participate in such types of tools which are mainly based on questionnaires [152][153][154].
Six learning algorithms in the neural networks including Conjugate gradient back-propagation with Fletcher-Reeves updates (CGF), Levenberg-Marquardt (LM), Broyden-Fletcher-Goldfarb-Shanno (BFGS), scaled-conjugate gradient (SCG), Resilient Back-propagation (RP), and variable learning rate (GDX) are used to predict the tendency for suicide [155]. The dataset was collected from demographic questionnaires distributed among 800 university students. Suspicious questionnaire papers were removed and the size sample was reduced to 698 students, 557 of which were female, and 141 were male. Levenberg-Marquardt learning algorithm provided the best performance 93.12% true acceptance rate (TAR). Jihoon Oh, et al. [156] recorded observations within one month and one year. A total of 573 patients submitted 31 self-report psychiatric scales and questionnaires. A neural network classifier is trained on using 41 criteria (31 psychiatric scales and 10 socio-demographic elements). Their model accuracy was 93.7% in one month, 90.8% in one year, and 87.4% in discovering lifelong suicide attempts. accuracy achieved by this algorithm is 97% and 99% for GAD and separation anxiety disorder respectively whereas accuracy is 97% for both GAD and separation anxiety disorder in the case of the J48 algorithm. Another study [134] used the ADTree classifier as a screening tool for detecting anxiety disorders in preschool children.
Kim, et al. [135] used a support vector machine model to predict the treatment response of methylphenidate (MPH) administration attention deficit hyperactivity disorder (ADHD) in young subjects. Authors claimed that SVM can produce 84.6% accuracy in prediction at stage 4 for predicting MPH response using age, weight, ADRA2A MspI and DraI polymorphisms, lead level, Stroop color-word test performance, and oppositional symptoms of Disruptive Behaviour Disorder rating scale. Burke, et al. [136] used a large sample of pediatric primary care patients (13,325 with a mean age of 17 years) and studied the ridge regression model and two machine learning algorithms, decision trees, and random forests, to classify suicide attempt history. A total of 53 Behavioural Health Screen (BHS) items are used as indicators for the predictive models. Random forests showed an accuracy of 92% for emergency department data (Sensitivity is 70% and specificity is 94%) and accuracy of 96% (Sensitivity is 70% and specificity is 97%) for primary care data.
Internalizing disorders normally start in childhood and experienced the first time in the adolescent age [137]. Hence it is important to identify ID risk at an early age. Lots of research has been done to predict the risk factors based on self-assessment reports or questionnaires completed by parents [138][139][140][141][142]. Rosellini, et al. [137] used waves 1 and 2 of the NESARC survey of the US adults [143]. Wave 1 interviews was conducted in 2001-2002 (n = 43,093) whereas wave 2 re interviewed of wave 1 respondents (n = 34,653) in 2004-2005. The authors have used 213 features from the available information in the data set. Nine classifiers (logistic regression, least absolute shrinkage selection operator penalized regression, generalized additive modeling, adaptive splines, k-nearest neighbors, linear SVM, and linear discriminant analysis) are tested individually or in an ensemble to differentiate five outcomes, namely GAD, PD, Social phobia, depression, and mania. AUC was achieved in the range of 0.76 (depression) to 0.83 (mania) for different outcomes.
McGinnis, et al. [144] proposed a 90 seconds fear induction task during which the motion of children is recorded by a wearable belt containing acceleration and angular velocity sensors.
Authors have extracted time and frequency-based features and KNN is used as binary classifier models for diagnosing children with internalizing disorders. The classifier produced 75% classification accuracy by the best model. In another paper [145] McGinnis, et al. extracted 147 features from 20 seconds time series of acceleration and angular velocity data obtained from the sensors. The total number of subjects is 63 children (57% female). The logistic regression (LR) model is used for the classification of an internalizing diagnosis from controls based on 10 features selected based on the Davies-Bouldin index. The performance of the classi-persons with suicidal attempts and persons with no suicidal attempt. The classifiers showed accuracy ranging from 64% to 72% with RVM showing the best performance. Cheng, et al. [160] used SVM to predict the suicide risk levels based on a web-based survey of Chinese media Weibo posts. Natural language processing of these posts can lead to identifying the individuals who are at greater risk of a suicide attempt.
Ludwig, et al. [161] analyzed the concept of dividing the suicides into violent and non-violent. They have investigated the clusters present in various forms of suicides like poisoning, hanging, shooting, drowning and jumping from high places, etc. based on a large sample of 77,894 cases of suicides using similarity measures. They have found different clusters in male and female groups. Features are extracted from Simplified Chinese-Linguistic Inquiry and Word Count (SC-LIWC) categories and classifiers are trained to relate these features with five suicide risk factors.

Discussions, Challenges and Future Directions
The use of machine learning algorithms in the detection, treatment response, and prediction of anxiety disorders are getting popular. Classification through machine learning algorithms can be broadly classified into three categories as follows, In Table 1, the performance of some classifiers are compared for different types of anxiety disorders. Most of the researchers have used SVM, and random forest classifiers for the detection of different types of anxiety disorders. But still, there is room for improvement in terms of classification accuracy. Hierarchical classification can be explored for the classification large number of anxiety disorders. At the first Walsh, et al. [157] applied a random forest algorithm to electronic health records of more than 5167 subjects (3250 patients with suicidal attempt and 1917 control healthy subjects) to predict the suicide risk. Data is generated based on demographic data, diagnoses based on claim data, past health care utilization, evidence of prior suicide attempts, socioeconomic data, and medication data. The dimension of the feature space is further reduced to capture only important details. The prediction model is used to predict the suicidal attempt with different prediction windows starting from 7 days before a suicide attempt to 720 days before the suicide attempt. Results show high recall values (0.95) and they remained the same for different prediction windows. Precision values are not so high starting from 0.79 to 0.74 as the prediction window is changed from 7 days to 720 days. The authors suggested that a random forest classifier is a better prediction model than traditional regression models and gave notable two years before the attempt to intervene and start the treatment procedures. Calderon-Vilca, et al. [158] used simulated data to predict the suicidal tendency in the young generation. Then they have tried different algorithms like C5.4, Naive Bayes, and JRIP rule-based algorithm to predict the suicidal tendency. The authors have found high classification accuracy on this simulated data, but the paper lacks the information about how the simulated data is presentable to the young population facing different anxiety factors in their life. Passos, et al. [159] used clinical and demographic variables from 144 subjects including patients of mood disorder and have attempted suicide. Three machine learning kernel-based methods, least absolute shrinkage and selection operator (LASSO), SVM, and relevance vector machine (RVM) is used to classify Table 1: Performance of some best classifiers for Anxiety Disorders.

Name of Disorders
Machine Learning Algorithms Performance GAD RF [108,111] SVM [110] 99%, 79% 96% ST NN [155,156] RF [157] RVM [159] 93%, 90% Recall 0.95 Precision 0.79 72% PD Ensemble tree classifier [112] Bayesian network [115] SVM [116] 79% 82% 92% SAD RF [117] CNN [118] 89% 87% PTSD SVM [119,126,127] GLL [120] VC [120] DBN [124] XGBoost [125] 82% Pediatric SVM [132,137] ADTree [133] J48 [133,146] RF [135,146] RL [144] 80% anxiety disorder and mental health by medical practitioners. In addition to the classical system of deriving clinical perception about mental health, evidence-based diagnosis can be integrated into the form of sensing the brain anatomy and functionalities, clinical test reports, HRV, etc. Smart personalized medical treatment can be planned with the help of such a decision support system and the response of medical treatment can be monitored by evidence analysis based on sensors' feedback, patients' feedback, and clinical perception of medical practitioners. Such type of clinical decision support system is also interactive and can make some decisions based on evidence. Panagiotakopoulos, et al. [94] used a contextual data mining approach to provide application and personalized services to medical practitioners based on the data collected through long-term monitoring. Contextual information consists of personal context (demographic data, medical history, clinical exams and notes of a medical practitioner), stress context (stress level at a specific context), symptoms context (seven first-rank and five second-rank including restlessness, fatigue, exaggerated response, muscle tension, sleep disturbance, unable to concentrate, irritability, nausea, perspiring, dry mouth, tachycardia, and tremor) and environmental context. They also provided four treatment support services to extract information that may not be easily visible to psychiatrists. The proposed system will provide help to the therapist in cognitive behavioral therapy (CBT). Apriori association rule mining algorithm in consultation with medical experts is built for the applications and services. For prediction services, the Bayesian network predictive classifier is used. We have not found many papers in which data mining approaches are used to discover the rule base or predictive behaviors and responses. Moreover, evidence-based reasoning models may also be developed to provide clinical support to medical experts at a personal level or general level. Hence there is a lot of scope of research in this area.

Fully automatic diagnosis and prediction of treatment response
Most people feel shy to consult health professionals in their early stages of anxiety disorders and mental problems due to personal ego, social pressure, fear of job loss, etc. Data can be recorded via sensors without the intervention of health professionals.
Text-based anxiety disorder detection: These days almost everyone uses a smartphone and sends text messages and emails to their friends, relatives, colleagues, and even medical doctors.
Hoogendoorn, et al. [170] studied predictive modeling based on text written by patients. Different features may be extracted like selection, usage, and frequency of words, topics of discussion, a sentiment of text, and writing style. Logistic regression, decision tree, and random forest algorithms are used as predictive models and obtained reasonable accuracy in terms of AUC to predict social anxiety. Similar work is done also on the text of social media applications like Twitter, WhatsApp, emails, Facebook, etc to discover the presence of any type of anxiety disorder. Saha, et al. [171] called social media as sensors for mental health because on social media level, a binary classifier can be used for patients having an anxiety disorder and healthy people. At the second level, different types of anxiety disorders can be classified.

Clinical decision support system for mental health
Based on standard data collection protocols including interviews, measurement scales, medical history, etc in the clinical setup can be used for training of the machine learning algorithms and these algorithms can help the health professional in predicting correct response to the data in the identification of the anxiety disorders. Furthermore, data mining can be done more easily with machine learning algorithms and new decision rules may also be discovered. Health professionals can interact with the clinical decision support system.
A clinical decision support system can provide recommendations on drug and dose selection according to the results of pharmacogenetic testing such that the efficacy of therapy is increased and undesirable side-effects may be minimized [162]. Evidence-based clinical practice [163] is data-driven and requires efficient and effective data mining protocols by using validated assessment tools instead of clinical judgment and diagnostic impressions [164]. Pharmacogenetic based decision support systems can optimize medical treatment for mental illness. In [165] many commercially available pharmacogenetic tools for psychiatry practice are reviewed. Logistic regression is used to examine the predictors of decisions about the treatment [166]. The clinical decision support system may also be helpful for patients and clinicians by providing them many treatment options [167]. Clinical decision support systems may also be extremely helpful in the rural communities where only primary healthcare workers are available in the primary health centers. Maulik, et al. [168] proposed the Medical Appraisal, Referral, and Treatment (SMART) system for rural areas of India where a large proportion of people do not receive adequate mental healthcare. Various opportunities for digital technology in clinical decision making are explored in [169]. In our view, smarter clinical decision systems can be made using machine learning algorithms that can provide good and accurate support at the community level and personal level by prediction personalized treatment plan for optimal treatment response. It is worth mentioning that standard rule-based classifiers, fuzzy rule-based classifiers, and different variants of decision trees must be explored further. The classifier's ability to explain its decisions based on big data needs special attention in future research so that human knowledge may also get benefit from decision support systems. We expect big data analytical tools will be applied in designing better clinical decision support systems for mental health.

Semi-automatic diagnosis and prediction of mental disorders
This kind of clinical decision support system comprises of self-assessments based on some scales, interviews by a health professional, interactive application that can discuss with the subject which help in clinical perception about the system [179] are few terms already coined in the literature.
Information on mental illness and the emotional state of the population is also very vital for decision making at the government level. A decision support system for government officials can be made which can issue alerts based on social media data analysis about the location-aware emotional state of the population. Policymakers can make new policies to uplift the emotional mood of the population by arranging various entertainment events or by taking appropriate measures to remove the cause of the emotional distress in the populations. Policymakers can also study the effect of their entertainment policies by analyzing the overall effect of entertainment events on the happiness or mood uplift of the population. Hence, there can be a new era of big data analysis that can help the general population by taking preventive measures to decrease the burden on medical facilities related to mental health. We did not find many papers related to clinical decision support system which can be trained on multi-model data based on Questionnaires, Measuring Scales Clinical/health history, Medication data, Sensors data (fMRI scans, HRV, human activity monitoring, etc) and provide the diagnosis, optimizing the treatment and predicting the response based on whatever data available from the patients by properly handling the missing data. In the coming years, the researcher will focus on this area and improve the quality of clinical decision support systems.
discussions about their problems, symptoms, and depression a lot directly in words or indirectly by hidden emotions. So textual features can serve as cues to help in the diagnosis of various types of anxiety disorder and can predict the onset of the disorders or suicide attempts and tendencies.
Audio based anxiety disorder detection: Different features of audio speech like pitch, speaking rate, articulation, specific spectral and timing properties can give clue about the depressed person [172]. These features of the audio signal can also be used as anxiety predictors in the speech [173,174]. Hence, analysis of audio discussions/chatting on social media can be analyzed to know the emotional status of the person and these clues may be helpful in the early detection of anxiety and depression.
Video based anxiety disorder detection: Video chatting, tik-tok videos, etc on social media can also help us in the detection of anxiety or depression. Based on face detection algorithms, faces in the video can be detected and tracked and facial clues like eye movements, head movements, etc can be used as possible predictors of stress and anxiety. Some work has been done related to this area recently [78,79].
Wearable sensor-based detection: Wearable sensors based bio-signal measurements are getting popular these days and many applications on a smartphone are providing good analysis to the users about their health. Human activity monitoring using motion sensors present in wrist bands and smartwatches along with heart rate monitoring help people to adopt a healthy lifestyle. Human activity monitoring, heart rate variability, sleep profiles can also provide good information about the mental state of the person. Wearable EEG measurement has a great potential in detecting the emotional state of the person during certain tasks [175][176][177].
A multi-modal automatic diagnostic system can be built on a combination of features extracted from text, audio, video, and wearable or smartphone sensors to predict any type of anxiety disorder without the knowledge of the subjects. Early detection of anxiety disorders by such systems can be very useful for further investigations by the medical experts and proper treatment intervention.
Another possible future direction in the coming years will be curing the mental illness by artificial intelligence. Numerous artificially intelligent chatbots can be installed on social media that can not only diagnose and predict the mental illness including anxiety disorders but also intervene and provide treatment to such types of disorders by engaging with them positively and providing them therapy sessions to cure their mental illness. After the sessions, they might be monitoring their progress in curing the patients and modifying their strategies for optimized treatment. In many cases of mental illness, social media may enhance the mental illness of the patients due to negative social contact and communications. These intelligent systems can intervene in these situations by breaking the avalanche of worsening mental illness and blocking social contacts. Hence putting the patients in cyber-quarantine so that they can respond to the clinical treatment positively. Privacy issues may need to be tackled properly. Integrated web-based therapy [178], web-based support