Data Driven Decision Making for Hospitals Part 5 – Natural Language Processing

In previous posts, we’ve covered the conceptual framework of data driven decision making. In this post, we explore how to apply computing concepts to solve difficult problems in decision making. All problem solving requires data.  The constraint is usually the volume of information and the time required to filter or select the appropriate data necessary for the decision.  The advances in Natural Language Processing (NLP)  and Unstructured Text Analytics (UTA) now allows for filtering large amounts of text in search of facts or sorting based on content.  Since machines never get tired or bored, the systems can present the user with organised, relevant data for use in decision making.

The concept of computers “reading” documents is not new. For many years the topic was referred to as artificial intelligence.  Natural Language Processing uses probabilistic models to identify the words that appear in a sentence based on the sequence of characters in the word.  Two very common uses in NLP are auto correction of spelling and grammar checking.  In any language, there is a frequency that occurs with regard to letters in a word and the relationship between words in a sentence (nouns, verbs, etc).  Algorithms have been developed that read a series of tokens (the computer representation of a word) and by using probabilistic modeling then determine the spelling of the token.   The process is referred to as lexical analysis. This form of NLP does not address the issue of context. Lack of context can result in those amusing and sometimes annoying autocorrect issues that occur while texting with your mobile phone.

Establishing context in machines that “read” requires algorithms that parse sentences.  Parsing is the computation equivalent of diagramming a sentence.  This includes extracting noun-verb pairs and subjects and predicates in order to present relationships in sentences.  A parsing program in NLP would provide its list of nouns, verbs, clauses, etc. based on the probabilistic model of word combinations found in a corpus of documents.  The tokenized document then undergoes the process of either fact extraction or topic modeling in order to gain insights about the content of the document.

The following diagram illustrates the flow of a parsing process.

Sentence Parsing Algorithm

The challenge in NLP is the ability to establish context to the document analyzed.  This requires a high level of programming sophistication.  Consider the following news headlines:

  • “Kids Make Nutritious Snacks”
  • “Dealers will Hear Car Talk at Noon”
  • “Drunk gets Nine Months in Violin Case”
  • “Red Tape Holds Up New Bridge”

These phrases created difficulties with establishing context with sentence parsing due to the multiple interpretations of their context. 

Algorithms for context are built on probabilistic models using a series of reference documents for determining frequency distributions of word/phrase pairings. Computers and algorithms don’t take cognitive shortcuts.   Humans have developed processes for filtering information and applying context that does not require the explicit instructions required by a computer.  Conversely, humans will make errors by emphasizing concepts rather than details when viewing a list of activities.

This can be illustrated as:

Humans will read the above labels and many times, not notice the error in the phrases.  This cognitive ability is absent in machines.  The advantage to using the machine is that these types of errors won’t occur.  It’s also why grammar checks find double typed words that the human doesn’t and why “Kids Make Nutritious Snacks” makes perfect sense to the human and not the machine.

There are a number of other interesting challenges in using NLP.  Consider the issue of reference resolution.  The statement “John is a large man.  He should play football. ” The system must be able to associate “He” with “John”.  The rules of grammar help to determine noun verb pairs as well as distance between noun pairs leading to a probabilistic determination of the co-reference. 

The tools of NLP can be used to apply context.  Using Unstructured Text Analytics (UTA) it becomes possible to extract relevant information from unstructured text.  This field of data science has the ability to significantly enhance decision-making.  UTA models built using data from a wide variety of sources can filter and extract relevant facts for support of the decision.  Combining UTA with structured data analytics further refines the user’s ability to target markets, leading to an understanding not only of “What has happened?” but “Why did it happen”.  

The process of developing a UTA model parallels the critical thinking process as shown in the table below.

StepCritical ThinkingUTA
1Recognize the problem and break it down to subsetsObjective statement
2Prioritize importance of solving subset problemsDevelop dictionaries and rules Based on Objective Statement
3Collect dataInput text files
4Examine DataParse text
5Examine RelationshipsCreate annotations (rules, dictionaries)
6Draw conclusionsDisplay output
Parallels between Critical Thinking and UTA Processes

More advanced models for determining content and context without using rules are based on converting text to vectors.  Vectors represent distance and magnitude in space, and a word can be compared to its place in space when compared to another word.  If all the words start in the same place (0,0; coordinates of the vector) the distance between the two groups of words can be determined by the angle differences from the origin. 

Common word groupings have common angles (cosine similarity).  This is the basis of algorithms such as WordtoVec that uses vector analysis with a supervised neural network for classification.  In order to gain compute efficiency and reduce noise, a number of pre-processing steps of text may occur such as removing irrelevant terms (stop-words), converting terms to lemmas, dropping pronouns or verbs based on the intent.  

Bidirectional Encoder Representations from Transformers (BERT) has gained in popularity.  This method of vector analysis using Long Short Term Memory Neural Networks for classification and context. In these models, a series of vectors (transformed data) are derived from training documents (provides negative and positive relationships) and compared to inputs.  The system then determines what relationships are closest to the positive learned (trained) documents.

Other topic modelling vector-based models include Latent Dirichlet Allocation (LDA), non-negative matrix factorization or naive Bayes Classifiers.

In summary, critical thinking requires a problem statement that is characterized using clearly defined terms.  This provides the context and specificity needed for UTA. UTA must be provided with dictionaries (simply a list of terms) that provide context to the question.  The breakdown of the primary question or decision to a set of sub-questions follows the same pattern with rules of combined terms in UTA analogous to sub-sets of questions in critical thinking.  

Data collection is required for both processes, but now the volume of data collected can be substantially expanded since UTA provides a methodology for identifying the signal from the noise.  Relationships are developed as a set of rules in UTA by combining terms in a dictionary with other characteristics of grammar resulting in an annotation when conditions are met. Documents can be classified and relationships can be explored using topic modelling systems.

NoviLens is an appliance for data driven decision making. NoviLens handles the tedium of the computing components of the process and frees up the decision maker to focus on their problem solving tasks.

Visit to learn more about NoviLens, and join our email list to get updates on new features and case studies. If you’d like to learn how NoviLens can help your organization, contact us for a 30 minute demo at