Datafying Vision Docs

One tool I would use in coaching engagements is vision docs – a series of thought exercises to clarify values (values are the foundation for your goals). In these docs, you start with exploring the stories you tell yourself about a facet of your life; career, physical health, mental health, financial health, etc. Having a better understanding of these stories, you then paint a future picture of the ideal. Finally, you explore why you want this future and strategize around how to get it. At the end of the exercise, you have a word document. A coach can review it, give you feedback and ideas, and challenges. But, we can also turn that word doc into data and use that data to identify patterns that may not have been evident. The data helps us ask better questions and identify anomalies. It is in those anomalies where we learn, but most anomalies are difficult for people to recognize.

This is an example of the start of a vision doc. It’s my vision doc that I created several years ago. I started with the stories I was telling myself that were really holding me back. Don’t get me wrong, I still struggle with some of them… but putting them on the table helped me create the empowering stories that I continuously tell myself today.

You can explore the rest of the document here. Now, let’s turn it into data to find those hidden patterns or questions that I may have missed. As you read, remember that one of the strongest powers of data is generating instances where you say “hmm, that doesn’t make sense” or “umm, that’s not what I expected.” When that happens, there’s an opportunity for learning. Learning = growth. And accelerated personal growth is the whole point of a coaching engagement.

# Revealing Sentiments & Emotions

Sentiment analysis isn’t new, but it has become especially powerful recently. Matt Jockers created this really brilliant package in R, Syuzhet, that will help us datafy the vision doc. Let us start with that first block pictured above; the Limiting Beliefs.

The get_sentiment function will create a sentiment score for each sentence in the text. We have 13 sentences, so we get 13 values. Negative values indicate negative sentiment, positive values positive sentiment.

I would anticipate that the overall sentiment is negative. After all, these are stories in my head that are holding me back. The average (mean) is, indeed, -0.12. It is not perfect. Oddly, the third sentence has a positive sentiment (“I go to work to support my family, not to be fulfilled”). By removing words and rerunning, I can see that the algorithm upon which this output is created strongly associates the words “support” and “fulfilled” with positivity. To me, that indicates this data is directional, not prescriptive.

We can go a level deeper with the get_nrc_sentiment function and look at the emotions associated with the text (overall and in each sentence). There are seven emotions defined in Saif Mohammad’s NRC Emotion lexicon; anger, fear, anticipation, trust, surprise, sadness, joy, and disgust.

This mostly makes sense. Curiously, ‘anticipation’ is more prominent than I would suspect. Bullet point 1 is a driver (“I never had a plan for my career and now it’s too late”). Words like ‘plan’ and ‘start’ seem to have a strong association with ‘anticipation’. But words alone neglect the broader context. As such, again, this data is directional, not prescriptive.

# Sentiment Through the Entire Document

When we segment the entire vision doc, there are 77 sentences. There are 38 mostly positive sentences, 31 neutral sentences, and 8 mostly negative sentences. That varies throughout the doc. We can plot the trends as we go from ‘stories you tell yourself’ to ‘painting a future picture’ to ‘understanding why’ and ‘developing a strategy.’

In the plot below (created using simple_plot), 0.0 indicates neutral sentiment. There are a few types of smoothing we can do to identify trends – I think that the rolling mean and loess smooth curves make the most sense. You can see in the segment that explores the future pain I am trying to avoid (called ‘Avoid Pain’) that the rolling mean plummets while the loess curve stagnates. One way to use this; the segments where you have big swings up or down is probably an area that you want to leverage when crafting goals and behaviors because they contain a lot of energy.

# Emotions Through the Entire Document

Another way to quantify this is to look at emotional ambiguity through the doc. Emotional entropy can be thought of as a measure of unpredictability and surprise based on the consistency or inconsistency of the emotional language in a given sentence. Cognitive dissonance slows us down. If we can identify it in our ideas… it is probably another good place to leverage when crafting goals and behaviors. In the plot below, the metric is normalized but, unlike above, the scale here is high to low; +1 indicates high entropy and -1 indicates low entropy. Here, emotional entropy appears high in the Limiting Beliefs section as well as the end of the Ideal Future section.

Let’s take a look at the sentences that contain the most mixed messages:

I cringe at the existential dread baked into those sentences. This rings a bell for me. There is a mental model called inverse thinking (or inversion). It’s a technique where you take a question or idea, flip it on it’s head, and think about it from that inverted perspective. For example, if I was creating expectations for employees I would start with “what makes a great team member?” Another way to approach it is to ask “what makes an awful team member?” Whatever I came up with, I would invert those items and add them to my list of characteristics for great team members.

Looking at those sentences above that contain the highest emotional entropy, I know I should formulate behaviors as soon as possible that represent the inverse of those things.

# Emotions in Entire Document

We really want this document to be a repository of motivation, not dread. As such, I would expect that the emotions embedded in the doc are mostly positive and inspirational. The bar chart below shows me that, overall, this is true. ‘Trust’ and ‘anticipation’ create the lion’s share of emotion. For me, trust is synonymous with human connection. When I have a great, inspiring discussion with someone it makes my day.

# That’s It?

Nope. While I do find the above helpful, the most helpful analytics come from combining metrics or documents. For every sentence, let’s plot their sentiment (x-axis) against their entropy (y-axis).

Check out the orange points:

  • Highly positive, no ambiguity = “I love producing beautiful things – tools, newsletters, blogs, presentations, videos, groups of the right people.”
  • Highly positive, significant ambiguity = “Writing helps me think through the problems we are trying to solve.”
  • Highly negative, no ambiguity = “I hate recurring, worthless meetings.”
  • Highly negative, significant ambiguity = “I suck at managing people, which makes me a bad leader.”

Take that last one as an example. It appears there is some internal struggle happening. There are a series of next questions to explore. Bad leader compared to what/who? What metrics are you using to arrive at that conclusion? What have you done already to become a better manager of people? Have you read Jim Collins or Reed Hastings lately? Do you even care – do you really want to change?

The reason I love this approach is because as a teacher or coach, I have my biases. I could review your document and give feedback. However, the things I call out may or may not be the best things for you to take more time on. This more quantitative approach (although still directional) utilizes a framework that is consistent from doc to doc and person to person.

# Conclusion

Datafying a single vision doc is neat, but there are likely more insights when you datafy multiple vision docs together. Although it’s not a replacement for a great Socratic coach, it’s a powerful tool that can yield more questions. And asking the right questions is the precursor to setting and achieving your objectives. While robot AI Socrates may be further down the road (although maybe closer than you think), ML today can help us ask better questions that can propel us to the next level. We simply need to be creative in how we connect the purpose of coaching with modern data science methods.

Code: Check out the gist used to create the visualization in this article.