life2vec

Abstract, or what is our paper about?

We represent human lives in a way that shares structural similarity to language, and we exploit this similarity to adapt natural language processing techniques to examine the evolution and predictability of human lives based on detailed event sequences. We do this by drawing on a comprehensive registry dataset, which is available for Denmark across several years, and that includes information about life-events related to health, education, occupation, income, address and working hours, recorded with day-to-day resolution.

Our model, life2vec, allows us to predict diverse outcomes ranging from early mortality to personality nuances. Using methods for interpreting deep learning models, we probe the algorithm to understand the factors that enable our predictions. Our framework allows researchers to discover potential mechanisms that impact life outcomes as well as the associated possibilities for personalized interventions.

Is your algorithm really able to predict people's day of death, age when you die, or anything like that?

No! Let us explain. First, let's explain what the number 78.8% accuracy (that has been widely reported) actually means.

We look at a subset of individuals aged between 35 and 65. This is because it is particularly challenging to make survival predictions in this cohort. The vast majority of individuals who pass away are older. And young have extremely low probability of dying.
That dataset we split up in two parts
- Training data: Used to teach the model which correlations are in the data. The training data is the vast majority of the data.
- Test data: We used the test data to understand how well the model is doing.
We now train the model on the training data
In the training data, the model learns from information in the years 2008-2015 to tell the difference between actual life/death outcomes for people in the training data during 2016-2020.
The trained model is then run on the test data (100000 individuals). Here the model sees the 2008-2015 data and makes a prediction. We then check against actual outcomes if it got it right.

So far so good.

There is one final wrinkle. Accuracy is defined as (# correct guesses)/(total guesses). Because our cohort is very young, almost everyone survives (more than 95%).
- This means that if we created an algorithm that always predicted “survive”, it would get a very high accuracy (over 95%).
- To address the issue, we balance the dataset, equivalent of 50000 with survive outcome and 50000 with death outcome. In this balanced dataset a random guess would get 50% accuracy.
- When we run our algorithm on that balanced dataset, we get 78.8% accuracy.

Some important consequences.

We do not make predictions for everyone in Denmark, only the test data.
We are not predicting how long people will live. Rather we test mortality over the next 4 years for a young cohort of individuals. A key theme of the paper is about the factors that contribute to such early mortality.

Can you download the software and try this out?

No! The dataset and model contain sensitive data and both are safely stored at Statistics Denmark. They cannot be accessed via the internet. Some follow-ups:

There are websites that claim to implement life2vec (e.g. deathcalculator[dot]ai, life2vec[dot]io, life2vecai[dot]com). Those are fraudulent and have nothing to do with us and our work, so be careful.
We are working on ways to share the model with the wider research communities, but as LLMs are known to potentially leak data, we have to do further research before we can do this.
We have not yet studied how our results generalize to other countries/contexts, but are actively investigating this topic.

But if you're not obsessed with predicting death, what is the aim of the study as you see it?

Transformer models (the technology we use) have been developed to find patterns in language. By structuring lives in sequences, we can identify very complicated patterns in life-events (Just as in language, where the ordering of words is very important, so is the ordering of events in human lives. In a US context, for example, it matters if you get a job with health care and then get sick, rather than first getting sick without having the healthcare).

Those patterns and relationships between life-events are encoded in mathematical spaces (called them embedding spaces). We learn the structure of those spaces by processing life-sequences

We know it may sound strange, but we chose the topic of predicting death, because it is a problem so many people have worked on. (For example, due to insurance companies, and so on). That means that we know more about what to expect … and that if we are very good at it, it is in competition with many other algorithms.

The arguments in the paper come in the following order:

First, we show that the algorithm is very good at making diverse predictions (death, personality).
Since it is very good, we know that it is capturing interesting patterns in the data.
That means it makes sense to study the embedding spaces that capture those patterns in the data.

It is this last part that we are really excited about. Working to understand what new things we can learn about human beings, human behavior, and societies based on the structure of the embedding spaces.

But having access to this information could be dangerous. What about discrimination? As via being rejected for a bank loan or insurance. Have you considered these ethical implications?

Yes. This work should never be used for insurance for example. The whole idea of insurance is based on the idea of sharing risk across many people. If a million people get together, they don’t know who’s going to be sick so everyone can pay a small amount to a large shared pool and the few who are unlucky enough to get seriously ill early on can draw on the pool to get help. Since we don’t know who is going to get sick, it’s a good deal. If we could tell who would get sick ahead of time, it would undermine the whole idea of insurance.

There are many other reasons our algorithm should not be used yet. For example there are many issues related to privacy or biases that need to be worked out before using it in practice.

That said, there are many places where this algorithm could be very helpful when applied (after additional work). It is most clear within healthcare and medicine. Earlier diagnoses could lessen the severity of many diseases.

There are also some areas in between healthcare and insurance where we are less sure. For those areas we need to have a public discussion about the use of such an algorithm. Should we identify people who are predicted to struggle in school to help them? Maybe it’s a good idea, maybe not – honest people could disagree.

What we hope happens, is that our algorithm helps start a discussion about these technologies and how we should use them. Predictions like these are already happening inside large tech companies. There are reasons why Meta, Google, Microsoft, etc, collect so much data about us. But right now those predictions are happening behind closed doors with the intention of predicting (and sometimes manipulating) our behavior. For now, it’s mostly to make us stare longer at our screens or sell products, but that will likely change. But predictions are happening and will likely just become more and more common.

This is why we wanted to create something open and public to bring these topics out of the secret rooms inside billion dollar corporations – to start a discussion around prediction of human behavior. Probably mostly within science to begin with, but hopefully soon in society more generally.

Using Sequences of Life-events to Predict Human Lives^[1]

Reliable Sources

Frequently Asked Questions

Metrics

Altmetric

Dimensions

References

LIFE2VEC

Using Sequences of Life-events to Predict Human Lives[1]