Tag Archives: ai

What Should I Know About… Machine Learning?

BIg data

In this fourth edition of the What Should I Know About… series we’re considering machine learning.

For background reading, we’d recommend the previous post on Artificial Intelligence (AI) but the important thing to know is just that Machine Learning is a sub category of AI that is concerned with making decisions from large amounts of data.

What is the goal?

In short, machine learning is the latest step in a long path humans have been making to use machines to assist in whether something should be done or not.

We have always adopted specific rules for whether something should be done or not, for example, at theme parks you’d see a sign saying “You must be this tall to go on this ride”. Traditionally there would be a human attendee doing an eyeball test to determine Yes or No to whether someone could go on the ride. Now it is possible for a machine to detect the height of people standing in the queue.

This concept then developed in order to make simple decisions based purely on digital information that otherwise would have been difficult to collect. For example, by collecting the digital information associated with a purchase, you can decide on which types of adverts to suggest next.

In the case of an online purchase of a baby clothes you could infer that this person is a new parent, and as such suggest other services related to parenting.

The next phase

As we’ve discussed in the previous post, the increase in the number of computers has led to an even bigger increase in the volume of data that exists in the world.

This has meant two things

  • More information to base decisions off
  • A need for more advanced ways to interpret this data

What does that mean?

Well, in our example of the baby clothes purchase, you can imagine a marketing department of a parental commerce store that recently went online (i.e. digitising their information) having a meeting to discuss the strategies they can now deploy with this wealth of information.

They suggest things such as sending campaigns to new customers based on when they buy a baby stroller as well as tips for schooling based on the age of the clothes which are being bought.

The problem though, is that life is more complex than a simple statement of “if someone buys a baby stroller then they are a new mother wanting these other products”. This leads to predictive analytics failures which can leave those making the predictions in a sticky situation.

There could be many reasons why a person is buying baby clothes: perhaps they are relative getting clothes for their niece, or a business doing market research, or a designer looking for inspiration.

Whilst these might sound a little farfetched, the concept is that humans are looking at the data that exists and trying to come up with combinations of scenarios that account for being able to make an accurate decision.

The fundamental issue is that humans’ cognitive power to codify the complexities of real life is limited, and requires a lot of time to go through.

In this example, the marketing department might go back and say that if the customer’s age is over 45 we’ll assume that’s a relative buying the product and as such offer a different version of an advert, but that again could cause issues.

More data

Whilst the task of deciding which advert to serve as a result of a baby product purchase might seem quite low risk, there are other areas where the stakes are higher.

A classic example of this is: “Should this person get a loan or not?”

Historically this was done by humans (i.e. bank managers) who would meet you, understand what you want the loan for and then make a judgement from there. There would often be information collected (income, outgoings etc.) which used to be calculated by people, but then got crunched by computers. This reduced the time it took to approve a loan from the days required for the bank clerk to make her calculations, to seconds via a computer calculation.

Whilst making a decision based on income and outgoings is OK, there’s a lot more information on people which can inform whether they should get a loan or not.

With the advent of more data being collected and digitised we’re increasingly seeing more information that can be consulted in order to reach a decision.

Let’s say, for example, that in applying for a loan you also sign up from your phone and as a result submit lots of data based on your call usage. This can be used to base a decision on whether you should receive a loan.

Drowning in data

What we’ve got here lots more data which humans may look at and decide on certain rules.

For example, if someone makes calls before 7am this shows that they get up early and means they’re a hard worker and so we should give them a loan.

The problem with this approach is that there are almost infinite possibilities and combinations of interpretation possible. The chances of getting one which turns out to be predictively accurate is increasingly rare.

The current paradigm of trying to use human cognitive power to solve these problems is at its limit.

Enter the machines

This is where “machine learning” comes in.

It essentially takes an ocean of data that exists and works through it automatically in order to make decisions.

The innovation though is that it needs only minimal direction in order make its decision.

Machine learning

A core aspect is for the computer (i.e. machine) to learn about the data it is making a decision from. This takes two forms:

Supervised learning

This is where historical examples are given to a machine for it to work out what may have caused it i.e. check on whether airtime top ups affect loan repayments

Unsupervised learning

This is where a data set is given to the computer and it essentially spots patterns that might be unrecognisable to the human eye. i.e. people have higher repayments if they top up their phone on Friday afternoons

Machine-based decisions

Once this test has been done, the computer will have an algorithm (i.e. set of instructions) for what should be done for future examples.

In the instance of a loan it may say that people with this set of conditions (i.e. receives an average of 5 calls/ day + tops up their phone on Friday afternoons) will have a higher probability of repaying their loan than someone who doesn’t.

As such, the decision can be made much more quickly and, in theory at least, more accurately than instances where a human has tried to come up with the conditions

The benefits of using the data

This fits with the vogue business mantra of being “data driven”.

Namely avoiding the foibles and irrationalities based on using your human judgement and instead trusting the data. It can also be doing stratospherically quicker than other means which results in expediency of getting results, i.e. a better customer experience.

The perils of data-driven decisions

Whilst the concept of using machines to objectively make decisions, the problem comes in that its almost impossible to completely remove subjectivity from the equation.

For example, a core tenet of some strands of machine learning is to learn from the past to make predictions on the future. What can happen in these instances is that human bias is encoded in the data and the computer “objectively” comes up with a bias judgement.

An example of this is in trying to automate and assist in the judgement of offenders in the US justice system. You can read more on the study here, but the essence is that because black Americans are arrested more, this racial bias was being translated into harsher criminal judgements.

For more on the negative outcomes that can come from machine learning, read up on the book: Weapons of Math Destruction (and this podcast episode)

Further reading

To go deeper in these areas, look to the following articles:

Conclusion

This post has covered more on the conceptual aspect of machine learning, rather than the scientific methodologies in how it is applied. For those looking to read up on this, take a look at the links above.

The core concept to know about machine learning is that because there is now more and more data in the world, computer scientists are required to find more sophisticated means with which to interpret it.

If the world was happy making decisions with smaller amounts of data, machine learning would not be necessary.

As it is though, the trend with data, and the demand for insights it can bring is only going one way, and as such advanced methodologies are necessary.

Whilst this no doubt enables greater possibilities, be particularly wary of how seductive the idea of using data to base your decision can be: problems come when biased data goes into a black box and comes out as an “objective” decision which we should blindly follow.

 

We hope that you found this post informative and thought-provoking. If you’d like to be notified of the next post, or learn more about how Inspira UK can help your business grow, then you can sign up to the newsletter and contact us here.