The amount of data that gets generated by companies is continuously growing – the term “Big Data” is commonly used to describe this, whether it is multiple financial transactions per second, thousands of emails per day or a million hours of video content in YouTube. This brings about a good amount of new challenges that can no longer be addressed by systems where you have to manually write rules to process records like monitoring credit card transactions for fraud. A more modern way of approaching such tasks is to ‘teach’ computers to do them for us. With this approach the computer figures out those rules by itself.
This is where we enter into the realm of machine learning. In recent years there has been a huge surge in interest towards this field due to the growth in the kinds of data mentioned above and also because of improvements in computational power which allow us to process lots of data.
Machine learning has already changed many companies’ workflow and services in order to give them a competitive edge. For example, Netflix’s recommendation system is regarded as one of their core and most valuable features. It is what helps them deliver better content to subscribers and retain their customers. Trading and investment firms have also started looking towards machine learning to improve their business and strategy.
In this post, I’ll try to give a high-level overview of how machine learning works, what the fundamentals are and why we at Sinara think it is important to invest in potential applications for this technology.
What exactly is machine learning?
Many people associate machine learning with the term AI which has become a buzzword in a way, but I can assure you it has nothing to do with robots that might take over the world! Machine learning developed from mathematical and analytical methods that are translated to a program and perform a specific task. Most frequently, this means predicting a variable like a share price. It is strongly based on statistics, linear algebra and calculus. Those are very precise, quantitative sciences and have nothing esoteric about them.
At the core of machine learning is data. A lot of systems are dependent on specific rules and hardcoded business logic in order to work. One way of thinking about machine learning is that it’s a way of uncovering those rules in order to achieve a desired outcome. In other words, a machine learning model is a program that predicts output based on input. However, unlike most computer programs, the logic is not hardcoded. Instead, it’s discovered through applying the chosen machine learning algorithm to existing data. This algorithm uses mathematical principles to decide if the current model produces optimal predictions and how to change the logic to further optimise.
In the following sections, I’ll describe how this actually happens, which is influenced by the way humans learn to perform various tasks. However, computers can process large amounts of data much quicker than us!
It all starts with data
The first step is to gather data and set a goal for what we want to do with it. These could come from any source such as a SQL database, a CSV file or an XML feed. Each sample can consist of many features/variables. The goal usually is to predict one of those variables when new data is given to the model. To do that we need to find which variables are most useful in this prediction. For example, we might want to try and predict future stock prices based on a large dataset of previous prices. By giving a future date as input to the model, we would hope to obtain a meaningful forecast as output.
A common way of teaching the learner system is to have the data split into Input+Expected Output pairs. The expected output is the ‘right answer’ we are trying to predict. This will be used in finding the hidden patterns within the dataset. More on this later.
Although it is the goal of machine learning, the model definition is a fairly small part of the overall workflow. Choosing which parts of the data will act as input, cleaning it and tweaking the model’s settings is what really matters and takes a greater deal of time.
In an ideal world, the data we gather will have all the right values for all the right variables. However, that is never true with real-world data. There can be noise, missing or plain wrong values. A regular rules-based application would have a very hard time dealing with this, and a huge amount of a developer’s time would go towards refining those rules.
A crucial step in the process is deciding which variables will be used as input to the system. Some experimentation and domain knowledge is required to find a set of inputs (also called features) that can be used to properly make predictions afterwards.
How does a machine learn?
A specific type of machine learning problem is supervised learning. It involves building a model for a particular task and feeding data into it. Examples of supervised learning systems are Artificial Neural Networks (ANN), Decision Trees and k-nearest neighbours. Usually the choice depends on the problem we are trying to solve.
A very oversimplified scenario (good to illustrate how things work, but the real-world version is much more complex) is the stock price prediction problem mentioned earlier.
Let’s say we want to have a way of predicting the future of some company’s stock price. There are public records about the history of price movements, so we can use that as a starting point. We take last month’s data, for example, and present it to our machine learning system as pairs of [Input] -> [Expected output]. For example, if we take 5 consecutive days’ worth of price movements and feed it as the input, then the price on the 6th day would be the expected output. The machine learning model then does whole lot of number crunching to find patterns in price movements. We then continue and take another chunk of historic data, feed it to the system and so on.
Initially, the model would not be very good and would be outputting seemingly random values. However, by comparing the predicted and expected output, it will slowly start learning patterns between the variables. This process is called ‘Training’ – running through existing data (pairs of input and output data) and optimising the system’s internal parameters. After lots of training examples, the system should now be able to look at today’s stock movement and give an educated guess as to where the price will go next. Keep in mind, this is an oversimplified way of defining a stock price prediction problem, because in reality market movements are influenced by many factors other than previous price.
Unsupervised learning is another way of analysing data for scenarios where we have no idea what the output is for each data sample. Usually, when this is the case, the model is used to group the data into clusters (k-means clustering is an algorithm that can be used to achieve this). Those clusters are defined by common patterns in the attributes of the dataset. After that, data not used for training can be classified as belonging to one of those clusters. To illustrate that, think of a child that doesn’t yet understand everything it sees. A baby discovers the world by looking at objects in the world, their look and sound. After a while they can distinguish between different objects based on patterns like shape, texture and size.
What is a typical machine learning task?
There are two types of tasks that a machine learning model can perform – regression and classification. All problems can be defined as one or the other.
A regression task is when, given some input, we try to predict a specific value – for example how many cups of coffee will be sold in a shop depending on the time of day, day of the week and month. The model could be trained to predict expected demand based on all of those metrics and tweak the amount of coffee or milk the shop has to order from its suppliers. Thus, it would become more efficient in terms of costs.
On the other hand, we have classification. This is when there is a discrete set of classes that we want to assign to new examples. For example, suppose we are analysing images of handwritten digits. We have a set of 10 classes – all digits from 0 to 9 and the input is an image. By giving many examples of pairs (handwritten image and digit) the model learns what shapes are associated with various digits. It is then able to look at images it hasn’t ‘seen’ before and guess what the digit is.
How to apply it to finance?
The possibilities are numerous and people are coming up with new ways of using machine learning. At Sinara, we mainly work with financial sector clients where an increasing demand for smarter systems is on the horizon and we hope to satisfy those needs.
A very good example that companies already implement is to monitor for fraud. Some people try to find various exploits when trading through exchanges and it can be hard to spot those cases with rules-based systems. Especially if there are hundreds or thousands of trades per second. That’s one example of how investing in a machine learning system can bring actual value to a company.
Credit card risk is also a popular application – predicting the likelihood of a person defaulting on their credit card or loan is surely a valuable type of insight for a company. By analysing their credit history, a prediction system could uncover hidden patterns that could drive the decision at the end.
A particularly interesting application of machine learning is in trading in various markets. Sentiment analysis of text within news feeds, tweets, etc, can potentially be used to determine the direction of a market. Smarter trading algorithms could be trained to find optimal trading strategies. Predicting when to buy or sell a security would have immense value for companies. Of course, if every trader starts using such a method, the way the market works would change as well. How exactly is a very exciting question!