In the constantly adapting industry of cyber security, how is machine learning being used in cyber defence strategies?
Cyber security is a rapidly evolving industry and it needs to be. As cyber criminals utilise new technology in their attacks, information security professionals must also adapt and implement new methods in their cyber defence. This game of cat and mouse means that cyber security is always at the cutting edge of technology. In recent years, machine learning has been used in cyber security to predict and identify attacks as they happen. However, cyber criminals are also utilising machine learning to hide their malware or launch the most convincing phishing campaigns. In this two-part blog, we will look at how machine learning plays a crucial role in cyber security and cyber attacks.
In this first part we explore ways in which machine learning is used in cyber defence. While this should play a crucial role in any cyber defence strategy, recent reports suggest GCHQ are not utilising Machine learning enough.
What is machine learning?
Machine learning uses models that use sets of data to learn the underlying concepts so that they are able to predict what future data should look like or classify data into groups. In the context of cyber security, this could be classifying network traffic as malicious or normal.
How is machine learning used in cyber security?
Networks generate a lot of traffic. Too much for any single team of security professionals to analyse meaningfully. Machine learning can be used to learn the underlying trends of the data, allowing for future predictions to be made such as changes to malware. They are also excellent at classifying threats and differentiating between malicious and normal network traffic. The following are some examples of Machine learning in action.
Threat detection of malware
As particular malware becomes more familiar in the security industry, all antivirus software will become aware of it and easily be able to identify its signature. This heavy reliance on previous knowledge can mean small variations in the malware will enable it to avoid detection. Machine learning models can take the previously known malware and learn its underlying concepts. From this, it will be able to detect malware even after it has been altered to avoid detection. This has been implemented in many antivirus solutions, known as heuristic detection. Researchers have achieved accuracy of over 85% using this method.
Phishing page and URL detection
Phishing attacks are an extremely common and successful way of stealing a victim’s credentials. A website is crafted to look like the target site, such as a fake banking application with the aim of tricking a user into entering their credentials. Often URLs leading to phishing sites are embedded in web applications waiting for a user to click on them and enter their sensitive data. Machine learning algorithms are able to analyse the URL and classify it as malicious or benign. Other attributes such as geolocation, website contents and word analysis can improve the accuracy of the prediction.
One of the most prolific and devastating DDoS attacks was due to the Mirai botnet which used thousands of devices to perfect large scale attacks. One way of preventing such an attack again is to analyse traffic of all devices on a network. When a botnet attack occurs, the traffic will deviate from standard use. This anomaly-based detection has had over 99.9% accuracy using some machine learning models.
User behaviour analysis
Similar to bot detection, day-to-day traffic of a user can be monitored. If there's a large anomaly in a user’s activity, this may indicate a compromised account. A standard user’s network footprint will be varied and complex due to the wide range of applications in use. As such, the false positive rate may be high. This still allows for security teams to be notified and act accordingly to the potential threat.
An issue frequently raised is how ethical is mass surveillance of employees? Students in Australia have voiced privacy concerns over software intended to analyse their actions during examinations from home. Such software should strike a balance between user privacy and effective detection of malicious activity. Data anonymization may help alleviate some of the concerns.
Optimising the human analysis
Machine learning is not removing the need for security analysts. Instead, it is completing the easier tasks and empowering the security analysts to draw from the highest quality data they can. For example, machine learning models are able to generalise trends from logs and point out points of interest for the security analyst. Another common issue with security analysts is “alarm fatigue”. This phenomenon results in repeatedly seeing false positive threats, meaning when a legitimate threat arises they are not mentally prepared to deal with it. By giving the analyst higher quality data and reducing the noise, this can eliminate the fatigue.
It is evident that machine learning is being used effectively in cyber security in a wide range of applications. But how are criminals using the same technology to their advantage? In the next article we will explore how machine learning is used in malicious contexts.
Tyler Sullivan, Security Consultant
Tyler is a Security Consultant at Informer. During his degree in computer science, Tyler’s main focuses were on cyber security and machine learning. This manifested itself in his dissertation topic, looking at how machine learning algorithms can identify infected bots on a network.