Cybersecurity and AI - A United Front for the Protection of Businesses and Personal Information

By
Sarah Legendre Bilodeau, October 31 2023
Cybersecurity
Data Privacy Protection
Data

Artificial Intelligence (AI) and cybersecurity are two areas of expertise that are on everyone's lips in 2023. This year, I have been approached several times to discuss AI in support of the cybersecurity topic. Two different subjects, but they share a common point: data. Conferences, podcasts, support for cybersecurity companies in adding AI components to solutions are just a few examples of what kept us busy at Videns on the subject in 2023, as you will see at the end of this article. Before moving forward, it's important for me to mention that I am not a cybersecurity expert! However, as a data and AI specialist, I am pleased to provide my perspective on the subject, from a familiar angle.

On the one hand, several companies have been able to demonstrate in recent years that AI can bring tangible benefits in a wide variety of subjects. For example, intelligent process automation, understanding consumer behaviors, demand forecasting, or credit risk management. On the other hand, concerns related to cyber-attacks and cyber fraud continue to rise. It must be said that the intrusions suffered by several companies or government organizations have contributed to educating businesses about the real existing risks. An education that came at a very high price... It seems that the question to ask is not 'Will my company suffer an attack?' but rather 'When will my company suffer an attack?’. When we ask a question using the word 'when,' it's highly likely that AI can help. Indeed, various machine learning approaches allow for predictions, bringing more proactivity to different application domains. Can AI be used to anticipate cyber-attacks? How can AI contribute to reducing a company's cybersecurity risk? This article will focus on what AI can bring to cybersecurity, the risks concerning data in the case of cyber-attacks, and the challenges faced by data and AI specialists in a cybersecurity context.

What is the Contribution of AI in Cybersecurity?

Data lies at the heart of the cybersecurity theme. This is natural since in the digital realm, everything leaves a trail, and these trails are represented in the form of data. AI can contribute both to diagnostics and prevention in the field of cybersecurity.

To make a good diagnosis, it is necessary to understand what happens before, during, and after the incident. It is therefore advisable to use past data to better understand what led to the incident. Could certain digital pathways have provided clues to an upcoming attack? Do they stand out from normal digital pathways? Modern machine learning approaches can provide valuable insights. However, statistical approaches such as segmentation or regression could also be excellent tools! The key is to identify 'patterns,' predictors, or combinations of predictors for a computer security incident. In fact, this is notably what is done in the field of fraud detection within the financial domain.

Good prevention is intimately linked to the diagnostic aspect. Indeed, the same historical data is used to train predictive models that will be applied to current data. These predictive models can provide valuable information in the form of alerts, for example. The concept of real-time becomes particularly relevant here. After all, being alerted a week too late would be pointless simply because data updating processes are executed once a week. The relevance of an alert is tied to its timeliness.

Analyzing the usage patterns of digital platforms helps understand what the vast majority of people tend to do. These patterns can be divided into different segments. When a detected pattern is significantly different from typical ones, investigation is necessary. Perhaps it's a legitimate behavior that might become more common in the future? But it could also be an indicator of an anomaly. In any case, a timely analysis by a human is a secure practice. Even if it turns out to be a false alert, understanding a new customer segment is likely to be appreciated.

We often think externally when it comes to attacks on a company, but it seems that employees are the weakest link within a company in terms of vulnerabilities (https://www.lesaffaires.com/dossier/la-cybersecurite-un-imperatif-commercial/cyberattaque--ce-nest-plus-qui-mais-quand/636809). Hence, analyzing people's behaviors in the digital realm can be a highly rewarding approach. By regularly testing individuals with simulated attacks, analyzing the gathered information can then guide training initiatives, provide appropriate feedback, and, of course, continually raise awareness.

What Are the Risks to Data in the Event of a Cyber Attack?

In the event of a cyber attack, the negative impacts can be numerous and very significant. From my perspective as a business leader specialized in data valorization and AI, data lies at the heart of the issues to be considered!

A leak of personal information about our clients, employees, and partners can be very damaging. Indeed, the harm caused by an event of this nature can be significant for individuals and businesses. Identity theft, invasion of privacy, and disclosure of compromising information are just a few negative examples that can affect people. At the business level, privileged information can be disseminated, data can be corrupted or even deleted, and critical information for the company's operations can vanish. The affected company may face significant difficulties and even closure. And all this is without considering the loss of trust from clients and partners resulting from such an incident, which can persist for a very long time.

How to Limit and Contain the Risk for Our Data?

To protect our data, a well-deployed cybersecurity strategy that matches our business and operational context is essential. To achieve this, several firms offer consulting services in the field.

At the data level, a few best practices should be considered.

Is it necessary to keep all the data generated in the company's operations? From my point of view, no. I actually wrote an article on this topic on CScience (https://www.cscience.ca/chroniques/conservation-des-donnees-en-entreprise-le-prix-a-payer-quand-on-veut-tout-garder/).The more data we retain, the more significant the negative impact can be in case of an attack. In the law on the protection of personal information in force in Quebec (Bill 25), data lifecycle management is a relevant requirement in this context. I also emphasize the importance of developing a good data strategy and implementing good data governance practices.

In certain situations, the use of synthetic data generation, when done well, can allow for inferences very similar to those made with real data but with fewer risks and cybersecurity-related issues. However, it's essential to plan the project carefully to ensure the value of synthetic data is achieved.

Moreover, for an AI service company like Videns, which works with several dozen clients, it's critical not to have client data in our technological environments. Indeed, our main approach is to work within our clients' technological environments to avoid data transfers. Data movements increase the risk...

Finally, for the development of models, algorithms, and data pipelines, it is important to use secure technological environments. Modern encryption protocols secure data at rest or in transit. Some protocols and hardware even ensure this encryption during computation, thereby providing uninterrupted encryption for the most sensitive applications.

All the software developments that frame these data pipelines (APIs, frontends, backends) must also be designed and developed with the appropriate level of security.

What Are the Challenges Faced by Data Scientists in Cybersecurity?

The use of AI in the field of cybersecurity is an undeniable contribution to the domain. However, it should be known that for various data and AI specialists, the challenges are numerous!

The performance of solutions often goes hand in hand with a wide variety of data and the combination of multiple sources. And it's not just for the field of cybersecurity! Thus, it often requires a great deal of effort to combine data sources and ensure finding the appropriate matches. Depending on the data sources, there can be a lot of irrelevant data for a few relevant ones. The expression 'looking for a needle in a haystack' can make complete sense here! In any case, significant efforts are required in data preparation. Relevant data is not always well-structured. Just think about software application logs, which are a great data source but not easy to use.

I often say that in AI, we constantly deal with error and uncertainty. There is no solution that provides consistently perfect recommendations or predictions. However, the patterns of behavior of cybercriminals are constantly improving and becoming increasingly complex. This rapid evolution of practices requires constant vigilance and retraining of AI models.

And from the perspective of a data scientist, I would add that cyberattack incidents are rare events, which is good from a business standpoint, but poses an additional challenge in data science! Modeling something that isn't observed frequently is a problem in itself. That's why, unless there is a large amount of data within the company, it might be more interesting to develop solutions for a platform used by many clients rather than trying to rebuild independent solutions for each company. Interestingly, the NSERC / Desjardins / National Bank Industrial Research Chair in Cybersecurity was established by two financial institutions that compete in other contexts. On this subject, it can be said that unity is strength!

Finally, preventing cyber incidents requires timely responsiveness. Therefore, solutions must be capable of providing real-time alerts and giving feedback at the right moment. From the perspective of solution engineering, this also poses a challenge.

All these challenges, in my opinion, advocate for the relevance of mobilizing experienced talents in various data specialties, something that Videns provides, namely in the field of cybersecurity.

Right before finishing this article, I spent the week in Paris, where I had the opportunity to participate in a panel on AI and cybersecurity topics organized by a think tank called 'Les vendredis de la colline.' This group, composed of young employees who are graduates from major Parisian schools, takes it upon themselves to reflect and engage on current topics. The guest of the evening was Gaëlle Picard-Abezis (https://www.linkedin.com/in/gapjoa/). She had taken care in developing the content and the theme and had suggested being accompanied by other individuals to enrich the exchanges. The discussions were of high caliber and highly relevant! I would thus open the discussion on the reconsideration that Mrs. Picard-Abezis made regarding the term 'cybersecurity.' As the word 'security' refers more to the concept of defense, would it not be more appropriate to replace 'cybersecurity' with 'cybersafety,' which perhaps better encompasses the defensive and proactive components necessary for a good protection strategy? From the perspective of mobilizing AI for this subject, it's entirely consistent!