The threat intelligence company Recorded Future announced on Tuesday that it is rolling out a generative artificial intelligence tool that relies on a fine-tuned version of Open AI’s GPT model to synthesize data.
Rapid advances in generative AI in recent months have led to a flurry of initiatives by companies to incorporate the technology into their offerings, and companies such as Recorded Future — with its massive trove of proprietary data — are showing how the technology is likely to be incorporated into products in the short term.
Over the course of nearly 15 years in business, Recorded Future has collected a huge amount of data on the activity of malicious hackers, their technical infrastructure and criminal campaigns. The company has used that data to tune a version of Open AI’s deep learning models to build a tool that summarizes data and events for analysts and clients. By connecting the AI model to its intelligence graph, which collects data from across the web, the model will include near real-time information about commonly exploited vulnerabilities or recent breaches.
“This is something that for a human analyst can take several hours — reading all this material and then generating a summary,” Staffan Truvé, co-founder and chief technology officer of Recorded Future, told CyberScoop. “As you move through the information, you now have someone summarizing it in real time.”
Cybersecurity companies have broadly incorporated AI into their products over the past decade, but the next step of incorporating machine learning into corporate applications is figuring how to build useful generative tools.
Companies such as Recorded Future with large internal data holdings have in recent months embraced deep learning technology to build generative AI tools. Late last month, Bloomberg rolled out BloombergGPT, a 50 billion parameter model trained on financial data.
By taking large data holdings and feeding them into AI models, companies like Recorded Future and Bloomberg are attempting to build generative AI systems that are finely tuned to answering the questions that their clients rely on them to answer. Companies with large data holdings will likely look to generative AI to turn that data into a more productive resource.
But Bloomberg and Recorded Future also offer an example of how companies can take different approaches in building generative AI models with major implications for the broader industry. While Bloomberg has built its own bespoke model, Recorded Future relies on OpenAI’s foundational GPT model and pays the company based on how much it queries the model.
While Truvé would not comment on the financial terms of the relationship between Recorded Future and OpenAI, it is likely that these types of business-to-business deals represent a fairly lucrative business model for OpenAI, a company that faces a difficult road to profitability while facing staggering computing costs to train its models.
It’s difficult to evaluate the quality of Recorded Future’s AI offerings. The company has not tested its model against standard AI benchmarking tools, instead relying on its in-house analysts to test and verify its accuracy. The company relies on OpenAI’s most advanced GPT models, but OpenAI has severely limited the amount of information it makes available about its top-of-the-line products.
In their eagerness to answer questions, advanced AI models are prone to hallucination — confidently stating a piece of information as fact that has no basis in reality. But Truvé said the company’s model manages to mostly avoid hallucinating in large part because its primary application is in summarizing a body of information returned as part of a query.
Indeed, the performance of Recorded Future’s AI is aided by the fact that its purpose is fairly straight forward. The company’s AI feature functions mainly as a summarizing tool, and Truvé sees the AI tool as something that will augment cybersecurity analysts.
“The challenge facing people in cybersecurity is that there is too much information and too few people to process it.” Truve said “This tries to solve the lack of time available to analysts and the rather acute lack of analysts.”