I’m told that every year RSA has a theme, and that this theme is predictive of the year to come for the information security industry. Sometimes, that theme is hidden. Other times, (such as last year) that theme is a race car engine with the words “Big Data” splattered all over it jumping out at you on every corner.
At the Information Security Media Group RSA themes discussion during Day 3, Executive Editor of BankInfoSecurity Tracy Kitten remarked that the challenges she hears about today are decades old: “the challenge of siloed channels.” That is, institutions are still maintaining legacy infrastructure while investing in new technology. Couple her interpretation with the showroom floor buzz around threats, as well as the conference’s stated theme of sharing knowledge, and you have my take. This year’s theme was undoubtedly the proliferation and sharing of “Threat Intelligence.”
This means different things for different practitioners and vendors, so since the gathering and dissemination of “threat intelligence” runs through our company’s veins, I’ll share Risk I/O’s views.
Ryan Huber, our security architect kicked things off with a talk at BSidesSF about his latest open source project, Bouncer. Sharing this tool with the community under an open-source license is an excellent example of the kind of sharing of methodology and data that information security professionals need in order to stay ahead of adversaries.
I gave a talk the following day at BSides about the fruits of correlating data from disparate data sources and the kinds of insight that this generates. My model of proper information security threat intelligence is a game-theoretic one, where information about attackers’ potential actions, in progress attacks, successful attacks, and near misses informs our decision making about vulnerability assessment and remediations. While still not a complete picture of the threat landscape, I used data from public vulnerability databases, the Metasploit Project, Exploit Database, and aggregated metadata from Risk I/O’s 2,000 enterprises, 1,500,000 live assets and over 70 million live vulnerabilities to assess the effectiveness of CVSS as a remediation policy. Add to the mix data coming in every hour from the Open Threat Exchange (OTX) that looks for indicators of compromise across 20,000 enterprises, and the results were less than stellar. The best case scenario (remediating CVSS 10 only) yielded a 3.5% predictive value positive for breaches ongoing in the last two months.
A better strategy would be to take a look at Metasploit and Exploit Database. Remediating vulnerabilities with entries there yields almost a 30% predictive value positive while retaining over 40% sensitivity.
Of note was the talk right before mine, by Trey Ford of Rapid7, which harped on the legislative realities of information security, and why it is so difficult for vendors and businesses to come together in information-sharing efforts. He called for a cultural change wherein vendors, businesses, and federal agencies alike share both data and research in an effort to stay ahead of attackers. The kinds of partnerships we have forged is the beginning of such efforts taking shape. Staying ahead of the attackers with such data-driven insights is precisely what my talk followed up with.
Kymberlee Price from Blackberry shared a similar story at Metricon 9, where she spoke about attempting to correlate public vulnerability data in order to baseline her incident response practice on the vendor side. In the discussion that followed, folks from academia and business alike debated the propensity of institution to share data about vulnerabilities or malware – and we saw some great examples throughout Metricon.
My talk at Metricon was largely centered around how we do data science operations here at Risk I/O, and how we intentionally simplify our process in order to allow for better transparency. We intentionally limit the complexity of tools we use for data analysis, so that the results could be reproduced back home. We use everyone on the team—from CEO, to marketing, to dev—in order to generate data driven insights. Everyone who works here is a data scientist.
You can get familiar with our data discovery and development process after the jump, as the transcript of the talk is written down, and, as always, reach out with any questions on comments right here or on twitter.