Building a Threat Intelligence Program: Gathering TIBy Mike Rothman
[Note: We received some feedback on the series that prompted us to clarify what we meant by scale and context towards the end of the post. See? We do listen to feedback on the posts. - Mike]
We started documenting how to build a Threat Intelligence program in our first post, so now it’s time to dig into the mechanics of thinking more strategically and systematically about how to benefit from the misfortune of others and make the best use of TI. It’s hard to use TI you don’t actually have yet, so the first step is to gather the TI you need.
Defining TI Requirements
A ton of external security data available. The threat intelligence market has exploded over the past year. Not only are dozens of emerging companies offering various kinds of security data, but many existing security vendors are trying to introduce TI services as well, to capitalize on the hype. We also see a number of new companies with offerings to help collect, aggregate, and analyze TI. But we aren’t interested in hype – what new products and services can improve your security posture? With no lack of options, how can you choose the most effective TI for you?
As always, we suggest you start by defining your problem, and then identifying the offerings that would help you solve it most effectively. Start with your the primary use case for threat intel. Basically, what is the catalyst to spend money? That’s the place to start. Our research indicates this catalyst is typically one of a handful of issues:
- Attack prevention/detection: This is the primary use case for most TI investments. Basically you can’t keep pace with adversaries, so you need external security data to tell you what to look for (and possibly block). This budget tends to be associated with advanced attackers, so if there is concern about them within the executive suite, this is likely the best place to start.
- Forensics: If you have a successful compromise you will want TI to help narrow the focus of your investigation. This process is outlined in our Threat Intelligence + Incident Response research.
- Hunting: Some organizations have teams tasked to find evidence of adversary activity within the environment, even if existing alerting/detection technologies are not finding anything. These skilled practitioners can use new malware samples from a TI service effectively, then can also use the latest information about adversaries to look for them before they act overtly (and trigger traditional detection).
Once you have identified primary and secondary use cases, you need to look at potential adversaries. Specific TI sources – both platform vendors and pure data providers – specialize in specific adversaries or target types. Take a similar approach with adversaries: understand who your primary attackers are likely to be, and find providers with expertise in tracking them.
The last part of defining TI requirements is to decide how you will use the data. Will it trigger automated blocking on active controls, as described in Applied Threat Intelligence? Will data be pumped into your SIEM or other security monitors for alerting as described in Threat Intelligence and Security Monitoring? Will TI only be used by advanced adversary hunters? You need to answer these questions to understand how to integrate TI into your monitors and controls.
When thinking about threat intelligence programmatically, think not just about how you can use TI today, but also what you want to do further down the line. Is automatic blocking based on TI realistic? If so that raises different considerations that just monitoring. This aspirational thinking can demand flexibility that gives you better options moving forward. You don’t want to be tied into a specific TI data source, and maybe not even to a specific aggregation platform. A TI program is about how to leverage data in your security program, not how to use today’s data services. That’s why we suggest focusing on your requirements first, and then finding optimal solutions.
After you define what you need from TI, how will you pay for it? We know, that’s a pesky detail, but it is important, as you set up a TI program, to figure out which executive sponsors will support it and whether that funding source is sustainable.
When a breach happens, a ton of money gets spent on anything and everything to make it go away. There is no resistance to funding security projects, until there is – which tends to happen once the road rash heals a bit. So you need to line up support for using external data and ensure you have got a funding source that sees the value of investment now and in the future.
Depending on your organization security may have its own budget to spend on key technologies; in that case you just build the cost into the security operations budget because TI is be sold on a subscription basis. If you need to associate specific spending with specific projects, you’ll need to find the right budget sources. We suggest you stay as close to advanced threat prevention/detection as you can because that’s the easiest case to make for TI.
How much money do you need? Of course that depends on the size of your organization. At this point many TI data services are priced at a flat annual rate, which is great for a huge company which can leverage the data. If you have a smaller team you’ll need to work with the vendor on lower pricing or different pricing models, or look at lower cost alternatives. For TI platform expenditures, which we will discuss later in the series, you will probably be looking at a per-seat cost.
As you are building out your program it makes sense to talk to some TI providers to get preliminary quotes on what their services cost. Don’t get these folks engaged in a sales cycle before you are ready, but you need a feel for current pricing – that is something any potential executive sponsor needs to know.
While we are discussing money, this is a good point to start thinking about how to quantify the value of your TI investment. You defined your requirements, so within each use case how will you substantiate value? Is it about the number of attacks you block based on the data? Or perhaps an estimate of how adversary dwell time decreased once you were able to search for activity based on TI indicators. It’s never too early to start defining success criteria, deciding how to quantify success, and ensuring you have adequate metrics to substantiate achievements. This is a key topic, which we will dig into later in this series.
Selecting Data Sources
Next you start to gather data to help you identify and detect the activity of potential adversaries in your environment. You can get effective threat intelligence from a variety of different sources. We divide security monitoring feeds into five high-level categories:
- Compromised Devices: This data source provides external notification that a device is acting suspiciously by communicating with known bad sites or participating in botnet-like activities. Services are emerging to mine large volumes of Internet traffic to identify such devices.
- Malware Indicators: Malware analysis continues to mature rapidly, getting better and better at understanding exactly what malicious code does to devices. This enables you to define both technical and behavioral indicators to search for within your environment, as Malware Analysis Quant described in gory detail.
- IP Reputation: The most common reputation data is based on IP addresses and provides a dynamic list of known bad and/or suspicious addresses. IP reputation has evolved since its introduction, now featuring scores to compare the relative maliciousness of different addresses, as well as factoring in additional context such as Tor nodes/anonymous proxies, geolocation, and device ID to further refine reputation.
- Command and Control Networks: One specialized type of reputation often packaged as a separate feed is intelligence on command and control (C&C) networks. These feeds track global C&C traffic and pinpoint malware originators, botnet controllers, and other IP addresses and sites you should look for as you monitor your environment.
- Phishing Messages: Most advanced attacks seem to start with a simple email. Given the ubiquity of email and the ease of adding links to messages, attackers typically use email as the path of least resistance to a foothold in your environment. Isolating and analyzing phishing email can yield valuable information about attackers and tactics.
These security data types are available in a variety of packages. Here are the main categories:
- Commercial integrated: Every security vendor seems to have a research group providing some type of intelligence. This data is usually very tightly integrated into their product or service. Sometimes there is a separate charge for the intelligence, and other times it is bundled into the product or service.
- Commercial standalone: We see an emerging security market for standalone threat intel. These vendors typically offer an aggregation platform to collect external data and integrate into controls and monitoring systems. Some also gather industry-specific data because attacks tend to cluster around specific industries.
- ISAC: Information Sharing and Analysis Centers are industry-specific organizations that aggregate data for an industry and share it among members. The best known ISAC is for the financial industry, although many other industry associations are spinning up their own ISACs as well.
- OSINT: Finally open source intel encompasses a variety of publicly available sources for things like malware samples and IP reputation, which can be integrated directly into other systems.
The best way to figure out which data sources are useful is to actually use them. Yes, that means a proof of concept for the services. You can’t look at all the data sources, but pick a handful and start looking through the feeds. Perhaps integrate data into your monitors (SIEM and IPS) in alert-only mode, and see what you’d block or alert on, to get a feel for its value. Is the interface one you can use effectively? Does it take professional services to integrate the feed into your environment? Does a TI platform provide enough value to look at it every day, in addition to the 5-10 other consoles you need to deal with? These are all questions you should be able to answer before you write a check.
Many early threat intelligence services focused on general security data, identifying malware indicators and tracking malicious sites. But how does that apply to your environment? That is where the TI business is going. Both providing more context for generic data, and applying it to your environment (typically through a Threat Intel Platform), as well as having researchers focus specifically on your organization.
This company-specific information comes in a few flavors, including:
- Brand protection: Misuse of a company’s brand can be very damaging. So proactively looking for unauthorized brand uses (like on a phishing site) or negative comments in social media fora can help shorten the window between negative information appearing and getting it taken down.
- Attacker networks: Sometimes your internal detection capabilities fail, so you have compromised devices you don’t know about. These services mine command and control networks to look for your devices. Obviously it’s late if you find your device actively participating in these networks, but better find it before your payment processor or law enforcement tells you you have a problem.
- Third party risk: Another type of interesting information is about business partners. This isn’t necessarily direct risk, but knowing that you connect to networks with security problems can tip you to implement additional controls on those connections, or more aggressively monitor data exchanges with that partner.
The more context you can derive from the TI, the better. For example, if you’re part of a highly targeted industry, information about attacks in your industry can be particularly useful. It’s also great to have a service provider proactively look for your data in external forums, and watch for indications that your devices are part of attacker networks. But this context will come at a cost; you will need to evaluate the additional expense of custom threat information and your own ability to act on it. This is a key important consideration. Additional context is useful if your security program and staff can take advantage of it.
If you use multiple threat intelligence sources you will want to make sure you don’t get duplicate alerts. Key to determining overlap is understanding how each intelligence vendor gets its data. Do they use honeypots? Do they mine DNS traffic and track new domain registrations? Have they built a cloud-based malware analysis/sandboxing capability? You can categorize vendors by their tactics to make sure you don’t pay for redundant data sets.
This is a good use for a TI platform, aggregating intelligence and making sure you only see actionable alerts. As described above, you’ll want to test these services to see how they work for you. In a crowded market vendors try to differentiate by taking liberties with what their services and products actually do. Be careful not to fall for marketing hyperbole about proprietary algorithms, Big Data analysis, staff linguists penetrating hacker dens, or other stories straight out of a spy novel. Buyer beware, and make sure you put each provider through its paces before you commit.
Our last point on external data in your TI program concerns short agreements, especially up front. You cannot know how these services will work for you until you actually start using them. Many threat intelligence companies are startups, and might not be around in 3-4 years. Once you identify a set of core intelligence feeds that work consistently and effectively you can look at longer deals, but we recommend not doing that until your TI process matures and your intelligence vendor establishes a track record.
One of the things you have to keep in mind is the sheer number of indicators that come into play, especially when using multiple threat intelligence services. So you need to make sure you build a step into your TI process is to provide context for the threat intelligence feeds prior to operationalizing them. That means you want to tailor what gets fed into the TI platform (which we discuss in the next post), so that all searching and indexing is only done for intelligence sources that are relevant for your organization. To use a very simplistic example, if you only have Mac devices on your network, getting a bunch of TI indicators for Windows Vista attacks will just clutter up your system and impact performance.
It’s a typical funnel concept. There are millions of indicators that you can get via a TI service. Only hundreds may apply to your environment. You want to do some pre-processing of the TI as it comes into your environment to get rid of the data that isn’t relevant making any alerts more actionable and allowing you to prioritize efforts on the attacks that present real risk.
Now that you have selected threat intelligence feeds, you need to put it to work. Our next post will focus on what that means, and how TI can favorably impact your security program.