NSA gathering data from social media, also – so is everyone else

I previously posted about my intention to do a digest / tear down talk on the ongoing NSA leaks / public discussion for the local security group. Unfortunately, this intention keeps getting sidetracked by new releases and articles. The difficulty and frustration in tracking, digesting, and distilling all these posts stems from the sensational nature of the prose coupled with the sketchy details of the documents released to support the various reporter narratives.  Take the Fantastico reporting  which Bruce Schneier also comments on. The screenshots and other materials these reports are based on seem primarily to be powerpoint presentations; in this case, Fantastico is talking about a Man in the Middle (“MITM”) attack evidenced by a highly sophisticated power point slide (sarcasm). This crude drawing tells us nothing substantial about what is going on other than a generic MITM attack. In fact, that slide doesn’t even indicate if the traffic was encrypted. A MITM attack where the NSA had a compromised or forged google SSL certificate would be interesting, but one where they just added a route to the routing table for google traffic is not even interesting from a technical perspective – sure, it is very interesting on a policy level, but not even C for effort on the technical side.

There are other reports discussing SSL interception and/or potential intentionally introduced weaknesses in various encryption standards and/or their default settings. If those types of issues were implicated in the above MITM slide, that would be worth talking about – but therein is my point. The media coverage is screaming the sky is falling, but those holding the Snowden documents are very stingy in releasing enough of them to corroborate the technical merits of their assertions. Clearly, there is enough out there now to warrant an open public discussion on privacy, legal protections for civil liberties, and the internet generally in context of the NSA’s work, but the echo chamber the media is currently screaming in does not make objective analysis easy (especially with so little primary source information).

Finally, some of the ‘revelations’ are not even new. The new brouhaha is over the NSA gathering data from social media networks for data mining purposes and to aggregate with other data sources. This is not even a revelation nor a secret – we know that, they specifically asked for it in the 2013 Intelligence Community Postdoctoral Research Fellowship Program (Research Topic 12.9) – which I actually applied for way back in December 2012! (Too bad my proposal wasn’t accepted, but if you 3 letter types change your mind, give me a ring!)

12.9. Methods and Techniques for Big Data Analysis of Social Media
Proposed Research Project:
The goal of this proposal is to advance intelligence analysis by adapting existing methodologies/ practices and developing new, more effective quantitative methodologies for big data analysis of social media data. Possible research may include an examination, extension, and/ or implementation of the ideas and strategies already developed in government, industry and academia to include research findings from programs such as Social Media in Strategic Communication, DEFT, and XDATA (DARPA), Open Source Indicators (IARPA) and Social Media Dashboard (ONR).
Technical Objectives:
• Content – Advance the state-of-practice for mining social media for building an intelligence picture applicable to areas such as indications and warning; sentiment analysis; decision-making; strategic, operational, and tactical analysis; operations planning, etc. Potential areas of emphasis include the following: (1) Modeling diffusion of information across social media networks and geographic locations to project the rate and location of the diffusion over time. (2) Modeling to predict the future time and place of an individual or group of individuals using social media with varying degrees of geolocational accuracy.
• Materials – Develop a method to evaluate and compare available data sources and existing tools for sentiment/text mining of social media and fusion. Identify gaps where new tools could be used effectively, report and present findings and proposed way ahead to a working group of subject matter experts. Develop innovative solutions against the gaps. This may involve research in mining large data sets, human factors, machine and other techniques for translating social media from foreign languages, systems-of-systems and network analysis, and behavioral and cognitive psychology.
• Presentation – Discover optimal methods for creating and presenting a fused intelligence picture that better reflects ground truth for use by other analysts and potentially operators/decision-makers. This may involve research into understanding complex problems and environments and then breaking down identities and relationships into simplified components to provide clear, practical analysis.
Possible outcomes from this research include a white paper describing the research in these areas. Proposals should discuss potential approaches of research and application to actual data. For more information on topic 12.9, submit inquiries per the procedures in Section 5.2 of this Research Solicitation.

That the intelligence community was/is pursuing this line of research is not surprising, partly because they solicit research proposals asking for it.

