I was really excited for this particular session, as big data an area that people are increasingly tackling, with fascinating results. The panelists included Janet Ramey who is a Senior Director at Cisco, Moira Burke who is a data scientist at Facebook, and Eva K. Lee who holds a professorship at Georgia Tech. Monica Martinez-Canales, a Principal Engineer at Intel, hosted the panel.
We started with a broad overview of that question everyone’s asking: what is big data anyway?
The reason that everyone’s asking the question is because people are still trying to figure it out for themselves (see Cloud Computing). Answers ranged from “it might not actually be that big” to “collections of data over 100 terabytes” to “data that comes from two or more sources” to “anything that doesn’t fit on the servers.” However, the conclusion that this session made is that big data is the intersection of variety, volume, value, and velocity, with viscosity (stickiness) and variability (inherent uncertainty) thrown in for good measure.**
The Panelists
Janet Ramey started off the panelists by introducing herself. Like me, her background is in liberal arts (go liberal arts!) and her transition to engineer went through intermediate careers including that of the tech writer. At Cisco, big data comes into play during support forecasting. They use data from customers, their engineers’ backgrounds, and previous cases to construct a model of what resources they’ll need, how they should train their employees, and what issues they’re likely to encounter.
Moira Burke went next, talking about how the data they gather at Facebook can help drive the product itself. She focused on whether online social interaction actually can improve users’ well being. Using analysis of surveys (based upon the ISEL scale) and server logs, they were able to determine that the most beneficial form of online interaction is what she terms directed communication (or, 1-1 interaction).
Eva K. Lee followed, exploring into research and biomedical applications. Medical records are a prime example of big data, with high resolution scans, lab results, doctors’ visits, and unstructured data making “electronically thick” patient files. Due to the vast quantitiy of data, most healthcare professionals simply cannot process all the information, which is a natural place for technology to step in and help make sense of it all.
Takeaways
Big data is here. We’re playing a catch up game with the handling of it. Instead of pushing it off to the side and saying “Oh, I’ll deal with that later,” we should be actively investigating the best ways that we can make it work for us now. Career opportunities in big data abound, so if you’re looking for a space to get involved, you may have found it!
You can read Kate’s coverage of the session on her blog.