And back in college, I got this idea in my head that I wanted to be a director. That's where all the predictable guys end up once they realize they'll never be an athlete or a guitarist.
Excited by the prospect of producing the next Sneakers (best. film. ever.), I set about gathering equipment. I built my own steady cam out of parts from Lowes. I secured a wheel chair to use for tracking shots. I knew how I'd do the long, fluid takes that eventually would become synonymous with "Foreman."
The problem was that none of this really mattered. Because I didn't know what I wanted to shoot or why I wanted to shoot it. Creating even the bones of a story, much less a compelling narrative, didn't really interest me. No, all I cared about was the How. How I would shoot the film. I obsessed over the tools.
And ultimately, I never shot one scene.
This is exactly where we find ourselves in the world of big data today. The proliferation of vendors who drive the conversation at conferences and on tech blogs are concerned primarily with the How. That's their business: building technology and providing services to make your big data fantasy a reality. It's your job, not theirs, to articulate whatever that fantasy is.
Whenever I meet other data science practitioners, I listen carefully to how they introduce their work.
If they say something like, "We're using some cool technology to do X," and then they proceed to tell me about this X they're doing, then I know this person's project stands a fighting chance. They know what they're building.
But when I hear, "We're doing some cool stuff using X technology," and then they proceed to tell me about their stack, I get a little nervous. Can they even define "cool stuff" or are they just tinkering?
Now, I get that you need to choose wisely the technologies you use to solve a problem. But the exciting part should be the business that's being done. So many folks are being pressured into doing a project, ANY PROJECT WILL DO, that uses Hadoop. Because their boss's boss wants a report on how the company is "doing big data." This is a regrettable situation. Not every business needs to do Big Data, which is why I really appreciated Evan Miller's grounded post on predictive analytics last week.
If I can't clearly articulate to my peers what my analytics project is and why I'm doing it, then forget everything else. Hell, that's why I'm writing an analytics book completely in spreadsheets -- because I'm tired of the tool discussion. When you use the most vanilla tools, the business problems come back into view.
Since my failed movie venture, I've swung in completely the other direction. I obsess over what business problem I should be solving with analytics and why they need solving. What does it get my company (MailChimp), and how does it help our customers?
Can you articulate the business problem that you're throwing software, talent, and hardware at? Or are you just buying tools that are looking for a use?