It seems like everyone is writing about big data these days. Many invoke the 3Vs in their definition: velocity (the speed at which data accumulates), volume, and variety of information. That is a good literal definition of what makes data “big,” but it doesn’t really say why it’s important or what can be done with it.
Big data promises to improve efficiency and productivity, decision-making, consumer experiences, and much more. By collecting and analyzing data about past events, from consumer trends to weather patterns, big data can help forecast future demands and even events.
Imagine if grocery stores knew exactly how much fish to order and when, reducing waste and improving sustainability; meteorologists could predict droughts years in advance; and that when you shopped for new clothes, the store knew how to direct you to what you want. That is the promise of big data—and forecasting is but one of its possible applications.
In a sense, we have all been doing this for centuries. Chefs change restaurant menus seasonally, reflecting what they anticipate will be fresh and what customers want to eat in warm or cold weather. Ben Franklin’s “Poor Richard’s Almanac” has been giving farmers advice on weather patterns since the 18th century. And stores change over their inventory to anticipate customer demands—swimsuits and exercise equipment in the spring, coats and crockpots in autumn. It is a matter of common sense applied to everyday observations.
The difference is that big data does this on the scale of an entire system, whether it is a human society, global weather, or the financial markets. Goodbye crystal balls, hello data analytics.
Now, I’m not a data scientist and I’m not writing this to try to convince you that big data is good or important. Just about everyone seems to think it is both, anyway. I am an intellectual property attorney and a public policy specialist. So, when I read what is out there on big data, I see some important needs and policy questions.
Perhaps the most misleading aspect of the term big data is the implication that its mere existence will revolutionize the world. Data, no matter how massive, will not produce any results by itself. In the terminology of an age gone by, it is raw material. First, it needs to be collected and stored. Second (and equally important), in order to be useful, it must be analyzed and the results of that analysis interpreted and communicated in a meaningful way. The fruits of intellectual property make that happen.
Collecting and storing data involves patented hardware. Organizing and analyzing the data involves software that is probably both patentable and copyrightable. The computer processors that run those programs are likely patented. The data itself may be a propriety trade secret. Reports and interpretations that are produced are copyrightable. And trademarks will help us identify the companies that produce the best analyses and forecasts.
Innovation doesn’t happen—and certainly doesn’t make it into the marketplace—unless people think they have a chance to sustain a business with it. Some innovations are hard to reverse engineer, access, or copy. A few examples include the formula for a soft drink, a computer algorithm that drives a service, or a closed garden of online data. Businesses whose competitive advantage stems from that class of innovation are less likely to rely on the law for protection from unfair competition, but they are vulnerable to industrial espionage and may face calls to release or at least license their proprietary information.
Other innovations are easy to copy once they have been released into the market, like pharmaceutical, software, and entertainment products. To preserve their ability to recoup investments in developing such innovation, companies that produce these kinds of products must rely on patent, copyright, and trademark laws to fend off unfair competition and free riders. This sometimes subjects them to arguments that use of their products is unduly restricted.
As the use of big data grows and develops, policymakers and businesses will face fundamental questions. What are the right incentives to encourage both the collection of data and a market for that data? How can we maximize the development and implementation of tools to analyze and interpret big data? Who should make those decisions: the government or the private sector? At the conceptual level, these are very old questions. History teaches that vibrant intellectual property protection (much of which is already in place) is a critical piece of a policy structure that will support innovation and growth. What remains is to apply it to the specifics of the big data marketplace, not only as it stands now, but with an eye towards how it may develop.
There are many other related issues directly raised by the collection and use of big data, including privacy, cyber security, and competition. There are also more indirect (though no less critical) issues, such as education and immigration policies that can ensure we have properly qualified people for this burgeoning field.
Too often in Washington, issues don’t get attention until there is a crisis. Here we have an opportunity to get it right from early on. We can see this train coming down the line. Let’s try to steer it on the right tracks through an open public discussion of these issues and the development of balanced policies now, rather than waiting for problems and inefficiencies to arise, and then try to fix them after the fact and in the face of established interests trying to protect their investments and business models.
This blog was first posted at: http://www.uschamberfoundation.org/blog/2014/04/big-data-and-intellectual-property-go-hand-hand
I had the opportunity to discuss the importance of statutory damages in the Copyright Act on the first panel at a meeting convened by the U.S. Department of Commerce. The webcast of the panel can be found here (scroll down to the 7th video).