Charlie Cheng, President and CEO, Kilopass Technology Inc.

The simple demarcation of “Big Data” for large organizations and “Little Data” for individual companies seems simple enough until one examines the motivation of the data. Then it becomes quite clear that the “Big Data” problem can be solved quite elegantly through traditional branches of computer science, riding on the shoulders of semiconductor advances. Equally clear, though, is how ill-defined the “Little Data” problem is, and how much more challenging it is to solve “the” problem. The “Little Data” problem is one of artificial intelligence, with the Internet enriched (or cluttered) by the Internet of Things (IoT), wearables, etc. as the database for automated decision-making for us human beings.

To provide a proper contextual background and contrast, it’s useful to summarize the “big” idea behind “Big Data.” One doesn’t always need to understand causality because sometimes correlations are enough to predict trends and provide useful insights. The example made famous by Viktor Mayer-Schönberger and Kenneth Cukier in “Big Data: A Revolution That Will Transform How We Live, Work, and Think” is how Google can use its vast database of the Internet to accurately predict the movement of the annual flu season well before the Center for Disease Control (CDC) can.

It turns out that winter sports like basketball are where people gather together and, as a result, exchange germs and viruses, so the Internet actually has a more complete database of people movement automatically than what the CDC can gather after the outbreak of the flu. While traditional scientific doctrine would strongly disagree with the blind application of correlation, the truth is that organizational behavior, however irrational, is often predictable through observations of past data.

If the purpose of the “Big Data” is to help organizations make decisions through new insights, the “Big Data” problem for the foreseeable future can be solved by gathering enough data, apply analytics and come up with useful correlations to support decision-making.

The initial pivot to “Little Data” must have been one of asking how the correlation model can be applied to the individual, and surely there are plenty of examples of early success in this pivot. Google, for example, revolutionized the search engine business by using correlation to present user-specific search results. If a user’s “cookie jar” is full of shoe shopping receipts, then the Google search results may favor some brands, products, stores, etc. based on this correlated data. The “Little Data” movement gathers more steam as IoT and wearables create the potential of generating much more individualized data and, therefore, create more personalized insights about what to do next.

For sure, there is numerous low-hanging fruit ripe for the taking. It would be easy to imagine running shoes reporting when it’s time to replace the shoes, and then how to run correctly to avoid injury. By extrapolation, the entire cadre of household appliances (down to light bulbs and electric blinds) should generate a simple list of repairs needed well before anything breaks down. The possibilities are endless, and each helps to eliminate a mundane task that we humans have to do but don’t like doing or can’t remember to do. From these possibilities, it’s easy to imagine how the semiconductor industry has a long and rosy road ahead in volume and revenue growth.

However, unlike the “Big Data,” the “Little Data” paradigm has three unique problems. Some of them are not solvable with technological advances and philosophical discoveries. The uncertainty of where to locate the “server,” who makes the decision and, indeed, how to conquer mercurial human whims are the three “Little Data” problem that distinguishes it from the “Big Data.”

First, the server problem. The personal computing paradigm started with a PC for every business person, evolved to one PC per household, then to one PC per person, one Smartphone per person, and now several “smart” devices per person. The problem of which devices receives what data, and which decisions are made by what data at first seem like a small issue of simply too many “smart” computing devices. However, the issues will evolve into who has access to what data, and then ultimately how we can have the device(s) with us at all times.

As the “Big Data” server is always in the cloud datacenter, and with rapidly scalable computing power, the “Little Data” server is suffering from an identity crisis. Too many individual devices confuse the user and force additional decisions about how to allocate decisions to difference devices. Ultimately, the “Little Data” server problem will be solved, but for the next five to 10 years, it will get worse before it gets better.

The second unique problem with the “Little Data” is more challenging. While both “Little Data” and “Big Data” try to correlate data to present options and possibilities, there is a “user versus vendor” issue that’s unique to “Little Data.”

“Big Data” presents organizational options as part of the organization and, therefore, doesn’t have the issue of authenticity and perspective. Taking the “Little Data” example of when to replace a pair of worn running shoes, it’s easy to see that what Nike says may not be taken nearly as seriously as what one’s own gathered statistics (50 runs of 5 miles or longer, for example).

With enough tools and effort, “Little Data” can be set up by the individual who is interested in making decisions based on his/her own criteria. However, current trends in online/mobile internet suggest more “Little Data” ideas/options will come from vendors than from users.

It’s one thing for an organization to create possible trends toward the future in order to facilitate better decision making, but quite different when a person is inundated with unsolicited suggestions based on “Little Data.” Imagine the sidebar advertisement from Facebook posts or Google searches, but an order of magnitude more in volume and more personal and urgent. On the other hand, setting up one’s own “Little Data” scenario will become more complicated, and more involved. For example, however misguided some of the applications of “Little Data” have been by TripAdvisor, it’s amazing how persistent the website is in coming up with correlated suggestions about where to stay and what to do.

One can only imagine the level of effort it’d take to really set up a correct search metadata for a possible hotel to stay, and how the travel sites make it their business to constantly gather user preference data and history to suggest options. Now expand this into all aspects of a person’s business and private life, the number of “Little Data” problems one can try to automate versus the number of suggestions from vendors will quickly get out of hand.

Let’s assume that one could actually solve the first two problems, and automate the search for “Little Data” solutions, the next unique problem becomes a philosophical issue of can a computer actually anticipate a human being’s desires and whims. The correlated approach of anticipating the future, which forms the basis for both “Big Data” and “Little Data,” assumes that the subject of interest behaves rationally. This is largely true in the “Big Data” space, but seldom so in “Little Data.”

Even with the most competent assistant and years of travel experience, last minute changes in the trip and personal circumstances could turn a best travel itinerary upside down. Just about every e-Commerce site has a bit of “Little Data” built-in, and it is flexible enough to react to changing whims of the buyer. But the evolution of the “Little Data” is supposed to expand the depth of search to whittle down the options to a few high likelihood results, based on a much larger sample of user inputs. With a larger effort to comb through more in-depth user data to create the “Little Data” result, only to have the user change the objective of the mission would be much more costly than the seconds of computing time and effort. And, that doesn’t consider the wastefulness of all the user data gathered for this exercise.

The discussion of this article so far seems more suitable as an essay for an ethics philosophy course, and less so for how “Little Data” might impact semiconductor design and production. Let’s take a brief tour of how “Little Data” might evolve in the next 10 years.

For sure, wearable devices will begin to invade our clothing, shoes, and even skin. They will be able to gather data not just about our health, but eventually about our mood. They will be connected to a personal device that crawls through personal and environment data. It will create a laundry list of “to-do’s,” and when they should be done, from when to renew passport, when to take a flu shot, all the way down to which hotel room types and air conditioner setting are best suited to the person.

As to these three “Little Data” problems, the semiconductor industry likely will focus on the most power-efficient computing paradigm to enable over-production of “Little Data.” It likely will create also the lowest power devices for wearable and personal device. Ultimately, semiconductor technology is just the enabler. The evolution of the internet companies with the flood of “Little Data” available to them will be one of the most interesting evolutions yet to come.