Deloitte AI Institute
An octopus lives in my data lake
The evolution of more intelligent data access, processing, and insights
Searching for intelligent data
We are all guilty of generating huge amounts of data every day. IDC/Statista estimates that by 2024, data consumption will near 149 zettabytes (that’s 21 zeros after 149). And in every walk of life, we struggle to organize this data in a manner that allows us to access it in a smart fashion. Just think of the photos on your own mobile devices, let alone the files your company creates daily.
From a corporate perspective, we have moved from data warehouses to data swamps to data lakes, yet we are drowning, not swimming in data. Accessing it the way we want to and need to is difficult at best. What we need is an octopus. Octopuses are known for their intelligence. That’s because an octopus has nine brains—a central brain and a mini brain on each of its eight arms that can act independently. Their cognitive abilities are impressive. It’s a model for how data can be handled more effectively and efficiently for organizations, both in the cloud centrally and on the edge locally.
How data warehouses evolved
We collect data from a variety of sources and put it into a central storage unit so we can access what we want, when we want it. The rationale for this traditional practice was that data stored locally by units throughout the enterprise was unknown or couldn’t be processed as is. Hence, it needed to be pulled into one central location or warehouse. Next, rules were created around managing the data and making sure it was all consistent. Technology had to be upgraded to ensure there was enough capacity to handle all the data and accompanying rules. Governance was then established so everyone who should be accessing the data could get to it.
This practice has led to huge demand and investments in storage, software, computing power, networking, etc. over the years. Results emerged at a trickle because most of the focus remained on simply managing the data. It’s not a strategic asset if people can’t effectively find and pull the information from it in the moment they need it.
It’s time the data-storage equation is flipped so the majority of time and energy is used for processing the data and using it to drive results. To do that, data access should be intelligent. We should leave the data where it originated and only pull from it what we need, when we need it. Imagine how fresh the data would be if it was just pulled out of the ocean or plucked from the tree that same day, that same hour.
Is the investment in data worth insights gleaned?
of effort has gone into creating huge data warehouses.
of effort has gone into analyzing insights from the data and generating value from it.
The octopus reach
Instead of a central warehouse, imagine data continues to lie in its native form, at the source where it is generated—a data lake. There are many such data lakes and there is an octopus that extends its tentacles into each of these lakes, pulling information from there as needed.
That information is pumped up to the main octopus brain and processed there for simple visualization and eventual insight generation.
What’s different from typical data warehousing? Quite a bit.
- First of all, because the data is not transferred to a central repository, it’s left in its natural state with all of its characteristics, as opposed to being forced to fit into a central warehouse where some of the characteristics could get lost or left behind.
- Second, the data is modular and free flowing. Parts of it can be extracted dynamically by a central brain in the cloud as needed. The rest can be left behind.
Finally, some of the data can be processed right there in its own lake, at the edge, in a more focused and timely manner. As the main brain in the cloud and the mini brain on the ground share information, they both get smarter, bringing additional insights to light.
How it works: The octopus brain at work—in the cloud
Let’s say the octopus is looking for a very specific, rare type of salmon. The octopus brain can extract the data from all over the world, not just from the data in its own domain to figure out what the salmon looks like and where it’s located. It can then use algorithms, or machine learning, to make sense of the data and provide a clear result. How?
The main octopus brain sorts out the three to five specific variables that are most significant to finding that rare type of salmon from the 20 or 200 variables that might be impactful. For example, it quickly determines where similar salmon were sold and disregards tuna sales. As it’s looking for information about the salmon, the central brain figures out and grabs just what relates to the salmon and leaves behind information that isn’t relevant. As this happens repeatedly, the octopus brain becomes very adept at learning from its experience. It has a better chance of finding the salmon again—faster than it did before.
How it works: The tenacle brain at work—at the edge
This type of learning is transferred to the mini brain node in the tentacles, and eventually the irrelevant information is not even transferred to the main octopus brain because the tentacle nodes become smart. The data isn’t just analyzed in the main brain, but at the node, before the computed results are sent to the main brain. So when the main brain asks for rare salmon, the mini brain at the edge doesn’t bother pulling data about tuna. It knows what the big brain is looking for. Edge analytics and computing happens locally where the data originates.
This type of intelligent computing isn’t easy. Global companies are just starting to solve for it, but it’s fast on the horizon. Imagine this.
The main brain and the tentacles continue to share information back and forth, and the pre-processed intelligence for acting on fresh data is stored in the form of data models. The central brain becomes smart enough to tell the nodes about new parameters, and the nodes can feed the central brain new information that was not previously available. The more pictures of rare salmon the node sees, the faster it identifies them, counts them, and can pop up a chart indicating where the fisher should take the boat out. The information is immediate.
The result is smart business
In a business context, the right models and visualization are applied to the right kind of ingested data and answers to business questions are delivered at speeds we haven’t achieved before and in a high-touch way. Think about watching Netflix as one individual in a family of viewers. The suggestions for what to watch are different for each family member based on previous viewing history. It’s personalized.
Or, think of ads for credit card rewards that cater to the individual based on purchase history and their own digital footprints. In both cases, such high-touch personalization has been developed with machine learning and artificial intelligence (AI). Intelligence grows over time with experience, and this intelligence speeds up the processing and interpreting of fresh data. What’s starting to evolve is where it should happen.
Where will the octopus surface first?
The mini brain edge-processing technology is already being used in a variety of industries and applications. It will likely surface first for emergency situations or in really painful business cases where a service might fail. It may be best suited toward situations where the node doesn’t have time to send the information to the big brain for processing before it’s too late—as in the case of self-driving vehicles and wearable health care devices.
What’s being further developed is the octopus’ central brain training the tentacles to create the right waves in the lakes—to take action on their own when necessary to prevent a problem from occurring. The evolution from data warehouses to data lakes is ongoing. New advancements and business value, even life-saving discoveries, are being made because two things merged: computing power and massive amounts of data.
And we can all be thankful for the intelligent octopuses in the cloud and at the edge that are not only making it possible, but bringing high-touch value to the table.