Let’s just face it: IoT settings are complex. With the vastness of data and all the different variables, it can certainly feel and seem like making sense of the data is a long and difficult process…
We get to meet a lot of key players in interesting industries, many of which include IoT (Internet of Things) related functions. There seems to be a common misconception about what it takes to start utilizing the data, and getting the benefits it offers.
Well, this is about to change.
This post is a case study we’ve put together that includes a typical IoT setting. We’ll go over how the data is being used right now, what problems exists and how to solve them.
This post will act as a reminder that big and bloated is not always better.
Recent innovations in the field of mechatronics have made it affordable and easy to actually collect data, so many companies are indeed doing it. So, how do they typically do it?
Quite often we see something like 5-1500 of the most important signals, metrics and gauges being tracked, and then signaled into some end destination on a secondly or multiple times per second basis.
So, if collecting data is easy nowadays (and it is), what’s the problem?
The problems seem to be divided into two categories.
1) Lack of coordinated end-storage for all the data
2) Deep siloing with uncommunicative interfaces
3) Total lack of proactive analytics
In other words, here’s how a typical scenario plays out.
We have a big industrial player, which is collecting data. They then store little chunks of it in multiple end-destinations. The data then is quite often used for transferring responsiblity when something inevitable goes wrong. All this with super-human levels of patience on the behalf of engineers and other professionals, whose time would certainly be better spent elsewhere.
Imagine being one of these engineers, who are desperately trying to put it all together with insufficient tools. It’s no wonder many have difficulty even believing that their data worries could be solved in a heartbeat.
In order to answer that question, let us construct a reality-based case study here (while of course honoring the identity of our clients).
This case study is so good because it precisely brings home the fact we’ve been saying since the beginning of time: biggest winners of our time are moving from reactive data-usage into using it for proactive process optimization.
In this case study we have a company that manufactures vehicles for hard (and highly complex) industrial usage. The actual market doesn’t really matter, as many of the same rules apply elsewhere.
Many such companies are already collecting data from their vehicles for various purposes, like error diagnostics, location data, maintenance etc. In this case we’re bringing in 17 streams of data with a bi-second frequency.
All of that adds up to 168 gigabytes of raw data on a weekly basis. The data is collected at factory level, and only the factory itself has access to it.
So, we have 168 G of data being generated on a weekly basis. How is this data currently used?
Currently, mostly maintenance, support and other ad hoc purposes as the need arises. When something goes wrong (as inevitable happens every now and then), the data from a certain time period is sent to engineers for analysis. They then use whatever computing power they have in combination with Matlab and similar programs to dig for answers.
Now, this certainly has some value on it, as digging the error data can point out to insights about the machines themselves at the point of failure.
However, this particular company has realized the abundant benefits they could access through their data.
Some of the obvious benefits this company is going to get include things like
1) They can let their professionals do what they’re best at, and save their nerves (and coffee expenses when the pros can stop drowning their worries to caffeine)
2) Innumerous iterations with data, which can point out previously invisible correlations
3) Much higher value for the customers and end-users through process optimization, cost savings and better value proposition
4) Remove hazy production decisions, and base all the decisions on data (and see how everything fits the big picture)
And so on. Well, you probably get the idea.
These benefits are probably believable. At least they should, as more and more companies are grabbing them all the time. The UN-believable part quite often revolves around the time it takes to get access to these benefits.
Biased as we are, we would now like to show how long and tedious plans, endless meetings and such are quite unnecessary. If someone tells you it takes several months to even define the problem, they don’t know what they’re talking about, period. Not to talk trash about anyone, but Big Data is not some huge, complicated thing you need 18 consultants working on at all times.
So, we have the 168 G of data, that’s pulsating to each factory black box on a weekly basis. The current situation is much like leaking waterpipe (as we don’t have enough storage for all the data), so the first thing we’ll do is to grab all of that expensive water into one, controlled end destination.
You’ve probably heard the old one… Pessimist sees the glass half empty. Optimist sees the glass half full. But get this: our IoT engineers now see the glasses pouring over abundantly. Not only that, but he can turn the water into gold and turn off the thirst for entire company! (okay, we’re not the best of comedians, but you get the point.)
Moving data into one place gives us a tremendous reduction in hassle, and lets everyone at the company access the data immediately.
In practice, this is done through a very versatile API. It’s sort of like a secret, safe door through which the data will be escorted into our platform. Since we receive the data with Hadoop, it doesn’t really matter what kind of format it’s in (or if it even has one).
Once all the data has been transferred over to our Hadoop, we get to work with it.
So, now we have utilized the other part of our service, data warehousing. All the data we could possibly need is nicely stored in a safe vault, where we can start working with it.
Now we come in grips with the second part of our service, the powerful analytics platform. If you read through our Twitter Pulse case study, you know that we base our analytics on Netezza, quite simply because it’s the best one there is.
But it’s not like everyone likes to do things the easy way (to our amazement). In fact, the let’s explain the “traditional” way through a metaphor here.
You see, the way many companies understand Big Data is sort of like dead lifting 500 pounds from the ground. They perceive their data to be this 500-pound beast they have to have the power to lift every time they need something out of it.
The problem with this mindset is that it practically ENSURES the need for heavy consulting, bloated technology choices etc. (and thus, sky-high costs.)
Here’s the thing: why lift the entire 500 pounds, if all you need is the bar?
Iteration with the ENTIRE data is pointless. Instead, you need to be able to cut only the important parts from it, and then do analytics for that particular part.
So, imagine you need to find an almond from a plate of porridge. Here are the different ways of doing it:
Traditional way: dive into a container full of porridge.
The new way: grab a teaspoon of porridge with the almond already on it.
No wonder the first one comes with starting costs, an army of consultants and other moving parts. It’s because it’s ineffective. With the latter one, you’re practically guaranteed to win.
So, let’s continue with our 500-pound olympic bar metaphor here. Below you’ll see a picture of where the magic happens.
Don’t worry, you don’t have to grasp everything in this picture. The whole point is that this is from a presentation we gave in the context of analyzing a population of billions of people. The purpose was to find out how many of the people have been given birth before a certain date, and certain other queries.
You’ll be far ahead of the pack if you understand this following point. Big Data is difficult for many companies, because their structure only allows for lifting the whole 500-pound stack every time they need to do something with the bar. In our system, we first empty the bar. Then, moving it from one place to another is fast, easy and effortless. This is exactly what the FPGA part of our analytics platform does: it unloads the waste and lets you jump right into the juicy stuff with such speed, that if there were speed limits and police officers in the Big Data world, you would lose your license immediately!
In the traditional way you need to hire a porridge diver to get into the container, whereas we grab a high-probability volume of porridge and just pick the almond from there.
Simple. Easy. Fast.
As it should.
So, we still have the 168 gigabytes of data coming in at a weekly basis. Now, however, the company doesn’t have to throw it away after one week due to running out of storage: it’s all in here for any imaginable use.
Next, through our API, we’re constantly receiving data in order to not only take care of the legal formalities and other compliance, but we get to do something really awesome with the data as well.
So, imagine the possibilities with the data. You can now make innumerous iterations, and see correlations you never even thought of. It’s like putting on your secret 3D-glasses, and seeing the picture clearly, vividly and instantly.
And this is not some airy-fairy “let’s hope we find something” kind of rah-rah story. As an example, we see 7-figure profit and cost savings potential just about everywhere in any decent-size IoT settings.
Instead of wasting months on meetings and planning, we fully implemented the whole plan with all the points detailed here in less than 2 weeks (sometimes in days, but let’s be conservative here.)
So, there are essentially 2 types of people in the Big Data market nowadays.
1) The technology folks
2) The consultants
Both are required, of course. However, the problem with consulting-only people quite often is the lack of real-world, hands on implementation experience (which might translate into unnecessary delay). The tech folks on the other hand, often miss the boat by only focusing on the hardware.
Again, it’s not about having one or the other. You need BOTH.
Not only do we have a solid background in some of the most demanding data projects known to man, but we have our own technology (which we know like our pockets). Last but not least, Big Data gets us excited, which often translates into low tolerance for wasting a whole lot of time in our projects. Either do or do not, there’s no try.
We prefer the people we work with to see rapid returns on their investment. That’s why there has to be a decent volume of data lurking around somewhere in the process already.
PLANNING for data-usage is a task for consultants. No porridge has any value after it gets cold. When the porridge has eaten almonds that are rightly yours, well, that’s where we can help.
Is this for you? If it sounds like it is, let us know by contacting us, and we’ll see if there’s a fit between us.
In any case, make a promise to yourself right now: no more waste, and no more leaving money on the table. If your office is on top of a solid vein of gold, you have an obligation to to get out there and dig it.
COO, Service Delivery
+358 40 550 2524
Keilaranta 17, C-talo