The analytical challenges of IoT data


If data is indeed the new oil, then we’re still a long way off from mastering the science of extracting, refining and deploying it as a strategic enterprise asset. That, in short, seems to be the conclusion that emerges from two separate studies. This may come from the analytical challenges that come with IoT data.

The first study from Gartner classifies 87% of businesses as having low BI and analytics maturity. Organizations within this low maturity group are further divided along two levels; a basic level characterized by spreadsheet-based analytics and a higher opportunistic level where data analytics is deployed but piecemeal and without any central leadership and guidance.  

In the second study from New Vantage Partners, a majority of C-suite executives conceded that they were yet to create either a data culture or a data-driven organization. Rather more worryingly, the proportion of companies that self-identified as data-driven seemed to be on the decline.

Companies may not yet be data-driven but the data flow shows no signs of slowing down.

According to IDC, the global datasphere will grow to 175 Zettabytes (ZB) in 2025, up from 23 ZB in 2017. Even as that happens, the consumer share of this data will drop from 47%, in 2017, to 36%, in 2025. This means that the bulk of this data surge will be driven by, what IDC refers to as, the sensorized world or the IoT. 


Key challenges of IoT data

The surge of IoT data comes with a lot of economic value, estimated at around $11 trillion by 2025. But it also comes with some significant challenges in terms of aggregating data from disparate, distributed sources and applying analytics to extract strategic value. 

The primary challenge of IoT data is its real-time nature. By 2025, 30% of all data will be real-time, with IoT accounting for nearly 95% of it, 20% of all data will be critical and 10% of all data will be hypercritical. Analytics will have to happen in real-time for companies to benefit from these types of data.

Then there is the issue of time series data. This refers to any data that has a time stamp, such as a smart metering service in the IoT space or even stock prices. A company’s IoT infrastructure must be capable of collecting, storing and analyzing huge volumes of time series data. The challenge here is that most conventional databases are not equipped to handle this type of data.

The distributed nature of IoT data, where most of the data is created outside enterprise data centers, also presents its own set of analytics challenges. Chief among them is the need to process at least some of this distributed data, especially hypercritical and critical data, to be processed outside the data center. IoT analytics itself, therefore, will have to become distributed with some analytics logic shifting out of the cloud to the edge. IoT analytics will have to be distributed across devices, edge servers, gateways and central processing environments. In fact, Gartner predicts that half of all large enterprises will integrate edge computing principles into their IoT projects by 2020.    

These are just a few of the characteristics of IoT data that differentiate it from conventional data sets. And traditional data-analytics technologies and capabilities are not designed to handle the volume, variety and complexity of IoT data. Most companies will have to completely revamp their analytics capabilities to include IoT specific capabilities such as streaming analytics, the ability to identify and prioritize between different data types and formats, and edge analytics. 

CSPs take the lead in IoT data analytics

Many companies are turning to cloud-based IoT platforms that offer rich data services alongside their core IoT offerings. Customers are looking for real-time capabilities across data ingestion, storage, processing, and analysis such as for the rich data ingestion, transformation, storage and processing.  Some cloud vendors are even offering their own hardware to enhance the interoperability and performance between IoT devices and data that is processed in the cloud. 

According to a Bain & Company study, CSPs (cloud service providers) are seen as leaders in providing a comprehensive set of tools that address all the IoT data analytics needs of the enterprise. These CSPs, according to the same study, are also playing a key role in lowering barriers to IoT adoption, facilitating simpler implementations and enabling customers to design, deploy and scale new use cases as quickly as possible. 

AWS takes the lead among IoT CSPs

Among the big brand CSPs, Amazon AWS has consistently been ranked as the platform of choice, followed by Microsoft Azure and Google Cloud Platform, in the annual IoT Developer Survey conducted by the Eclipse IoT Working Group. 


With data collection and analytics remaining a top three concern among developers, Amazon AWS offers arguably the most robust cloud-based IoT analytics solution in the market today.

The AWS IoT Analytics platform is a managed service that eliminates the complexity of operationalizing sophisticated analytics for massive volumes of IoT data. With AWS Lambda, developers also have access to a functional programming model that enables them to build and test IoT applications for both cloud and on-premise deployments. 

In terms of data collection & analytics, Amazon offers two distinct services in the form of AWS IoT Analytics and Kinesis Data Analytics. 

AWS IoT Analytics has the capabilities required for a range of IoT applications with built-in AWS Core support to simplify the setup process. With AWS IoT Analytics, it becomes much easier to cleanse bad data and to enrich data streams with external sources. AWS IoT Analytics allows data scientists access to raw and processed data, the facility to save and retrieve specific subsets of data and flexibility for rule-based routing of data across multiple processes.  

Kinesis Data Analytics is more suited for real-time data ingestion applications, like remote monitoring and process control, that require low latency response times in the range of milliseconds. The service integrates with other AWS tools like Amazon DynamoDB, Amazon Redshift, AWS IoT, Amazon EC2, to streamline the data analytics process. The Kinesis Analytics suite comprises a raft of services including, Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics and Kinesis video streams. Amazon Kinesis Data Streams enables the continuous capture of large volume real-time data feeds and events of different kinds. Raw data from Kinesis can then be cleaned and process through AWS Lambda or Amazon ECS.  Kinesis Firehose prepares and loads streaming data to S3, Redshift or Elasticsearch for near real-time processing and analytics. 

While Kinesis offers developers more flexibility in development and integration, AWS IoT focuses on simplifying deployment using prebuilt components. It is possible to combine these two solutions to build a comprehensive IoT solution encompassing streaming as well as at-rest data.  

Late last year, Amazon AWS announced the launch of four new capabilities that would make it easier to ingest data from edge devices. AWS IoT SiteWise is a managed service that makes it easy to collect, structure, and search data from industrial equipment at scale. With AWS IoT Events, customers can now easily detect and respond to events from large numbers of IoT sensors and applications. AWS IoT Things Graph enables a no-code approach to IoT development with a visual drag-and-drop interface that allows developers to build IoT applications by simply connecting devices and services and defining their interactions. And finally there was AWS IoT Greengrass Connectors, a service that would enable developers to connect devices to third-party applications, on-premises software, and AWS services through cloud APIs.

Over and above all this, AWS has established a strong partner network of edge-to-cloud service providers and device manufacturers to offer customer the deepest technical and domain expertise required to mitigate the complexity of IoT projects. 

Apart from being a developer favorite, AWS IoT has also built up a client roster of some of the biggest brands in the industry including LG, Bayer, NASA, British Gas and Analog Devices, to name just a few. 

Notwithstanding the challenges of Big Data and analytics, there have been many successful IoT implementation across diverse sectors. Here then are just a couple of success stories of how companies and their IoT partners were able to use the power of big data analytics in IoT. 

IoT data analytics success stories

Bayer & AWS IoT Core: Bayer Crop Science, a division of Bayer, provides a range of products and services that maximize crop production and enable sustainable agriculture for farmers worldwide. The company uses IoT devices on harvesting machines to monitor crop traits that are then manually transmitted, over several days, to its data centers for analysis. The lack of real-time data collection and analytics meant that Bayer could not immediately address any issues with equipment calibration, jamming, or deviations to help with routing plans for subsequent runs. 

Already an AWS customer, Bayer’s IoT team decided to move its data-collection and analysis pipeline to AWS IoT Core. The company built a new IoT pipeline to manage the collection, processing, and analysis of seed-growing data. 

The new solution captures multiple terabytes of data, at an average of one million traits per day during planting or harvest season, from the company’s research fields across the globe. This data is delivered to Bayer’s data analysts in near real-time. The AWS IoT solution also provides a robust edge processing and analytics framework that can be scaled across a variety of IoT use cases and IoT initiatives. 

Bayer is now planning to use AWS IoT Analytics to capture and analyze drone imagery and data from environmental IoT sensors in greenhouses for monitoring and optimizing growing conditions.

Microsoft Azure IoT Hub & ActionPoint: Many manufacturers still use paper checklists, manual processes, human observation and legacy closed-loop technologies to monitor and maintain their equipment. Even in the case of modernized plants, manufacturers often did not have the right sensors in place to provide all the data required, or they had no analytics solutions to analyze the sensor data.  

Custom software developer ActionPoint partnered with Microsoft and Dell Technologies to develop IoT-PREDICT, an industrial IoT solution for predictive maintenance that incorporates machine learning, data analytics, and other advanced capabilities. The solution is powered by the Microsoft Windows 10 IoT Enterprise operating system running on Dell Edge Gateway hardware, and combined with the Microsoft Azure tool set to provide state-of-the-art edge computing. 

The combination of Windows 10 IoT Enterprise and Azure delivers a highly effective IoT solution that customers can deploy in minutes. It also gives the IoT-PREDICT solution the flexibility and scalability that allows manufacturers to start small with IoT and grow at their own pace.

IoT-PREDICT helps manufacturers quickly reduce downtime, lower costs, and increase the overall efficiency of their equipment and operations. It helps maximize the impact of manufacturer data by using the Microsoft Azure IoT Hub to gather data and make it available to several Azure services, including Azure Time Series Insights, Azure Stream Analytics. Manufacturers can now explore the data using Time Series Insights, or use Stream Analytics to take action with the data by setting up queries and alerts based on various performance thresholds.

IoT data analytics has certain unique characteristics and challenges that cannot be addressed by conventional analytics technologies and capabilities. But like in any analytics operation, the primary objective remains the same: to generate actionable insights that can enable positive business value. It is not just about choice or sensor or connectivity protocol or CSP. It has to be about ensuring the integrity of what McKinsey defines as the insights value chain. In order to ensure that every IoT project leads to demonstrable business value, organizations have to ensure the integrity of the entire insights value chain. 

Pentair & AWS: Pentair is a water treatment company that offers a comprehensive range of smart, sustainable water solutions to homes, businesses and industries around the world. The company relies on connected systems to monitor and manage its product installations, most of which are in remote locations. Traditionally, the company took the custom building route to develop its connected systems, which came with its own set of disadvantages. 

Pentair needed a powerful, flexible IoT platform, with high availability and scalability and a high degree of reuse across all lines of its business. Pentair also wanted a comprehensive solution that covered everything from IoT data ingestion, to analysis and visualization. 

The company teamed up with AWS Partner Network (APN) Advanced Technology Partner and IoT Competency Partner Bright Wolf to evaluate potential technology providers including Amazon, GE, IBM, Microsoft and others against a set of platform characteristics. This included data ingestion, infrastructure controls, deployment options, machine learning and visualization tools, development tools and the overall openness of each platform.

“AWS came out on top when it came to the raw scoring,” says Brian Boothe, the lead for Pentair’s Connected Products Initiative.

Till date, Pentair has deployed three different connected solutions using the AWS IoT platform and a flexible, scalable, and reusable reference architecture developed by Bright Wolf. The benefits according to Pentair include, accelerated time to market for value-added services, simpler integration, cost savings from deploying commodity edge devices on the open AWS IoT platform enterprise-grade scalability and availability.