Enabling Big Data in your IOT Strategy

IOT and Big data are considered as two sides of a same coin. For some industry requirements, you won’t be able to differentiate whether it is IOT or big data solution. At the same time, in some cases your IOT solutions may not be having any big data use cases. In such cases, it is always a tough call for the IOT strategist to decide on the feasibility of considering big data in their system.

I suggest to keep your data big data friendly.

What does big data friendly mean?

Current big data solutions are spending lot of efforts to clean up their large amount of complex & messy data. It happened just because the data was not collected big data friendly.  We should learn from the mistakes of others!!! Your data may not be “big” now, but it is growing  and soon  will be large  for any big data processing in future. Industry already started adopting new approaches, moving from descriptive to predictive and prescriptive analytics. We should prepare our data for various big data analytics in future to improve our business value.

How can we make our IOT strategies big data friendly?

There are a few design considerations,

1. Finding ‘target-rich’ data :-

All data may not be worth saving for future. Organizations need to focus on the high-value ‘target-rich’ data. Business analyst and data scientist should work on identifying the high value data which can be an asset for the organization in near future.

2. Data saving strategies :-

How data is stored is a major decision point for the big data system.

2.1 Data Format

Saving text data in formats like JSON data won’t be easy to handle when the data become huge. We should also consider using more big data friendly formats like Parquet, Avro, ORC etc

2.2 Dynamic schema for version handling

The data veracity increases when the system grows with different versions have different fields added or renamed or removed. Manually handling schema is always an overhead. Self describing data formats will be a suitable solution. We should consider using data formats having dynamic schema generation capability (like parquet).

2.3 Partitioning strategy of data

Data partitioning strategy should be defined to having better segmentation of data for easy processing.

3. Security considerations.

Data security is one of the major concerns of both big data and IOT systems. We should implement the privacy by design. Privacy impact assessment and data anonymization are few things to be considered.

In a multi tenant environment, you don’t know how fast your data is growing and become matured as big data. Preparing your data for future will enable to extract maximum value out of it.

“Information is wealth”, Sequoia Applied Technologies will help design solutions to Capture it before you lose it!! .

Link to Article on Linkedin