Aju Kuriakose & Anu P Raj h ad a fun and Interactive Q&A session at Department of Computational Biology, Kerala University. Thanks to Dr. Achuthsankar Nair, Dr. Biji Cl & rest of the DCBians for making this possible.
With the advent of fast genome sequencing techniques, biological datasets worldwide have exploded to tremendous sizes today. For instance, a single patient’s sample after sequencing and several stages of data processing and analysis could run into over a Terra byte! Raw sequencing data that comes out of the sequencing machine is at an abstract level of potentially useful information, requiring significant processing to be converted into meaningful form to drive genomics research.
Some of the data conversion steps being highly computation intensive and/or requiring specialized bioinformatics algorithms, a large portion of the bio-informatics data processing pipeline is implemented in the cloud today. However, as the data resident in the “genomics cloud” reaches the hands of the researcher, it is only as good for research as the analytics and visualization capabilities.
Visualization is a graphical representation of data intended to provide the user a qualitative understanding of information. Data visualization techniques greatly enhance the user’s understanding and interpretation of these massive data sets. A visualization-integrated bio-informatics pipeline provides researchers with the ability to explore genomics data and enables them to progressively iterate, backtrack or zero-in on their analysis steps, thereby enabling them to infer high-impact conclusions with an improved degree of confidence within a reasonable time.
The two essential attributes of a successful data visualization framework are:
1) High interactivity
2) Performance at the speed of analysis
Interactivity implies the ability to manipulate graphical entities to derive intuitive data representations. Interactive graphics involves the detection, measurement and comparison between points, lines, shapes and images being represented for the effectiveness of user interpretation, accuracy of quantitative evaluation, aesthetics and adaptability. Enhancing data interpretation by varying the views, labelling to retrieve the original data, zooming in to focus the clarity of data, exploring the neighboring points and a user adjustable mapping can create a good data exploration experience to the user.
Consequently, as the user continuously manipulates data (applies filters, adjusts thresholds, tunes parameters like scale and dynamic range of values) to make “research sense” out of the data, the visualization framework should permit
1) Discrete or continuously variable settings with user-friendly controls like text boxes, selection drop-downs, sliders, knobs etc. and
2) Quick redrawing of the updated graphical representation after every change is made in user settings.
General-purpose and traditional analytics software packages that have been adopted in bio-informatics often come with add-on packages for interactive visualization to a basic level of utility for research. With an easy non-programmer model that appeals very much to researchers, these packages provide interactive graphs and plots. Having an in-built web server eliminates the need to install any client applications, all that the user needs is a browser and an URL to point it to.
However, when it comes to enormous datasets that range millions of data points, these in-built/add-on visualization frameworks are found to be incapable of giving the user an acceptable (sub 1-second?) performance each time a user setting is changed. Therefore, guaranteeing an analysis-continuum to the users remains challenging. Besides the rendering stability of these in-built/add-on packages is often found problematic when large data sets are thrown at them, with statistical methods applied on the data. Rendering inaccuracies including gross misrepresentations of data are frequently encountered that expose the limitations of their scalability.
Here comes the need for evaluating, piloting and implementing visualization frameworks based on customized graphical libraries that leverage fast rendering techniques in a browser environment. As was proven by our experiments with multiple fast-visualization techniques, a customized visualization framework for bio-informatics is the sole solution to match the user’s speed of analysis to provide an enhanced time-to-insights experience.
In conclusion, bio-informatics visualization framework needs to be highly interactive and lightning fast to handle data sets in the millions. Further, from the bioinformatics pipeline provider’s perspective, scalability for a large number of concurrent users and security of data are the other key attributes to be satisfied by the visualization framework, as is applicable to the other modules like data transformation and analytics modules in the pipeline.
Today, 99.6% of all smartphones run on either IOS or Android. Increasingly mobile apps have gained significance as way to not only conduct business but also for raising brand awareness. There are hundreds of new applications being launched on a daily basis. In the last few years, the concept of cross-platform mobile app development has taken off in a big way. It allows the developer to write the code once and employ it across all platforms – Android, IOS or Windows. Some of the advantages of developing Cross Platform apps.
Cross-platform vs Native apps:
Native apps are written in languages that the platform accepts natively. For example, Swift or Objective-C is used to write native IOS apps, Java is used to write native Android apps, and C# for the most part for Windows Phone apps.
Apple and Google offer app developers their own development tools, interface elements and standardized SDK; XCode and Android Studio. This allows any professional developer to develop a native app relatively easily.
- Since native apps work with the device’s built-in features, they are easier to work with and also perform faster on the device.
- Native apps get full support from the concerned app stores and marketplaces. Users can easily find and download apps of their choice from these stores.
- Because these apps have to get the approval of the app store they are intended for, the user can be assured of complete safety and security of the app.
- Native apps work out better for developers, who are provided the SDK and all other tools to create the app with much more ease.
Cross-platform development tools that do not use WebView and communicate with the platform directly aren’t united in any subgroup. Existing under the general term of cross-platform development, they are sometimes called native development tools, which just makes it all even more confusing. For the sake of convenience, we’ll refer to these tools as ‘near-native’ here and will explain why they deserve such a praise.
In ideal scenario, cross-platform apps work on multiple operating systems with a single code base. There are 2 types of cross-platform apps:
- Native Cross-Platform Apps
- Hybrid ‘HTML 5’ Cross-Platform Apps
Native Cross-platform Apps
Native cross-platform apps are created when you use APIs that are provided by the Apple or Android SDK but implement them in other programming languages that aren’t supported by the operating system vendor. Generally, a third-party vendor provides an integrated development environment that handles the process of creating the native application bundle for iOS and Android from a single cross-platform codebase. In this case, the final product is an app that still uses native APIs, and cross-platform native apps can achieve almost native performance without any lag visible to the user. Native Script, Xamarin, and React Native are the most common examples native cross-platform languages.
Hybrid HTML 5 cross-platform apps
Mobile app development tools
Xamarin apps are built with standard, native user interface controls. Built with #C and .NET, Xamarin allows developers to re-use code and simplifies the process of creating dynamic layouts in iOS.Apps not only look the way the end user expects, they behave that way too. Xamarin apps have access to the full spectrum of functionality exposed by the underlying platform and device, including platform-specific capabilities like iBeacons and Android Fragments. Xamarin apps leverage platform-specific hardware acceleration and are compiled for native performance. This can’t be achieved with solutions that interpret code at runtime.
Apache Cordova comes with a set of pre-developed plugins which provide access to the device’s camera, GPS, file system etc. As mobile devices evolve, adding support for additional hardware is simply a matter of developing new plugins.
The React Native framework was created by Facebook, and its development started as a result of a hackathon back in 2013. React is an example of a technology that the developer community created for itself when developers were looking for a tool that would combine the good things about mobile development with the power and agility of the native React environment. React Native’s genesis resulted in a huge enthusiastic community investing into the framework’s development, and there are catalogs of freely available components that go with it.
React Native provides development tools for debugging and application packaging, which saves time.
Which One to Choose
So, if you want to impress users with a lightning fast interface, rich functionality, and overall performance, native apps are what you need. In addition, you get better security and stability. The price for this is that you’ll most likely need to hire two dedicated teams for each platform. Small business may not be able to afford develop an application for both platforms.
Cross platform apps, on the other hand, can be developed for both IOS and Android. Plus, cross platform apps are much easier in terms of maintenance and deployment, so you can spend more time and money on marketing and attracting new customers. However, their biggest disadvantage is lower performance, which may be especially crucial if you’re developing an application with features that require deep hardware integration.
Big data as a technology passed through various stages of evolution during the last few years, which still keeps it hot in the list of tech buzzwords! Starting with handling the 3 V’s of data – Volume (of data to be handled), Velocity (of data generated) and Variety (of data generated), it has spread wings to more V’s – Veracity (to ensure data integrity and reliability), Vulnerability (to address privacy and confidentiality concerns) and Value (of information)!
As Google showed the way, collection and collation of huge volumes of data and applying the right analytics to gain valuable insights into the business and optimization possibilities is the key to extracting the full potential of the data-driven industry. Today Chief Data Officers are building strategies to organize their data and to derive business intelligence from it to drive radical transformation of businesses in many sectors such as industrial, retail, logistics, healthcare etc.
BDaaS (Big Data-as-a-Service) is gaining momentum, enabling external experts to take the company’s customer data to the cloud and to provide analytical insights for decision making. Offered as a managed service, it frees up the customer from substantial initial investment and helps offer RoI-driven spending. This article focusses on BDaaS, describing the potential that enables our customers to conceptualize and launch new business models.
Large corporations with structured and centralized ERP systems wouldn’t benefit as much from BDaaS as compared to unorganized sectors comprising of diverse players each with their own fragmented IT infrastructures. For instance, unorganized retail is a heterogeneous sector with a geographically distributed supply chain that spans across medium and small players, having considerable differences in their levels of process maturity. Stand-alone islands of software application are encountered many times and so are ad-hoc (or legacy) structures of data storage and archival. B2B companies providing services to geographically spread out customers in many traditional supply chains like chemicals/reagents for laboratory use, petrochemical (non-fuel) derivatives and medical drugs could benefit from the transformational potential of BDaaS.
Suppose you are a B2B player in one of these or similar sectors, let us take a closer look at your business and customer data! Could your expertise in the industry be leveraged to identify a new data-driven model by “integrating your customer data” to offer new intelligence gleaned from it? This integration gives you the data in ‘sector level’ rather than ‘individual’ customer level. You will be able to identify sector level intelligence and provide it to all your customers which will be mutually beneficial for all.
In order to accomplish this outcome, you will most often need external expertise in big data to work collaboratively with you (or your domain consultants) in order to build a BDaaS platform to offer your customers. The value of business intelligence that the platform brings helps them win in their businesses and their patronage in turn helps your business model succeed.
So has been our experience working with a world leader in the pharmacy supply chain across North America. Besides supplying medicines and medical equipment to their customers, they also provide inventory and patient management software to their customers. The software installed in each of the numerous hospitals gathers transactional data over time. We worked closely with the customer’s consultants on the feasibility of data integration and created a centralized control center using big data technologies such as Spark and Kafka. Hosted on the cloud, the platform captures streaming data from different hospitals and pushes them to the centralized system that offers a metered BDaaS service to end-customers, the analytics insights helping them to optimize their businesses.
The path to big data implementation, however, was filled with several challenges, a few of which are:
With the regulatory requirements concerning medical information like the HIPPA standards, compliance is mandatory. Only non-sensitive data at a lower level of granularity is collected, that respects privacy concerns of the individual hospitals of exposure of their patients’ sensitive information. This is the key factor to the success of the project both from customer buy-in and regulatory compliance points of view. The collected data is pushed to cloud securely with transport layer security.
Verity of data
The data being heterogeneous and scattered is the foremost challenge while implementing big data solutions. Even though most hospitals use our customer’s software a few others use their own legacy software. Data could be isolated even across the departments in the same organization! We built data collector modules which can be customized easily to collect data from various sources and push it to the cloud. Rationalizing the relevant data fields from these diversified sources and integrating it provides a lot of insight into possibilities of analytics.
Time to market and initial investment
Being a metered service we had to make sure that customer’s cost is kept linear with usage. Databricks big data platform with reliable Open Source Kafka data injector gives us a balanced and scalable framework to meet this objective.
After data was made available from all sources centrally for analysis it was discovered that information on the availability of particular medicines in each hospital along with demand predictability has the potential to reduce the associated transportation costs by around 20%. Data-driven drill down revealed for instance that for a particular area with a prevalence of influenza but with shortage of the corresponding medicine, the system can identify the best possible area (nearest, where there is enough stock but no demand currently) from which this medicine can be arranged. Supply chain demand mitigation by coordinating drug supply between customers can significantly save inventory and transportation cost for customers. More importantly, it saves precious reaction time for their end-users, which would not have been possible without the magic of BDaaS.
In your own strategy to connect your fragmented customer data centrally to provide mutually beneficial information, the role of an experienced big data partner is indeed crucial. Combine the power of your domain expertise with big data specialists to create new data-driven business models which besides increasing your revenues could make you the hub to all customers thereby increasing the bonding of existing ones and attracting new ones.
IoT is changing the world around us. This change is affecting every walk of life including the maintenance industry. Maintenance management used to depend on skills of the maintenance managers to troubleshoot skills and was least data-driven as they have very limited data to fall back on when it came to machine health. However, it is rapidly changing. It is becoming heavily data-driven than skill driven. Advances in wireless communications and data processing enable maintenance managers to gauge the health of the factory in an instant.
We can tell that its no longer a hype but a reality and proof is in the fact that the leading organization- OPC Foundation is spending time in developing the Unified Architecture (UA) Specification for IIoT in the manufacturing environment. The standard is being developed to enable IIOT devices to easily pass information between sensors, machines, monitoring devices and the cloud in a secure and open way. Also OPC, AMT & OMAC have jointly developed Packaging Machine Language (PackML) and MTConnect which combines OPC UA with existing industry standards to lower cost of predictive maintenance.
Low cost of IIOT sensors is making predicting failure or measuring remaining useful life (RUL) of a tool a no-brainer enabling maximum uptime at optimum costs . As an example, a drill over course of its functioning will start to suffer wear. As we continue using them regularly at some point of time they become unusable either because the precision of the job falls below the parameter or the drill bit breaks off. With the combination of Industrial IoT sensors and AI techniques today we can easily predict the remaining useful life of the tool .
Any maintenance professional will agree with me that predictive maintenance is a journey they have to take but IIOT makes the journey easier. Retrofitting existing machine with a sensor to measure machine health becomes very easy. One of the companies where we work with to enable this transitions is OPA By Design. It is a smart device which can be tagged to any existing machine at a very minimal cost to measure 8 different parameters and report it maintenance supervisors via mobile app & cloud. Since the machine is constantly being monitored, any sign in degradation in the health of the machine is alerted instantly
IIOT is also enabling to drive down the inventory holding cost as now maintenance supervisors have better predictability of machine failure and hence they have to stock less spares. It also results in fewer emergency inventory orders and less downtime due to out-of-stock inventory.
IIOT is not changing anything for the maintenance professional except the fact that he can now listen to his assets and make informed decisions based on actual data on the health of the asset. IIOT is not going to fix the problem for him. He will still have to depend on his best technician to fix it reliably
Automotive electronics has been making steady gains in percentage cost of the total vehicle cost world-wide. Consequently, it has been facing some of the same challenges that were faced earlier (and mostly solved by automated tests) in other areas of automobile mass-manufacturing – fabrication, mechanical assembly, electrical components and hydraulic systems.
A typical example is the Electronic Control Unit (ECU) that has become the heart (or brain!) of the modern automobile. An ECU receives inputs from various sensors and sends outputs to multiple actuators, in addition to communicating with other ECUs of related subsystems in the vehicle. Some ECUs implement performance critical functions such as fuel injection, ignition timing etc., whereas others control safety critical systems such as Anti-skid Braking (ABS), Electronic Stability Control (ESC) etc. Therefore an automated manufacturing test station for the ECU is significantly complex in design, involving several pieces of instrumentation, simulation of sensors and multiple automotive communication protocols.
Let’s see if some real-world figures could lend a quantitative perspective to this mass-manufacturing challenge. For instance let’s take the case of a mid-size automotive OEM that sells over a 100,000 vehicles annually, with production in 2 plants of identical capacity. That would mean at least (taking Engine Control) an equal number of ECUs supplied annually by their Tier-1 ECU Manufacturer who needs to manufacture around 8 ECUs in an hour, assuming full 3-shift operations. Assuming 4 parallel assembly-lines, it gives less than 30 minutes to manufacture an ECU! The time available practically for testing ECUs at the End-of-Line (EoL) is even shorter. Assuming 2 parallel test stations, the operator typically would have less than a minute to test an ECU – to load it on the test station, execute the automated tests, to know if it passed or failed, print a bar code and affix it to the passed piece (or dump the failed piece into the reject bin) and unload the ECU, and ready to load the next one! Added is the complexity of different versions of the same ECU that are simultaneously in production. Since batches having different versions of ECU come to the same test station, the operator would need to reconfigure the station for a different set of tests each time. The reconfiguration must be completed typically within 4 to 5 minutes before loading the next ECU type.
Now let’s review how this challenge applies (or doesn’t apply!) to different segments in the automotive industry. It’s a no-brainer that any Tier-1 Manufacturer (or OEM) in the business would have all of this covered in their factory floors already, if not they would hardly be selling! However it is no longer the steady-state in the case of a newly introduced ECU design, be it part of a new brand of vehicle the OEMs plan to introduce to the market, or be it related to an additional feature, like adaptive cruise control, that’s being introduced for a new model variant. Does the Tier-1 Manufacturer have the required engineering bandwidth to design the test station themselves? In the case of technology transfer for ECU design from a global principal, does the Tier-1 Manufacturer have in-house expertise in the early stages to develop a test station on time before pilot production starts? In the case of in-house development of the ECU, does the Tier-1 Manufacturer really have the resources, bandwidth and simply the time to get the test station ready before the ECU design passes all type tests and hits production?
Alternatively, do existing test station vendors for other components, like starter motors, tiltable mirror assemblies or instrument clusters, have the necessary expertise to design such a complex test station? What about ECUs for Electric Vehicles (and hybrids) that are predicted to transform the entire motoring landscape forever! Not to forget the two-wheeler (and three-wheeler) segments, which under the rapidly closing time window of emission control regulations (Bharat Stage-VI in India although behind Euro-VI by a few years, has a 2020 deadline currently!) will be forced to switch to ECU based fuel-injection etc. in a few years’ time in order to legally sell in the market.
Here’s where a little foresight into accelerating the design of manufacturing test solutions could benefit the relevant stakeholders. At Deep Thought Systems, We have designed and developed a reliable, modular and generic platform called TestMate for building manufacturing test stations specifically for ECUs. We have successfully customized Testmate to supply EoL test stations for ECUs to Indian Tier-1 Manufacturers and OEMs in a very short turnaround.
The Human Machine Interface (HMI) of the Testmate, the main part that the operator sees and operates on a continuous basis, is a very generic requirement that consists of rugged enclosure, controls and indications for long years of reliable performance in an assembly floor. They say, and we’ve witnessed it ourselves, that routine use of test stations by the creed of factory operators indeed constitutes a really hash environment! The mounting, orientation, peripherals for viewing and printing, display properties etc. are all ergonomically designed, optimally for continuous usage by an operator over an 8-hour shift (or longer!). We have successfully installed the test station in factory floors where they are being used continuously for years, with zero support calls.
We work with the customer on the ECU connector type, to design a custom cable harness and test fixture that includes the mating connector, with locking arrangement. The fixture design ensures proper contact between the pins of the ECU connector and the mating connector over months of continuous loading and unloading. We equip the customer with spare cable harness to handle the unlikely event of damage due to exceedingly rough/careless usage by operators, which can be easily replaced onsite without having to depend on a service engineer.
Built on the same principles as our other automotive offerings for vehicle diagnostics, testing and simulation, Testmate is capable of communicating with various ECU designs over multiple automotive communication protocols like CAN, K-Line and LIN and messaging standards like J1979, J1939, UDS, KWP2000 etc. We work with the customer to customize it for the ECUs communication specification. Apart from testing continuous engine parameters, Diagnostic Trouble Codes defined for the ECU can also be tested. Containing many building blocks of an actual ECU, for many communication tests the test station appears to the ECU as a peer ECU (sometimes multiple) of the related sub-system(s)!
Testmate can reliably simulate inputs to the ECU, ranging from the simplest ignition key switch to the complex crankshaft position waveform that is a critical input for many engine control functions. It also measures the ECU’s outputs, ranging from the discrete voltages or timed pulses to PWM waveforms to actuators, and evaluates it against defined limits for pass or fail. In addition to functional tests, power supply and other electrical (negative) tests can be performed to test how well the ECU hardware responds to abnormal conditions, like reversed polarity of the power supply, under voltage etc. The I/O instrumentation is completely custom-designed as per the interface specification of the ECU.
The HMI software supports multiple levels of users, with differential permissions defined for each login level, like running tests, modifying test parameter limits, changing the sequence of tests, error message text, test calibration and troubleshooting. All tests are logged for later review by supervisors or managers. For failed tests clear troubleshooting assistance is displayed/logged as to which specific test failed and how exactly, so that the defective unit can be repaired. An ECU may come in twice for tests, once after bare assembly without the enclosure, and once again after the enclosure is fitted.
Finally it all comes together in the hands of the operator, who after loading an ECU has less than a minute to run the automated tests to know if it is a pass or a fail. Pass is good news always, the ECU gets a bar-coded label stuck on it and moves forward to the next stage. However a fail is hardly the end of the road because in order to keep the rejection costs low failed units need to be repaired, with the test station providing precise troubleshooting information to get it repaired quickly. In this context a few pertinent questions for relevant Tier-1 Manufacturers and OEMs are:
1) How much of ECU test station design could be generic, versus how much of it should essentially remain ECU design specific?
2) Does it justify to their business to completely reinvent a unique solution to their challenge in terms of engineering effort, cost or timelines? While large parts of the challenge retain a commonality, which a generic test platform such as Testmate has not only abstracted, but also been customized for specific ECUs and proven on the factory floor.
At Deep Thought Systems, we clearly understand the generic and reusable parts of the TestMate platform which help accelerate the design of EoL Test Stations. A high-performance hardware platform, powered by a real-time operating system and sound embedded firmware design practices ensures fast test execution and that all timing considerations in vehicle communication protocols are taken care of. Thanks to our expertise in digital and mixed signal hardware design, we are able to quickly customize other parts of the test station like I/O interfaces, ECU fixture and HMI software as per the customer’s specification and needs with total assurance of the customer’s Intellectual Property.
Another closely related area for production where we could work with customers to provide a quick solution is the design and supply of ECU Flashing units. Operators use the flashing units to flash the firmware into ECUs after assembly. The design of the ECU flashing unit is greatly accelerated by our generic ECU flashing framework, where the only input required from the customer is the seed generation algorithm for unlocking the ECU, which could be imported into our firmware as a library (in binary form) to protect the customer’s (or principal’s) confidentiality. In conclusion, our expertise and track record of supplying and installing EoL test stations on factory floors and supporting production personnel in the usage and fine-tuning of these systems will ensure an efficient and trouble-free operation for the customer for the entire production lifecycle.
Link to Linkedin article
These days we see quite a few technology companies going the crowdfunding route(Indiegogo/Kickstarter) to get to market sooner rather than wait for the traditional way of raising money to build the product. It appears as a beautiful idea if co-founders do not want to give out equity but raise money to get to market. But I personally feel that this is a double edge sword and entrepreneurs have to be very careful with the choices as it may end up hurting more than helping in the long run.
What I have noticed is companies look forward to crowd sourcing mostly for either one of the following reasons
- Raise money to help them accelerate the engineering cycle time and help them reach market faster with confirmed orders – The challenge with this usually is if you are not far enough into your engineering /product cycle with most problems solved the money raised through these campaigns are in most cases is not enough to get to production and delivery.
- Create a sales and marketing buzz which then later helps them to get leverage with retailers and opens up many channels – This is a fantastic model because getting into some of the traditional channels to sell a product is not easy. But these days Bestbuy/Amazon etc. have a separate focus on successful products from these campaigns. So this will enable the startups to get into shelf faster if they are successful. This also gives a better chance of getting picked up by some distributors.
- Show the demand the product has in the market to convince tradition VC’s to put in money into the company – This is a good idea only if you are convinced that your product is going to be a runaway success else the chances are that it can do more harm. Any thing less than a runaway success is going to raise more questions and challenges when startups try to raise money than help.
I have read few statistics and based our experience, the projects mostly do not make it out in time. This ends up damaging credibility with the same customer base which supported the product. And now if the product turns out to be below par after a long wait, we have a very unhappy customer to deal with also.
What I noticed is that many of these companies fail to deliver on time because
- These companies are either trying to solve some really challenging engineering problems which need a tremendous amount of money than what they can get from a Kickstarter/Indiegogo campaign. So they start falling behind on development goals and delivery deadlines
- They have the right idea and concept but limited experience in delivering products end to end and when they start dealing with it they realize the unknowns are lot more than the knowns and they start slipping
- These companies are fighting battles on many fronts and crowd sourcing is just one of the avenues. So they do tend to get carried away in their engineering cycle when they see greener pastures which ends up adding delays.
My personal thought always has been crowd funding is a good platform if you are done with 80% of your engineering . As I mentioned earlier this is because the money you raise from pre-selling this product is usually good to pay for your production needs only. However, if you are planning on doing your core engineering and delivering product based on this money then the likelihood of failure and delays are very high. The only exception I can think of is if the company has a reliable partner/team in a country like India, China where a bulk of the engineering is being done then this money does tend to help even if they are behind in their product lifecycle.
I think backers need to check with the company before putting money in as to how much of engineering is already done and ask to show working prototypes, actual Industrial design mockups, software demos etc. before trying to put money in. Also, it may help to ask how the money collected is going to be spent because it gives you an idea of the readiness of the product you are backing.
Link to article on Linkedin
‘Platform as a Service’ (PaaS) in the distributed systems arena is gaining wide adoption nowadays as the cloud is gaining more customer confidence. The latest IDC forecast states “By 2020, IDC forecasts that public cloud spending will reach $203.4 billion worldwide”. They also predict a fast growth in the PaaS segment, precisely in the next five years, Compound Annual Growth Rate (CAGR) is predicted at 32.2% which is very promising. PaaS Solutions for distributed systems have captured the serious attention of big players, like Amazon (AWS EMR), Google (Google Cloud Platform), Microsoft (HDinsight), Databricks (Unified Analytic platform) etc., and the count is growing by the day. The same is the case for IOT, with platforms from Amazon (AWS IOT), IBM (Bluemix), CISCO (Cloud Connect) etc. being the major ones in the growing list.
The explosive growth of PaaS Solutions is boosted by the complexity of DevOps and administration nightmares encountered in distributed systems; we still remember the Apache Hadoop version upgrades that always led to sleepless nights!
PaaS Solutions absorb a lot of complexities of the distributed systems which allows us to,
1. Do the evaluation of platforms straight away. You don’t need to wait anymore for cost approvals, deployment completion as in the case of On Premises or IaaS deployments.
2. IOT enabling becomes as fast as just plugging in an agent in your device.
3. Automatic version upgrades of opensource distributed platforms like Apache Spark, Apache Hadoop, Apache Kaa etc. becomes just configuration changes.
4. You can enjoy the additional features like Notebook integration, REST API service support etc. provided by the vendors
All fine! But are there any hidden factors in PaaS Solutions that need to be considered? From my experience of the past few years, it is a big YES! Especially for IoT and Big Data applications.
A ready-made dress may still need alterations!!!
PaaS solutions allow us to remain focused on the application use case by simplifying the spinning up of any platform with few clicks. Moving to another platform configuration is as easy as changing a few parameters and doing a restart. Major configurations and optimizations inside the platform are completely transparent to the user, which is an advantage most of the time.
However complete transparency to the system is not always insightful. You may need to play around with platform configurations to tune your application on top of it – scenarios like trying a few customized or new plugins into the platform which can give extra muscle to your application. As the open source incubations are growing rapidly and lot of new innovative tools in distributed systems are getting released every month, you need to have the flexibility to use them on the platform. Debugging or performance benchmarking of the application running over a totally transparent underlying platform is not good news for system designers. So when the platform is said to be transparent, we should also check the level of control we have over the platform.
For instance, while working with a major US healthcare player for collecting their large data streams for predictive and descriptive analytics, we were using Kafka for data injection and Apache Spark Streaming PaaS for data landing and processing. The initial evaluation and selection of the platform went well with standard architectural considerations and we were happy with the platform choice. Once the development of the application’s functionality was over and alpha tests completed, we started looking to make a few optimizations and tuning as part of the refactoring, for which accessing the platform cluster nodes became essential. We requested the platform vendor for access to the cluster nodes, but their reply was disappointing. Their customer support said “It’s completely transparent to user and we do not recommend any access or modification of the platform configurations”. We were stuck!!!
In another case of a Smart Battery IoT project, we were pushing status info from the smart device to an IOT PaaS platform for self-tuning. The data was being stored internally in the PaaS system. Things were working great and we were able to view the data using their custom tools and REST API based limited query access. However in our project, we made a strategy to create a raw data lake into AWS S3 for future analytics. To our surprise we found that there isn’t an option for data export! Being a very basic yet important feature, we contacted the IoT platform technical support. Their response was “Yeah, it is a simple feature, but it is not in our ‘Business priority list’ of features. So, it may take us some more time to do it”. How much more time was unknown! We were stuck again, and had to review the raw data lake policy.
In both cases, our development plans were seriously impacted and we were forced to skip/postpone major use cases, or start looking out for alternative platform to migrate to, although so late into the project. Let’s closely observe the responses from technical support in these two cases for a few interesting facts.
Case 1: “It’s completely transparent to user and we do not recommend any access or modification of the platform configurations”
Transparency of platform complexities is definitely an important motivation to opt for PaaS, as it gives a quick, efficient and cost effective way of building the distributed system. But it is important to have an insight into the platform internals and in few cases some control as well. Being a system designer, we don’t like to swallow things as they are!!! After all, “platform limitations” is definitely not the story we want to tell our customers! In this specific case, we were looking to try out external monitoring tools that need a few agents to be installed into the cluster nodes. Eventually supporting a third-party BI tool took us roughly two months, in coordinating between the technical teams at the PaaS vendor and the BI vendor. This is simply not acceptable to the customer in terms of time or budget.
Case 2: “Yeah, it is a simple feature, but it is not in our ‘Business priority list’ of features. So, it may take us some more time to do it”
Not just disappointing, this is alarming!!! Technical interoperability for the customer’s data should not be restricted for the sake of business priority. Unfortunately, the so called “business priority” often loses focus in retaining customers which reminds one of a “My way or the Highway” strategy! No customer wants his data being stuck in a specific platform. We need the flexibility to move it through multiple platforms, as business data has latent insights which could be extracted through different systems today or in the future.
To sum up, apart from traditional architectural considerations for selection between OnPremises or IaaS, PaaS or SaaS, we should be vigilant regarding these hidden factors during the selection of distributed platforms especially for IoT and Big Data applications, where large amounts of data are generated. The hidden factors are tricky in the sense that they may not be visible in the first look.
Some of the architectural considerations that help mitigate these hidden factors are given below.
1. Create a proper migration plan – This may not be a short-term goal. But it becomes very important because as and when the data grows you may end up in a world of restrictions.
2. Make sure you have enough control over the platform internals – Although you want to avoid administration overheads as much as possible, you still need good control of the platform for development, refactoring and analysis. Distributed system usage without platform control is painful in the long run. Telnet or SSH access to the cluster nodes, privilege to install custom tools and configuration level flexibility are few items to be verified in general.
3. Third party integration flexibility – Most of the time, the system that we develop would be part of a pipeline and may need integration with customized systems like monitoring tools, custom logging methods etc. which make the integration hooks critical.
4. Platform vendor’s willingness to provide functionality on demand – Platform vendors should be able to handle custom functionality requests on demand. We cannot wait indefinitely for the platform to support it in due course. Make sure that their quick and efficient response is covered in your Service License Agreement (SLA).
Distributed Platform as a Service is definitely growing rapidly, and customers will continue to invest heavily for the combo-advantages of reduced Capex cost, reduced time to market and reduced maintenance/administrative complexity. But I hope the quality and competitiveness of PaaS Solutions also matures fast for the benefit of investing customers, like our IoT and Big Data customers at Sequoia AT. Let’s hope a day will come soon when the platform vendors start advertising their respective platforms by throwing an open challenge “Hey, try out our PaaS solution and if you don’t like it, migrate to any other PaaS solution in 24 Hours or 1 Week!!!”
Link to Article on Linkedin