Welcome!

Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Containers Expo Blog, Microservices Expo, @CloudExpo

Containers Expo Blog: Article

Data Mining and Data Virtualization

Extending Data Virtualization Platforms

Data Mining helps organizations to discover new insights from existing data, so that predictive techniques can be applied towards various business needs. The following are the typical characteristics of data mining.

  • Extends Business Intelligence, beyond Query, Reporting and OLAP (Online Analytical Processing)
  • Data Mining is cornerstone for assessing the customer risk, market segmentation and prediction
  • Data Mining is about performing computationally complex analysis techniques on very large volumes of data
  • It combines the analysis of historical data with modeling techniques towards future predictions, it turns Operations into performance

The following are the use cases that can benefit from the application of data mining:

  • Manufacturing / Product Development: Understanding the defect and customer complaints into a model that can provide insight into customer satisfaction and help enterprises build better products
  • Consumer Payments: Understand the payment patterns of consumers to predict market penetration analysis and discount guidelines.
  • Consumer Industry: Customer segmentation to understand the customer base and help targeted advertisements and promotions.
  • Consumer Industry: Campaign effectiveness can be gauged with customer segmentation coupled with predictive marketing models.
  • Retail Indsutry: Supply chain efficiencies can be brought by mining the supply demand data

‘In Database' Data Mining
Data Mining is typically a multi-step process.

  1. Define the Business Issue to Be Addressed, e.g., Customer Attrition, Fraud Detection, Cross Selling.
  2. Identify the Data Model / Define the Data / Source the Data.(Data Sources, Data Types, Data Usage etc.)
  3. Choose the Mining Technique (Discovery Data Mining, Predictive Data Mining, Clustering, Link Analysis, Classification, Value Prediction)
  4. Interpret the Results (Visualization Techniques)
  5. Deploy the Results (CRM Systems.)

Initially Data Mining has been implemented with a combination of multiple tools and systems, which resulted in latency and a long cycle for realization of results.

Sensing this issue, major RDBMS vendors have implemented Data Mining as part of their core database offering. This offering has the following key features:

  • Data Mining engine resides inside the traditional database environment facilitating easier licensing and packaging options
  • Eliminates the data extraction and data movement and avoids costly ETL process
  • Major Data Mining models are available as pre-built SQL functions which can be easily integrated into the existing database development process.

The following is some of the information about data mining features as part of the popular databases:

Built as DB2 data mining functions, the Modeling and Scoring services directly integrate data mining technology into DB2. This leads to faster application performance. Developers want integration and performance, as well as any facility to make their job easier. The model can be used within any SQL statement. This means the scoring function can be invoked with ease from any application that is SQL aware, either in batch, real time, or as a trigger.

Oracle Data Mining, a component of the Oracle Advanced Analytics Option, delivers a wide range of cutting edge machine learning algorithms inside the Oracle Database. Since Oracle Data Mining functions reside natively in the Oracle Database kernel, they deliver unparallel performance, scalability and security. The data and data mining functions never leave the database to deliver a comprehensive in-database processing solution.

Data Virtualization: Data Virtualization is the new concept that allows , enterprises to access their information contained in disparate data sources in a seamless way. As mentioned in my earlier articles there are specialized Data virtualization platforms from vendors like, Composite Software, Denodo Technologies, IBM, Informatica, Microsoft have developed specialized data virtualization engines. My earlier article details out Data Virtualization using Middleware Vs RDBMS.

Data virtualization solutions provide a virtualized data services layer that integrates data from heterogeneous data sources and content in real time, near-real time, or batch as needed to support a wide range of applications and processes. : The Forrester Wave: Data Virtualization, Q1 2012 puts the data virtualization in the following perspective, in the past 24 months, we have seen a significant increase in adoption in the healthcare, insurance, retail, manufacturing, eCommerce, and media/entertainment sectors. Regardless of industry, all firms can benefit from data virtualization.

Data Mining Inside Data Virtualization Platforms?
The increase in data sources, especially integration with Big Data and Unstructured data made Data Virtualization platform a important part of enterprise data access strategy. Data virtualization provides the following attributes for efficient data access across enterprise.

  • Abstraction: Provides location, API, language and storage technology independent access of data
  • Federation: Converges data from multiple disparate data sources
  • Transformation: Enriches the quality and quantity of data on a need basis
  • On-Demand Delivery: Provides the consuming applications the required information on-demand

With the above benefits of the Data Virtualization Platform in mind, it is evident that enterprises will find it more useful if Data Virtualization platforms are built with Data Mining Models and Algorithms, so that effective Data Mining can be performed on top of Data Virtualization platform.

As the important part of Data Mining is about identifying the correct data sources and associated events of interest, effective Data Mining can be built if disparate data sources are brought under the scope of Data Virtualization Platform rather than putting the Data Mining inside a single database engine.

The following extended view of Data Virtualization Platform signifies how Data Mining can be part of Data Virtualization Platform.

Summary
Data Virtualization is becoming part of the mainstream enterprise data access strategy, mainly because it abstracts the multiple data sources and avoids complex ETL processing and facilitates the single version of truth, data quality and zero latency enterprise.

If value adds like a Data Mining engine can be built on top of the existing Data Virtualization platform, the enterprises will benefit further.

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).

@ThingsExpo Stories
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, discussed how they built...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, examined the regulations and provided insight on how it affects technology, challenges the established rules and will usher in new levels of diligence arou...
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
No hype cycles or predictions of a gazillion things here. IoT is here. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, an Associate Partner of Analytics, IoT & Cybersecurity at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He also discussed the evaluation of communication standards and IoT messaging protocols, data...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
DevOps at Cloud Expo – being held June 5-7, 2018, at the Javits Center in New York, NY – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits,...
@DevOpsSummit at Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, is co-located with 22nd Cloud Expo | 1st DXWorld Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait...
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive ov...
SYS-CON Events announced today that T-Mobile exhibited at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on qua...
SYS-CON Events announced today that Cedexis will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cedexis is the leader in data-driven enterprise global traffic management. Whether optimizing traffic through datacenters, clouds, CDNs, or any combination, Cedexis solutions drive quality and cost-effectiveness. For more information, please visit https://www.cedexis.com.
SYS-CON Events announced today that Google Cloud has been named “Keynote Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Companies come to Google Cloud to transform their businesses. Google Cloud’s comprehensive portfolio – from infrastructure to apps to devices – helps enterprises innovate faster, scale smarter, stay secure, and do more with data than ever before.
SYS-CON Events announced today that Vivint to exhibit at SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California. As a leading smart home technology provider, Vivint offers home security, energy management, home automation, local cloud storage, and high-speed Internet solutions to more than one million customers throughout the United States and Canada. The end result is a smart home solution that sav...
SYS-CON Events announced today that Opsani will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Opsani is the leading provider of deployment automation systems for running and scaling traditional enterprise applications on container infrastructure.
SYS-CON Events announced today that Nirmata will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nirmata provides a comprehensive platform, for deploying, operating, and optimizing containerized applications across clouds, powered by Kubernetes. Nirmata empowers enterprise DevOps teams by fully automating the complex operations and management of application containers and its underlying ...
SYS-CON Events announced today that Opsani to exhibit at SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California. Opsani is creating the next generation of automated continuous deployment tools designed specifically for containers. How is continuous deployment different from continuous integration and continuous delivery? CI/CD tools provide build and test. Continuous Deployment is the means by which...