Predictive Analytics and Data Mining: Why Most Projects Fail and What Really Works

Predictive Analytics and Data Mining (PADM) has been a slow but steadily growing practice within the broader field of Business Intelligence (BI). The “machine learning” technologies that drive PADM were developed many decades ago. The last two decades presented adequate desktop computing power and software automation to make PADM a practical and highly valuable BI function in every medium and large business.

There are two significant opposing forces influencing the pace of broad PADM adoption. Fortunately the forces for progress prevail and are gaining momentum. This momentum for PADM will continue to grow, because the majority of organizations are steadily proceeding down the Business Intelligence development path — from data acquisition to storage, structure, quality, retrieval, exploration, visualization, dash boarding, and retrospective analysis; and due to an increasing flow of powerful case studies and industry reports that reveal significant rewards enjoyed by those who properly approach the practice.

Why Most PADM Projects Fail

In a business environment, practitioners are rewarded business impact, not for technical accuracy. Unfortunately, the practitioners chose to focus on building technically accurate models. As such, the model data answer the wrong questions. The results are not interpretable nor are they adapted to the operational environment. Results are not reasonably implementable, not understood or appreciated by leadership, and don’t meet organizational metrics that define success or failure.

It’s no wonder then why a corresponding number of projects (again over 50% when adding the first three rows in the illustration below) did not reach completion, could not calculate ROI, or didn’t achieve positive ROI.

What Really Works in Predictive Analytics

The vast majority of BI professionals do not realize that there are industry standard processes for PADM available in the public domain. Two of the most popular are SEMMA (Sample, Explore, Modify, Model, and Assess, by SAS Institute) and CRISP-DM (CRoss-Industry Standard Process for Data Mining). CRISP-DM is vendor-neutral and more broadly adopted.

CRISP-DM

CRISP-DM evenly applies emphasis across project-level strategic implementation and tactical model development with three phases. The first two phases, Business Understanding and Data Understanding, concentrate on conducting a comprehensive assessment of the overall environment, situation, team members, resources, and objectives. Any organization interested in building more than an independent or pilot model should not overlook the importance of implementing every stage within these first two critical phases.

Allow me to distill the experiences and recommendations imparted so far into a low-risk/high-impact approach to PADM.

  1. Start with Training: Whether you are a practitioner or leader, training is essential for appreciating the unique nuances of this particular BI practice. Even those planning to outsource the effort, will interact more effectively with their consulting team, after understanding PADM’s risks, rewards, capabilities, limitations, and standard process.
  2. Conduct a Data Mining Project Assessment: While most organizations have the capability to own, operate, and maintain a PADM practice internally with existing staff, it is recommended to have a seasoned data mining expert play the role of assessor and architect of the project design.
  3. Implement Internally with External Mentorship: There is no reason that your existing business practitioners cannot ultimately build, operate, and maintain a PADM practice. However, benefit will be achieved with the guidance and oversight of a seasoned PADM project leader and architect.

You Can Do It, Own It, and Reap Residual Rewards

If you are in a medium or large business that has not made PADM a highly profitable part of your BI process, then the valuable prospective insights left hidden within your glut of data are no different than large denominations of money left hanging from low branches. It won’t be long until PADM is a standard function within your BI practice. Why delay the rewards? Like any other BI practice, PADM is not a one-time implementation, but an ongoing process. As such, it would be highly inefficient to perpetually outsource the function. Not only is the talent involved far more expensive than nearly any other IT role, but once the process is designed and established, it’s straightforward for general business practitioners to maintain the end-to-end process.

Related Courses
Predictive Analytics and Data Mining: Model Development
Predictive Analytics and Data Mining: Strategic Implementation

Excerpted from Predictive Analytics and Data Mining: Why Most Projects Fail and What Really Works

In this article

Join the Conversation