Big Data is Big again

I anticipate we will hear this phrase – “Big Data” – more in the near future. It stands for massive volumes of data used to create big change. A recent report by McKinsey discussed this in some detail

They conclude. the computer and electronic products and information sectors are poised to gain substantially from the use of Big Data. Finance, insurance and government are also positioned to benefit strongly as long as the decision makers develop a data driven mindset . The biggest barriers will remain around privacy and security. Legal issues will need to be addressed before data can be copied and combined with other data sources. The question of ‘fair use’ will also need to be addressed.

The report points out that the sectors that achieved a leap in productivity around the turn of the century shared three broad characteristics in their approach to IT. First, they tailored their IT investment to sector-specific business processes and linked it to key performance levers. Second, they deployed IT sequentially, building capabilities over time. Third, IT investment evolved simultaneously with managerial and technical innovation.

To give an example of the value that can be realized – in the retail sector alone marketing levers via Big Data can affect 10 to 30 percent of operating margin; merchandising levers can affect 10 to 40 percent; and supply chain levers can have a 5 to 35 percent impact.

Of note – and this is going to be critical to policy makers. The pool of graduates with capability for deep analytical thinking is strongest in the following five countries. I am sure there is more to being a success in the knowledge economy than having the biggest pool of talent, but the following five countries seem to be well positioned to dominate the space in the foreseeable future.

  1. USA
  2. China
  3. India
  4. Russia
  5. Brazil

The full report can be downloaded from here.

In my opinion there are a few key enablers that are key to realizing the potential of Big Data.

  1. Political and managerial will: trust and understanding of insights from Big Data for use in driving change
  2. Insights visualization: delivering insights from Big Data truthfully and meaningfully for building organizational will for change
  3. Data fusion: Big Data stands to grow exponentially with integration of image, audio and social data.
  4. Open source technologies for data storage and processing:  I feel that the high cost of technology is a barrier. mass adoption of open source technologies such as R, Cassandra, Hadoop will build a technological workforce capable of delivering on the promise of Big Data.

 

Bump and grind with Fedex

A few weeks ago we shipped a package to California by Fedex. Nothing out of the ordinary except we put a BlackBox sensor tag inside the package. The package was supposed to arrive on the Friday but for whatever reason it got delayed. From Indianapolis it was sent to Memphis and then finally routed to Fountain Valley California. Here’s what Fedex told us about the package delay.

Figure 1: Screen shot of Fedex package tracker

That would have been it, except we downloaded the data on the tags to see what happened to the package en route. The results were an eye opener.

Each impact over 8G’s received by the package was logged with a timestamp by the BlackBox sensor. We mapped it back to Fedex tracker, and this is what we made out.

event record on blackbox sensor tag

Figure 2: Impact events recorded on package shipped from Toronto to California

Here’s what one of the dots in the chart above represents. The time series of the impact recorded on three axes. Note that this event probably represents the package dropping from one conveyor belt to another and then given a little sideways bump with a moving arm. Pretty cool , eh?

Figure 3: Impact detail on x-y-z axes for Event #38 recorded on sensor

It was fascinating to map out what happened at every point of transfer, and the amount of time it took for the sorting and routing to take place. When we looked up the sorting process at Fedex’ Memphis center it was quite amazing to see what we had deduced. Fascinating stuff. Take a look at the vid

OF course the BlackBox is used to monitor product abuse, but this experiment suggests another application – monitoring the treatment of goods shipped on a particular channel by a specific carrier. It would be interesting to compare the performance of UPS vs Fedex vs DHL vs the postal service – in who subjects the package to the most abuse.

Insurance fraud and the election platform

Today Ontarians go to the polls to elect in a new provincial government. This election was interesting (to us) for the attention on auto insurance fraud by the candidates in the lead-up. The present government also allocated some space on the subject in the 2011 budget.

The reforms also directly targeted abuse and fraud in the auto insurance system, which increase costs and lead to higher premiums. The government will build on these reforms by taking further immediate steps to reduce fraud. These include:

  • working with the industry to use the newly established Health Claims for Auto Insurance (HCAI) database to detect potentially fraudulent activity. Use of the HCAI database by Ontario health care facilities or providers to transmit auto insurance claim forms to insurers was made mandatory on February 1, 2011; 
  • introducing new rules to ensure that treatments are provided as invoiced;
  • establishing an auto insurance anti-fraud taskforce to determine the scope of auto insurance fraud in Ontario and make recommendations regarding detection, investigation, enforcement and consumer education. The government is committed to fully investigating the problem of auto insurance fraud and will establish appropriate working groups of stakeholders to develop collaborative approaches and solutions;

I think this is an important initiative and ClaimsGator can play a critical role in it. Am hopeful that the proposed reforms continue with the new elected government.

A Hat Tip to Autonomy

To be candid I had never heard of Autonomy before the recent acquisition by HP. The more I hear of it, the more I am in awe of the founders and their vision as below.

“Autonomy was founded upon a vision to dramatically change the way in which we interact with information and computers, ensuring that computers map to our world, rather than the other way round.”

They are correct in identifying that the wealth of data in unstructured text, voice, images and video far exceeds the value tapped through the formatted content in databases.

The analysts are still jousting over the merit of the deal. In my opinion, the company vision translated to a $10B acquisition and the founder’s stake is worth close to $900M today. That’s a win imho and deserves a hat tip.

It would be interesting to see how the story plays out for OpenText, which plays in the same space as Autonomy.

As a footnote, another company of note in this space is Idee Inc for its image search engine called Tineye. I am not sure so far if there is any business applications for this product for any of our clients, but it’s definitely something I keep a (sic) eye on.