One of the best features in Pentaho Data Integration is the possibility to create your own transformation step also known as a “Pentaho Data Integration Plug-In”.

For example: I want to collect twitter messages containing specific keywords so I can analyze them afterwards. Pentaho itself does not have a standard plugin for this data input. So I created a plugin called “Twitter Search” which retrieves all messages containing their specified searchterms. Building a database containing all messages since the start of this project.

Pentaho Data Integration Plugin for Twitter

Pentaho Data Integration Plugin for Twitter

With this data I now have access to an extra datasource which I can use for my analytic environment. Giving me extra insights and information on the chosen searchterms.

The only preferable requirement, to create such a custom plugin, is the availability of a JAVA API which can connect to the datasource and retrieve the data and transforms it into a readable format Pentaho can use in it’s ETL flow.

Pentaho Data Integration is part of the Pentaho BI Suite. This suite contains everything you need for a business intelligence project. Data Integration, Reporting, Analysis and data mining possibilities. Check their site for more information.

Feel free to contact me if you want more information regarding this subject.

3 Comments

  1. This is a nice start Bram!

  2. Tellervo Warelius

    Good post! This is the kind of information that should be distributed on the online community. I would like to read more of this.

  3. A cool blog post there mate . Thanks for that .

You must be logged in to leave a reply.