Talend - Data Integration Tout de Suite
Thursday, August 21st, 2008If you’ve talked to me in the past year, chances are you’ve heard me mention Talend.
A year ago, Jeff and I were creating a proposal for a project that would combine different web services with several ETL jobs and business rules. The end result needed to be an application that could be maintained by the client with very little to no coding needed to make changes to the business rules and data. And, of course, the client wanted to keep the upfront costs down - no unnecessary software licenses, please!
Our research led us to Talend Open Studio. Free, open source, built on Eclipse, with a graphical interface, metadata-driven and connectors to every database imaginable…we were hooked! We quickly put together a proof of concept that took data from a SQL server database, transformed it into XML, called a .NET web service application and logged the responses. Talend generated Java code that we then ran by calling a .bat file from Windows Scheduler. The development cost savings from using this code generation tool? It reduced the project cost by 50%.
We do a lot of work in the real estate sector, and aggregation is the movement of the moment, with syndication being a close second. MLS’s, state associations, brokers, and third party vendors are all dealing with not only ETL (Extract-Translate-Load) but work-flow issues as well. We work to provide business integration solutions at points in the transaction for all of these different parties.
In keeping with our agile development methodology, our design goals are:
-
The solution should be adaptable to allow for changes in direction of application.
-
Each application process should be loosely coupled and composable.
-
The base framework, once implemented, should allow for the rapid development of additional services.
-
The solution should consist of standardized components that can be reused and combined to address the client’s changing goals and priorities.
Talend allowed us to fulfill these goals for a variety of projects over the past year. I don’t want to sound as if it is a golden hammer; it’s not. But it is a valuable tool in our development toolkit.
Let me give you a real-world example of data syndication using Talend to create a series of jobs that:
- Call an external RETS client program, passing it parameters, and retrieves listing data from an MLS in RETS COMPACT format.
- Load the RETS data as an internal XML metadata and pass it into a Talend mapper object.
- Create three different output maps and build the corresponding output XML files - one for Zillow, one for Trulia, one for Yahoo.
- Transform the single RETS feed into three syndication formats.
- Call the built-in FTP client to put each xml file on the appropriate server and directory expected by the syndicators.
Talend generates stand-alone code in either Java or PERL and you can run this job from any platform in the scheduler of your choice as often as you want.
I’m excited about tools like this that make life easier. I believe in open source and open standards. I want to work with companies that are building those kinds of tools with that kind of philosophy. And that’s why we recently became Talend partners. I’m looking forward to seeing what this innovative French company does next, and being a part of things.