Archive for the ‘Standards’ Category

Talend - Data Integration Tout de Suite

Thursday, August 21st, 2008

If you’ve talked to me in the past year, chances are you’ve heard me mention Talend.

A year ago, Jeff and I were creating a proposal for a project that would combine different web services with several ETL jobs and business rules. The end result needed to be an application that could be maintained by the client with very little to no coding needed to make changes to the business rules and data. And, of course, the client wanted to keep the upfront costs down - no unnecessary software licenses, please!

Our research led us to Talend Open Studio. Free, open source, built on Eclipse, with a graphical interface, metadata-driven and connectors to every database imaginable…we were hooked! We quickly put together a proof of concept that took data from a SQL server database, transformed it into XML, called a .NET web service application and logged the responses. Talend generated Java code that we then ran by calling a .bat file from Windows Scheduler. The development cost savings from using this code generation tool? It reduced the project cost by 50%.

We do a lot of work in the real estate sector, and aggregation is the movement of the moment, with syndication being a close second. MLS’s, state associations, brokers, and third party vendors are all dealing with not only ETL (Extract-Translate-Load)  but work-flow issues as well. We work to provide business integration solutions at points in the transaction for all of these different parties.

In keeping with our agile development methodology, our design goals are:

  • The solution should be adaptable to allow for changes in direction of application.
  • Each application process should be loosely coupled and composable.
  • The base framework, once implemented, should allow for the rapid development of additional services.
  • The solution should consist of standardized components that can be reused and combined to address the client’s changing goals and priorities.

Talend allowed us to fulfill these goals for a variety of projects over the past year. I don’t want to sound as if it is a golden hammer; it’s not. But it is a valuable tool in our development toolkit.

Let me give you a real-world example of data syndication using Talend to create a series of jobs that:

  1. Call an external RETS client program, passing it parameters, and retrieves listing data from an MLS in  RETS COMPACT format.
  2. Load the RETS data as an internal XML metadata and pass it into a Talend mapper object.
  3. Create three different output maps and build the corresponding output XML files - one for Zillow, one for Trulia, one for Yahoo.
  4. Transform the single RETS feed into three syndication formats.
  5. Call the built-in FTP client to put each xml file on the appropriate server and directory expected by the syndicators.

Talend generates stand-alone code in either Java or PERL and you can run this job from any platform in the scheduler of your choice as often as you want.

I’m excited about tools like this that make life easier. I believe in open source and open standards. I want to work with companies that are building those kinds of tools with that kind of philosophy. And that’s why we recently became Talend partners. I’m looking forward to seeing what this innovative French company does next, and being a part of things.

Good Stuff at No Fluff

Wednesday, August 20th, 2008

No Fluff Just Stuff

Summer in Ohio brings with it one of my favorite traditions - the annual Central Ohio No Fluff Just Stuff Software Symposium.

Started in 2002, the No Fluff seminars are three solid days of focused technical discussions on the new developments in the Agile/Java space. They keep the conference size small and the pace intense - no snoozing in the back of the room here! Being the teacher’s pet that I am, I enjoy sitting up front and participating.

This is our third year attending No Fluff, and I followed my usual method for selecting which sessions to attend: namely, on the first day I find a speaker who is engaging and totally enthusiastic about his subject, and I attend as many of his sessions as possible. All of the presenters at No Fluff are good, but at each conference I look for a special speaker who is really passionate about a technology and makes it come alive. This year that person was Scott Davis.

Feelin’ Groovy

Serendipitously, Scott’s first sessions were about topics near to my heart: Groovy and Grails. We started using Groovy and Grails on projects this Spring and they lived up to the hype - it was possible to deliver a project in 1/3 the time using Groovy on top of our legacy Java libraries. For me the key is the flexibility of a dynamic language while not losing all the Java goodness underneath.

Scott spent 90 minutes delving into some of the cooler parts of Groovy that I had only skimmed the surface on, metaprogramming which includes Operator Overloading and ExpandoMetaClass. Basically ExpandoMetaClass allows me to dynamically add methods, constructors and properties. This means if I’m using an API that I don’t have the source to, but I need to add/change a method…I can do this via Groovy! This is really cool - but according to both Scott and Jeff definitely dangerous in the hands of evil or careless programmers.

Operator Overloading maps Java and Groovy method calls to operators. I had basically just been working with the left shift (<<) for Lists but Scott took us through examples from the rest of the list including how to add overloading to our own classes.

Besides providing me with additional Groovy tips and tricks, Scott had other sessions that had direct relevance to many of our clients in Real Estate.

Mappin’ It

Integrating maps - or building a bigger, better GIS system - is on every client’s wish list. So is saving money. These are not mutually exclusive, if you know your way around free data and open source GIS. Scott’s discussion was not just about URLS for free data. He actually tackled dealing with moving between different projections, overlaying data, integrating free viewers, and optimizing queries in spatial databases.

Back in my banking days I did a lot of work with some very expensive desktop mapping packages: Atlas, Tactician and MapInfo. At Realty One, we built a sophisticated mapping system using MapInfo and Oracle spatial database. We had a really great CMA that had the option to filter on different radii around properties. All this back in 2000 but still with very pricey software. Fast forward to 2008 and suddenly it appears there’s finally an open source spatial database alternative: PostgreSQL+PostGIS.

And then, to make everything even better, Scott ties this GIS package together with some web services from the OGC, the organization that develops and maintains geospatial and location-based web services.

Working with related standards in the real estate industry, I had the opportunity to meet with representatives from the OGC at the 2007 Spring MISMO meeting. They were really interested in working together with the various real estate standards to share their geospatial standards and also their experience in managing the development of standards. (At one point someone from OGC had read the RETS2 service document and was very complimentary to me regarding its composition, vision, and use of existing standards - pretty high praise.)

To recap: 90 minutes spent in the GIS session yielded a wealth of notes and ways to save many clients lots of money. This alone paid for the price of the conference.

Have you met JSON?

“Real World JSON” turned out to be an eye-opener. JSON (JavaScript Object Notation) is such a simple concept you wonder how anyone can talk about it for 90 minutes, right? The beauty is in the coding. We walked through remoting JSON, building JSON in Groovy, converting XML to JSON and vice versa, JSON + XSS, and a few really great libraries (like YUI) that make using JSON as ridiculously simple as explaining JSON.

There are so many utilities to build and consume JSON that the barrier to entry is really low. For much of what our clients do to exchange bulk real estate data in key/value pairs, JSON makes a lot of sense.

One of the demonstrations included the benefits gained by converting a fairly sophisticated xml document into JSON. Given the JSON utilities available in nearly every programming language to convert XML to JSON, the thought crossed my mind that representing RETS’ XML documents as JSON is not a big leap.

In summary, this year’s No Fluff Just Stuff was Groovy. I learned some tips for putting the “where” into my clients’ applications without taking all the green out of their wallets. I have a notebook full of ideas. And I’m excited and re-energized about work. If you have a No Fluff coming to a city near you, check it out - I’d love to hear what you discover there!