Kayak.com is the ultimate travel comparison site – users can find the cheapest flight between two points, compare hotel rates, find car rentals, and even buy complete packages for their getaways. It’s all made possible thanks to this firm’s grip on big data.
Despite the name, Stamford, Ct.-based Kayak.com has nothing to do with kayaking and it’s not a travel company. It describes itself as a technology company and it’s in the business of giving its users the best possible tools to manage data around their travel needs. Recently, the web service has evolved beyond comparisons of different travel options and forged into the area of predictive analytics.
When’s the best time to book that flight to save the most money? Where can you travel this holiday season to get the best rates on hotel rooms? Thanks to its mastery of travel related information and a growing history of user data, Kayak.com can help you answer those questions. It even turns its data into useful content on its sites, writing articles advising its users about where they might want to look for a deal.
To get to the bottom of how Kayak.com’s technology machine works and how its able to use that data for great marketing impact, we did a Google Hangouts on Air with Giorgos Zacharia, the chief technology officer at Kayak.com.
You can watch the conversation by playing the video above. Or read my edited transcript of the conversation below:
Brian Jackson: Can you begin by telling us briefly what Kayak.com does?
Giorgos Zacharia: We’re a travel search engine. Our Web site allows people to compare hundreds of packages at once with flights, hotels, cars, and vacation packages. We give the customer the choice of where to book. We’ve also introduced the mobile booking ability, so you can book on another web site without having to leave Kayak.com to a non-mobile friendly experience.
BJ: You became the chief technology officer at Kayak.com after being in charge of product there. What perspective does that give you in your new role?
GZ: At Kayak, product and technology are paramount – 70 per cent of our staff are technical staff and designers. The transition was very smooth as I worked with the founding CTO for about five years and my day-to-day priorities haven’t changed much, it’s just the scope of the job. We’re trying to build the best web site and mobile site to deliver the best user experience and save people time and money.
BJ: Kayak.com can compare the prices of countless flights, the room fares of more than half a million hotels, and even get rates on car rentals all around the world. You could describe that as a “big data” operation, so give us a peek behind the curtain. How do you run your platform?
GZ: We receive the data from many different third parties and we have to spend time cleaning it. For example we might get a record about one hotel from Priceline.com and a different version from Travelocity. We need to rationalize that data, clean it up and create one unique record for every hotel. This is done with a lot of machine learning and we train machine learning models for how human experts would compare this data. So most of these tasks are done automatically for you, and the comparisons it’s not confident about, it’s kicked off to a human processing loop for a human to make the final call. The technologies behind this are mostly Hadoop for data access we use a lot of Python for dashboards and machine learning, and at the end of the day we’re a Java shop so whatever is deployed has to be written in Java.
BJ: How do you train your data model to think like a human?
GZ: Lets say one of our data providers thinks a hotel is a five-star hotel, another one thinks its a four star hotel, and another one thinks its a three star hotel. What do we show on our web site? A human expert can decide it’s a three or four star hotel based on the amenities. Maybe each one of our providers was correct at the time of the decision. So looking at the age of the information, a human can make that decision. We can train the machine learning models on these examples of how to do the comparison. If there is repeatable patterns that happen with high confidence, the machine learning picks it up and you no longer need humans in that process.
BJ: Kayak.com has been doing predictive analytics too, informing customers of what it might be cheapest to buy that plane ticket. How do you determine what type of predictions are most useful to your customers and what degree of accuracy is required before you present them?
GZ: We are big users of our product ourselves, we have a lot of very active travelers. So the development of our product is done in a very organic way. Any engineer can come up with an idea and go to production for A/B testing. Our users can vote with their clicks – the features that our users find useful will get emphasis for production and the features that don’t get used are eliminated. So flight price prediction was a feature like this. Our analytics teams thought they could do something reasonably accurate and they did bring back a nice and accurate machine learning model. We deployed it and our users used it, so we expanded it. The way we do the forecasting is to combine all the data from our providers and its done in a crowdsourcing model. The trips that are more popular for our users will give you better forecasting, so during Spring break you’ll get better predictions for trips to the Caribbean. It’s an input for the users’ final decision, and we tell the user how good we’ve been at that prediction in the past so they can decide if they will buy now or wait until later.
BJ: How do you use the data from your site to create content?
GZ: It’s very organic. It may be an engineer or a marketing person that comes up with an idea. An analytics can take a look and if we have something interesting, a marketing person can report it. Sometimes we may have a data scientist reporting on it.
BJ: How do you create that environment where anyone can come up with an idea and execute on it?
GZ: If it’s a feature its very easy, you set up an A/B experiment and over time different engineers, marketers and analytics people have built their reputation on how good their ideas have been for Kayak, so based on how good their ideas have been in the past, they get higher priority for the new ones. For reporting pieces, it’s the same process. If we get enough pickup for an idea on Twitter or other followers, we’ll do more of that type of reporting in the future.
BJ: What does Kayak.com learn from the way customers use its site and how do you modify your product based on that information?
GZ: Any part of the interface is measured for popularity and over time we might change the interface on the popularity among users. For example we might change the order of the filters based on how users will change their behaviour. We’re not afraid of A/B experiments, so we’ll test how much detail we show in a results page. So at any time, we’ll be testing different aspects of the web site and the mobile site. Our own users teach us how to build a better website.
Watch the video to find out how Kayak.com users surprised Zacharia, and what his goals are as CTO.