Wednesday, November 13, 2013

Presto :: SQL Query Engine From Facebook For Hadoop



Presto is a low-latency, SQL-compliant query system for Hadoop developed at Facebook. It is yet another fast query option to Hadoop. Facebook is one of he largest data warehouses in the world with its size more than 300 petabytes. The huge amount of structured and unstructured data at Facebook has driven it towards the development of its own tools working at a high scale. Presto was up and working in the early 2013 and is now actively used by more than 1,000 employeesrunning more than 30,000 queries each day against a database of petabyte scale.

Facebook had previously developed Hive interface to support SQL like querying of unstructured data, and made it an open source product. It is now a popular tool used by most companies that use Hadoop. Facebook has also come up with its decission to contribute Presto to the open source community.

Facebook reports that Presto is "10X better that Hive/MapReduce in terms of efficiencyand latency for most queries". It also provides support for a large subset of ANSI SQL queries, subqueries and joins.

No comments:

Post a Comment