관리 메뉴

Hee'World

[1004jonghee]타조(Tajo)??? 본문

BigData/Tajo

[1004jonghee]타조(Tajo)???

Jonghee Jeon 2013. 8. 19. 15:07

 

 

 

 

Introduction

Tajo is a relational and distributed data warehouse system for Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation and ETL on large-data sets by leveraging advanced database techniques. It supports SQL standards. Tajo uses HDFS as a primary storage layer and has its own query engine which allows direct control of distributed execution and data flow. As a result, Tajo has a variety of query evaluation strategies and more optimization opportunities. In addition, Tajo will have a native columnar execution and and its optimizer.

Features

  • Fast and low-latency query processing on SQL queries including projection, filter, group-by, sort, and join.
  • Rudiment ETL that transforms one data format to another data format.
  • Support various file formats, such as CSV, RCFile, RowFile (a row store file), and Trevni.
  • Command line interface to allow users to submit SQL queries
  • Java API to enable clients to submit SQL queries to Tajo

 

-http://tajo.incubator.apache.org/

 

 

타조는 국내에서 개발되어졌으며, HDFS의 저장된 내용들을 SQL쿼리를 이용하여 질의를 할 수 있는 Hadoop System datawarehouse입니다. HIVE와 비슷하다고 생각되어 지네요?

Comments