learning catalog: Working with Hive Tables

HIVE is a data warehousing infrastructure on hadoop which provides capabilities for data storage in hadoop filesystem. It is designed for easy data summarization,ad-hoc analysis of larger volume of data from hadoop . It has own SQL-style language to work on this hive database and we call it HQL.

Here are some HQL commands to work with hive database.

CREATE HIVE Table

CREATE TABLE employee ( empid int, emp_name string,salary string, hire_dt timestamp)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ‘\t’

LINES TERMINATED BY ‘\n’

STORED AS TEXTFILE;

Browse HIVE Tables

SHOW TABLES; --To show all tables

SHOW TABLES ‘emp*’; --To show tables start with letters emp

DESCRIBE employee; --To show columns and those types

ALTER HIVE Tables

RENAME table name:

ALTER TABLE employee RENAME TO new_empl;

REPLACE all columns with new list of columns in a table:

ALTER TABLE new_empl REPLACE COLUMNS (empid int, emp_name string,salary string);

ADD new columns:

ALTER TABLE new_empl ADD COLUMNS (hire_dt timestamp);

INSERT Data into HIVE Table

In HIVE, we have LOAD DATA command to insert data into tables.

Load data from local file-system file

LOAD DATA LOCAL INPATH '/home/cloudera/sample_emp_data.txt' OVERWRITE INTO TABLE

employee;

Load data from hadoop file-system file

LOAD DATA INPATH “/user/Hadoop_dir/sample_emp_data.txt” OVERWRITE INTO TABLE

employee;

Note: OVERWRITE is optional to overwrite the table data .

DROP HIVE Tables

DROP TABLE new_empl;

learning catalog

Labels

Thursday, February 9, 2017

Working with Hive Tables

No comments:

Post a Comment