Thursday, February 9, 2017

Working with Hive Tables

HIVE is a data warehousing infrastructure on hadoop which provides capabilities for data storage in hadoop filesystem. It is designed for easy data summarization,ad-hoc analysis of larger volume of data from hadoop . It has own SQL-style language to work on this hive database and we call it HQL.

Here are some HQL commands to work with hive database.

CREATE HIVE Table

CREATE TABLE employee ( empid int, emp_name string,salary string, hire_dt timestamp)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE;

Browse HIVE Tables 

SHOW TABLES; --To show all tables

SHOW TABLES ‘emp*’; --To show tables start with letters emp 

DESCRIBE employee; --To show columns and those types

ALTER HIVE Tables

RENAME table name:

ALTER TABLE employee RENAME TO new_empl;

REPLACE all columns with new  list of columns in a table:
ALTER TABLE new_empl REPLACE COLUMNS (empid int, emp_name string,salary string);

ADD new columns:
ALTER TABLE new_empl ADD COLUMNS (hire_dt timestamp);


INSERT Data into HIVE Table

In HIVE, we have LOAD DATA command to insert data into tables.

Load data from local file-system file
LOAD DATA LOCAL INPATH '/home/cloudera/sample_emp_data.txt' OVERWRITE INTO TABLE
employee;

Load data from hadoop file-system file
LOAD DATA INPATH “/user/Hadoop_dir/sample_emp_data.txt” OVERWRITE INTO TABLE 
employee;

Note: OVERWRITE is optional to overwrite the table data .

DROP HIVE Tables

DROP TABLE new_empl;

.

No comments:

Post a Comment