Length function in Hive with examples
Contents
Length function in Hive
The length function returns the length of a specified string. In other words, it gives the number of characters in the string. It is available in every relational database systems. In Hive, Length function include the leading and trailing whitespaces when calculating the length of the string.
Syntax of Length function in Hive
1 |
length(string) |
The function is able to find the length for different values such as string, varchar, int, decimal, date & timestamp. To calculate length, either the value or column name should be passed as argument inside the length function.
If the value is NULL, the length function also returns NULL in Hive.
Example
Lets show the length function with few examples. For that, we have a table called “student_info” in Hive.
As we can see that the table contains the student details such as student_id, name, email_id, date_of_joining and last_login_time. Also please find the data types of those columns in the below screenshot.
Lets find the length of student_id column from student_info table. We have used length function on the student_id and it returns the length as 7. Even this column data type is bigint, the function returns the length of the value.
Next lets find the length of email_id column. The data type of the column is String and it contains the NULL value for one student.
As we can see above, the length function returned the number of characters in the email_id column. Since the email_id is NULL for one student, the length is also NULL.
Substring with Length function in Hive
We can use the Length function with other String function such as SubString. Lets use both length and Substring function on the email_id column to find the user name without domain name.
1 2 3 4 |
select email_id, SUBSTRING(email_id, 1, LENGTH(email_id) - INSTR(REVERSE(email_id),'@')) as user_name from student_info; |
In the email_id, the character ‘@’ is splitting the user name and domain name. So we need to find the index/location of that character in email_id column. For that we have used INSTR and REVERSE function.
The REVERSE function is used to reverse the entire email_id string as below. In that String, INSTR function finds the index of ‘@’ character.
Then we are subtracting the index of ‘@’ character from the length of email_id column. So the starting index for the substring is 1 and ending index of the substring is the index value before the ‘@’ character.
1 |
LENGTH(email_id) - INSTR(REVERSE(email_id),'@') |
As we have given the required index value in the substring function, it returned the user name from the email_id column as below.
Recommended Articles
Your Suggestions