hiveql - NULL column names in Hive query result -
i have downloaded weather .txt files noaa, looks like:
wban,date,time,stationtype,skycondition,skyconditionflag,visibility,visibilityflag,weathertype,weathertypeflag,drybulbfarenheit,drybulbfarenheitflag,drybulbcelsius,drybulbcelsiusflag,wetbulbfarenheit,wetbulbfarenheitflag,wetbulbcelsius,wetbulbcelsiusflag,dewpointfarenheit,dewpointfarenheitflag,dewpointcelsius,dewpointcelsiusflag,relativehumidity,relativehumidityflag,windspeed,windspeedflag,winddirection,winddirectionflag,valueforwindcharacter,valueforwindcharacterflag,stationpressure,stationpressureflag,pressuretendency,pressuretendencyflag,pressurechange,pressurechangeflag,sealevelpressure,sealevelpressureflag,recordtype,recordtypeflag,hourlyprecip,hourlyprecipflag,altimeter,altimeterflag 00102,20150101,0001,0,ovc043, ,10.00, , , ,27, ,-2.8, ,26, ,-3.1, ,25, ,-3.9, , 92, , 0, ,000, , , ,30.05, , , , , ,30.36, ,aa, , , ,30.23, 00102,20150101,0101,0,ovc045, ,10.00, , , ,27, ,-2.8, ,26, ,-3.1, ,25, ,-3.9, , 92, , 6, ,080, , , ,30.07, , , , , ,30.37, ,aa, , , ,30.25, 00102,20150101,0201,0,ovc047, ,10.00, , , ,26, ,-3.3, ,25, ,-3.7, ,24, ,-4.4, , 92, , 6, ,090, , , ,30.08, , , , , ,30.39, ,aa, , , ,30.26, 00102,20150101,0301,0,ovc049, ,10.00, , , ,26, ,-3.3, ,25, ,-3.7, ,24, ,-4.4, , 92, , 7, ,100, , , ,30.09, , , , , ,30.40, ,aa, , , ,30.27,
then have created following table:
create external table weather(wban string, `date` string, time string, stationtype int, skycondition string, skyconditionflag string, visibility int, visibilityflag string, weathertype string, weathertypeflag string, drybulbfarenheit int, drybulbfarenheitflag string, drybulbcelsius decimal, drybulbcelsiusflag int, wetbulbfarenheit int, wetbulbfarenheitflag int, wetbulbcelsius decimal, wetbulbcelsiusflag int, dewpointfarenheit int, dewpointfarenheitflag int, dewpointcelsius decimal, dewpointcelsiusflag int, relativehumidity int, relativehumidityflag int, windspeed int, windspeedflag int, winddirection int, winddirectionflag int, valueforwindcharacter int, valueforwindcharacterflag int, stationpressure decimal, stationpressureflag int, pressuretendency int, pressuretendencyflag int, pressurechange int, pressurechangeflag int, sealevelpressure decimal, sealevelpressureflag int, recordtype string, recordtypeflag string, hourlyprecip decimal, hourlyprecipflag int, altimeter decimal, altimeterflag int) comment 'our weather table in hive!' row format delimited fields terminated ',' lines terminated '\n' location '/data/weather';
now if try simple query like:
hive> select * weather limit 10;
i result following, , null replacing column's names!
wban date time null skycondition skyconditionflag null visibilityflag weathertype weathertypeflag null drybulbfarenheitflag null null null null null null null null null null null null null null null null nullnull null null null null null null null null recordtype recordtypeflag null null null null 00102 20150101 0001 0 ovc043 10 27 -3 null 26 null -3 null25 null -4 null null null null null 0 null null null 30 null null null null null 30 null aa null null 30 null 00102 20150101 0101 0 ovc045 10 27 -3 null 26 null -3 null25 null -4 null null null null null 80 null null null 30 null null null null null 30 null aa null null 30 null 00102 20150101 0201 0 ovc047 10
as may noticed, fourth , seventh columns (and many after those) tilted null when should stationtype, , visibility... etc respectively!
even if tried:
hive> select visibility weather limit 10;
i correct result, null column title/name!!!
why null column names/titles?!
interesting question, took me minute realize going on right knowledge of hive obvious!
- the first thing note here null values occur in columns not of type string.
- the second thing realize hive (unlike beeline example) not print column headers above selection.
so, putting 1 , 2 together:
- the column names fine, see query
describe weather
. - the file use datasource, appears have had column names on first row. these making first row of hive table. of course columns of type string have no problem dealing data, columns of type int show null when asked handle strings cannot cast int properly.
suggestion:
try rid of first row, preferably before creating external table.
Comments
Post a Comment