Hi there --
I've been doing some analysis using the raw pageviews table in Hive, in
order to try to understand the effect that adding a sitemap to
it.wikipedia.org had on traffic[1]. As part of this analysis, I created
three temporary tables. But, of course, those tables only exist within the
context of my own session, which is sub-optimal since I'm not the only one
trying to understand this.
What's the best way to go about persisting these tables? I can SELECT INTO
to move the data in to a non-temp table, but don't want to do so
willy-nilly.
(They'll probably need to stick around for about 2 weeks, I would guess,
and each of the three tables in question is about 5 million rows with three
columns each (a string, and two int))
Thanks!
- Ian