Hello,
I want to know what is the best way to gather tweets from a specific
date till ‘time.now’.
I have a database which I dumped all user tweet history. Tweets are
dumped in a sqlite3 database. My db fields are tweet.created_at,
tweet.text and tweet.id plus an integer as key.
I use tweet.id to perform a match test before accepting new tweets on
the database.
However, now the script tries to dump all possible
tweets from twitter’s API every time, do the match and add the ones
that are missing (which are the ones of course). The procedure, as you
imagine, causes big delays.
The created_at date string is like this: “Tue Jul 06 10:08:23 +0000
2010”
Time matters, I can’t deal only with dates.
I have a couple of solutions in mind, but I’d like to know from more
experienced users which way to approach this:
-
Convert the ‘created_at’ string to YYYY-MM-DD date? This could be
tricky because there’s also the exact time of the tweet to consider.
(didn’t try it out yet) -
Using sqlite3’s “id integer primary key” which uses the biggest
number for the latest entry and extract date from there? -
Any smarter way?
Thanks