Twitter is a social network site that enables users to communicate to each other using short messages known as tweets. The tweet is of about 140 characters but the number could have been increased as of November 7, 2017 https://en.wikipedia.org/wiki/Twitter. Similar to other social networking sites Twitter has a wealth of user information that can be used for both social, political, economical and even scientific research. In this post we are going to use a popular Twitter library known as Tweepy to extract Twitter user data. This posts assumes that you have basic knowledge of Python and you have you have a Twitter account.

Scraping Twitter Data With Tweepy

Tweepy is an open source Python library that enables Twitter APIs to be used in the Python programming language. For more details check the Tweepy documentation here.

Installing Tweepy

Before we can use the Tweepy package we need to install it. If you are using the Anaconda platform run the following command

However, you can run the pip command to install the Tweepy library.

Creating  Twitter APP

For us to scrap Twitter data from Twitter site we need to to create an APP and generate consumer key, consumer secret, access token and access secret keys. To do this let’s go to

apps.twitter.com and click on Create New App button. This will bring you this form to create your new app

Creating an app - Scraping Twitter Data With Tweepy

After creating your new app and clicking on the app you can go to keys and access tokens to generate access tokens and keys.

Creating an app - Scraping Twitter Data With Tweepy

Now let’s begin extracting data from twitter site using our twitter app keys and tokens.

Reading Twitter Timeline

Creating a json Dump From Twitter Timeline

Creating a DataFrame  From Twitter Timeline

Reading Personal Tweets

Searching For Tweets

Streaming Tweets

Output

This will create a liverpool.json file in your working directory and continuously save the live tweets as they come. You can see the size of the liverpool.json file as it grows with time.

More on Streaming Tweets

Continuously stream live tweets as they are posted.

Conclusion

In this post we have seen how to create an a Twitter APP and use the access keys and tokens to extract data from the twitter APIs using the Tweepy library. In addition we have seen how to stream data from the live tweets which forms a fundamental tasks of many analysis use cases.It is worthy noting that you should keep the access keys and tokens very secure. Also you need to go through the Twitter Privacy and Terms to understand the extend to which the Twitter data can be used.

What’s Next

We have looked at an important task of scraping data from the Twitter site. In the next post we are going to do simple analyses and visualizations on this data we have extracted.

Scraping Twitter Data With Tweepy

Post navigation