In this blog, I am going to invoke Twitter custom APIs with Postman client in order to pull live feeds, or you can say tweets from Twitter. Output will be JSON text which you can format or change based on your requirement.
Soon I will be writing another blog to demonstrate how you can ingest this data in real time with Kafka and process it using Spark. Or, you can directly stream & process the data in real time with Spark streaming.
As of now, let's try to connect Twitter API using Postman.
Twitter developer account
Postman Client Installation
There are basically two ways to install Postman, either you can download the Postman extension for your browser (chrome in my case) or you can simply install native Postman application. I have installed Postman application to write this blog.
Step 1. Google "Install Postman" and go to the Postman official site to download the application.
Step 2. After opening Postman download link, select your operating system to start Postman download. It's available for all the types of platform - Mac, Linux and Windows. The download link keeps on changing so if the download link doesn't work just Google it as shown above.
Step 3. Once installer is downloaded, run the installer to complete the installation process. It's approximately 250 MB application (for Mac).
Step 4. Sign up. After signing in, you can save your preferences or do it later as shown below.
Step 5. Your workspace will look like below.
Twitter Developer Account
I hope you all have Twitter developers account, if not please create it.
Then, go to Developer Twitter and sign in with your Twitter account. Click on Apps > Create an app at the top right corner of your screen.
Note: Earlier, developer.twitter.com was known as apps.twitter.com.
Fill out the form to create an application > specify Name, Description and Website details as shown below. This screen has slightly changed with new Twitter developer interface but overall process is still similar.
If you have any question, please feel free to ask in comment section at the end of this post.
Please provide a proper website name like https://example.com otherwise you will get error while creating the application. Sample has been shown above.
Once you successfully create the app, you will get the below page.
Make sure access level is set to Read and Write as shown above.
Now go to Keys and Access Token tab > click on Create Access Token.
At this point, you will be able to see 4 keys which will used in Postman client.
Consumer Key (API Key)
Consumer Secret (API Secret)
Access Token Secret.
New Interface looks like this.
Calling Twitter API with Postman Client
Open Postman application and click on authorization tab.
Select authorization type as OAuth 1.0.
Add authorization data to Request Headers. This is very important step else you will get error.
After setting up authorization type and request header, fill out the form carefully with 4 keys (just copy-paste) which we generated in Twitter App - Consumer Key (API Key), Consumer Secret (API Secret), Access Token & Access Token Secret.
Now let's search for tweeter statuses which says snap.
Copy-paste request URL as https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=snap as shown below.
You can refer API reference index in order to access various Twitter custom API.
GET some tweets, hit Send button. You will get response as shown below.
Twitter has very nice API documentation on accounts, users, tweets, media, trend, messages, geo, ads etc and there is huge variety of data which you can pull. I am invoking few APIs just for demonstration purpose.
Accounts and users
Lets say you want to search for user name "Elon". You can do it like this,
Now suppose you want to get friend list of Elon Musk, you can do it like this,
Input user_id is same as id in previous output. You can also change the display => pretty, raw and preview.
You can pull top 50 trending global topics with id = 1, for example,
You can also POST something like you Tweet in your Twitter web account. For example if you want to Tweet Hello you can do it like this,
You can verify same with your Twitter account, yeah that's me! I rarely use Twitter.
Cursoring is used for pagination when you have large result set. Lets say you want to pull all statuses which says "Elon", it's obvious that there will be good number of tweets and that response can't fit in one page.
To navigate through each page cursoring is needed. For example, lets say you want to pull 5 result per page you can do it like this,
Now, to navigate to next 5 records you have to use next_results shown in search_metadata section above like this,
To get next set of results again use next_results from search_metadata of this result set and so on..
Now, obviously you can't do this manually each time. You need to write loop to get the result set programmatically, for example,
cursor = -1
api_path = "https://api.twitter.com/1.1/endpoint.json?screen_name=targetUser"
url_with_cursor = api_path + "&cursor=" + cursor
response_dictionary = perform_http_get_request_for_url( url_with_cursor )
cursor = response_dictionary[ 'next_cursor' ]
while ( cursor != 0 )
In our case next_results is like next_cursor, like a pointer to next page. This might be different for different endpoints like tweets, users and accounts, ads etc. But logic will be same to loop through each result set.
Refer this for complete details.
That's it you have successfully pulled data from Twitter.
Learn Apache Spark in 7 days, start today!
1. Apache Spark and Scala Installation
2. Getting Familiar with Scala IDE
3. Spark data structure basics
4. Spark Shell
5. Reading data files in Spark
6. Writing data files in Spark
7. Spark streaming