Using the Facebook API to Extract Group Discussion
The first step of my sulfite allergy data project is to get the data, which consists of posts and discussions made in a Facebook sulfite allergy support group since 2012.
Get a token
In my past experience working with APIs, they often would have a developer key, an alphanumeric code you signed up to receive, and included in ever request. This allows the site to track the frequency of access requests made by certain users and limit it. The sign up was a simple "here's my e-mail" and they sent you a key.
Facebook is a little more complicated. You have to register your account as a developer, and create an app.
tangent: I always thought it was weird which things required me to give an "App" permission- why do I need an app to figure out which Golden Girl I would be? It's just a link! But I guess that's where they get your data- I'm not sure what they actually learn from having my age and gender though.
Once you have that done you can use the FB Graph API Explorer, found under Tools menu on the app's dashboard, to request a token and play with different API calls to check that they return what you need, before you build any code.
To get group posts in the graph explorer "Get" bar, you just type /{group_id}/feed, where group id is the number you see in the address bar while visiting a group's page (without the braces).

A small bit of results are returned below the bar.
This works if you're an admin of the group you are extracting data from, but if you want an app you can let others use for any group they are in, first you have to submit the app for review. (Post Cambridge Analytica)
Note: You may need to go to developers.facebook.com/tools/accesstoken : 'debug' , to extend the token past a couple hours.
Filter for dates, retrieve comments
Well, this seems simple enough! Except, when it shows a feed, it only shows posts- not their comments. This is because Facebook has a node map to store data, which limits how much data it has to access to load specific things (for instance, taking you straight to replies to your comment instead of loading the original post and all comments). This is faster for page loads but cumbersome for trying to retrieve all the things. To retrieve the comments and all respective replies, I'll have to run another API call with the object ID for each comment.. ug. First I'll retrieve the posts, and assess how best to get the comments, working within Facebook's 200 an hour call limits.
I also need to filter by date, which I can do simply by adding a since/until to the end
.since(01/01/2019).until(01/27/2019)
So the next step is using the facebook-sdk library for Python, which I'll talk about in next entry