Meetup¶
NB: The meetup pipeline will not work until this issue has been resolved.
Data collection of Meetup data. The procedure starts with a single country and Meetup category. All of the groups within the country are discovered, from which all members are subsequently retrieved (no personal information!). In order to build a fuller picture, all other groups to which the members belong are retrieved, which may be in other categories or countries. Finally, all group details are retrieved.
The code should be executed in the following order, which reflects the latter procedure:
- country_groups.py
- groups_members.py
- members_groups.py
- groups_details.py
Each script generates a list of dictionaries which can be ingested by the proceeding script.
Country \(\rightarrow\) Groups¶
Start with a country (and Meetup category) and end up with Meetup groups.
-
generate_coords
(x0, y0, x1, y1, n)[source]¶ Generate \(\mathcal{O}(\frac{n}{2}^2)\) coordinates in the bounding box \((x0, y0), (x1, y1)\), such that overlapping circles of equal radii (situated at each coordinate) entirely cover the area of the bounding box. The longitude and latitude are treated as euclidean variables, although the radius (calculated from the smallest side of the bounding box divided by \(n\)) is calculated correctly. In order for the circles to fully cover the region, an unjustified factor of 10% is included in the radius. Feel free to do the maths and work out a better strategy for covering a geographical area with circles.
The circles (centred on each X) are staggered as so (single vertical lines or four underscores correspond to a circle radius):
____X____ ____X____
|
X________X________X
|
____X____ ____X____
This configuration corresponds to \(n=4\).
Parameters: - x0, y0, x1, y1 (float) – Bounding box coordinates (lat/lon)
- n (int) – The fraction by which to calculate the Meetup API radius parameter, with respect to the smallest side of the country’s shape bbox. This will generate \(\mathcal{O}(\frac{n}{2}^2)\) separate Meetup API radius searches. The total number of searches scales with the ratio of the bbox sides.
Returns: The radius and coordinates for the Meetup API request
Return type: float,
list
oftuple
-
get_coordinate_data
(n)[source]¶ Generate the radius and coordinate data (see
generate_coords
) for each shape (country) in the shapefile pointed to by the environmental variable WORLD_BORDERS.Parameters: n (int) – The fraction by which to calculate the Meetup API radius parameter, with respect to the smallest side of the country’s shape bbox. This will generate \(\mathcal{O}(\frac{n}{2}^2)\) separate Meetup API radius searches. The total number of searches scales with the ratio of the bbox sides. Returns: - containing coordinate and radius
- for each country.
Return type: pd.DataFrame
-
class
MeetupCountryGroups
(country_code, coords, radius, category, n=10)[source]¶ Bases:
object
Extract all meetup groups for a given country.
-
country_code
¶ ISO2 code
Type: str
-
params (
obj:’dict’): GET request parameters, including lat/lon.
-
groups
¶ List of meetup groups in this country, assigned assigned after calling get_groups.
Type: list
ofstr
-
Groups \(\rightarrow\) Members¶
Start with Meetup groups and end up with Meetup members.
-
get_members
(params)[source]¶ Hit the Meetup API for the members of a specified group.
Parameters: params ( dict
) –https://api.meetup.com/members/
parametersReturns: Meetup member IDs Return type: ( list
ofstr
)
-
get_all_members
(group_id, group_urlname, max_results, test=False)[source]¶ Get all of the Meetup members for a specified group.
Parameters: - group_id (int) – The Meetup ID of the group.
- group_urlname (str) – The URL name of the group.
- max_results (int) – The maximum number of results to return per API query.
- test (bool) – For testing.
Returns: A matchable list of Meetup members
Return type: (
list
ofdict
)
Members \(\rightarrow\) Groups¶
Start with Meetup members and end up with Meetup groups.
-
exception
NoMemberFound
(member_id)[source]¶ Bases:
Exception
Exception should no member be found by the Meetup API
Groups \(\rightarrow\) Group details¶
Start with Meetup groups and end up with Meetup group details.
-
exception
NoGroupFound
(group_urlname)[source]¶ Bases:
Exception
Exception should no group be found by the Meetup API
-
get_group_details
(group_urlname, max_results, avoid_exception=True)[source]¶ Hit the Meetup API for the details of a specified groups. :param group_urlname: A Meetup group urlname :type group_urlname: str :param max_results: Total number of results to return per API request. :type max_results: int
Returns: Meetup API response data Return type: ( list
ofdict
)
Utils¶
Common tools between the different data collection points.
-
get_api_key
()[source]¶ Get a random API key from those listed in the environmental variable
MEETUP_API_KEYS
.
-
save_sample
(json_data, filename, k)[source]¶ Dump a sample of
k
items from row-oriented JSON datajson_data
into file with namefilename
.
-
flatten_data
(list_json_data, keys, **kwargs)[source]¶ Flatten nested JSON data from a list of JSON objects, by a list of desired keys. Each element in the
keys
may also be an ordered iterable of keys, such that subsequent keys describe a path through the JSON to desired value. For example in order to extract key1 and key3 from:{'key': <some_value>, 'key2' : {'key3': <some_value>}}
one would specify
keys
as:['key1', ('key2', 'key3')]
Parameters: - list_json_data (
json
) – Row-orientated JSON data. - keys (
list
) – Mixed list of either: individual str keys for data values - are not nested; or sublists of str, as described above. (which) –
- **kwargs – Any constants to include in every flattened row of the output.
Returns: Flattened row-orientated JSON data.
Return type: json
- list_json_data (
-
get_members_by_percentile
(engine, perc=10)[source]¶ Get the number of meetup group members for a given percentile from the database.
Parameters: - engine – A SQL alchemy connectable.
- perc (int) – A percentile to evaluate.
Returns: The number of members corresponding to this percentile.
Return type: members (float)
-
get_core_topics
(engine, core_categories, members_limit, perc=99)[source]¶ Get the most frequent topics from a selection of meetup categories, from the database.
Parameters: - engine – A SQL alchemy connectable.
- core_categories (list) – A list of category_shortnames.
- members_limit (int) – Minimum number of members required in a group for it to be considered.
- perc (int) – A percentile to evaluate the most frequent topics.
Returns: The set of most frequent topics.
Return type: topics (set)