Skip to content

Commit

Permalink
Incoming bill actions now have either <committee> or <committees> nodes
Browse files Browse the repository at this point in the history
Starting on Dec. 13 I started getting 689 bills (in the 116th Congress) with "committees" instead of "committee" in actions in the incoming
GPO bulk data XML files. (Example: [HR 3](https://www.govinfo.gov/bulkdata/BILLSTATUS/116/hr/BILLSTATUS-116hr3.xml)) It looks like currently
1,897 bills have this change. It appears that the data format change described at usgpo/bill-status#147 was published
early, and only in some bills.

Thankfully, our JSON data format for bills already held a list of committee codes, so no change to our data format was needed.

Once GPO has refreshed all of the bulk data files with the new <committees/> format, this code can be simplified to remove the <committee> case.

Fixes #245.
  • Loading branch information
JoshData committed Dec 20, 2019
1 parent 1892eb6 commit 78d3215
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion tasks/bill_info.py
Original file line number Diff line number Diff line change
Expand Up @@ -477,12 +477,24 @@ def action_for(item):

references.append({'type': type, 'reference': reference})

# extract committee IDs
if item.get('committee'):
# Data format through Dec. 13, 2019 had only one <committee/> (though node could be empty).
committee_nodes = [item['committee']]
elif item.get('committees'):
# Starting on Dec. 13, 2019, and with a slow rollout, multiple committees could be specified.
# Thankfully our JSON output format allowed it already.
committee_nodes = item['committees'].get("item", [])
else:
# <committee/> or <committees/>, whichever was present, was empty
committee_nodes = []

# form dict

action_dict = {
'acted_at': acted_at,
'action_code': item.get('actionCode', ''),
'committees': [item['committee']['systemCode'][0:-2].upper()] if item['committee'] else None,
'committees': [committee_item['systemCode'][0:-2].upper() for committee_item in committee_nodes] if committee_nodes else None, # if empty, store None
'references': references,
'type': 'action', # replaced by parse_bill_action if a regex matches
'text': text,
Expand Down

0 comments on commit 78d3215

Please sign in to comment.