I then spent a few more hours poking through more data and testing out some APIs. One thing I realized is that it’s impractical for me to be pulling all my data from an API. The only case in which I might use an API is if I attempt to create real-time element to my viz, in which case, I’ll use Sunlight’s new Real Time Congress API.
As for the my other data sources, I’ve decided to pull the dumps from Transparency Data and Sunlight and then modify those CSV files to suit my needs. I will then put those into database, most likely accessing the data from SQL with PHP, transforming it into JS and then plugging it into my Processing code. As for historical data, if I decide to go that route, it appears I have no choice but to use GovTrack.us, but it’s a huge amount (16Gb!) of XML data, and I will probably try to avoid this if possible.
On nice thing about this approach is that I will be able to utilize a “two-pronged attack” going forward. I will spend half my time working on getting the back-end setup, messaging data, etc. The other half of my time will be spent working on the front-end interface, utilizing place-holder data. Then, sometime soon, I will have the front-end and back-end “meet in the middle.” After tying the front- and back-end together, I can move on to testing, evaluation, and refinement.