-
Notifications
You must be signed in to change notification settings - Fork 9
Description
This issue was created to separate the two issues that ate being reported in issue #27 to make it easier to manage ongoing issues in this project.
Summary:
@DEQrporter (Ryan Porter) has reported that the following call in pyaqsapi takes much longer than the equivalent call in RAQSAPI:
df = pd.DataFrame(aqs.bystate.monitors("88101", datetime.date(1950, 1, 1), datetime.date(2024, 12, 31), "41"))
@mccroweyclinton-EPA responded with the following probable cause as to why this call is taking so long to finish in pyaqsapi:
it is easy to see why your call in pyaqsapi take so much longer than it does in RAQSAPI, you are requesting 74 years of data. I have never requested that many years of data in a single call, during all of my testing. In fact, it is probably better to not overload the AQS DataMart API server with a request spanning such a large time frame in a single call.
Anyways, to address the issue as hand, pyaqsapi has a function __aqs_ratelimit() that is called after each API call to the server; each year of requested data will require 1 API call. By default __aqs__ratelimit makes a 5 second call to time.sleep() as a simple pause between API calls. So 74 years * 5 second delay/year results in a 370 second delay added by the ratelimit. Also 370 seconds ~ 6.16 minutes. So the equivalent call in pyaqsapi adds at least 6.16 minutes of delay (minus the throttling action in RAQSAPI). This delay also does not account for the time that it takes for the API server to return data which in itself can take quite some time. The rate limit in RAQSAPI is managed differently because the R package:httr2 has internal functions that manage rate limits differently. It seems that I may have to implement the ratelimit differently in pyaqsapi so that API calls are on par with that of what would be achievable in RAQSAPI, hopefully it can be done in a way that is similar to how it is in R by the httr2::req_throttle function.