We randomly hitting below Error in our CI jobs, example:- [1][2].
Failed to allocate nodes: Rate limit exceeded: 6 sessions in the last 5 minutes
Is rate limiting recently imposed or it was there even before but with [3] it's now visible to end user. Currently we using "cico node get --arch x86_64 --release 8-stream --count 1 --retry-count 6 --retry-interval 60 -f csv", What's the recommended retry-interval to use to avoid Rate limit?
[1] https://jenkins-cloudsig-ci.apps.ocp.ci.centos.org/job/tripleo-quickstart-promote-wallaby-current-tripleo-delorean-minimal/69/console [2] https://jenkins-cloudsig-ci.apps.ocp.ci.centos.org/job/tripleo-quickstart-promote-train-current-tripleo-delorean-minimal/43/console [3] https://github.com/CentOS/python-cicoclient/pull/26/commits/f22f8d1a
The PR you mentioned was merged so that indeed instead of just having string indices must be integers they'd get the Duffy API answer. So the limit was always there, but sometimes not returned through cicoclient (but it was if using plain api call)
string indices must be integers
afaics you're retrying 6 times and waiting 60s so if previous (and still running) jobs are still consuming duffy nodes, you should increment that ?
What's the average time needed for your test[s] . That should give you a hint or just put some logic in your jenkins so that you just put in queue and would only allow [x] number of parallel jobs on the executor ?
afaics you're retrying 6 times and waiting 60s so if previous (and still running) jobs are still consuming duffy nodes, you should increment that ? What's the rate limit btw? like how many requests are allowed in a minute so we can atleast do that to avoid rate limit error. Or is that error means retry limit exceeded? What's the average time needed for your test[s] . That should give you a hint or just put some logic in your jenkins so that you just put in queue and would only allow [x] number of parallel jobs on the executor ?
What's the rate limit btw? like how many requests are allowed in a minute so we can atleast do that to avoid rate limit error. Or is that error means retry limit exceeded?
I checked currently we have Concurrency set to 15. Average time for most of the jobs is 1.5 hours exception to 1 job which takes longer 3-4 hours. Yes we can adjust the concurrency to some lower value but needs to see what would be better setting in this case.
I think nothing changed but worth confirming with @siddharthvipul1 Code should still be https://github.com/CentOS/duffy/blob/stale/master/duffy/api_v1/views.py#L57
So that would mean that you can't have more than more than 5 requests/sessions (still active) in the last 5 minutes. You only started to recently get the returned message instead of wrong one because python-cicolient merged this PR https://github.com/CentOS/python-cicoclient/commit/f22f8d1accbae87e3b4fca7bb71427f4c47647ca and made it into 0.4.7 release
Metadata Update from @arrfab: - Issue tagged with: need-more-info
no feedback on this ticket but normally we can close it as behaviour is the same as before, it's just that newer python-cicoclient is now outputting the error from duffy api @ykarel so safe to close ?
Thanks @arrfab for the explaintaion, somehow i missed that. yes this can be closed.
Metadata Update from @ykarel: - Issue close_status updated to: Invalid - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.