Just loading jenkins page sometimes I get "504 Gateway Time-out"
Also, jobs are taking too long to run each step they time out...
https://osci-jenkins-2.ci.fedoraproject.org/job/fedora-scratch-build-pipeline/
instance: https://osci-jenkins-2.ci.fedoraproject.org
Metadata Update from @siddharthvipul1: - Issue assigned to dkirwan - Issue tagged with: centos-ci-infra, groomed, high-gain
I couldn't find anything that could explain the problem on jenkins log
Updating the ticket with the information form irc chat.
(16:40:18) siddharthvipul: hmm, so from what I understand, that's a NFS export issue (we have a fix but that requires an outage that we are planning) (09:36:21) siddharthvipul: jbair, bgoncalv hey, sorry nowadays I have started closing my laptop at hard stop of 10pm :) I was away (09:36:40) siddharthvipul: and re: the outage.. fabian is in discussion with the person in datacenter and they need to sync (09:37:05) siddharthvipul: It would be somewhere in the last week of September (graceful shutdown of OCP4 would be needed) (15:09:43) siddharthvipul: bgoncalv, hey, sorry I was away for some time.. update would be that we need to upgrade hardware and for that there would be a need of outage.. It's likely to be scheduled at the end of this month.. regarding the issue itself, it's a limitation of network band. From the monitoring it doesn't look too bad but it's a collective of all other jobs as well and I am not sure if there is something passing the openshift monitoring (15:10:23) siddharthvipul: on Monday "out openshift expert" will be back and will be assigned to work on it (while I work on #4) :)
Now jenkins seems to be completely down:
Application is not available
On openshift POD I see events like:
PodPosci-jenkins-2-b894b756-7jd74NamespaceNSfedora-ci-jenkins-prod 4 minutes ago Generated from kubelet on kempty-n9.ci.centos.org 1478 times in the last 6 days Readiness probe failed: Get http://10.128.2.189:8080/login: dial tcp 10.128.2.189:8080: connect: connection refused PodPosci-jenkins-2-b894b756-7jd74NamespaceNSfedora-ci-jenkins-prod Sep 15, 4:14 pm Generated from kubelet on kempty-n9.ci.centos.org 15 times in the last 2 days Readiness probe failed: HTTP probe failed with statuscode: 503 PodPosci-jenkins-2-b894b756-7jd74NamespaceNSfedora-ci-jenkins-prod Sep 15, 4:09 pm Generated from kubelet on kempty-n9.ci.centos.org 2104 times in the last 2 days Back-off restarting failed container
We think we understand the issue, see https://pagure.io/centos-infra/issue/53#comment-686574 for more information as to when this will hopefully be resolved.
Metadata Update from @dkirwan: - Issue marked as depending on: #53
Metadata Update from @dkirwan: - Issue untagged with: groomed - Issue priority set to: None (was: Needs Review) - Issue tagged with: medium-trouble
Should be resolved now.
Metadata Update from @dkirwan: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.