#806 Unstable pufty nodes in the CentOS CI infra
Closed: Fixed by arrfab. Opened by mrc0mmand.

Since yesterday's evening (~10 PM CET) I've been noticing frequent timeouts both when accessing the Duffy nodes and/or network resources, e.g.:

Today @ 09:44 CET

...
09:40:48 Switched to branch 'pr'
09:40:48 man/org.freedesktop.login1.xml
09:40:48 src/login/logind-session-dbus.c
09:40:48 src/login/org.freedesktop.login1.conf
09:40:49 Cloning into 'systemd-centos-ci'...
09:43:51 ssh: connect to host n61.pufty port 22: Connection timed out

Today @ 09:23 CET

...
Warning: Permanently added 'n63.pufty,172.19.3.127' (ECDSA) to the list of known hosts.
...
35 files removed
CentOS Stream 8 - AppStream                     3.6 MB/s |  23 MB     00:06    
CentOS Stream 8 - BaseOS                         59 MB/s |  23 MB     00:00    
CentOS Stream 8 - Extras                        1.7 MB/s |  18 kB     00:00    
CentOS Stream 8 - Extras common packages        419 kB/s | 4.3 kB     00:00    
CentOS Stream 8 - PowerTools                     46 MB/s | 4.7 MB     00:00    
Copr repo for nm-build-deps owned by nmstate    0.0  B/s |   0  B     00:49    
Errors during downloading metadata for repository 'copr:copr.fedorainfracloud.org:nmstate:nm-build-deps':
  - Curl error (7): Couldn't connect to server for https://download.copr.fedorainfracloud.org/results/nmstate/nm-build-deps/epel-8-x86_64/repodata/repomd.xml [Failed to connect to download.copr.fedorainfracloud.org port 443: No route to host]
Error: Failed to download metadata for repo 'copr:copr.fedorainfracloud.org:nmstate:nm-build-deps': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
Extra Packages for Enterprise Linux 8 - x86_64  0.0  B/s |   0  B     05:51    
Errors during downloading metadata for repository 'epel':
  - Curl error (7): Couldn't connect to server for https://mirrors.fedoraproject.org/metalink?repo=epel-8&arch=x86_64&infra=stock&content=centos [Failed to connect to mirrors.fedoraproject.org port 443: No route to host]
  - Curl error (28): Timeout was reached for https://mirrors.fedoraproject.org/metalink?repo=epel-8&arch=x86_64&infra=stock&content=centos [Connection timed out after 30000 milliseconds]
  - Curl error (7): Couldn't connect to server for https://mirrors.fedoraproject.org/metalink?repo=epel-8&arch=x86_64&infra=stock&content=centos [Failed to connect to mirrors.fedoraproject.org port 443: Connection timed out]
Error: Failed to download metadata for repo 'epel': Cannot prepare internal mirrorlist: Curl error (7): Couldn't connect to server for https://mirrors.fedoraproject.org/metalink?repo=epel-8&arch=x86_64&infra=stock&content=centos [Failed to connect to mirrors.fedoraproject.org port 443: No route to host]

After going through 10+ such reports, I noticed that all of them are from the pufty nodes, so it looks like it's the same issue as https://pagure.io/centos-infra/issue/771.


Metadata Update from @zlopez:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: centos-ci-infra, high-gain, medium-trouble

Metadata Update from @arrfab:
- Issue assigned to arrfab

I confirmed that chassis had issues, so putting "aside" for now but I also discovered that gusty chassis was in such state now, and unable to remotely attach to the management port, so for that one I'll as DC people to do a hardware/cold reset :/

got confirmation from DC people that they were able to proceed with a cold reset of that gusty chassis and so we should be good to go for now. Closing

Metadata Update from @arrfab:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

Log in to comment on this ticket.

Metadata