#782 [cbs] failing to create listening socket on 127.0.0.1 during build
Closed: Fixed with Explanation by arrfab. Opened by chantra.

I am trying to rebuild librepo-1.14.2-1.el8 for hyperscale SIG and it is failing while running unittests, more specifically when trying to connect to a mock server. Those build work just fine when using mock on my laptop.

It seems that on cbs.centos.org the tests are not able to set up a server while they could on koji.mbox.centos.org (ast leasst back in 2021-11-24 when librepo-1.14.2-1.el8 was built).

The merge build that is failing is https://cbs.centos.org/koji/taskinfo?taskID=2811734

the specific failure in build.log:

Test project /builddir/build/BUILD/librepo-1.14.2/x86_64-redhat-linux-gnu
    Start 1: test_main
    Start 2: test_python
1/2 Test #1: test_main ........................   Passed    0.15 sec
2/2 Test #2: test_python ......................***Failed   15.84 sec
.
(process:705): librepo-WARNING **: 16:47:56.664: WARNING! Deprecated LRO_MIRRORLIST used
(process:705): librepo-WARNING **: 16:47:56.664: WARNING! Deprecated LRO_MIRRORLIST used
(process:705): librepo-WARNING **: 16:47:56.664: WARNING! Deprecated LRO_MIRRORLIST used
/builddir/build/BUILD/librepo-1.14.2/x86_64-redhat-linux-gnu/librepo/python/librepo/__init__.py:1505: DeprecationWarning: Using string value for LRO_URLS is deprecated, use list of strings instead
  "use list of strings instead", DeprecationWarning)
.
(process:705): librepo-WARNING **: 16:47:56.665: WARNING! Deprecated LRO_MIRRORLIST used
(process:705): librepo-WARNING **: 16:47:56.665: WARNING! Deprecated LRO_MIRRORLIST used
(process:705): librepo-WARNING **: 16:47:56.665: WARNING! Deprecated LRO_MIRRORLIST used
........EEE/builddir/build/BUILD/librepo-1.14.2/tests/python/tests/test_yum_repo_locating.py:28: ResourceWarning: unclosed file <_io.BufferedReader name='/builddir/build/BUILD/librepo-1.14.2/tests/test_data/key.pub'>
  self.ctx.op_import(open(PUB_KEY, 'rb'))
.......
(process:705): librepo-WARNING **: 16:48:12.231: WARNING! Deprecated LRO_MIRRORLIST used
..
(process:705): librepo-WARNING **: 16:48:12.267: WARNING! Deprecated LRO_MIRRORLIST used
.
======================================================================
ERROR: setUpClass (tests.test_yum_package_downloading.TestCaseYumPackageDownloading)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILD/librepo-1.14.2/tests/python/tests/base.py", line 94, in setUpClass
    raise Exception("Server didn't start even after 5 seconds.")
Exception: Server didn't start even after 5 seconds.
======================================================================
ERROR: setUpClass (tests.test_yum_package_downloading.TestCaseYumPackagesDownloading)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILD/librepo-1.14.2/tests/python/tests/base.py", line 94, in setUpClass
    raise Exception("Server didn't start even after 5 seconds.")
Exception: Server didn't start even after 5 seconds.
======================================================================
ERROR: setUpClass (tests.test_yum_repo_downloading.TestCaseYumRepoDownloading)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILD/librepo-1.14.2/tests/python/tests/base.py", line 94, in setUpClass
    raise Exception("Server didn't start even after 5 seconds.")
Exception: Server didn't start even after 5 seconds.
----------------------------------------------------------------------
Ran 20 tests in 15.604s
FAILED (errors=3)
50% tests passed, 1 tests failed out of 2
Total Test time (real) =  15.85 sec
The following tests FAILED:
      2 - test_python (Failed)
RPM build errors:
Errors while running CTest
error: Bad exit status from /var/tmp/rpm-tmp.CWkjKu (%check)
    Bad exit status from /var/tmp/rpm-tmp.CWkjKu (%check)
Child return code was: 1
EXCEPTION: [Error()]
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/mockbuild/trace_decorator.py", line 93, in trace
    result = func(*args, **kw)
  File "/usr/lib/python3.6/site-packages/mockbuild/util.py", line 600, in do_with_status
    raise exception.Error("Command failed: \n # %s\n%s" % (command, output), child.returncode)
mockbuild.exception.Error: Command failed: 
 # /usr/bin/systemd-nspawn -q -M 7b7cd6fbe8d24e0eaf9813a59574977b -D /var/lib/mock/hyperscale8s-packages-experimental-el8-build-138797-914754/root -a -u mockbuild --capability=cap_ipc_lock --rlimit=RLIMIT_NOFILE=10245 --bind=/tmp/mock-resolv.t99q2enp:/etc/resolv.conf --bind=/dev/loop-control --bind=/dev/loop0 --bind=/dev/loop1 --bind=/dev/loop2 --bind=/dev/loop3 --bind=/dev/loop4 --bind=/dev/loop5 --bind=/dev/loop6 --bind=/dev/loop7 --bind=/dev/loop8 --bind=/dev/loop9 --bind=/dev/loop10 --bind=/dev/loop11 --setenv=TERM=vt100 --setenv=SHELL=/bin/bash --setenv=HOME=/builddir --setenv=HOSTNAME=mock --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=PROMPT_COMMAND=printf "\033]0;<mock-chroot>\007" --setenv=PS1=<mock-chroot> \s-\v\$  --setenv=LANG=C.UTF-8 --setenv=http_proxy=http://cache.rdu2.centos.org:8080 --setenv=https_proxy=http://cache.rdu2.centos.org:8080 --resolv-conf=off bash --login -c /usr/bin/rpmbuild -bb --target x86_64 --nodeps /builddir/build/SPECS/librepo.spec

To rule out the merge being the source of the issue, I tried to rebuild using c8s commit: https://cbs.centos.org/koji/taskinfo?taskID=2811762 and the same issue happens, so I am kind of leaning toward some difference between koji.mbox and cbs.


Link to the code failing to listen: https://github.com/rpm-software-management/librepo/blob/3b9a8d17188ba602d139f79b6e61305030f21109/tests/python/tests/base.py#L78

Metadata Update from @arrfab:
- Issue tagged with: cbs, medium-gain, medium-trouble

@arrfab any idea what could be the issue? I don't have access to CentOS builders so I can't check if this would also happen there, and hence would be a rather generic problem, or if it is isolated to CBS.

myself have no time to investigate this at the moment but : when you say that you're able to rebuild locally, is that using systemd-nspawn or legacy isolation type ?
koji.mbox.centos.org builders are centos 7 but all cbs.centos.org were migrated to centos stream 8 (that's a difference) and maybe we can force old isolation/chroot type to see if that fixes the issue for you ?

Thanks @arrfab

I believe mock on F36 uses systemd-nspawn by default but I will go and
verify that.
As much as it should not have network access. It should be able to bind to
loop back.

On Mon, May 23, 2022 at 10:58 PM Fabian Arrotin pagure@pagure.io wrote:

arrfab added a new comment to an issue you are following:
myself have no time to investigate this at the moment but : when you say that you're able to rebuild locally, is that using systemd-nspawn or legacy isolation type ? koji.mbox.centos.org builders are centos 7 but all cbs.centos.org were migrated to centos stream 8 (that's a difference) and maybe we can force old isolation/chroot type to see if that fixes the issue for you ?

To reply, visit the link below or just reply to this email
https://pagure.io/centos-infra/issue/782

@arrfab

TL;DR it seems http_proxy env var have been introduced into the build steps env variable, which is causing connection to 127.0.0.1 to go over the proxy which fails (and the proxy itself is not reachable either).

Longer debugging logs below.

ok, so the reason it fails is because the build env has http{,s}_proxy env set up. See first lines of https://cbs.centos.org/kojifiles/work/tasks/8605/2828605/build.log .
When the tests are performed against 127.0.0.1, the proxy is used (see end of https://cbs.centos.org/kojifiles/work/tasks/8605/2828605/build.log):

ERROR:root:Connecting to http://127.0.0.1:5000/
ERROR:root:Connection failed
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 162, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/usr/lib/python3.6/site-packages/urllib3/util/connection.py", line 57, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib64/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib64/python3.6/http/client.py", line 1273, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1319, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1268, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1044, in _send_output
    self.send(msg)
  File "/usr/lib64/python3.6/http/client.py", line 982, in send
    self.connect()
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 184, in connect
    conn = self._new_conn()
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f99cf9ecf60>: Failed to establish a new connection: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='cache.rdu2.centos.org', port=8080): Max retries exceeded with url: http://127.0.0.1:5000/ (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f99cf9ecf60>: Failed to establish a new connection: [Errno -2] Name or service not known',)))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builddir/build/BUILD/librepo-1.14.2/tests/python/tests/base.py", line 90, in setUpClass
    requests.get(cls.MOCKURL, timeout=0.1)
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 510, in send
    raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPConnectionPool(host='cache.rdu2.centos.org', port=8080): Max retries exceeded with url: http://127.0.0.1:5000/ (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f99cf9ecf60>: Failed to establish a new connection: [Errno -2] Name or service not known',)))

The centos build does not have the proxy env set up: https://koji.mbox.centos.org/pkgs/packages/librepo/1.14.2/1.el8/data/logs/x86_64/build.log

I am not sure what is the right solution here. NO_PROXY could be used, but it sounds like a whack-a-mole game. Is it intended that the build have access to the network? Actually, looking at the logs, I am not sure there is any network, so maybe those proxy envs have been introduced lately by mean of propagation but are not really needed.

librepo-1.14.0-2.1.hsx.el8 was successfully built on July 2021: https://cbs.centos.org/koji/buildinfo?buildID=33351
and the build log don't show http_proxy env var being set:
https://cbs.centos.org/kojifiles/packages/librepo/1.14.0/2.1.hsx.el8/data/logs/x86_64/build.log

the https_proxy variable was added to let SIG be able to build from gitlab : as cbs builders are isolated, we had to just force outgoing traffic through specific proxy

You're right that adding the no_proxy env variable would be needed for such tests so ideally we can confirm on a dev kojid instance that it would work and then add that option on cbs koji builders

I believe no_proxy is pretty much interpreted by different tools differently. But probably a good start would be something along the line of:

no_proxy=127.0.0.1,::1,localhost,localhost.localdomain,localhost4,localhost4.localdomain4,localhost6,localhost6.localdomain6

WDYT?

so ideally we can confirm on a dev kojid instance that it would work and then add that option on cbs koji builders

ok, is this something I can help you with? if so, any pointers on how to use that dev kojid instances?

If needs be, you should be able to get the src.rpm file from https://cbs.centos.org/koji/taskinfo?taskID=2811733

Metadata Update from @arrfab:
- Issue assigned to arrfab

Had a quick look (between other tasks) and I gave it a try and it worked : I then merged and pushed commit to specify this and then ansible pushed the change.
I just wanted to confirm that it was also working on a public koji instance (dev isn't public) and so just resubmitted your job : https://cbs.centos.org/koji/taskinfo?taskID=2831447

It's now working with the added no_proxy option and so I'll close this ticket

Metadata Update from @arrfab:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

Realizing that build itself succeeded but not tag-build, just because of needed permissions, but that's unblocking you, so you can now tag-build pkg yourself to -candidate (and -testing/-release)

Thanks! I was able to tag the build and it got promoted to the mirrors!

Log in to comment on this ticket.

Metadata