Hello,
We started having trouble with some of our CI jobs failing to validate resource-agents-4.10.0-17.el9.x86_64.rpm.
resource-agents-4.10.0-17.el9.x86_64.rpm
Upon further investigation, we have mirrored a file with hash b24e6a8a70066918658ffc390d3dc4e3f91eb19eead295db9f14b7af988d8796 from our upstream.
b24e6a8a70066918658ffc390d3dc4e3f91eb19eead295db9f14b7af988d8796
As you can see in the small script and results at the following link - https://paste.opendev.org/show/bfACXyFK1sL6CS8E5Ts4/ - some mirrors appear to have a corrupt version of this file, while others have one that agrees with mirror.stream.centos.org (3a4a37810d503f5eefb6b114d3b9f65d82ede7b9983dd1b3d62f975f8081e94b). e.g.
3a4a37810d503f5eefb6b114d3b9f65d82ede7b9983dd1b3d62f975f8081e94b
3a4a37810d503f5eefb6b114d3b9f65d82ede7b9983dd1b3d62f975f8081e94b - http://mirror.karneval.cz/pub/linux/centos-stream/9-stream/HighAvailability/x86_64/os b24e6a8a70066918658ffc390d3dc4e3f91eb19eead295db9f14b7af988d8796 - http://mirror.shastacoe.net/centos-stream/9-stream/HighAvailability/x86_64/os 3a4a37810d503f5eefb6b114d3b9f65d82ede7b9983dd1b3d62f975f8081e94b - https://mirror1.hs-esslingen.de/pub/Mirrors/centos-stream/9-stream/HighAvailability/x86_64/os b24e6a8a70066918658ffc390d3dc4e3f91eb19eead295db9f14b7af988d8796 - http://mirror.net.cen.ct.gov/centos-stream/9-stream/HighAvailability/x86_64/os 3a4a37810d503f5eefb6b114d3b9f65d82ede7b9983dd1b3d62f975f8081e94b - http://ftp.fi.muni.cz/pub/linux/centos-stream/9-stream/HighAvailability/x86_64/os
The two files are exactly the same size. They start to differ at the end; it's too big to paste but you can see in a dump comparison on the left the corrupt file has what looks like null-separated hashes that continue to EOF - https://paste.opendev.org/show/bKxa7j6Yi2lqzBOxc3Q0/
I'm wondering if there's been a corrupt file pushed, and some mirrors have picked it up, but are now not refreshing themselves because the file size (and timestamp) has remained the same?
I believe there were some hardware issues around the time we started to notice this (https://pagure.io/centos-infra/issue/812). This may be related.
Metadata Update from @arrfab: - Issue assigned to arrfab
as discussed on irc (#centos-devel on irc.libera.chat), I confirm that we had a bad package. The main cause was discussed in other ticket you mentioned.
So I then ensure that bad copies where replaced with the correct version but one thing we'll not control is how third-party mirrors will get it (or not) : the fact that even in the centos infra "downstream" mirrors weren't updating it, even after being pointed to another origin/pull-from server, it related to the fact that rsync , it not called with --checksum (really io intensive so not used by default) will just skip the file, as it looks at timestamp and file size (see -c, --checksum skip based on checksum, not mod-time & size from man rsync)
-c, --checksum skip based on checksum, not mod-time & size
That specific file should be ok now for you on all mirror.stream.centos.org
Metadata Update from @arrfab: - Issue tagged with: centos-stream, high-gain, medium-trouble
Two thoughts are that if you have access to the underlying storage, you could "touch" it to update the modification time? Or, we ask the package owner to do a no-op version bump to get a fresh build with a new, later version in (and then release the mirror as soon as possible). Leaving some subset of mirrors silently broken for an extended period seems like a really bad fallback case.
This is also what was done too : http://mirror.stream.centos.org/9-stream/HighAvailability/x86_64/os/Packages/?C=M;O=D
So if updating the timestamp hasn't made the new package flow onto all mirrors, then perhaps the package version bump and mirror update is the best solution?
Do you confirm that it's ok from your side now ?
Our upstream has synced the correct file, and so has the metalinks i get using the script above. I guess this type of corruption is a pretty edge case. Thanks
Metadata Update from @iwienand: - Issue close_status updated to: Fixed with Explanation - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.