#1327 Duffy fails to provision C9S nodes
Closed: Fixed with Explanation by arrfab. Opened by mrc0mmand.

Hey,

An hours or so ago I noticed that a couple of our C9S jobs are waiting for a suitable C9S node to get provisioned, but looking at all C9S pools and watching them for a while it looks like that's never going to happen. For example, the provisioning field in the virt-ec2-t2-centos-9s-x86_64 pool keeps jumping to 7 (or whatever the current ready - fill_level number is), but falling back to 0 a couple of seconds later.

I noticed this with most of the C9S pools (virt-ec2-c6g-centos-9s-aarch64, metal-ec2-c5n-centos-9s-x86_64, virt-ec2-t2-centos-9s-x86_64), so it looks like there's something going wrong right at the start of the provisioning phase. Unfortunately, I can't see what's going on, all I have is numbers from Duffy.

/cc @nphilipp


Metadata Update from @arrfab:
- Issue assigned to arrfab

Metadata Update from @arrfab:
- Issue tagged with: centos-ci-infra, high-gain, medium-trouble

It seems some CentOS Stream 9 images were removed from AWS :

2023-12-11 14:01:43,915 p=2895669 u=duffy n=ansible | An exception occurred during task execution. To see the full traceback, use -vvv. The error was: botocore.exceptions.ClientError: An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-032c467025553090b]' does not exist

Let me update it and push to duffy ansible inventory (git)

I pushed the updated AMI lDs (retrieved from https://www.centos.org/download/aws-images/ for us-east-1 region) and Duffy was then able to instantiate EC2 instances

{
  "action": "get",
  "pool": {
    "name": "virt-ec2-t2-centos-9s-x86_64",
    "fill_level": 8,
    "levels": {
      "provisioning": 0,
      "ready": 8,
      "contextualizing": 0,
      "deployed": 8,
      "deprovisioning": 0
    }
  }
}

Metadata Update from @arrfab:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

Log in to comment on this ticket.

Metadata