We finally got an ISO produced out of last night's pungi run in Fedora 26. The bad news is that there a bunch of failures on boot and you never get a login prompt and the machine never gets an IP address so you can't access it from traditional methods. Asking colin to take a look at this.
ISO image: https://kojipkgs.fedoraproject.org/compose/branched/Fedora-26-20170315.n.0/compose/Atomic/x86_64/iso/Fedora-Atomic-ostree-x86_64-26-20170315.n.0.iso
Metadata Update from @dustymabe: - Issue assigned to walters - Issue tagged with: F26, host
Booted the ISO via virt-manager and collected the output from the console using:
virt-manager
virsh console 16 | tee console.log
Formatting got a bit mangled, but I put the majority of it into this paste
https://paste.fedoraproject.org/paste/17KsuOmnzXqs4bfU6Thy3F5M1UNdIGYhyRLivL9gydE=
Some notable failures:
[[0;1;31mFAILED[0m] Failed to start Setup Virtual Console. See 'systemctl status systemd-vconsole-setup.service' for details.
[[0;1;31mFAILED[0m] Failed to start Create Volatile Files and Directories. See 'systemctl status systemd-tmpfiles-setup.service' for details.
Clean install of this ISO to qemu-kvm, and then boot it with systemd debug enabled, and use virsh console to capture the boot and startup. Attaching this console.log.
This is the short version. There's 13.8 MiB of additional lines like below, so I chopped them out. Event source bus-time (type monotonic) returned error, disabling: Connection timed out Event source bus-time (type monotonic) returned error, disabling: Connection timed out Event source bus-time (type monotonic) returned error, disabling: Connection timed out Event source bus-time (type monotonic) returned error, disabling: Connection timed out
Service failures in order:
[ 1.554768] dracut-pre-udev[337]: rpcbind: /run/rpcbind/rpcbind.lock: No such file or directory Received SIGCHLD from PID 359 (rpc.statd). Child 359 (rpc.statd) died (code=exited, status=1/FAILURE) dracut-pre-udev.service: Child 359 belongs to dracut-pre-udev.service Child 636 (systemd-vconsol) died (code=exited, status=1/FAILURE) systemd-vconsole-setup.service: Child 636 belongs to systemd-vconsole-setup.service systemd-vconsole-setup.service: Main process exited, code=exited, status=1/FAILURE systemd-vconsole-setup.service: Changed start -> failed systemd-vconsole-setup.service: Job systemd-vconsole-setup.service/start finished, result=failed Enabling showing of status. Received SIGCHLD from PID 706 (systemd-tmpfile). Child 706 (systemd-tmpfile) died (code=exited, status=1/FAILURE) systemd-tmpfiles-setup.service: Child 706 belongs to systemd-tmpfiles-setup.service systemd-tmpfiles-setup.service: Main process exited, code=exited, status=1/FAILURE systemd-tmpfiles-setup.service: Changed start -> failed systemd-tmpfiles-setup.service: Job systemd-tmpfiles-setup.service/start finished, result=failed [FAILED] Failed to start Create Volatile Files and Directories. systemd-logind.service: Got notification message from PID 746 (STOPPING=1, STATUS=Shutting down...) Received SIGCHLD from PID 746 (systemd-logind). Child 746 (systemd-logind) died (code=exited, status=1/FAILURE) systemd-logind.service: Child 746 belongs to systemd-logind.service systemd-logind.service: Main process exited, code=exited, status=1/FAILURE systemd-logind.service: Changed start -> failed systemd-logind.service: Job systemd-logind.service/start finished, result=failed [FAILED] Failed to start Login Service.
But I'm not really sure about why these are failing.
OK this seems bad
plymouth-start.service: Executing: /usr/sbin/plymouthd --mode=boot --pid-file=/var/run/plymouth/pid --attach-to-session [ 3.372107] general protection fault: 0000 [#1] SMP [ 3.372654] Modules linked in: virtio_console(+) parport snd_timer qemu_fw_cfg snd soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace 8139too qxl drm_kms_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel serio_raw virtio_pci virtio_ring 8139cp mii virtio ata_generic pata_acpi sunrpc scsi_transport_iscsi [ 3.374009] CPU: 1 PID: 667 Comm: systemd-udevd Not tainted 4.11.0-0.rc1.git0.1.fc26.x86_64 #1 [ 3.374009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 [ 3.374009] task: ffffa0b875210000 task.stack: ffffaf5a40758000 [ 3.374009] RIP: 0010:vp_modern_find_vqs+0x39/0x70 [virtio_pci] [ 3.374009] RSP: 0018:ffffaf5a4075ba68 EFLAGS: 00010282 [ 3.374009] RAX: ffffaf5a403ed000 RBX: 2d306f6974726976 RCX: 0000000000000000 [ 3.374009] RDX: 00000000000000fc RSI: ffffaf5a403ed01c RDI: 0000000000000001 [ 3.374009] RBP: ffffaf5a4075ba88 R08: 000000000001c8e0 R09: ffffffff9341ea29 [ 3.374009] R10: ffffe08981dd0c40 R11: 0000000000000000 R12: ffffa0b834bda708 [ 3.374009] R13: 0000000000000000 R14: ffffa0b834bda400 R15: 000000000000001f [ 3.374009] FS: 00007fd9211438c0(0000) GS:ffffa0b87fd00000(0000) knlGS:0000000000000000 [ 3.374009] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.374009] CR2: 000055f2eeb94728 CR3: 000000007610c000 CR4: 00000000003406e0 [ 3.374009] Call Trace: [ 3.374009] init_vqs+0x1a0/0x2e0 [virtio_console] [ 3.374009] virtcons_probe+0xb9/0x360 [virtio_console] [ 3.374009] virtio_dev_probe+0x144/0x1e0 [virtio] [ 3.374009] driver_probe_device+0x106/0x450 [ 3.374009] __driver_attach+0xa4/0xe0 [ 3.374009] ? driver_probe_device+0x450/0x450 [ 3.374009] bus_for_each_dev+0x6e/0xb0 [ 3.374009] driver_attach+0x1e/0x20 [ 3.374009] bus_add_driver+0x1d0/0x270 [ 3.374009] ? virtio_cons_early_init+0x1d/0x1d [virtio_console] [ 3.374009] driver_register+0x60/0xe0 [ 3.374009] ? virtio_cons_early_init+0x1d/0x1d [virtio_console] [ 3.374009] register_virtio_driver+0x20/0x30 [virtio] [ 3.374009] init+0x9f/0xfe3 [virtio_console] [ 3.374009] do_one_initcall+0x50/0x1a0 [ 3.374009] ? free_hot_cold_page+0x19a/0x300 [ 3.374009] ? kmem_cache_alloc_trace+0x15f/0x1c0 [ 3.374009] ? do_init_module+0x27/0x1e6 [ 3.374009] do_init_module+0x5f/0x1e6 [ 3.374009] load_module+0x22b7/0x2820 [ 3.374009] ? __symbol_put+0x60/0x60 [ 3.374009] SYSC_init_module+0x16f/0x1a0 [ 3.374009] SyS_init_module+0xe/0x10 [ 3.374009] do_syscall_64+0x67/0x170 [ 3.374009] entry_SYSCALL64_slow_path+0x25/0x25 [ 3.374009] RIP: 0033:0x7fd91fda53da [ 3.374009] RSP: 002b:00007ffdf7f18d38 EFLAGS: 00000246 ORIG_RAX: 00000000000000af [ 3.374009] RAX: ffffffffffffffda RBX: 000055f2eeb6f6a0 RCX: 00007fd91fda53da [ 3.374009] RDX: 00007fd9208da9c5 RSI: 000000000000b37b RDI: 000055f2eeb893a0 [ 3.374009] RBP: 00007fd9208da9c5 R08: 000055f2eeb74e80 R09: 0000000000000078 [ 3.374009] R10: 00007fd92005fb00 R11: 0000000000000246 R12: 000055f2eeb893a0 [ 3.374009] R13: 000055f2eeb6c140 R14: 0000000000020000 R15: 000055f2ede5dfca [ 3.374009] Code: 54 53 49 89 fe e8 78 0d 00 00 85 c0 41 89 c5 75 44 49 8b 9e 08 03 00 00 4d 8d a6 08 03 00 00 4c 39 e3 74 31 49 8b 86 38 03 00 00 <0f> b7 7b 28 48 8d 70 16 e8 3a e6 15 d3 49 8b 86 38 03 00 00 bf [ 3.374009] RIP: vp_modern_find_vqs+0x39/0x70 [virtio_pci] RSP: ffffaf5a4075ba68 [ 3.410405] ---[ end trace a110b926d7e8d96b ]---
For what it's worth, this gpf and trace don't happen when booting the install media.
more worthy information:
I rebased like so:
[vagrant@vanilla-f25atomic ~]$ cat /etc/ostree/remotes.d/f26.conf [remote "f26"] url=https://kojipkgs.fedoraproject.org/compose/atomic/26/ gpg-verify=false [vagrant@vanilla-f25atomic ~]$ sudo rpm-ostree rebase f26:fedora/26/x86_64/atomic-host
I do see some failures, but was ultimately able to log in:
[vagrant@vanilla-f25atomic ~]$ sudo systemctl --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● systemd-vconsole-setup.service loaded failed failed Setup Virtual Console LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 1 loaded units listed. Pass --all to see loaded but inactive units, too. To show all installed unit files use 'systemctl list-unit-files'.
atomic host cloud images from last night:
https://kojipkgs.fedoraproject.org/compose/branched/Fedora-26-20170316.n.0/compose/CloudImages/x86_64/images/
Journal output from a booted system. Removed a ton of Looping too fast. Throttling execution a little. lines.
Looping too fast. Throttling execution a little.
This is because:
$ guestmount --ro -m /dev/atomicos/root:/ -a /srv/libvirt/images/fedora-VAGRANTSLASH-26-atomic-host-test_vagrant_box_image_0.img ~/tmp/mnt/ $ grep ^passwd: ~/tmp/mnt/ostree/deploy/fedora-atomic/deploy/31be42e470a888fc0256fa436f33e54e74e8a180e3f85b0d090e81d2a8d3416b.0/etc/nsswitch.conf passwd: sss files systemd
what should it be? - should we open an issue upstream somewhere?
F25 Atomic Host has:
$ grep passwd /etc/nsswitch.conf #passwd: db files nisplus nis passwd: files altfiles sss
https://github.com/projectatomic/rpm-ostree/issues/685
what should it be? - should we open an issue upstream somewhere? F25 Atomic Host has: $ grep passwd /etc/nsswitch.conf passwd: db files nisplus nis passwd: files altfiles sss
F25 Atomic Host has: $ grep passwd /etc/nsswitch.conf
passwd: files altfiles sss
I can confirm that fixing those lines allows me to get to a login prompt.
Another issue here is:
Mar 16 11:19:24 localhost.localdomain systemd[1]: sysinit.target: Found ordering cycle on sysinit.target/start Mar 16 11:19:24 localhost.localdomain systemd[1]: sysinit.target: Found dependency on cloud-init.service/start Mar 16 11:19:24 localhost.localdomain systemd[1]: sysinit.target: Found dependency on basic.target/start Mar 16 11:19:24 localhost.localdomain systemd[1]: sysinit.target: Found dependency on sockets.target/start Mar 16 11:19:24 localhost.localdomain systemd[1]: sysinit.target: Found dependency on iscsiuio.socket/start Mar 16 11:19:24 localhost.localdomain systemd[1]: sysinit.target: Found dependency on sysinit.target/start Mar 16 11:19:24 localhost.localdomain systemd[1]: sysinit.target: Breaking ordering cycle by deleting job cloud-init.service/start Mar 16 11:19:24 localhost.localdomain systemd[1]: cloud-init.service: Job cloud-init.service/start deleted to break ordering cycle starting with sysinit.target/start
Which is because cloud-init.service has:
cloud-init.service
Before=sysinit.target
Commenting that out does allow me to ssh with credentials provided using cloud-init. We'll need to hunt down why that was added. I don't see such a dependency in F25.
@jlebon Possibly related: https://bugzilla.redhat.com/show_bug.cgi?id=1400249
@jlebon @mattdm - this is known bug - the fix is currently being pushed to stable - next image build should have fix
Should be fixed by https://bodhi.fedoraproject.org/updates/rpm-ostree-2017.3-3.fc26 - I don't think we need another human to test it manually, just a sanity check that https://github.com/projectatomic/rpm-ostree/pull/686 was pulled in etc.
images now boot. closing this.
Metadata Update from @dustymabe: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.