openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: [WIP] add buster tests https://review.openstack.org/649497 | 02:10 |
---|---|---|
*** hwoarang has quit IRC | 03:01 | |
*** hwoarang has joined #openstack-dib | 03:06 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: debian-minimal buster support https://review.openstack.org/649496 | 05:10 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: [WIP] add buster tests https://review.openstack.org/649497 | 05:10 |
*** hwoarang has quit IRC | 06:39 | |
*** hwoarang has joined #openstack-dib | 06:41 | |
cgoncalves | ianw, hi! do you happen to be still around? | 09:38 |
cgoncalves | ianw, I proposed a work-around to https://bugzilla.redhat.com/show_bug.cgi?id=1666612 in https://review.openstack.org/#/c/643749/ | 09:39 |
openstack | bugzilla.redhat.com bug 1666612 in systemd "Rules "uname -p" and "systemd-detect-virt" kill the system boot time on large systems" [High,On_qa] - Assigned to jsynacek | 09:39 |
cgoncalves | apparently the work-around is not helping much, as you can see in https://review.openstack.org/#/c/643752/ | 09:40 |
cgoncalves | the centos7 job passed, twice, but barely hit the job time out. the tempest jobs still take a long time. the job is expected to run in ~1h30 like in ubuntu jobs | 09:41 |
cgoncalves | neither with or without the work-around I see the expected output from the journalctl example | 09:43 |
cgoncalves | so I am wondering why... I ran a centos7 VM with and without nested virtualization enabled | 09:43 |
*** mjturek has joined #openstack-dib | 13:53 | |
*** altlogbot_2 has quit IRC | 13:54 | |
*** Vorrtex has joined #openstack-dib | 15:58 | |
Vorrtex | Hey, I come to you guys with a small problem. I've been working on building an octavia amphora on power with a centos base, and I'm seeing an error that I don't quite understand. Any help you can provide would be *fantastic*. https://gist.github.com/tvardema/5bb00d0d327b6a026edbddd0b5953d3d | 16:02 |
Vorrtex | As someone who is almost completely unfamiliar with the process here, I'm very confused why DIB would look for a compressed image tarball as the base, and then immediately re-pack the tarball, then complain about the tarred tar being not a qcow2 image? Which seems "accurate" to me? I'm 90% sure I'm missing an argument somewhere that I should use. | 16:03 |
clarkb | Vorrtex: This is from memory but I think it takes the images and converts it to a file stream (tarball basically) so that it can write it out onto its own image file (which allows it to do partitioning and the like) | 16:09 |
clarkb | I want to say this may break if the upstream image has multiple partitions itself? | 16:10 |
clarkb | johnsom: may have more insight on this as it relates to octavia | 16:10 |
johnsom | That is a pure DIB error. The image is xz compressed and DIB expects it to be a raw qcow | 16:11 |
johnsom | clarkb He is setting file:///root/CentOS-7-ppc64le-GenericCloud.qcow2.xz as the file path, but DIB wants just a qcow2 | 16:12 |
clarkb | oh I see. So it needs to be uncompressed first | 16:12 |
Vorrtex | clarkb the problem is I'm not telling it to look for CentOS-7-ppc64le-GenericCloud.qcow2.xz specifically, that's what DIB is looking for by default | 16:24 |
Vorrtex | I tried changing my base image's name to that just to see if it would work regardless, but it fails with a separate error. | 16:25 |
clarkb | https://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/centos7/root.d/10-centos7-cloud-image is the source of the filename | 16:29 |
johnsom | Vorrtex Did you unxz it? it's in xz compression format. You can check by running "file <filename>" | 16:29 |
clarkb | looks like the cloud images url doesn't work either (centos isn't publishing ppc images anymore?) | 16:29 |
johnsom | He is overriding the image location with a command line export | 16:30 |
clarkb | I don't see that in the paste? | 16:30 |
clarkb | but ya DIB_LOCAL_IMAGE is how you should set that I think | 16:30 |
johnsom | It's the "file://" URL that points me that way | 16:31 |
johnsom | Plus DIB doesn't know anything about xz compressed files as far as I know | 16:32 |
clarkb | the extract-image script seems to at least try | 16:33 |
* clarkb gets a link | 16:33 | |
*** yolanda has quit IRC | 16:33 | |
clarkb | https://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/redhat-common/bin/extract-image#n53 | 16:33 |
Vorrtex | johnsom I compressed it because the error message said it was looking for a .xz file. Like I mentioned above, I also took the original image and changed its name to have ".xz" just to try it out, and it fails anyway | 16:34 |
clarkb | Vorrtex: where can we find the original source file? | 16:35 |
Vorrtex | Its on a vm that I have in redhat internal servers. | 16:35 |
Vorrtex | Had to build it ourselves | 16:35 |
clarkb | qcow2 has had incompatible format changes in the past, it is theoretically possible the qcow2 is too new for your version of qcow2. | 16:35 |
clarkb | er too new for qemu-img | 16:36 |
clarkb | Vorrtex: are you able to boot that image or otherwise verify it is a valid qcow2 image? | 16:36 |
clarkb | other debugging steps would be to run through the commands at https://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/redhat-common/bin/extract-image#n53 to do the uncompression and convert yourself just so that we can narrow down where it might be failing | 16:38 |
Vorrtex | clarkb the command run to create the image was "qemu-img create -f qcow2 clouimg.qcow2 60G", followed by a virt-install using this location: http://mirror.centos.org/altarch/7/os/ppc64le/ | 16:39 |
clarkb | ya I would start by double checking that image is actually happy with qemu-img then run through the steps that dib is taking to see if we can get any more info out of it | 16:41 |
clarkb | (assuming something doesn't pop up before hand) | 16:41 |
clarkb | `qemu-img check` | 16:41 |
Vorrtex | "No errors were found on the image" | 16:42 |
Vorrtex | followed by some extra output | 16:42 |
clarkb | ok then run through the steps dib is taking at https://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/redhat-common/bin/extract-image#n53 | 16:43 |
clarkb | you know I wonder if you are running out of disk? | 16:43 |
clarkb | and the tmp image is getting truncated? (just an idea I have no evidence of that) | 16:43 |
Vorrtex | 32GB available on the VM | 16:43 |
clarkb | but your image is 60GB | 16:43 |
Vorrtex | the image file isn't that big. | 16:44 |
Vorrtex | I guess it could be that bad, right? | 16:44 |
clarkb | Vorrtex: if you read the dib script it is converting it to raw | 16:44 |
clarkb | the raw image will be that big | 16:44 |
Vorrtex | alright, then I'll have to rebuild the image smaller... johnsom how "small" can the amp image be? | 16:45 |
johnsom | That is up to your image. Centos is usually pretty fat and requires at least 3GB | 16:45 |
Vorrtex | I'll make it 5 GB then. Just to see what's what. | 16:46 |
clarkb | you can probably just resize it | 16:46 |
clarkb | qemu-img resize | 16:47 |
clarkb | (not sure if that supports shrinking) | 16:47 |
Vorrtex | Oh? Hmm... I try that. | 16:47 |
clarkb | sometimes filesystems get bad about that | 16:47 |
clarkb | and i think centos defaults to xfs now which doesn't shrink | 16:47 |
clarkb | but worth a try if it is a quick fix :) | 16:47 |
Vorrtex | This seems to have worked: qemu-img resize --shrink clouimg.qcow2 -25G | 16:49 |
Vorrtex | RIP, my image was 30 GB, not the one generated for the base image... had to shrink it more (bad maths). Retrying again, just in case. | 17:13 |
Vorrtex | alright, size shouldn't be a problem at all. Still complaining about the format. | 17:25 |
Vorrtex | I'll run through the steps of that script now. | 17:25 |
Vorrtex | clarkb I'm not sure if I got farther or not, but I stopped trying to fuss with the compressed image, and am trying to use this other environment variable: DIB_LOCAL_IMAGE="/root/clouimg.qcow2" | 18:11 |
Vorrtex | Image is valid, for sure, but it failes with: | 18:11 |
Vorrtex | device-mapper: reload ioctl on loop1p2 failed: Invalid argument | 18:11 |
Vorrtex | fails*** rather | 18:12 |
clarkb | could be that the loopbacks are confused? an easyt way to clear them out is a reboot but you can manually manage with losetup iirc | 18:12 |
Vorrtex | I'll try a reboot real quick | 18:12 |
Vorrtex | clarkb same error. | 18:16 |
clarkb | Im not sure then. Are you able to paste the list of commands and output? | 18:27 |
Vorrtex | list of commands of what? | 18:27 |
Vorrtex | I'm not even through the script just yet (got back from lunch, sorry) | 18:28 |
Vorrtex | I have an environment variable set (DIB_LOCAL_IMAGE, mentioned above) and the error is : | 18:29 |
Vorrtex | 2019-04-04 18:28:52.719 | device-mapper: reload ioctl on loop1p2 failed: Invalid argument │./diskimage_builder/elements/redhat-common/bin/extract-image: echo "Working in $WORKING" | 18:29 |
Vorrtex | 2019-04-04 18:28:52.724 | create/reload failed on loop1p2 | 18:29 |
Vorrtex | lol, it grabbed my other screen... dammit | 18:29 |
Vorrtex | lemme get a better copy. | 18:29 |
Vorrtex | device-mapper: reload ioctl on loop1p2 failed: Invalid argument | 18:29 |
Vorrtex | create/reload failed on loop1p2 | 18:29 |
Vorrtex | I'm having trouble even finding where in that script I would start on device-mapper, since that doesn't seem to be in there as failure text. | 18:30 |
Vorrtex | I believe I see the output from this line: https://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/redhat-common/bin/extract-image#n67 | 18:32 |
Vorrtex | but can't seem to find exactly what fails next, really. | 18:32 |
Vorrtex | Alright, looks like I might just have to walk this more slowly than I thought. Will report back in a little bit. | 18:34 |
clarkb | ok. Also ianw should be coming on line soon and may be more help | 18:34 |
Vorrtex | clarkb the erring line is https://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/redhat-common/bin/extract-image#n71 | 18:52 |
clarkb | Vorrtex: fails on the kpartx? | 18:55 |
clarkb | can you paste that and the output without the awk? | 18:55 |
Vorrtex | Lemme see. (I was just using the script, and commenting out bits of it. | 18:55 |
Vorrtex | ) | 18:55 |
Vorrtex | https://gist.github.com/tvardema/5bb00d0d327b6a026edbddd0b5953d3d | 18:58 |
Vorrtex | If it "helps" at all, it keeps incrementing my loops... first run is loop0, and now I'm on loop4. | 18:58 |
Vorrtex | clarkb ^^ | 19:05 |
clarkb | Vorrtex: does /dev/loop4p1 exist? | 19:08 |
Vorrtex | nope. How could it, nothing created it. | 19:10 |
Vorrtex | It goes from "make /dev/loop#" exist, and then assumes "/dev/loop#p1" or whatever exists... | 19:11 |
clarkb | well kpartx is trying to add it if I am reading that correctly | 19:13 |
clarkb | and was curious if it succeeded despite the error | 19:13 |
Vorrtex | sudo losetup -f isn't apparently creating "loop#p1" or whatever, it just makes a new "loop#". | 19:13 |
clarkb | ya then kpartx reads the partition tablr and creates those subdevices | 19:14 |
Vorrtex | You mean when it does the awk portion for line 71? | 19:14 |
Vorrtex | so technically line 72? because I didn't read that right myself lol | 19:16 |
clarkb | no kpartx itself | 19:18 |
Vorrtex | Alright, well, I'm obviously not familiar with how that's supposed to work, but its definitely *not* working as written. Any ideas, or is this still us waiting for maybe ianw to help out? | 19:20 |
clarkb | I think the next thing maybe cehcking the kernel loh for more error info | 19:21 |
Vorrtex | checking the host machine's kernel log or is there a log associated with DIB? | 19:24 |
clarkb | would be the kernel log on the machine | 19:25 |
clarkb | ioctl is in the kernel so hopefully it logged more i fo | 19:25 |
Vorrtex | Hmm: device-mapper: table: 253:8: loop4 too small for target: start=10240, len=125818880, dev_size=10485760 | 19:26 |
Vorrtex | I don't really know what that means. | 19:26 |
clarkb | possible the shrink didnt update the partition table properly | 19:29 |
Vorrtex | Alright, building new image... lol thanks for the help so far, maybe a build from scratch will remove all these issues :( | 19:29 |
Vorrtex | I'm also going to fix that cloud image name... cuz it hurts me. | 19:30 |
Vorrtex | rebuilt the image, restarted the host machine, and it seems to have gone much farther.... Looks like it failed on a repofile, though | 20:22 |
Vorrtex | Thanks again for the help, I'll be looking into it more tomorrow or early next week. | 20:23 |
Vorrtex | clarkb ^^ | 20:23 |
ianw | Ran: 15 tests in 2499.0931 sec. | 20:32 |
ianw | Ran: 15 tests in 4967.0000 sec. | 20:32 |
ianw | http://logs.openstack.org/52/643752/1/check/octavia-v2-dsvm-scenario-centos-7/56ef7da/job-output.txt.gz | 20:32 |
ianw | http://logs.openstack.org/88/634988/6/check/octavia-v2-dsvm-scenario-ubuntu-bionic/c1db89d/job-output.txt.gz | 20:33 |
ianw | does seem like it's fairly apples-to-apples much slower | 20:33 |
clarkb | what is the difference between the two? | 20:34 |
johnsom | ianw I am out of the loop, but bionic usually runs faster than centos | 20:34 |
johnsom | Confused by those results | 20:34 |
johnsom | Ah, they are backward, ok | 20:34 |
ianw | sorry, the longer run is centos yes; this was in response to a cgoncalves question above | 20:35 |
johnsom | It's roughly double the run time on centos. Not just tests either, even the boot time is slower | 20:36 |
ianw | hrm, it would be nice if the time was in the testr_results.html | 20:36 |
ianw | yeah, when we last deep dived into this we debugged the entire thing to find it was the same thing as reported @ https://bugzilla.redhat.com/show_bug.cgi?id=1666612 | 20:36 |
openstack | bugzilla.redhat.com bug 1666612 in systemd "Rules "uname -p" and "systemd-detect-virt" kill the system boot time on large systems" [High,On_qa] - Assigned to jsynacek | 20:36 |
johnsom | It's also pretty hard to judge using the zuul gates as instance provider performance varies greatly. | 20:37 |
ianw | it ended up forking uname for like every single udev rule | 20:37 |
cgoncalves | hi Ian :) | 20:38 |
johnsom | The issue cgoncalves is trying to track down is why the centos based job is double the run time of the ubuntu job. It's so far out it ends up timing out the centos jobs regularly. | 20:38 |
cgoncalves | yeah. I believe I applied the workaround suggested but still it doesn't reduce the times | 20:39 |
cgoncalves | I also didn't get the journalctl messages that confirm the bug so I wonder if it's a different issue | 20:40 |
ianw | yeah, fix applied at http://logs.openstack.org/52/643752/1/check/octavia-v2-dsvm-scenario-centos-7/56ef7da/controller/logs/dib-build/amphora-x64-haproxy.qcow2_log.txt.gz#_2019-03-17_19_41_05_348 | 20:41 |
ianw | it ran for ~40 seconds which seems consistent with rebuilding the initramfs | 20:41 |
*** Vorrtex has quit IRC | 20:46 | |
*** mjturek has quit IRC | 21:08 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!