ianw | johnsom: the other thing is, you can just manually remove the .egg-info file and that's basically what pip<10 did anyway | 00:08 |
---|---|---|
ianw | it's not perfect, but if ti was working, it would continue | 00:08 |
johnsom | I'm all over this venv now... grin | 00:12 |
*** mjturek has quit IRC | 00:14 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Remove installed packages before pip install https://review.openstack.org/561479 | 00:22 |
clarkb | ianw: ^ that update lgtm I've +2'd | 00:31 |
*** andreas_s has joined #openstack-dib | 01:00 | |
ianw | good news, i've built a xenial with most elements and pip10 doesn't seem to give issues | 01:04 |
ianw | $ pip --version | 01:04 |
ianw | pip 10.0.0 from /usr/local/lib/python2.7/dist-packages/pip (python 2.7) | 01:04 |
ianw | $ pip3 --version | 01:04 |
ianw | pip 10.0.0 from /usr/local/lib/python3.5/dist-packages/pip (python 3.5) | 01:04 |
ianw | now ... see if we can diagnose this rax boot issue :/ | 01:04 |
*** andreas_s has quit IRC | 01:05 | |
johnsom | Looks like I have a good fix for Octavia with the venv. CentOS is still broken, but I think your patch will fix that (virtualenv). | 03:07 |
*** andreas_s has joined #openstack-dib | 03:45 | |
*** andreas_s has quit IRC | 03:50 | |
*** pbourke has quit IRC | 04:05 | |
*** pbourke has joined #openstack-dib | 04:06 | |
*** esha2 has joined #openstack-dib | 04:42 | |
ianw | http://logs.openstack.org/79/561479/8/check/nodepool-functional-py35-gentoo-src/aefeaed/controller/logs/builds/gentoo-17-0-systemd-0000000002_log.txt.gz#_2018-04-17_01_50_46_833 | 05:29 |
ianw | emerge: there are no ebuilds to satisfy "yum-utils". | 05:29 |
ianw | promethanfire: ^ can you take a look | 05:29 |
prometheanfire | hi | 05:32 |
*** esha2 has quit IRC | 05:32 | |
prometheanfire | oh, just a pkg-map thing? | 05:33 |
*** esha2 has joined #openstack-dib | 05:33 | |
prometheanfire | * sys-apps/yum | 05:33 |
prometheanfire | Available versions: ~3.4.3_p20170619 {test PYTHON_TARGETS="python2_7"} | 05:33 |
prometheanfire | Homepage: http://yum.baseurl.org/ | 05:33 |
prometheanfire | that seem right? | 05:33 |
prometheanfire | not sure what binary/file you want installed | 05:33 |
ianw | urgh, i guess that comes from the epel element, which shoulnd't install anything for !centos/rhel | 05:36 |
prometheanfire | :D | 05:37 |
prometheanfire | notmyfault? :D | 05:38 |
ianw | ok, no mea culpa | 05:39 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Remove installed packages before pip install https://review.openstack.org/561479 | 05:44 |
*** esha1 has joined #openstack-dib | 06:04 | |
*** esha2 has quit IRC | 06:07 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Remove installed packages before pip install https://review.openstack.org/561479 | 06:10 |
*** andreas_s has joined #openstack-dib | 06:17 | |
*** hashar has joined #openstack-dib | 06:53 | |
*** andreas_s has quit IRC | 13:52 | |
*** hashar has quit IRC | 14:02 | |
*** johnsom has quit IRC | 14:11 | |
*** johnsom has joined #openstack-dib | 14:12 | |
*** brault_ has quit IRC | 14:18 | |
*** mjturek has joined #openstack-dib | 14:21 | |
*** hashar has joined #openstack-dib | 14:31 | |
*** esha1 has quit IRC | 15:19 | |
*** esha1 has joined #openstack-dib | 15:43 | |
*** hashar is now known as hasharAway | 16:05 | |
johnsom | So here is the question of the day: Are we going to pin pip for the stable branches? | 16:49 |
clarkb | johnsom: probably a question for the qa team? personally I'd like to not pin anything | 16:50 |
clarkb | that has the potential for putting pbr in a weird position in particular | 16:51 |
johnsom | I ask, because my workaround venv of the octavia image uses too much disk space during the build for the file system size we used previously. Changing the stable branch file system size default up to 3GB doesn't seem right, nor backporting our ubuntu-minimal changes that allow it to fit. | 16:51 |
clarkb | johnsom: this is to boot under the nova flavor? dib buidls the image successfully? | 16:54 |
*** esha1 has quit IRC | 16:54 | |
johnsom | With the venv DIB fails to build the image at 2GB | 16:55 |
openstackgerrit | Paul Belanger proposed openstack/diskimage-builder master: Revert "debootstrap: Call update-initramfs explicitly" https://review.openstack.org/562004 | 16:55 |
clarkb | johnsom: sure but dib doesn't care does it? it will just make a bigger image? | 16:55 |
*** esha1 has joined #openstack-dib | 16:55 | |
clarkb | (infra's images are like 22GB or something raw) | 16:55 |
johnsom | No, it crashes out with out of disk space errors. We can tell DIB to use a 3GB filesystem, which builds, but then right, nova flavors would have to be changed, etc. I guess we could do something really ugly like resize it after DIB finishes, but that is unpleasant . | 16:58 |
clarkb | oh this is the tmpfs | 16:58 |
clarkb | I think if you just tell it to not tmpfs it may work (though unsure of how slow that will make the build) | 16:58 |
clarkb | johnsom: was the old image right on the border of 2GB? I wouldn't have expected a major difference in size but virtualenvs will be bigger | 16:59 |
johnsom | It must have been during build, it's got room in the finished image. | 16:59 |
johnsom | The qcow2 is just 7MB larger with the venv | 17:00 |
clarkb | ya I think it uses a 2GB tmpfs | 17:00 |
clarkb | my hunch is building stuff from source makes it balloon a bit but then after the build it falls back down in size | 17:00 |
johnsom | Right, plus all the deb cache, etc | 17:02 |
*** esha1 has quit IRC | 17:27 | |
*** eshas has joined #openstack-dib | 17:28 | |
*** eshas has quit IRC | 17:47 | |
*** esha1 has joined #openstack-dib | 17:47 | |
*** esha2 has joined #openstack-dib | 18:36 | |
*** esha1 has quit IRC | 18:40 | |
*** sean-k-mooney has joined #openstack-dib | 18:43 | |
sean-k-mooney | o/ | 18:46 |
sean-k-mooney | quick question. anyone know why cleanup_build_dir is called here https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/lib/disk-image-create#L528 and not on line 545 | 18:47 |
clarkb | sean-k-mooney: probably so that if the image conversions fail you don't have to also trap and handle the cleanup there? (though that may explain why we sometimes leak build dirs if we aren't trapping for that properly) | 18:48 |
clarkb | sean-k-mooney: but once the image content is written the build dir isn't needed any longer | 18:48 |
sean-k-mooney | clarkb: well in my case this is cause int the image build to fail because it cleans up a mount dir that is need to create the image | 18:49 |
clarkb | sean-k-mooney: everything should be copied over to the image at that point though right? | 18:50 |
clarkb | sean-k-mooney: and just above that is the remoe all mounts step | 18:50 |
sean-k-mooney | i taught so. im going to reporduce again and copy the relevent log section | 18:50 |
sean-k-mooney | this might be related to the custome caching element im creating | 18:51 |
sean-k-mooney | basically i am copying the pip-cache element and creating a gradel-cache element and the imag is failing right on the cleanup_build_dir call | 18:52 |
clarkb | sean-k-mooney: you may need to have a post-install.d or finalise.d step (somethign that happens late) to unmount properly) | 18:54 |
sean-k-mooney | clarkb: the pip cache element and my gradle cache both bind mount a host dir into the chroot in root.d | 18:54 |
sean-k-mooney | clarkb: ya i was just going to ask should both of these have a fianalise.d to remove the bind mount | 18:54 |
sean-k-mooney | the pip-cache element does https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/elements/pip-cache/root.d/01-pip-cache but there is no unmount in the element | 18:55 |
sean-k-mooney | clarkb: here is the relevent error http://paste.openstack.org/show/719407/ | 18:58 |
sean-k-mooney | we do 2018-04-17 18:55:33.188 | + /home/centos/dib-elements/diskimage-builder/diskimage_builder/lib/common-functions:cleanup_build_sudo rm -rf /tmp/dib_build.Jx4vnUno/mnt | 18:58 |
sean-k-mooney | the something trys to use it on the last line 2018-04-17 18:55:33.521 | *** /tmp/dib_build.Jx4vnUno/mnt is not a directory | 18:59 |
*** esha2 has quit IRC | 18:59 | |
sean-k-mooney | looking back through the full logs i can see that unmount_image which is called here https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/lib/disk-image-create#L347 unmounts my chache dir so the issue i have looks unrelated to my element. | 19:34 |
ianw | sean-k-mooney: 2018-04-17 18:55:33.260 | Killing chroot process: 'unknown(26384)' | 19:58 |
ianw | that's weird, doesn't it basically do a lsof and kill. i guess that is racy but haven't see it too often | 19:58 |
sean-k-mooney | ianw: ya i think this is what is failing https://github.com/openstack/diskimage-builder/blob/f3d58d9042f82805ac6c944c8ea360e88b3cca4d/diskimage_builder/lib/common-functions#L184-L190 | 19:59 |
sean-k-mooney | ianw: what is proably happening is i have a parent process and a child and we kill the parent first and then when we go to kill the child its dead | 19:59 |
ianw | oohh, yeah *that* would explain it | 20:00 |
sean-k-mooney | i was thinking of adding an "|| /bin/true" here https://github.com/openstack/diskimage-builder/blob/f3d58d9042f82805ac6c944c8ea360e88b3cca4d/diskimage_builder/lib/common-functions#L189 | 20:00 |
sean-k-mooney | ianw: that or sort the pid in decending order | 20:00 |
sean-k-mooney | a parent cant have a higher pid the its childeren right? | 20:01 |
ianw | pid's can wrap-around, so it's possible | 20:01 |
sean-k-mooney | ... ok ill try the ||/bin/true then | 20:02 |
sean-k-mooney | im running the image build in a vm so im not sure if it will make the racy more likely then on a phyical host do to vm timeing being weired/slow | 20:02 |
ianw | i'm thinking the ignore is probably best ... what can we really do if the kill didn't work? it must mean the process isn't there which was the effect we were looking for anyway | 20:04 |
sean-k-mooney | ianw: yep that is what i was thinking | 20:05 |
sean-k-mooney | ianw: thinking about it however if kill did not work it could also mean the process is still running and we should have used kill -9 | 20:10 |
ianw | sean-k-mooney: that won't cause the "kill" to exit 1 though, which is what is triggering the halt | 20:11 |
ianw | it is a bit of a symptom of another issue though ... do you know what's still running that's being killed? if we can stop it before we get to this point, it's usually better | 20:12 |
ianw | we've talked about dumping some more info in that loop to aid debugging iirc | 20:12 |
sean-k-mooney | ianw: so with the ignore the image build completes http://paste.openstack.org/show/719410/ | 20:13 |
sean-k-mooney | ill hack a ps -aux into the loop and grep for the current pid and see if that shows anything | 20:14 |
ianw | i'm guessing some daemon that has started | 20:15 |
sean-k-mooney | oh i know what it is | 20:16 |
sean-k-mooney | im building an image with gradle | 20:16 |
sean-k-mooney | gradel starts a build deamon | 20:16 |
sean-k-mooney | i proably need to kill it after gradle build | 20:16 |
sean-k-mooney | that proably would have worked better if i did the ps command before the kill ... | 20:19 |
ianw | sounds likely ... i have wondered if we're better off without this loop, as it probably does make us look closer at failures like this | 20:22 |
ianw | but then again, it probably doesn't matter if we kill an unused build-daemon either | 20:22 |
ianw | probably no 100% correct answer | 20:22 |
sean-k-mooney | ianw: well we proably should not fail but have a very big warning in the log output that we had to kill somthing that should not be running | 20:23 |
sean-k-mooney | in my case it was postgress which was also started by the gradel build script when in installed the deb backage ... | 20:24 |
sean-k-mooney | *package | 20:24 |
ianw | ahh, yeah, so i guess postgres handles signals nicely, but it is the type of thing where you might end up with a corrupted final image if a process doesn't | 20:26 |
sean-k-mooney | ya i guess that is true. im going to put a post-install action in my element to stop postgress cleanly | 20:27 |
sean-k-mooney | ianw: one other thing i notices is that the package-install element does not handel unicode well | 20:27 |
sean-k-mooney | i got an asci decode error form the console output of having it install java on ubuntu | 20:28 |
ianw | :/ i feel like we've "fixed" unicode there a bunch of times | 20:29 |
sean-k-mooney | ianw: is there a reason we dont use oslo.processutils instead of POPEN directly | 20:29 |
sean-k-mooney | maybe we would get teh same error but just a taught | 20:30 |
clarkb | please no oslo | 20:31 |
clarkb | ianw: I thought tobiash had figured out the reason | 20:31 |
ianw | probably related to https://review.openstack.org/#/c/548958/ but i couldn't fully replicate that | 20:31 |
clarkb | its because we don't init the lcoale stuff before using it | 20:31 |
clarkb | so it defaults to suepr conservative procedures (I think it may be a race) | 20:31 |
* clarkb digs up python docs | 20:31 | |
sean-k-mooney | ianw: yep that is exactly the issue i hit | 20:32 |
sean-k-mooney | ianw: or at lest that is the exact line that was raising the exception | 20:32 |
clarkb | ianw: https://docs.python.org/3/library/locale.html#locale.getpreferredencoding unviersal newlines uses that function and we don't call setlocale | 20:33 |
clarkb | iirc tobiash managed to test it and calling setlocale first fixed it? | 20:33 |
clarkb | I think in the end tobiash decided to fix it that other way beacuse package names won't necessarily come out in your current locale? | 20:34 |
clarkb | something like that, it is an annoying problem | 20:34 |
ianw | clarkb: so universal_newlines is invalid unless you've called setlocale() first? that may be what i'm not grokking | 20:34 |
clarkb | ianw: "on some platforms" and we seemed to be able to determine that tobiash's linux was one of them | 20:35 |
clarkb | I don't know what determines that though | 20:35 |
ianw | sean-k-mooney: if you're triggering it, what's your platform? | 20:36 |
sean-k-mooney | ianw: if you want to reproduce just create a element that install default-jdk or gradle on ubunut. | 20:36 |
sean-k-mooney | ubuntu xenial guest on a centos 7.4 host | 20:36 |
sean-k-mooney | my centos host vm is alos just an unmodified cloud image. this is the locale info from the host http://paste.openstack.org/show/719415/ if that helps but i guess the issue is with the guest locale | 20:39 |
clarkb | more specifically its locale as loaded by python in the guest | 20:39 |
clarkb | subprocess can't call setlocale due to it not being thread safe I think | 20:40 |
clarkb | but we can call setlocale prior | 20:40 |
sean-k-mooney | clarkb: ianw: im just heading home but if i can create and example that reprocues reliably ill upload it to github and send it to ye tomorow. | 20:40 |
sean-k-mooney | clarkb: ianw thanks for your help. | 20:41 |
ianw | yeah, that it's inside the newly built guest and locale settings might not be quite sane may be starting to gel as a root cause | 20:42 |
*** hasharAway has quit IRC | 21:11 | |
openstackgerrit | Michael Johnson proposed openstack/diskimage-builder master: Add pip cache cleanup to pip-and-virtualenv https://review.openstack.org/562055 | 21:26 |
openstackgerrit | Michael Johnson proposed openstack/diskimage-builder master: Add pip cache cleanup to pip-and-virtualenv https://review.openstack.org/562055 | 21:30 |
openstackgerrit | Paul Belanger proposed openstack/diskimage-builder master: Revert "debootstrap: Call update-initramfs explicitly" https://review.openstack.org/562004 | 21:47 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!