*** swest has quit IRC | 01:50 | |
*** swest has joined #zuul | 02:04 | |
*** evrardjp has quit IRC | 04:36 | |
*** evrardjp has joined #zuul | 04:36 | |
*** etp has quit IRC | 06:46 | |
*** dpawlik has joined #zuul | 06:50 | |
*** etp has joined #zuul | 07:26 | |
*** dpawlik has quit IRC | 07:33 | |
openstackgerrit | Albin Vass proposed zuul/zuul master: Update to create-react-app 3.4.1 https://review.opendev.org/716305 | 07:33 |
---|---|---|
*** avass has joined #zuul | 07:34 | |
*** tosky has joined #zuul | 08:43 | |
avass | mordred: ooh, the error only occurs if you build it. I'm able to reproduce it locally now | 08:53 |
zbr | clarkb: mordred: any chance to finally get https://review.opendev.org/#/c/690057/ in? | 11:06 |
zbr | i rebased that change for more than 6mo, every time other changes where more important :p | 11:07 |
avass | zbr: right, I was supposed to take another look at that | 11:43 |
zbr | avass: thanks. also, please ping me when you think that the react-app bumping is ready. | 11:46 |
avass | zbr: sure, I've tracked it down to the NotificationDrawer.Toggle returning undefined so far | 11:48 |
avass | I think that's it at least | 11:49 |
*** tumble has joined #zuul | 12:50 | |
mordred | avass: ooh! so a difference between prod and debug builds perhaps | 13:33 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Add new non-npm specific javascript jobs https://review.opendev.org/726547 | 13:53 |
avass | mordred: yeah must be, fixing peer dependencies doesn't seem to solve it either | 13:56 |
mordred | avass: I wonder if somehow it's optimizing something out that it thinks is unused, and then when there is an error it's trying to use it without initializing it *waves hands wildly* | 13:59 |
avass | mordred: I'm trying to see if it's something like that by comparing to master | 14:01 |
avass | mordred: looks like it's missing some components from patternfly | 14:14 |
avass | mordred: I think that's it, NotificationDrawerToggle.js is served on master and when running a debug build but must be optimized away when building | 14:22 |
mordred | avass: aha! | 14:24 |
mordred | avass: I'm readig through webpack stuff to see what we can do about it | 14:31 |
*** tumble has quit IRC | 14:35 | |
mordred | avass: what about doing this: http://paste.openstack.org/show/793359/ | 14:42 |
avass | mordred: tried it, didn't seem to change anything | 14:44 |
avass | mordred: oh, except for NotificationDrawer.Toggle -> NotificationDrawerToggle, I'll try that | 14:44 |
mordred | hah: Attempted import error: 'NotificationDrawerToggle' is not exported from 'patternfly-react'. | 14:45 |
avass | mordred: interesting, looks like anything to do with NotificationDrawer crashes the webapp | 14:51 |
avass | mordred: this also works with a debug build but crashes a production build: http://paste.openstack.org/show/793360/ | 14:53 |
avass | mordred: since all of those files, Panel, Toggle etc are missing | 14:54 |
avass | mordred: and this works: http://paste.openstack.org/show/793361/ :) | 14:55 |
openstackgerrit | Merged zuul/zuul master: Don't reconfigure the tenant on tag creation https://review.opendev.org/726213 | 15:05 |
avass | mordred: I wonder if we should be using StatefulToggleNotificationDrawerWrapper or something like that instead | 15:25 |
*** evrardjp has quit IRC | 16:36 | |
*** evrardjp has joined #zuul | 16:36 | |
*** tumble has joined #zuul | 17:00 | |
openstackgerrit | Merged zuul/zuul master: executor: Catch error when reading cpu_times https://review.opendev.org/726545 | 17:17 |
tumble | I don't understand how to prepare an image for nodepool. I currently just configure a naked debian image in the provider section's cloud-images attribute. But how do I know what needs to be installed in such image before and where does the nodepool-builder come into play? Or is it optional? | 17:29 |
*** armstrongs has joined #zuul | 17:32 | |
fungi | nodepool's builder daemon creates disk images and then uploads them to providers | 17:36 |
fungi | it uses software called diskimage-builder to create the images | 17:36 |
fungi | diskimage-builder is configured with a set of "elements" which consist of arbitrary scripts/executables run either inside or outside chroot trees in various phases and a defined priority order | 17:37 |
fungi | dib can do something as simple as consume a distro's published cloud image and add some custom configuration or copy some cached files into it | 17:38 |
fungi | and also convert to image formats specific to different providers, in case your providers have different image format requirements | 17:38 |
fungi | the builder recreates and uploads new images at a configurable frequency (daily, weekly, whatever you like) to keep them fresh | 17:39 |
tumble | ty, it sounds like it's just a useful helper and I could just as well provide some image within my cloud for nodepool to use, right? But I would need to know what tools zuul expects to find in it, like ansible probably? | 17:40 |
fungi | ansible is "agentless" so you just need to be able to ssh into the server once booted from the image | 17:41 |
*** armstrongs has quit IRC | 17:41 | |
fungi | oh, you do need a python interpreter in the image, i guess | 17:41 |
tumble | I thought the zuul-executor sshs into such node and executes ansible there, running playbooks etc. | 17:41 |
fungi | ansible actually builds ephemeral python modules at the initiating end, copies them over an ssh connection and executes them in-place on the remote side | 17:42 |
tumble | ah okay so the executor is the host running the plays and the nodepool nodes are just the target hosts | 17:42 |
fungi | so you don't need ansible installed on a remote system, the ansible run by the executor is installed on the executor | 17:43 |
fungi | yep | 17:43 |
fungi | the nodepool nodes are your ansible "inventory" | 17:43 |
tumble | okay, cool. I'm still struggling because I don't really see why my zuul builds are failing. No log output whatsoever. Was just my wild theory that the node images lack some tooling, but sounds like that shouldn't be the case. | 17:44 |
fungi | the zuul scheduler figures out from your job's specified nodeset how many of what types of nodes to request from the nodepool launchers, and then when they return that set of node ids zuul creates an ansible inventory on the executor for that job | 17:45 |
fungi | so check that the ssh bootstrapping key zuul is using on your executor(s) can successfully log into the configured account name on one of your nodepool nodes | 17:45 |
tumble | ok, good point, will check that once more. | 17:46 |
fungi | also the executor log should contain output from any attempted ansible invocations | 17:48 |
fungi | you can set debugging on if you want verbose ansible output too | 17:48 |
fungi | and you can even set the executor to not clean up the workspaces it creates, in case you want to inspect one | 17:48 |
fungi | however, be careful to turn it back off again, it can quickly eat a lot of disk on an active deployment | 17:49 |
tumble | guess I'll look for the verbose option then, because no errors appear in the executor and ssh into the nodes works | 17:51 |
tumble | it just recreates the nodes a few times until the retry limit is reached | 17:51 |
fungi | do you see evidence of any builds getting initiated at all? | 17:51 |
fungi | if not, may be better to start with the scheduler's logs to see if it's not assigning any work to the executors | 17:52 |
tumble | http://dpaste.com/0F29FZJ <- this is what the executor says and it looks like it's running jobs to me | 17:53 |
tumble | http://dpaste.com/295HV8Z <- and that's the scheduler. Also looks "okay" apart from some warnings which seem to be expected | 17:56 |
fungi | based on the executor log, it looks like it thinks the job simply isn't configured to do anything | 18:03 |
tumble | oh, well, that's possible since all it does using the debug module to say hi | 18:04 |
tumble | maybe I should try to do more than that. But why would it report back to gerrit as FAILURE then? :/ | 18:05 |
fungi | probably the next thing to work on is log collection | 18:05 |
tumble | yeah I saw the upload-logs thingy. I hope the job run gets far enough to reach that | 18:05 |
fungi | well, the executor seems to indicate the job completed | 18:06 |
fungi | but also, turning on debug logging for both the executor and scheduler might help in working out what it's doing exactly | 18:07 |
fungi | figuring out why zuul isn't doing something is a lot harder than figuring out why it is doing something | 18:07 |
tumble | :D | 18:07 |
tumble | I also prefer big stack traces yelling at me that stuff is broken | 18:07 |
fungi | (since there's an essentially infinite number of possible things it *could* do) | 18:07 |
fungi | but yeah, from the log snippets you pasted, it looks like it believes it's doing whatever you asked of it | 18:08 |
fungi | debug level logs will provide some additional detail as to what it reports, why it reported it, et cetera | 18:09 |
fungi | also it records the entire decision-making process about why it did or didn't run a particular job | 18:12 |
fungi | though in this case it seems to be running a job, you presumably just want to know why it reports a failure for builds of that job | 18:13 |
fungi | debug logging on the executor will get you all the ansible stdout/stderr too | 18:14 |
tumble | activated that now, indeed very talkative. Is there a smarter way of retriggering a build from a gerrit review than changing something, amending and using git review again? | 18:14 |
fungi | in opendev we configure our check pipeline to additionally trigger on any review comment starting with the word "recheck" | 18:15 |
fungi | but zuul also comes with a zuul-enqueue utility which can enqueue a change via its rcp interface | 18:15 |
fungi | s/rcp/rpc/ | 18:15 |
tumble | since I followed the tutorial with that, I just rememberd it's indeed configured to react on comments | 18:15 |
tumble | let's see | 18:15 |
fungi | er, i guess it's actually the "enqueue" subcommand of the "zuul" rpc client utility | 18:16 |
fungi | `zuul enqueue --help` to get context help and find out the parameters it needs | 18:16 |
tumble | Ansible output: b"bwrap: No permissions to creating new namespace, likely because the kernel does not allow non-privileged user namespaces. On e.g. debian this can be enabled with 'sysctl kernel.unprivileged_userns_clone=1'." | 18:23 |
tumble | this looks like it could be the culprit | 18:23 |
tumble | is it trying to wrap the execution in a kind of container? | 18:24 |
fungi | yes, the executor sandboxes ansible invocations with a tool called bubblewrap, which is like a lightweight container essentially | 18:25 |
fungi | this is an added layer of protection against jobs coercing ansible into doing something untoward on the executor and being able to influence or compromise other builds | 18:26 |
tumble | so nodes are recycled for several builds? I thought it would create a fresh one for each job | 18:27 |
tumble | s/job/pipeline/ | 18:27 |
fungi | job nodes are recreated not recycled (unless you're using the static provider driver) | 18:27 |
tumble | using the openstack driver | 18:28 |
fungi | but jobs can also run playbooks locally on the executor by specifying "localhost" as the host | 18:28 |
fungi | however with very limited constraints | 18:28 |
fungi | playbooks run on the executor are generally things like copying git repositories to remote nodes, collecting logs from remote nodes for publication, and sometimes lightweight actions like hitting a rest api somewhere | 18:29 |
tumble | it looks like I won't be able to avoid reading about the diskimage builder ';D | 18:30 |
fungi | tumble: well, the error you mentioned is specific to the server where the zuul-executor service is running, and has nothing to do with nodepool or diskimage-builder | 18:42 |
fungi | it's saying your executor's server needs kernel.unprivileged_userns_clone enabled in the kernel, or possibly that you're trying to run the executor service itself in an unprivileged container | 18:43 |
fungi | i think also the bubblewrap tool may expect to be setuid 0 if it's going to be run by unprivileged users | 18:48 |
tumble | oh, good point, I didn't realize it's failing on the executor. Well, I'm using the "official" (but undocumented) containers. That might be the reason | 18:50 |
fungi | the executor container may need to be run privileged | 18:51 |
tumble | will try that | 18:51 |
fungi | our deployment (opendev) us running the scehduler, web and merger services in regular containers but still running the executor service outside containers | 18:51 |
fungi | in part because of the complexity of essentially having to do nested containers | 18:52 |
tumble | I could do the same when scaling out, but since it's an all-in-one deployment yet and I'm still playing around with all the different pieces, it was very inviting to go with the containers that don't pollute the VM | 18:53 |
tumble | privileged did the trick | 18:56 |
tumble | "everything" seems to be working now | 18:56 |
tumble | thank you very much. Would have taken me forever to figure it out alone. :) | 18:57 |
*** avass has quit IRC | 18:59 | |
fungi | you're welcome! sounds like we could stand to improve those parts of the documentation | 19:06 |
AJaeger | mordred, ianw, so https://review.opendev.org/726413 passes now in opendev but still fails in Vexxhost CI | 19:06 |
AJaeger | mordred: https://review.opendev.org/#/c/726413/7/roles/ensure-pip/tasks/workarounds.yaml : the probe for venv says it exists - but that looks wrong.. | 19:06 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 19:08 |
AJaeger | mordred: let's enable logs to see why Vexxhost CI thinks venv is installed in the test but complains later about it missing ^ | 19:09 |
mnaser | sorry for inflicting that much work :) but uh yeah, nothing 'fancy' about those images | 19:11 |
mnaser | they're dib built and i think i posted the config | 19:11 |
mnaser | yikes https://www.irccloud.com/pastebin/6xgxu5BG/ | 19:12 |
mnaser | AJaeger: ^ | 19:13 |
AJaeger | mnaser: thanks. Let's see what the logs show from this run. | 19:15 |
AJaeger | mnaser: btw. could you help https://review.opendev.org/#/c/725898 merged so that https://review.opendev.org/#/c/724067/2 can merge, please? (that's ansible-hardening) | 19:16 |
AJaeger | mnaser: is VEXXHOST CI down? I'm used to quick reports but haven't see anything on 726413 yet... | 19:22 |
mnaser | AJaeger: 3 jobs x 3 retry_limit :) | 19:23 |
mnaser | AJaeger: https://zuul.vexxhost.dev/t/opendev/status :) | 19:23 |
mnaser | it just reported | 19:23 |
AJaeger | mnaser: ah, great - thanks | 19:23 |
mnaser | (that issue we're troubleshooting fails in pre so it retries it 3 times) | 19:23 |
AJaeger | sorry for ma impatience | 19:23 |
AJaeger | Ah! | 19:23 |
AJaeger | mordred, mnaser, ianw, I'm puzzled https://zuul.vexxhost.dev/t/opendev/build/d3320ba9ce4f461ba6c420674e908d66/console shows that python3 -m venv --help works and venv is installed (ensure-pip: Probe for venv) and then fails with venv not installed | 19:25 |
mnaser | AJaeger: maybe we need to look into how debian does its "replacement" of venv | 19:32 |
mnaser | maybe it replaces it by a stripped down module and not remove --help and stuff like that | 19:33 |
AJaeger | yeah, might be needed | 19:35 |
mnaser | AJaeger: oh... | 19:36 |
mnaser | AJaeger: we're trying to create a venv using py3 in the tox-py27 job? | 19:36 |
mnaser | https://packages.debian.org/buster/amd64/python3-venv/filelist | 19:41 |
mnaser | maybe we can check for pyenv ? | 19:41 |
mnaser | pyvenv* | 19:41 |
ianw | ... this was supposed to be simple! :) | 22:35 |
ianw | AJaeger/mnaser: the other thing about that message is i think it's something that's thrown on any exception, maybe it's not actually related to python3-venv | 22:38 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 22:44 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 22:49 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 22:53 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 22:59 |
*** tumble has quit IRC | 23:10 | |
*** tosky has quit IRC | 23:13 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 23:17 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 23:22 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 23:27 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] Revert "Revert "ensure-tox: use venv to install"" https://review.opendev.org/726413 | 23:30 |
ianw | ^ /bin/sh: 4: /usr/local/bin/python3: not found ... so there is *no* python in /usr/local/bin on these images | 23:32 |
ianw | but there *is* a pip installation there it seems | 23:33 |
fungi | if python3 is installed from distro packages but ensurepip is used to bootstrap pip, i believe you'll get that combination | 23:37 |
ianw | yeah, basically the problem here is that we have assumed "if pip3 command exists, then python -m venv must work" | 23:47 |
ianw | i think this is true if you install python3-pip package, which (again i think) will bring in python3-venv on debuntu | 23:48 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!