clarkb | corvus: ok a few comments on https://review.opendev.org/c/zuul/zuul/+/788988 but I +2'd it | 00:20 |
---|---|---|
clarkb | the majority are testing things. The other two are not critical at this stage I don't think. Just callouts for things that might need changing as we move forward | 00:20 |
clarkb | but please double check | 00:20 |
clarkb | and now I must figure out dinner | 00:21 |
opendevreview | James E. Blair proposed zuul/zuul master: Remove ExecutorApi.update() call from tests https://review.opendev.org/c/zuul/zuul/+/798778 | 00:39 |
corvus | clarkb: thanks! | 00:39 |
pabelanger[m] | So we've found a regression in 4.6.0 I think | 02:33 |
pabelanger[m] | related to https://opendev.org/zuul/zuul/commit/be50a6ca42c41c0608dd02930a01123afd4e6064 | 02:33 |
pabelanger[m] | the runAnsibleFreeze function doesn't take into account an inventory file | 02:34 |
pabelanger[m] | where an ssh connection is not SSH | 02:34 |
pabelanger[m] | we have images that set https://zuul-ci.org/docs/nodepool/openstack.html#attr-providers.[openstack].diskimages.connection-type to network-cli | 02:35 |
pabelanger[m] | but now we get timeout while scanning for facts | 02:35 |
pabelanger[m] | https://pastebin.com/raw/cgR6H4JS | 02:36 |
pabelanger[m] | the change to https://opendev.org/zuul/zuul/src/branch/master/zuul/executor/server.py#L794 is why | 02:49 |
pabelanger[m] | we properly skip them for setup, but no longer properly filter it on freeze playbook | 02:50 |
opendevreview | Paul Belanger proposed zuul/zuul master: Don't use blacklisted connections for freeze playbook https://review.opendev.org/c/zuul/zuul/+/798784 | 03:03 |
pabelanger[m] | corvus: tobiash: clarkb: tristanC: ^my attempt to fix broken freeze jobs for zuul.a.c | 03:05 |
pabelanger[m] | happy to make the code better if people suggest another way | 03:05 |
*** sshnaidm is now known as sshnaidm|afk | 04:39 | |
*** marios is now known as marios|ruck | 05:08 | |
*** marios is now known as marios|ruck | 07:16 | |
*** jpena|off is now known as jpena | 07:39 | |
*** marios is now known as marios|ruck | 08:09 | |
opendevreview | Benjamin Schanzel proposed zuul/zuul-jobs master: Fix a bug in s3 log uploader with .gz files https://review.opendev.org/c/zuul/zuul-jobs/+/798802 | 08:15 |
opendevreview | Simon Westphahl proposed zuul/zuul master: Stop jobs on gearman disconnect https://review.opendev.org/c/zuul/zuul/+/714722 | 09:35 |
*** sshnaidm|afk is now known as sshnaidm | 10:53 | |
*** jpena is now known as jpena|lunch | 11:27 | |
*** marios|ruck is now known as marios|ruck|call | 12:01 | |
*** jpena|lunch is now known as jpena | 12:27 | |
*** marios|ruck|call is now known as marios|ruck | 12:57 | |
corvus | pabelanger: can we set gather_facts to false in the freeze playbook instead? | 13:45 |
pabelanger[m] | I am trying to remember which host set_fact run on | 14:14 |
pabelanger[m] | because we cannot access the node via SSH connection right now | 14:15 |
corvus | pabelanger: can you test that to help confirm it works? | 15:56 |
corvus | pabelanger: hrm, it looks like we don't gather facts in the setup playbook? | 16:03 |
*** marios|ruck is now known as marios|out | 16:12 | |
opendevreview | Merged zuul/zuul master: Put zuul vars back in debug inventory.yaml https://review.opendev.org/c/zuul/zuul/+/798092 | 16:24 |
corvus | pabelanger: i'm confused -- why can't network_cli handle fact gathering? | 16:48 |
*** jpena is now known as jpena|off | 16:49 | |
corvus | i just tested kubectl, and it does... which makes me wonder why we don't run the setup module for that | 16:49 |
corvus | ah, we don't run the setup module for that because we don't put it in the inventory at all | 16:51 |
pabelanger[m] | it can, but it is a blacklisted connection. So up until now, we would not scan the nodes | 16:51 |
pabelanger[m] | in these nodes right now, don't have interfaces online | 16:51 |
corvus | pabelanger: so this isn't an issue with network_cli -- this issue is that you're adding nodes to the inventory which aren't online? | 16:52 |
corvus | pabelanger: and i guess you do something out of band to make them be online? | 16:53 |
corvus | (also, i was wrong, we do add kubectl nodes to the inventory; so it's unclear to me why we don't run setup on them) | 16:54 |
pabelanger[m] | yes, we do https://github.com/ansible-network/windmill-config/blob/master/nodepool/nl01.sjc1.vexxhost.zuul.ansible.com.yaml#L347 which means we don't keyscan the node, and setup playbook would not run. Then we have a pre-run playbook that will scan the private IP and setup our public interface properly: https://github.com/ansible/ansible-zuul-jobs/blob/master/playbooks/ansible-network-appliance-base/pre.yaml#L19 | 16:56 |
pabelanger[m] | but now, we have no way to skip the job freeze playbook | 16:56 |
pabelanger[m] | from zuul-executor we do not access these nodes | 16:56 |
pabelanger[m] | only from the controller node | 16:56 |
corvus | pabelanger: okay, that sounds like a pretty unusual situation, and tbh, i think the support in zuul needs to change a bit. because if it's the case that in normal circumstances, a networkcli host can actually run setup or gatherfacts, that's probably what zuul should be doing. you should probably rework both zuul and your use of it to use the "resources" system that is used for k8s namespaces for what you're doing (that's a concept of "here is | 17:02 |
corvus | a resource from nodepool that zuul shouldn't use directly, but the job will"). | 17:02 |
corvus | pabelanger: that's a long-term issue, but it would be great if you're interested in working on that | 17:03 |
corvus | pabelanger: for the short term, my guess is you may be the only user in the world using networkcli, so if you are okay with basically losing access to all job variables for networkcli hosts, i think we could merge your patch as-is | 17:03 |
corvus | pabelanger: it sounds like if you're not using the hosts from zuul anyway, then you're probably not going to care about not having job vars | 17:04 |
avass[m] | Does anyone know if it's possible to stop element from displaying the homeserver part of usersnames? Irc style layout doesn't really handle that very well | 17:06 |
pabelanger[m] | corvus: yes, we only force network_cli today, because it is a blacklisted connection. It seems now, we will support it for job freeze playbooks, is that right? | 17:06 |
pabelanger[m] | yes, I don't believe we use any host vars today on these nodes | 17:06 |
corvus | pabelanger: can you update your patch with a release note? | 17:08 |
pabelanger[m] | yes, I can do that shortly | 17:09 |
fungi | avass[m]: my guess is that would lead to ambiguity if two users at different homeservers had the same local usernames | 17:09 |
corvus | avass: it may only be doing it for folks who have a collision; have an example? | 17:10 |
avass[m] | fungi : I could live with that not being completely unambiguous in the chat since it could also be possible to mouseover/open the user profile | 17:10 |
* avass[m] uploaded an image: (9KiB) < https://matrix.org/_matrix/media/r0/download/vassast.org/dvDfKPpmekrVlWvnSaxfAxBU/image.png > | 17:10 | |
avass[m] | corvus: yes: :) | 17:10 |
corvus | hrm, that's different than what i see. | 17:11 |
avass[m] | maybe a version difference? | 17:11 |
* corvus uploaded an image: (8KiB) < https://matrix.org/_matrix/media/r0/download/acmegating.com/rnGvemIrLORJUeYmfbqlvRQn/image.png > | 17:11 | |
corvus | yeah maybe | 17:11 |
avass[m] | oh, it's only on my windows machine... | 17:12 |
corvus | Element version: 3a67dc18e9e3-react-3a67dc18e9e3-js-3a67dc18e9e3 lol that's helpful | 17:12 |
fungi | blame windows | 17:12 |
clarkb | the collision between my irc and matrix names is useful bceaus matrix sends me notifications even if I primarily consume this channel via irc :) | 17:13 |
avass[m] | Version: 1.7.31 on windows, but 1.7.20 is fine on manjaro. But I'm gonna guess it's a windows thing | 17:13 |
corvus | avass: using element desktop vs browser? | 17:15 |
avass[m] | corvus: desktop on all three machines, but I think it's an electron app anyway | 17:15 |
corvus | avass: yeah, but i'm wondering how os makes a difference | 17:16 |
avass[m] | corvus: that's a good question in that case, maybe it has different css engines? | 17:17 |
avass[m] | or something like that | 17:17 |
corvus | pabelanger: wait a minute -- i think i just now understood that you're saying you're not using network_cli at all -- you just found a connection type in zuul that happened to not run the setup playbook and you used that? | 17:19 |
corvus | pabelanger: so we have no idea how network_cli behaves with the freeze playbook -- it could work for all we know, and your change could break it? | 17:20 |
pabelanger[m] | yes, that is right | 17:21 |
pabelanger[m] | in 3.0 we didn't support network_cli as a connection type | 17:21 |
pabelanger[m] | because of security reasons | 17:21 |
pabelanger[m] | but now, it seems freeze playbook doesn't restrict connections | 17:21 |
corvus | pabelanger: 3.x would use network_cli connections in playbooks, it only didn't run setup | 17:22 |
corvus | also 4.x through 4.5.1 | 17:22 |
pabelanger[m] | maybe I am not understanding the idea of BLACKLISTED_ANSIBLE_CONNECTION_TYPES | 17:23 |
pabelanger[m] | I don't think we ever got network_cli to work from zuul-executor in 3.0 | 17:23 |
pabelanger[m] | if we could, that would actually save a lot of work for us | 17:23 |
pabelanger[m] | however, it's been a while since I've had to think about this process | 17:24 |
corvus | pabelanger: this is where it was added: https://review.opendev.org/530265 | 17:24 |
pabelanger[m] | yes, that is true. Only recently gather facts for network_cli support was added | 17:25 |
pabelanger[m] | we also have specific facts gathers, like vyos_facts or ios_facts. I need to check if gather_facts will map to them in 2.8 and 2.9 ansible | 17:26 |
corvus | pabelanger: okay, i'm hesitant to change the behavior for network_cli here -- but i left a suggestion on https://review.opendev.org/798784 for what i think you can do, and i think it's about the same amount of work (basically, add a new resource type). you'll want to add a test for it too, and a release note. | 17:28 |
pabelanger[m] | corvus: gather_facts true on network_cli for 2.8 wouldn't work, but should be okay in 2.9: https://docs.ansible.com/ansible/devel/porting_guides/porting_guide_2.9.html#improved-gather-facts-support-for-network-devices | 17:30 |
pabelanger[m] | so it depends on the version of ansible zuul jobs use | 17:30 |
pabelanger[m] | and network_cli is only used for network appliances | 17:31 |
pabelanger[m] | so you are likely right we are the only ones doin git | 17:31 |
pabelanger[m] | it* | 17:31 |
corvus | pabelanger: okay, i think it would also be okay to avoid running freeze on 2.8 in that case, but we should run it on 2.9 | 17:31 |
pabelanger[m] | yes | 17:32 |
pabelanger[m] | that still means, we have an issue of trying to scan a node that isn't online yet | 17:32 |
pabelanger[m] | and not sure how to deal with that | 17:32 |
corvus | pabelanger: yeah, i think the best solution for your case is the new resource type | 17:33 |
pabelanger[m] | Hmm, okay. I was hoping to avoid writing a new feature to support this | 18:01 |
opendevreview | James E. Blair proposed zuul/zuul master: Add some jitter to apscheduler interval cleanup jobs https://review.opendev.org/c/zuul/zuul/+/798953 | 20:20 |
opendevreview | James E. Blair proposed zuul/zuul master: Add comment for test method https://review.opendev.org/c/zuul/zuul/+/798954 | 20:20 |
clarkb | I just want to say kudos to everyone who has already reviewed the zookeeper v5 changes. They are often not short and also dense on details | 21:36 |
fungi | and double-plus thanks from me as i have not found time to review any of them | 21:39 |
opendevreview | Merged zuul/zuul master: Add ExecutorApi https://review.opendev.org/c/zuul/zuul/+/770902 | 22:58 |
opendevreview | Merged zuul/zuul master: Change zone handling in ExecutorApi https://review.opendev.org/c/zuul/zuul/+/787833 | 22:59 |
opendevreview | Merged zuul/zuul master: Switch to string constants in BuildRequest https://review.opendev.org/c/zuul/zuul/+/791849 | 23:04 |
opendevreview | Merged zuul/zuul master: Clean up Executor API build request locking and add tests https://review.opendev.org/c/zuul/zuul/+/788624 | 23:08 |
opendevreview | Merged zuul/zuul master: Fix race with watches in ExecutorAPI https://review.opendev.org/c/zuul/zuul/+/792300 | 23:09 |
clarkb | corvus: I finally got through https://review.opendev.org/c/zuul/zuul/+/784195 can you check my comments on it? | 23:18 |
corvus | lookin | 23:19 |
corvus | clarkb: thanks; i provided an explanation on #1 and revised my vote to a -1 for #2 | 23:30 |
clarkb | cool. I'll take a look once I've gotten through the next in the list | 23:30 |
opendevreview | James E. Blair proposed zuul/zuul master: Simplify queue traversal of fileschangescompletedevent processing https://review.opendev.org/c/zuul/zuul/+/798966 | 23:51 |
opendevreview | James E. Blair proposed zuul/zuul master: Switch from global to tenant event queues https://review.opendev.org/c/zuul/zuul/+/797440 | 23:53 |
opendevreview | James E. Blair proposed zuul/zuul master: Remove unused addResultEvent method https://review.opendev.org/c/zuul/zuul/+/797542 | 23:53 |
opendevreview | James E. Blair proposed zuul/zuul master: Lock/unlock nodes on executor server https://review.opendev.org/c/zuul/zuul/+/774610 | 23:53 |
opendevreview | James E. Blair proposed zuul/zuul master: Remove ExecutorApi.update() call from tests https://review.opendev.org/c/zuul/zuul/+/798778 | 23:53 |
corvus | clarkb: ^ that should address that; rebase incoming | 23:53 |
opendevreview | James E. Blair proposed zuul/zuul master: Add some jitter to apscheduler interval cleanup jobs https://review.opendev.org/c/zuul/zuul/+/798953 | 23:53 |
opendevreview | James E. Blair proposed zuul/zuul master: Add comment for test method https://review.opendev.org/c/zuul/zuul/+/798954 | 23:53 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!