*** zbr has quit IRC | 00:33 | |
*** zbr has joined #zuul | 00:45 | |
*** rlandy has quit IRC | 01:24 | |
*** zbr7 has joined #zuul | 02:23 | |
*** zbr has quit IRC | 02:23 | |
*** zbr7 is now known as zbr | 02:23 | |
*** saneax has joined #zuul | 03:36 | |
*** saneax has quit IRC | 04:29 | |
*** vishalmanchanda has joined #zuul | 04:36 | |
*** hamalq has quit IRC | 04:41 | |
*** saneax has joined #zuul | 04:48 | |
*** evrardjp has quit IRC | 05:33 | |
*** evrardjp has joined #zuul | 05:33 | |
*** jfoufas1 has joined #zuul | 05:58 | |
*** zbr has quit IRC | 06:13 | |
*** zbr has joined #zuul | 06:14 | |
*** mnaser has quit IRC | 06:32 | |
*** mnaser has joined #zuul | 06:34 | |
*** jcapitao|off has joined #zuul | 07:16 | |
*** jcapitao|off is now known as jcapitao | 07:19 | |
*** piotrowskim has joined #zuul | 07:26 | |
*** rpittau|afk is now known as rpittau | 08:12 | |
*** cloudnull has quit IRC | 08:59 | |
*** jpena|off is now known as jpena | 08:59 | |
*** cloudnull has joined #zuul | 09:02 | |
*** tosky has joined #zuul | 09:14 | |
*** tosky_ has joined #zuul | 09:20 | |
*** tosky has quit IRC | 09:24 | |
*** tosky_ is now known as tosky | 09:25 | |
*** nils has joined #zuul | 09:28 | |
*** cloudnull has quit IRC | 09:37 | |
*** cloudnull has joined #zuul | 09:40 | |
*** evrardjp has quit IRC | 09:46 | |
*** evrardjp has joined #zuul | 09:48 | |
*** saneax has quit IRC | 10:36 | |
*** saneax has joined #zuul | 11:02 | |
*** sshnaidm has quit IRC | 11:15 | |
*** sshnaidm has joined #zuul | 11:18 | |
*** jcapitao is now known as jcapitao_lunch | 11:26 | |
*** harrymichal has joined #zuul | 12:15 | |
*** fdegir5 is now known as fdegir | 12:24 | |
*** jpena is now known as jpena|lunch | 12:30 | |
*** rlandy has joined #zuul | 12:31 | |
*** jcapitao_lunch is now known as jcapitao | 13:07 | |
*** jpena|lunch is now known as jpena | 13:23 | |
*** ikhan has quit IRC | 13:24 | |
openstackgerrit | Felix Edel proposed zuul/zuul master: Simplify ZooKeeper client initialization https://review.opendev.org/c/zuul/zuul/+/754360 | 13:55 |
---|---|---|
openstackgerrit | Felix Edel proposed zuul/zuul master: Improve typings in context of builds via ZooKeeper https://review.opendev.org/c/zuul/zuul/+/753578 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Make ZooKeeper mandatory for Scheduler https://review.opendev.org/c/zuul/zuul/+/756716 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Make ConnectionRegistry mandatory for Scheduler https://review.opendev.org/c/zuul/zuul/+/757095 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Improve typings in context of 756716 and 757095 https://review.opendev.org/c/zuul/zuul/+/757148 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Instantiate executor client, merger, nodepool and app within Scheduler https://review.opendev.org/c/zuul/zuul/+/757149 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Improve typings in context of lock nodes on executor https://review.opendev.org/c/zuul/zuul/+/757097 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: DNM: Reduce number of jobs for SOS development https://review.opendev.org/c/zuul/zuul/+/775081 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Component Registry in ZooKeeper https://review.opendev.org/c/zuul/zuul/+/759187 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Move management and result events to model https://review.opendev.org/c/zuul/zuul/+/761163 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Allow (de-)serialization of management events https://review.opendev.org/c/zuul/zuul/+/761164 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Allow (de-)serialization of result events https://review.opendev.org/c/zuul/zuul/+/761165 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Add and fix fields in driver trigger event models https://review.opendev.org/c/zuul/zuul/+/761166 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Allow (de-)serialization of trigger events https://review.opendev.org/c/zuul/zuul/+/761167 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Interface to get a driver's trigger event class https://review.opendev.org/c/zuul/zuul/+/761168 | 13:56 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Increase default test wait timeout to 120s https://review.opendev.org/c/zuul/zuul/+/763754 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Implementation of Zookeeper backed event queues https://review.opendev.org/c/zuul/zuul/+/761170 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Implementation of Zookeeper event watcher https://review.opendev.org/c/zuul/zuul/+/761171 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Switch to Zookeeper backed trigger event queues https://review.opendev.org/c/zuul/zuul/+/761172 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Switch to Zookeeper backed management event queues https://review.opendev.org/c/zuul/zuul/+/761738 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Improve test output by using named queues https://review.opendev.org/c/zuul/zuul/+/775620 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Avoid race when task from queue is in progress https://review.opendev.org/c/zuul/zuul/+/775621 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Implement Zookeeper backed connection event queue https://review.opendev.org/c/zuul/zuul/+/775622 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Dispatch Pagure webhook events via Zookeeper https://review.opendev.org/c/zuul/zuul/+/775623 | 13:57 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Dispatch Github webhook events via Zookeeper https://review.opendev.org/c/zuul/zuul/+/775624 | 13:58 |
openstackgerrit | Felix Edel proposed zuul/zuul master: Dispatch Gitlab webhook events via Zookeeper https://review.opendev.org/c/zuul/zuul/+/775625 | 13:58 |
tristanC | tobiash: looking at scheduler log only, by replacing ansible-playbook with an `exit 0`, it seems like i get a consistent 10sec end-to-end benchmark for a single pipeline/job execution | 14:00 |
tobiash | tristanC: nodeless do-nothing jobs? | 14:01 |
tristanC | which makes me wonder if the log file timestamps are enough to get accurate measurement of the process performance | 14:01 |
tobiash | for most things probably yes, however we monitor and alert on some key metrics like event queue sizes | 14:02 |
tobiash | (which we get from statsd atm) | 14:02 |
tristanC | tobiash: this is currently using a static label, but i guess nodepool could be removed using a nodeless job | 14:04 |
tristanC | tobiash: yeah, we do also monitor queue size, but my goal here is to get some syntetic benchmark to measure before and after scheduler ha performance | 14:05 |
tristanC | and using the scheduler log to measure the performance seems convenient , e.g. to collect timing between new-change, build-start and build-report | 14:07 |
*** sanjay__u has joined #zuul | 14:08 | |
*** saneax has quit IRC | 14:08 | |
tobiash | tristanC: you get those data also via the mqtt reporter afaik | 14:08 |
tobiash | we use those for kpi's of regular running reference jobs | 14:09 |
*** sanjay__u has quit IRC | 14:10 | |
tristanC | does it covert the time it takes between a new change event and when the build start? | 14:10 |
tristanC | if i understand correctly, scheduler ha uses zookeeper to store new change event before starting a build, and that is an overhead i'd like to measure | 14:11 |
tristanC | in other word, what would be the best strategy to measure the overhead of zookeeper in the scheduler ha feature? | 14:17 |
avass | tristanC: is it the gerrit event -> zookeeper znode creation you want to measure? | 14:25 |
tristanC | avass: that would be good to know, but to compare with v4 i'd like to get `event -> build start` (for example) | 14:27 |
tristanC | so i was hoping i could collect overall performance for a thousand events, and measure individual event performance using the scheduler log | 14:29 |
avass | yeah I was just thinking that you could use a zookeeper watcher somehow. but using zuul logs would make it a bit easier and maybe more accurate. | 14:40 |
tobiash | tristanC: I think we have there the time it took from enqueue until the job starts, the time from job start to first ansible run and also the full time the job took | 14:48 |
tristanC | tobiash: you mean in statsd? | 14:49 |
tobiash | tristanC: I mean in the mqtt reporter | 14:51 |
*** jhesketh has quit IRC | 14:52 | |
corvus | those seem like easy statsd things to add if they're not there already | 14:52 |
avass | yeah looks like there's a trigger time and enqueue time: https://zuul-ci.org/docs/zuul/reference/drivers/mqtt.html#attr-%3Cmqtt%20schema%3E.buildset.builds.result | 14:52 |
tobiash | tristanC: we get something like this out of those data: https://paste.pics/ac3a15cd892374e9a12454739ec3e9c3 | 14:52 |
tobiash | we run short reference jobs in every region every 30min and graph queuing time, job preparation time and job runtime | 14:53 |
avass | I was starting to get jealous at those job durations :) | 14:54 |
tobiash | well, that's just a short reference job that uses most of our infrastructure but as short as possible | 14:55 |
tobiash | the real jobs mostly take much longer ;) | 14:55 |
tristanC | tobiash: and time_to_start is execute_time - enqueue_time ? | 14:55 |
tobiash | yes, I think so | 14:55 |
tristanC | tobiash: alright thanks, i'll work on a benchmark tool and see if i can reproduce consistent measure with the current master | 14:58 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: mqtt: document the trigger and enqueue time attribute https://review.opendev.org/c/zuul/zuul/+/776212 | 15:08 |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Update upload-logs-swift and upload-logs-gcs https://review.opendev.org/c/zuul/zuul-jobs/+/774650 | 15:46 |
*** saneax has joined #zuul | 15:54 | |
*** jfoufas1 has quit IRC | 16:15 | |
*** hashar has joined #zuul | 16:29 | |
*** piotrowskim has quit IRC | 16:32 | |
*** hashar has quit IRC | 16:35 | |
zbr | mordred: avass: ok to bypass no-tabs rule for fields named 'regex'? | 16:35 |
mordred | zbr: I'm not sure what that means? | 16:36 |
mordred | (all the words are words I understand - just not in that sequence ;) ) | 16:36 |
zbr | https://review.opendev.org/c/zuul/zuul-jobs/+/773245/3/roles/build-docker-image/tasks/main.yaml#19 | 16:37 |
avass | mordred: the no-tabs for ansible-lint, yaml load \t as a tab and ansible picks that up and errors because of tabs not being allowed | 16:37 |
mordred | AH | 16:37 |
avass | zbr: yes tabs should probably be allowed inside fields | 16:37 |
*** hashar has joined #zuul | 16:38 | |
mordred | gotit | 16:38 |
mordred | that explains why the noqa no-tabs was on that - thanks! | 16:38 |
tristanC | would it be better to use \s to also match spaces? | 16:38 |
zbr | i will patch the linter only for regex, to minimize behavior change. | 16:38 |
avass | I don't think it should be limited to 'regex' fields, for example if someone wants to dump a makefile they also need tabs | 16:39 |
zbr | that would be a trick to avoid that error. | 16:39 |
mordred | yeah | 16:39 |
avass | I mean, there are legic reasons to have tabs in fields | 16:39 |
mordred | content in fields that ansible is managing can very reasonably contain tabs | 16:39 |
zbr | at the same time, i seen accidental pase of tabs. | 16:39 |
zbr | in that case adding no-tabs obliterates the problem anyway. | 16:40 |
zbr | adding no-tabs to the skip-list | 16:40 |
zbr | if chucknorris uses the in-line templates to generate makefiles, i cannot help them ;) | 16:41 |
clarkb | opendev actually hit this recently. We wanted a tab in a literally quoted ini file in an ansible yaml context | 16:41 |
clarkb | that was an error. I think we just turned off the rule | 16:42 |
fungi | or turned off ansible-lint | 16:42 |
zbr | considering that gerrit is smart enough to highlight tabs, the risk is low and skip-list is ok | 16:42 |
avass | yeah I vote for turning that off. If tabs were a problem they should hopefully get picked up by testing the role anyway :) | 16:43 |
zbr | doing it right now | 16:44 |
zbr | avass: while zuul-jobs has decent test coverage, I cannot say the same about other repos. | 16:49 |
openstackgerrit | Sorin Sbârnea proposed zuul/zuul-jobs master: Upgrade ansible-lint to 5.0 https://review.opendev.org/c/zuul/zuul-jobs/+/773245 | 16:58 |
zbr | wow, this review took Time: 0h:01m:36s to upload, that is not good. | 16:58 |
*** jpena is now known as jpena|brb | 17:14 | |
avass | zbr: any reason why the loop var rule is staying? | 17:18 |
corvus | zuul-maint: we found 2 more things we said we would do before 4.0: remove zuul-migrate, and make tls required for zookeeper | 17:19 |
corvus | i'm going to start working on those patches | 17:20 |
zbr | avass: nope, i forgot that i adopted it as a core rule. | 17:20 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Remove zuul-migrate https://review.opendev.org/c/zuul/zuul/+/776245 | 17:23 |
clarkb | I didn't realize that tool had such extensive testing | 17:26 |
clarkb | anyway that first one is approved | 17:27 |
openstackgerrit | Sorin Sbârnea proposed zuul/zuul-jobs master: Upgrade ansible-lint to 5.0 https://review.opendev.org/c/zuul/zuul-jobs/+/773245 | 17:28 |
*** jcapitao has quit IRC | 17:33 | |
*** jpena|brb is now known as jpena | 17:36 | |
*** rpittau is now known as rpittau|afk | 17:41 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: Require TLS for zookeeper connections https://review.opendev.org/c/zuul/zuul/+/776249 | 17:48 |
corvus | and i think we need a corresponding nodepool change | 17:48 |
*** cloudnull has quit IRC | 17:49 | |
*** cloudnull has joined #zuul | 17:49 | |
avass | did [database] become a thing and is required or are sql connections still fine? | 17:50 |
corvus | avass: database is a thing; you should migrate to it. both will be supported in 4.0 | 17:52 |
avass | Oh nvm, read the releasenotes. I guess they're fine but [database] is preferred | 17:52 |
corvus | yeah. 4.X (or 5.0 at the latest) will probably drop sql connections. | 17:52 |
corvus | there's lots of warning messages though, so migrating sooner is better :) | 17:53 |
avass | should be easy to update that, just want to double check that we're ready for v4 | 17:53 |
corvus | yeah: zk-tls everywhere, and at least one sql connection are the requirements from 3->4 (and that sql connection can be driver=sql or [database]) | 17:55 |
avass | yeah we enables zk-tls and that's supposed to warn if you're not using it right? | 17:55 |
clarkb | I assume we'll restart opendev on that zk change to sanity check it? | 17:57 |
clarkb | that should also give people a reference for what a working config looks like | 17:57 |
clarkb | (maybe not entirely ideal, but at least workable) | 17:57 |
corvus | avass: i expect that if tls is configured but the server isn't using it then it would error, but i don't know that i've tested that | 17:57 |
corvus | clarkb: yeah, i think we should restart just to make sure the server init is fully exercised (i have manually tested locally but better safe than sorry there) | 17:58 |
avass | oh that case would be even better :) | 17:58 |
corvus | avass: i'll try that out real quick | 17:58 |
avass | oh actually, I don't think we allow non-tls connections in our zookeeper event. so it should be setup | 18:00 |
avass | even* | 18:00 |
corvus | if i configure zuul to use tls but connect to a zk on the non-ssl port, the connection fails; it just loops over: 2021-02-17 10:00:52,323 WARNING zuul.zk.base.ZooKeeperClient: Retrying zookeeper connection | 18:02 |
corvus | not the best error message, but at least it doesn't work | 18:02 |
corvus | so at least we're unlikely to have folks who think they configured tls but aren't actually using it | 18:02 |
clarkb | zbr: ansible really doesn't like the ensure-terraform checksum string changes in 773245. I'm not sure why beacuse it seems valid. Does ansible maybe parse the {{ var }} strings in files before deserializing? | 18:09 |
*** nils has quit IRC | 18:10 | |
*** jpena is now known as jpena|off | 18:12 | |
clarkb | I guess if that were the case the old code is unlikely to work either | 18:12 |
clarkb | that is an interesting situation | 18:12 |
avass | clarkb: no that should work, it could be that it needs a '>-' or that '\' doesn't need to be escaped in the regex | 18:13 |
clarkb | avass: ya >- seems possible to remove the trailing newline | 18:14 |
clarkb | oh yup yaml doesn't allow escaped sequences in a block quoted string | 18:15 |
clarkb | the \\g -> to \g seems like another likely candidate | 18:15 |
avass | there's also a weird combination of yaml/jinja/filters with backslashes that just doesn't work. Easy to stumble into that one when working on windows and replacing paths with backslashes :) | 18:18 |
fungi | these days windows lets you use / as a path separator too though, right? | 18:25 |
avass | yeah but I believe some tools still doesn't like that | 18:25 |
fungi | i thought i heard that somewhere (thankfully i never have to touch the stuff) | 18:25 |
avass | tbh windows is getting a lot better but there are still some things that are a lot of pain to work with, most notably files being locked by reads/writes | 18:32 |
clarkb | once upon a time I remember the joys of trying to delete files and then playing "who has this file open" | 18:32 |
openstackgerrit | Merged zuul/zuul master: Remove zuul-migrate https://review.opendev.org/c/zuul/zuul/+/776245 | 18:34 |
fungi | clarkb: you can play that on linux/bsd too, just less often | 18:35 |
fungi | but also i have no idea what the windows equivalent of lsof would be | 18:35 |
avass | oh there are very nice gui programs that does that for you :) | 18:36 |
mordred | of course there are | 18:38 |
corvus | clarkb, mordred: any reason not to upgrade to focal for zuul/nodepool unit tests? | 18:38 |
clarkb | corvus: the biggest thing would be the lower python bound I think | 18:38 |
clarkb | bionic has 3.6 but focal will only have 3.8 | 18:39 |
corvus | ah | 18:39 |
clarkb | another option might be centos-8(-stream) for python3.6 but I think bionic is expected to have a logner shelf life at this point | 18:39 |
corvus | unsure if it's necessary at this point | 18:40 |
mordred | yeah - but other than that - I would say use focal by default nad use bionic for 3.6 | 18:40 |
avass | doesn't focal come with snapd by default as well? that can be a bit evil with automatic updates and apt automatically installing a snap instead I believe | 18:40 |
clarkb | avass: our images shouldn't have snapd in them, but yes if you grab a proper ubuntu image you get snapd | 18:40 |
clarkb | (I think bionic did that too though) | 18:40 |
mordred | yah - benefit of building our own images | 18:40 |
clarkb | mordred: ++ we should be able to default to focal for everything but old python use cases | 18:41 |
corvus | just mapping out zookeeper versions | 18:41 |
avass | clarkb: hmm maybe it did | 18:42 |
mordred | clarkb: ++ | 18:43 |
tobiash | speaking about py36, how/when shall we handle its deprecation? It will be end of life end of this year. | 18:45 |
mordred | tobiash: it's a good question. I think we've historically ignored the upstream python EOL and looked instead at the support lifespan of the linux distro | 18:46 |
mordred | like - bionic has a longer support lifespan than 3.6 itself - so people could reasonably expect to be running bionic and using 3.6 | 18:47 |
mordred | but - bionic also has newer python I think | 18:47 |
mordred | which is me saying "I don't have a clear answer in my head" | 18:47 |
mordred | centos has frequently been the long-tail in terms of needing to support older things - but with centos getting killed off maybe that's not a thing we have to care about as strongly? | 18:48 |
clarkb | while thats true rhel8 will continue to be 3.6 for almost a decade more | 18:48 |
mordred | yeah | 18:49 |
mordred | otoh - the python community seems to be aggressively dropping support for python versions in libraries once the version gets EOLd | 18:49 |
clarkb | bionic does have newer python packages available as well, but rhel8/centos8 don't currently aiui | 18:49 |
clarkb | so the question is likely to be about the rhel/centos support side of things | 18:49 |
mordred | so it might become harder to keep supporting 3.6 once it's EOL'd | 18:49 |
fungi | but not necessarily a guarantee, after all they released rhel 8 with python 2.7 and said they would drop it before the distro itself reaches eol | 18:49 |
fungi | also red hat has a clear history of backporting newer bits of the interpreter into their "old" version | 18:50 |
mordred | yeah. also - there's always the container story for people on rhel/centos if we do decide that zuul's python support position does not immediately overlap with rhel's | 18:50 |
clarkb | my hunch is that its something we'd want to deprecate and not just drop | 18:51 |
mordred | while I don't think we want to mandate everyone use containers - since there are non-container options, and rhel+container options - maybe ensuring we support non-container deployments on slow-moving platforms like rhel is too much? | 18:51 |
clarkb | so that any existing users on say centos can plan ahead (which they are likely already doing but potentially to another 3.6 distro) | 18:51 |
mordred | ++ | 18:52 |
Open10K8S | corvus: Hi there. Can you check this plz? https://review.opendev.org/c/zuul/zuul-jobs/+/774650 | 18:53 |
clarkb | it does feel like maybe python2.7 has spoiled us in a way. It woudl be nice if there was a largely agreed upon minium version of python3 most things should support | 18:54 |
clarkb | of course with python2 that also happened to be the max value :) | 18:54 |
corvus | Open10K8S: probably tomorrow | 18:55 |
Open10K8S | ok, thank you for your test and review :) | 18:55 |
corvus | Open10K8S: you're welcome, thank you :) | 18:55 |
clarkb | another thought: I've been on python3.8 locally for a while now and have yet to have it break on projects that do older python3. To me that means that hte biggest questions are likely to be "Will deps continue to support these older versions" and "Are there any new python3 features we would really like to see in the code base" | 18:58 |
tobiash | f strings have been introduced in 3.6 right? | 18:59 |
avass | clarkb: walrus operators, pattern matching, better f-strings | 18:59 |
corvus | i'm having trouble getting bionic's zookeeper doing tls connections, which is required for nodepool's unit tests (and i think we're probably going to require them for zuul's soon too, but right now, due to a quirk of how we set up the conneciton, zuul's tests don't need to use tls even if zuul does) | 18:59 |
clarkb | yes f strings were added in 3.6 | 18:59 |
clarkb | avass: pattern matching is nice but we don't even do 3.9 yet :) | 18:59 |
clarkb | or at least not when I last checked | 19:00 |
tobiash | corvus: which zk verson does that have? | 19:00 |
tristanC | clarkb: if i understand correctly, the default python3.6 in rhel8 is named platform-python and it is used for dnf and other system service, regular python are distributed with streams which have different lifecycle, and rhel-8.2 has python3.8 | 19:00 |
corvus | tobiash: 3.4.10. it technically should be present, but it wasn't fully supported | 19:00 |
corvus | tobiash: i get: Exception in thread "main" java.lang.NoClassDefFoundError: org/jboss/netty/channel/group/ChannelGroup | 19:00 |
clarkb | tristanC: Oh I didn't realize they had updated the non platform python | 19:00 |
corvus | (even though netty jar is in classpath) | 19:00 |
clarkb | corvus: as a workaround I think ensure-zookeeper has an install from tarball option or similar. Maybe that would work? | 19:01 |
corvus | clarkb: yeah... so have test-setup.sh just run zookeeper from a jar? | 19:01 |
clarkb | corvus: ya (I do that locally because suse doesn't package zk) | 19:02 |
clarkb | the jar comes with an init like script you can use to start and stop the service | 19:02 |
clarkb | you just need a java to be present | 19:02 |
clarkb | tristanC: hrm all of the internet guides from installing new python3 on rhel/centos 8 say to compile it from source :/ | 19:03 |
clarkb | at least the several top results I've clicked on so far | 19:03 |
openstackgerrit | Merged zuul/zuul master: Require TLS for zookeeper connections https://review.opendev.org/c/zuul/zuul/+/776249 | 19:05 |
tristanC | clarkb: the command should be `dnf install python38` , from https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_basic_system_settings/using-python3_configuring-basic-system-settings | 19:07 |
tristanC | and you can give this a try by using this free image: registry.access.redhat.com/ubi8/ubi | 19:07 |
clarkb | neat, in that case ya I think we probably don't need to support 3.6 for centos/rhel 8 | 19:07 |
clarkb | (since you are supposed to update to the latest point release as a centos or rhel user the old 8.0 and 8.1 python versiosn aren't super valid anymore) | 19:08 |
fungi | also ubuntu does make newer python available on older lts releases too | 19:09 |
tobiash | corvus: don't do zuul's unit tests tls? I'm confused because the py-36 jobs of the zuul change worked. | 19:10 |
fungi | debian stable already has 3.7, and debian oldstable was 3.5 anyway so we've already dropped that | 19:11 |
fungi | and ubuntu bionic has 3.7 (and 3.8) packages available | 19:12 |
clarkb | fungi:the 3.8 is an alpha version though right? | 19:12 |
fungi | yeah, but the 3.7 is not | 19:12 |
fungi | 3.7.3 in the base distro for bionic, and 3.7.5 in bionic-updates | 19:13 |
corvus | tobiash: no, zuul doesn't use zk tls in unit tests | 19:13 |
tobiash | ok, that explains things | 19:13 |
corvus | tobiash: so whatever we do for nodepool, we're going to need to do for zuul if we move the connection handling into the components. | 19:13 |
fungi | i would say if we want to be able to drop python 3.6 support but still want people to be able to use zuul on ubuntu bionic, then we can test 3.7 there via the optional python3.7 package (or on debian buster where 3.7 is the default python3) | 19:16 |
tobiash | corvus: maybe we could also use test-setup-docker.sh instead if zk is too old | 19:16 |
corvus | tobiash: i kind of like that idea | 19:16 |
tobiash | I prefer that during development anyway | 19:17 |
corvus | i think i'm going to try porting that over to nodepool; then if that works, we can port it back to zuul | 19:17 |
tobiash | ++ | 19:17 |
corvus | yeah, the whole idea was to try to make it easy for devs to match the unit test env | 19:17 |
*** sduthil has quit IRC | 20:07 | |
*** hashar has quit IRC | 20:23 | |
corvus | is there a bindep entry for docker-compose anywhere? | 21:19 |
corvus | looks like there's some in codesearch for docker, not seeing compose | 21:20 |
clarkb | I am not aware of one | 21:20 |
corvus | i'm thinking that with a [test] is the best way to get those installed for test-setup.sh to use | 21:20 |
clarkb | you can pip install it too | 21:21 |
clarkb | (if it is running in a tox context that may make sense) | 21:21 |
corvus | oh, that sounds neat but i think test-setup.sh is run before tox | 21:21 |
corvus | (and there's a nice benefit for devs in that it reduces tox startup time if they can just leave it running) | 21:22 |
tristanC | tobiash: here are the measures i gather from mqtt events: https://github.com/TristanCacqueray/zuul-nix/blob/master/benchmark.py#L23 | 21:24 |
tristanC | re-running the script on a fresh vm consistently gives these results: for 100 builds, it takes 104 seconds, Enqueue time: mean 0.028 std dev 0.044, Report time : mean 0.053 std dev 0.040 | 21:25 |
corvus | tristanC: are those metrics emitted by statsd? | 21:25 |
corvus | er to statsd | 21:26 |
tristanC | corvus: it doesn't seem to be emitted to statsd | 21:26 |
corvus | tristanC: they look useful and should be easy to add -- could we go ahead and do that? i think it'd be better overall if mqtt wasn't required in order to get stats like that | 21:28 |
tristanC | corvus: on the other hand, it's easier to setup mqtt than statsd | 21:28 |
*** cloudnull has quit IRC | 21:28 | |
corvus | tristanC: for the benchmark or for a production system? | 21:28 |
tristanC | well i'm not familiar with statsd, but for mqtt, mosquitto is simple to setup | 21:30 |
tristanC | i can have a look to duplicate that metrics in statsd if you prefer | 21:31 |
corvus | i mean, mqtt isn't really a metrics database; yes, a production system needs a backing database for that (graphite/influx/whatever), but a sizable production system is going to have that, so just adding an extra metric is easy | 21:31 |
corvus | but for a benchmark you don't need a backing database | 21:31 |
corvus | so using the statsd *protocol* can actually be even easier than using mqtt | 21:31 |
corvus | gimme a sec to dig up a link | 21:32 |
*** cloudnull has joined #zuul | 21:33 | |
corvus | tristanC: a statsd server doesn't have to be much more than this: https://opendev.org/zuul/zuul/src/branch/master/tests/base.py#L2783-L2812 in the tests, we parse the data in the test functions instead of on the server, but a simple split there could do it. | 21:33 |
corvus | tristanC: i think this is worthwhile because i really like the metrics you're looking at and the idea of benchmarking, and i'd love to use the same mechanism in prod as the benchmark | 21:34 |
corvus | i can help flesh out that statsd server if you want | 21:34 |
tristanC | i understand that statsd is more lightweigh, but i find the mqtt or prometheus ecosystem more active and easier to integrate | 21:35 |
tristanC | anyway, it's already in place so we might as well add this enqueue and report time metric | 21:36 |
*** vishalmanchanda has quit IRC | 21:37 | |
tristanC | note that my goal was to limite deviation, so i'm measuring trigger to enqueue and completed build to report, so that ansible execution is not taken into account | 21:38 |
tristanC | and to make things go faster, i've replaced the ansible-playbook command by an `exit 0` script | 21:38 |
corvus | those will obviously produce different values in benchmark and prod :) | 21:38 |
corvus | but we can still measure processes before and after the build | 21:39 |
dmsimard | o/ heads up: ansible==3.0.0 releasing tomorrow | 21:45 |
corvus | dmsimard: thanks! | 21:45 |
avass | dmsimard: oh interesting | 21:45 |
corvus | tristanC: i think this should handle counters, and almost handle timers: http://paste.openstack.org/show/802753/ | 21:45 |
corvus | tristanC: i figure you might want to do your stdev calcs, etc, so i stopped there.. want me to finish that up, or you want to take it from here? | 21:46 |
*** gmann is now known as gmann_afk | 21:51 | |
tristanC | corvus: i'm looking at adding the enqueue_time and report_time in Manager.reportStats | 21:52 |
tristanC | or somewhere else, not sure what's the most appropriate yet | 21:55 |
corvus | tristanC: yeah, that's tricky; it might be better to do it in reportEnqueue etc... | 22:00 |
*** sduthil has joined #zuul | 22:06 | |
corvus | tristanC: http://paste.openstack.org/show/802755/ should plug into your script fairly easily | 22:08 |
corvus | tristanC: should be able to do something like mean(statsd.timers['stats.gauges.zuul.tenant.foo.enqueue_time']) or whatever the metric name ends up being. | 22:09 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: Require TLS https://review.opendev.org/c/zuul/nodepool/+/776286 | 22:22 |
corvus | that will almost certainly fail the openshift job; we'll have to figure out how to do zk tls there | 22:23 |
*** jhesketh has joined #zuul | 22:32 | |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: Require TLS https://review.opendev.org/c/zuul/nodepool/+/776286 | 22:34 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: scheduler: add statsd metric for enqueue time https://review.opendev.org/c/zuul/zuul/+/776287 | 22:35 |
tristanC | corvus: i can have a look, but why do you think it will fail? | 22:37 |
corvus | tristanC: it uses ensure-zookeeper to run zk, so we'll need to figure out how to set up tls certs for that (or maybe see if we can switch to using test-setup.sh which will run zk in podman) | 22:38 |
tristanC | corvus: can I add a 'use_tls' ensure-zookeeper role attribute for that? | 22:40 |
tristanC | this would only be the third time i add tls configuration to a zookeeper confmgmt :-) | 22:41 |
corvus | tristanC: yeah, and i set the nodepool test framework to accept cert paths via ENV variables | 22:41 |
tristanC | ok, i can have a look now | 22:42 |
corvus | tristanC: what should we do for generating certs? i also am adding zk-ca.sh to the nodepool repo, so we can run that if we want | 22:42 |
tristanC | i guess i'll copy the script in zuul-jobs if that's ok | 22:43 |
corvus | wfm | 22:43 |
corvus | where does the k8s test get zookeeper? | 22:48 |
tristanC | it's using the default zookeeper image from docker, see https://opendev.org/zuul/zuul-operator/src/branch/master/conf/zuul/components/ZooKeeper.dhall#L37 | 22:49 |
corvus | tristanC: nodepool-functional-k8s doesn't use that does it? | 22:50 |
tristanC | i don't think so | 22:50 |
corvus | that's the job i meant | 22:50 |
tristanC | oh i missread then | 22:51 |
corvus | (likewise nodepool-functional-openshift for openshift) | 22:51 |
tristanC | well it would make sense to do k8s integration test with the operator | 22:51 |
corvus | i think i prefer the simpler approach -- let's keep building up | 22:51 |
corvus | oh | 22:56 |
corvus | it must have just used zookeeper from bindep | 22:56 |
corvus | tristanC: for the k8s job we can probably switch to using ensure-zookeeper as well | 22:58 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: ensure-zookeeper: add use_tls role var https://review.opendev.org/c/zuul/zuul-jobs/+/776290 | 23:02 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: Require TLS https://review.opendev.org/c/zuul/nodepool/+/776286 | 23:04 |
tristanC | corvus: what about regular job, should they also use ensure-zookeeper then? | 23:04 |
corvus | tristanC: i've got test-setup.sh setting up zk with tls, so we can use the regular tox jobs | 23:05 |
corvus | (it's also how i plan on running tests on my workstation from now on too) | 23:05 |
corvus | that uses docker or podman | 23:05 |
corvus | tristanC: i'll update my patch to depends-on yours | 23:06 |
tristanC | corvus: ok, i got to go, i was adding the zookeeper_use_tls role vars, making a symlink from /opt/zookeeper/ca to tools/ca and skipping the compose when the dir exists in test-setup.sh | 23:08 |
tristanC | though file permission probably needs to be adjusted in zk-ca too | 23:09 |
*** rlandy is now known as rlandy|bbl | 23:11 | |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: Require TLS https://review.opendev.org/c/zuul/nodepool/+/776286 | 23:13 |
corvus | tristanC: since you're afk i'm going to push up a fix to your change | 23:13 |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: ensure-zookeeper: add use_tls role var https://review.opendev.org/c/zuul/zuul-jobs/+/776290 | 23:15 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: Require TLS. https://review.opendev.org/c/zuul/nodepool/+/776286 | 23:15 |
*** gmann_afk is now known as gmann | 23:22 | |
*** jkt has quit IRC | 23:22 | |
*** jkt has joined #zuul | 23:23 | |
clarkb | corvus: would it be helpful to review either of those two changes at this point? | 23:28 |
corvus | clarkb: only if you're bored; i'm still bashing it into shape | 23:29 |
clarkb | well I do think I could use a break from staring at scala | 23:29 |
clarkb | I'll start with the one that doesn't say WIP on it as it is also passing testing | 23:30 |
corvus | clarkb: k; i'm looking at unit test failures right now | 23:30 |
corvus | cool | 23:30 |
fungi | i lost an hour to kernel wireless firmwares moving from brcm to cypress without sufficient symlinks. that was a break indeed | 23:30 |
corvus | clarkb: as tristanC was alluding to, i think we need a solution to the ownership issue with the use_tls patch | 23:34 |
corvus | it's going to run the ca as root, but the tests need to read the client cert and key as the zuul user | 23:34 |
corvus | should we just run the ca as the zuul user? | 23:34 |
clarkb | corvus: for test purposes that seems fine and practical | 23:35 |
clarkb | as an alternative you could chown all the client and server stuff to zuul but keep the ca as root? | 23:36 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: Require TLS. https://review.opendev.org/c/zuul/nodepool/+/776286 | 23:37 |
corvus | i'll adapt the zuul-jobs change to run the ca without become | 23:38 |
clarkb | corvus: couple of thoughts on the existing ps but nothing that really needs changing | 23:39 |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: ensure-zookeeper: add use_tls role var https://review.opendev.org/c/zuul/zuul-jobs/+/776290 | 23:44 |
corvus | clarkb: okay let's try that ^ | 23:44 |
* clarkb refreshes | 23:44 | |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: Require TLS https://review.opendev.org/c/zuul/nodepool/+/776286 | 23:45 |
clarkb | that role update looks like it should work | 23:47 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!