Wednesday, 2021-02-17

*** zbr has quit IRC00:33
*** zbr has joined #zuul00:45
*** rlandy has quit IRC01:24
*** zbr7 has joined #zuul02:23
*** zbr has quit IRC02:23
*** zbr7 is now known as zbr02:23
*** saneax has joined #zuul03:36
*** saneax has quit IRC04:29
*** vishalmanchanda has joined #zuul04:36
*** hamalq has quit IRC04:41
*** saneax has joined #zuul04:48
*** evrardjp has quit IRC05:33
*** evrardjp has joined #zuul05:33
*** jfoufas1 has joined #zuul05:58
*** zbr has quit IRC06:13
*** zbr has joined #zuul06:14
*** mnaser has quit IRC06:32
*** mnaser has joined #zuul06:34
*** jcapitao|off has joined #zuul07:16
*** jcapitao|off is now known as jcapitao07:19
*** piotrowskim has joined #zuul07:26
*** rpittau|afk is now known as rpittau08:12
*** cloudnull has quit IRC08:59
*** jpena|off is now known as jpena08:59
*** cloudnull has joined #zuul09:02
*** tosky has joined #zuul09:14
*** tosky_ has joined #zuul09:20
*** tosky has quit IRC09:24
*** tosky_ is now known as tosky09:25
*** nils has joined #zuul09:28
*** cloudnull has quit IRC09:37
*** cloudnull has joined #zuul09:40
*** evrardjp has quit IRC09:46
*** evrardjp has joined #zuul09:48
*** saneax has quit IRC10:36
*** saneax has joined #zuul11:02
*** sshnaidm has quit IRC11:15
*** sshnaidm has joined #zuul11:18
*** jcapitao is now known as jcapitao_lunch11:26
*** harrymichal has joined #zuul12:15
*** fdegir5 is now known as fdegir12:24
*** jpena is now known as jpena|lunch12:30
*** rlandy has joined #zuul12:31
*** jcapitao_lunch is now known as jcapitao13:07
*** jpena|lunch is now known as jpena13:23
*** ikhan has quit IRC13:24
openstackgerritFelix Edel proposed zuul/zuul master: Simplify ZooKeeper client initialization  https://review.opendev.org/c/zuul/zuul/+/75436013:55
openstackgerritFelix Edel proposed zuul/zuul master: Improve typings in context of builds via ZooKeeper  https://review.opendev.org/c/zuul/zuul/+/75357813:56
openstackgerritFelix Edel proposed zuul/zuul master: Make ZooKeeper mandatory for Scheduler  https://review.opendev.org/c/zuul/zuul/+/75671613:56
openstackgerritFelix Edel proposed zuul/zuul master: Make ConnectionRegistry mandatory for Scheduler  https://review.opendev.org/c/zuul/zuul/+/75709513:56
openstackgerritFelix Edel proposed zuul/zuul master: Improve typings in context of 756716 and 757095  https://review.opendev.org/c/zuul/zuul/+/75714813:56
openstackgerritFelix Edel proposed zuul/zuul master: Instantiate executor client, merger, nodepool and app within Scheduler  https://review.opendev.org/c/zuul/zuul/+/75714913:56
openstackgerritFelix Edel proposed zuul/zuul master: Improve typings in context of lock nodes on executor  https://review.opendev.org/c/zuul/zuul/+/75709713:56
openstackgerritFelix Edel proposed zuul/zuul master: DNM: Reduce number of jobs for SOS development  https://review.opendev.org/c/zuul/zuul/+/77508113:56
openstackgerritFelix Edel proposed zuul/zuul master: Component Registry in ZooKeeper  https://review.opendev.org/c/zuul/zuul/+/75918713:56
openstackgerritFelix Edel proposed zuul/zuul master: Move management and result events to model  https://review.opendev.org/c/zuul/zuul/+/76116313:56
openstackgerritFelix Edel proposed zuul/zuul master: Allow (de-)serialization of management events  https://review.opendev.org/c/zuul/zuul/+/76116413:56
openstackgerritFelix Edel proposed zuul/zuul master: Allow (de-)serialization of result events  https://review.opendev.org/c/zuul/zuul/+/76116513:56
openstackgerritFelix Edel proposed zuul/zuul master: Add and fix fields in driver trigger event models  https://review.opendev.org/c/zuul/zuul/+/76116613:56
openstackgerritFelix Edel proposed zuul/zuul master: Allow (de-)serialization of trigger events  https://review.opendev.org/c/zuul/zuul/+/76116713:56
openstackgerritFelix Edel proposed zuul/zuul master: Interface to get a driver's trigger event class  https://review.opendev.org/c/zuul/zuul/+/76116813:56
openstackgerritFelix Edel proposed zuul/zuul master: Increase default test wait timeout to 120s  https://review.opendev.org/c/zuul/zuul/+/76375413:57
openstackgerritFelix Edel proposed zuul/zuul master: Implementation of Zookeeper backed event queues  https://review.opendev.org/c/zuul/zuul/+/76117013:57
openstackgerritFelix Edel proposed zuul/zuul master: Implementation of Zookeeper event watcher  https://review.opendev.org/c/zuul/zuul/+/76117113:57
openstackgerritFelix Edel proposed zuul/zuul master: Switch to Zookeeper backed trigger event queues  https://review.opendev.org/c/zuul/zuul/+/76117213:57
openstackgerritFelix Edel proposed zuul/zuul master: Switch to Zookeeper backed management event queues  https://review.opendev.org/c/zuul/zuul/+/76173813:57
openstackgerritFelix Edel proposed zuul/zuul master: Improve test output by using named queues  https://review.opendev.org/c/zuul/zuul/+/77562013:57
openstackgerritFelix Edel proposed zuul/zuul master: Avoid race when task from queue is in progress  https://review.opendev.org/c/zuul/zuul/+/77562113:57
openstackgerritFelix Edel proposed zuul/zuul master: Implement Zookeeper backed connection event queue  https://review.opendev.org/c/zuul/zuul/+/77562213:57
openstackgerritFelix Edel proposed zuul/zuul master: Dispatch Pagure webhook events via Zookeeper  https://review.opendev.org/c/zuul/zuul/+/77562313:57
openstackgerritFelix Edel proposed zuul/zuul master: Dispatch Github webhook events via Zookeeper  https://review.opendev.org/c/zuul/zuul/+/77562413:58
openstackgerritFelix Edel proposed zuul/zuul master: Dispatch Gitlab webhook events via Zookeeper  https://review.opendev.org/c/zuul/zuul/+/77562513:58
tristanCtobiash: looking at scheduler log only, by replacing ansible-playbook with an `exit 0`, it seems like i get a consistent 10sec end-to-end benchmark for a single pipeline/job execution14:00
tobiashtristanC: nodeless do-nothing jobs?14:01
tristanCwhich makes me wonder if the log file timestamps are enough to get accurate measurement of the process performance14:01
tobiashfor most things probably yes, however we monitor and alert on some key metrics like event queue sizes14:02
tobiash(which we get from statsd atm)14:02
tristanCtobiash: this is currently using a static label, but i guess nodepool could be removed using a nodeless job14:04
tristanCtobiash: yeah, we do also monitor queue size, but my goal here is to get some syntetic benchmark to measure before and after scheduler ha performance14:05
tristanCand using the scheduler log to measure the performance seems convenient , e.g. to collect timing between new-change, build-start and build-report14:07
*** sanjay__u has joined #zuul14:08
*** saneax has quit IRC14:08
tobiashtristanC: you get those data also via the mqtt reporter afaik14:08
tobiashwe use those for kpi's of regular running reference jobs14:09
*** sanjay__u has quit IRC14:10
tristanCdoes it covert the time it takes between a new change event and when the build start?14:10
tristanCif i understand correctly, scheduler ha uses zookeeper to store new change event before starting a build, and that is an overhead i'd like to measure14:11
tristanCin other word, what would be the best strategy to measure the overhead of zookeeper in the scheduler ha feature?14:17
avasstristanC: is it the gerrit event -> zookeeper znode creation you want to measure?14:25
tristanCavass: that would be good to know, but to compare with v4 i'd like to get `event -> build start` (for example)14:27
tristanCso i was hoping i could collect overall performance for a thousand events, and measure individual event performance using the scheduler log14:29
avassyeah I was just thinking that you could use a zookeeper watcher somehow. but using zuul logs would make it a bit easier and maybe more accurate.14:40
tobiashtristanC: I think we have there the time it took from enqueue until the job starts, the time from job start to first ansible run and also the full time the job took14:48
tristanCtobiash: you mean in statsd?14:49
tobiashtristanC: I mean in the mqtt reporter14:51
*** jhesketh has quit IRC14:52
corvusthose seem like easy statsd things to add if they're not there already14:52
avassyeah looks like there's a trigger time and enqueue time: https://zuul-ci.org/docs/zuul/reference/drivers/mqtt.html#attr-%3Cmqtt%20schema%3E.buildset.builds.result14:52
tobiashtristanC: we get something like this out of those data: https://paste.pics/ac3a15cd892374e9a12454739ec3e9c314:52
tobiashwe run short reference jobs in every region every 30min and graph queuing time, job preparation time and job runtime14:53
avassI was starting to get jealous at those job durations :)14:54
tobiashwell, that's just a short reference job that uses most of our infrastructure but as short as possible14:55
tobiashthe real jobs mostly take much longer ;)14:55
tristanCtobiash: and time_to_start is execute_time - enqueue_time ?14:55
tobiashyes, I think so14:55
tristanCtobiash: alright thanks, i'll work on a benchmark tool and see if i can reproduce consistent measure with the current master14:58
openstackgerritTristan Cacqueray proposed zuul/zuul master: mqtt: document the trigger and enqueue time attribute  https://review.opendev.org/c/zuul/zuul/+/77621215:08
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Update upload-logs-swift and upload-logs-gcs  https://review.opendev.org/c/zuul/zuul-jobs/+/77465015:46
*** saneax has joined #zuul15:54
*** jfoufas1 has quit IRC16:15
*** hashar has joined #zuul16:29
*** piotrowskim has quit IRC16:32
*** hashar has quit IRC16:35
zbrmordred: avass: ok to bypass no-tabs rule for fields named 'regex'?16:35
mordredzbr: I'm not sure what that means?16:36
mordred(all the words are words I understand - just not in that sequence ;) )16:36
zbrhttps://review.opendev.org/c/zuul/zuul-jobs/+/773245/3/roles/build-docker-image/tasks/main.yaml#1916:37
avassmordred: the no-tabs for ansible-lint, yaml load \t as a tab and ansible picks that up and errors because of tabs not being allowed16:37
mordredAH16:37
avasszbr: yes tabs should probably be allowed inside fields16:37
*** hashar has joined #zuul16:38
mordredgotit16:38
mordredthat explains why the noqa no-tabs was on that - thanks!16:38
tristanCwould it be better to use \s to also match spaces?16:38
zbri will patch the linter only for regex, to minimize behavior change.16:38
avassI don't think it should be limited to 'regex' fields, for example if someone wants to dump a makefile they also need tabs16:39
zbrthat would be a trick to avoid that error.16:39
mordredyeah16:39
avassI mean, there are legic reasons to have tabs in fields16:39
mordredcontent in fields that ansible is managing can very reasonably contain tabs16:39
zbrat the same time, i seen accidental pase of tabs.16:39
zbrin that case adding no-tabs obliterates the problem anyway.16:40
zbradding no-tabs to the skip-list16:40
zbrif chucknorris uses the in-line templates to generate makefiles, i cannot help them ;)16:41
clarkbopendev actually hit this recently. We wanted a tab in a literally quoted ini file in an ansible yaml context16:41
clarkbthat was an error. I think we just turned off the rule16:42
fungior turned off ansible-lint16:42
zbrconsidering that gerrit is smart enough to highlight tabs, the risk is low and skip-list is ok16:42
avassyeah I vote for turning that off. If tabs were a problem they should hopefully get picked up by testing the role anyway :)16:43
zbrdoing it right now16:44
zbravass: while zuul-jobs has decent test coverage, I cannot say the same about other repos.16:49
openstackgerritSorin Sbârnea proposed zuul/zuul-jobs master: Upgrade ansible-lint to 5.0  https://review.opendev.org/c/zuul/zuul-jobs/+/77324516:58
zbrwow, this review took Time: 0h:01m:36s to upload, that is not good.16:58
*** jpena is now known as jpena|brb17:14
avasszbr: any reason why the loop var rule is staying?17:18
corvuszuul-maint: we found 2 more things we said we would do before 4.0: remove zuul-migrate, and make tls required for zookeeper17:19
corvusi'm going to start working on those patches17:20
zbravass: nope, i forgot that i adopted it as a core rule.17:20
openstackgerritJames E. Blair proposed zuul/zuul master: Remove zuul-migrate  https://review.opendev.org/c/zuul/zuul/+/77624517:23
clarkbI didn't realize that tool had such extensive testing17:26
clarkbanyway that first one is approved17:27
openstackgerritSorin Sbârnea proposed zuul/zuul-jobs master: Upgrade ansible-lint to 5.0  https://review.opendev.org/c/zuul/zuul-jobs/+/77324517:28
*** jcapitao has quit IRC17:33
*** jpena|brb is now known as jpena17:36
*** rpittau is now known as rpittau|afk17:41
openstackgerritJames E. Blair proposed zuul/zuul master: Require TLS for zookeeper connections  https://review.opendev.org/c/zuul/zuul/+/77624917:48
corvusand i think we need a corresponding nodepool change17:48
*** cloudnull has quit IRC17:49
*** cloudnull has joined #zuul17:49
avassdid [database] become a thing and is required or are sql connections still fine?17:50
corvusavass: database is a thing; you should migrate to it.  both will be supported in 4.017:52
avassOh nvm, read the releasenotes. I guess they're fine but [database] is preferred17:52
corvusyeah.  4.X (or 5.0 at the latest) will probably drop sql connections.17:52
corvusthere's lots of warning messages though, so migrating sooner is better :)17:53
avassshould be easy to update that, just want to double check that we're ready for v417:53
corvusyeah: zk-tls everywhere, and at least one sql connection are the requirements from 3->4 (and that sql connection can be driver=sql or [database])17:55
avassyeah we enables zk-tls and that's supposed to warn if you're not using it right?17:55
clarkbI assume we'll restart opendev on that zk change to sanity check it?17:57
clarkbthat should also give people a reference for what a working config looks like17:57
clarkb(maybe not entirely ideal, but at least workable)17:57
corvusavass: i expect that if tls is configured but the server isn't using it then it would error, but i don't know that i've tested that17:57
corvusclarkb: yeah, i think we should restart just to make sure the server init is fully exercised (i have manually tested locally but better safe than sorry there)17:58
avassoh that case would be even better :)17:58
corvusavass: i'll try that out real quick17:58
avassoh actually, I don't think we allow non-tls connections in our zookeeper event. so it should be setup18:00
avasseven*18:00
corvusif i configure zuul to use tls but connect to a zk on the non-ssl port, the connection fails;  it just loops over: 2021-02-17 10:00:52,323 WARNING zuul.zk.base.ZooKeeperClient: Retrying zookeeper connection18:02
corvusnot the best error message, but at least it doesn't work18:02
corvusso at least we're unlikely to have folks who think they configured tls but aren't actually using it18:02
clarkbzbr: ansible really doesn't like the ensure-terraform checksum string changes in 773245. I'm not sure why beacuse it seems valid. Does ansible maybe parse the {{ var }} strings in files before deserializing?18:09
*** nils has quit IRC18:10
*** jpena is now known as jpena|off18:12
clarkbI guess if that were the case the old code is unlikely to work either18:12
clarkbthat is an interesting situation18:12
avassclarkb: no that should work, it could be that it needs a '>-' or that '\' doesn't need to be escaped in the regex18:13
clarkbavass: ya >- seems possible to remove the trailing newline18:14
clarkboh yup yaml doesn't allow escaped sequences in a block quoted string18:15
clarkbthe \\g -> to \g seems like another likely candidate18:15
avassthere's also a weird combination of yaml/jinja/filters with backslashes that just doesn't work. Easy to stumble into that one when working on windows and replacing paths with backslashes :)18:18
fungithese days windows lets you use / as a path separator too though, right?18:25
avassyeah but I believe some tools still doesn't like that18:25
fungii thought i heard that somewhere (thankfully i never have to touch the stuff)18:25
avasstbh windows is getting a lot better but there are still some things that are a lot of pain to work with, most notably files being locked by reads/writes18:32
clarkbonce upon a time I remember the joys of trying to delete files and then playing "who has this file open"18:32
openstackgerritMerged zuul/zuul master: Remove zuul-migrate  https://review.opendev.org/c/zuul/zuul/+/77624518:34
fungiclarkb: you can play that on linux/bsd too, just less often18:35
fungibut also i have no idea what the windows equivalent of lsof would be18:35
avassoh there are very nice gui programs that does that for you :)18:36
mordredof course there are18:38
corvusclarkb, mordred: any reason not to upgrade to focal for zuul/nodepool unit tests?18:38
clarkbcorvus: the biggest thing would be the lower python bound I think18:38
clarkbbionic has 3.6 but focal will only have 3.818:39
corvusah18:39
clarkbanother option might be centos-8(-stream) for python3.6 but I think bionic is expected to have a logner shelf life at this point18:39
corvusunsure if it's necessary at this point18:40
mordredyeah - but other than that - I would say use focal by default nad use bionic for 3.618:40
avassdoesn't focal come with snapd by default as well? that can be a bit evil with automatic updates and apt automatically installing a snap instead I believe18:40
clarkbavass: our images shouldn't have snapd in them, but yes if you grab a proper ubuntu image you get snapd18:40
clarkb(I think bionic did that too though)18:40
mordredyah - benefit of building our own images18:40
clarkbmordred: ++ we should be able to default to focal for everything but old python use cases18:41
corvusjust mapping out zookeeper versions18:41
avassclarkb: hmm maybe it did18:42
mordredclarkb: ++18:43
tobiashspeaking about py36, how/when shall we handle its deprecation? It will be end of life end of this year.18:45
mordredtobiash: it's a good question. I think we've historically ignored the upstream python EOL and looked instead at the support lifespan of the linux distro18:46
mordredlike - bionic has a longer support lifespan than 3.6 itself - so people could reasonably expect to be running bionic and using 3.618:47
mordredbut - bionic also has newer python I think18:47
mordredwhich is me saying "I don't have a clear answer in my head"18:47
mordredcentos has frequently been the long-tail in terms of needing to support older things - but with centos getting killed off maybe that's not a thing we have to care about as strongly?18:48
clarkbwhile thats true rhel8 will continue to be 3.6 for almost a decade more18:48
mordredyeah18:49
mordredotoh - the python community seems to be aggressively dropping support for python versions in libraries once the version gets EOLd18:49
clarkbbionic does have newer python packages available as well, but rhel8/centos8 don't currently aiui18:49
clarkbso the question is likely to be about the rhel/centos support side of things18:49
mordredso it might become harder to keep supporting 3.6 once it's EOL'd18:49
fungibut not necessarily a guarantee, after all they released rhel 8 with python 2.7 and said they would drop it before the distro itself reaches eol18:49
fungialso red hat has a clear history of backporting newer bits of the interpreter into their "old" version18:50
mordredyeah. also - there's always the container story for people on rhel/centos if we do decide that zuul's python support position does not immediately overlap with rhel's18:50
clarkbmy hunch is that its something we'd want to deprecate and not just drop18:51
mordredwhile I don't think we want to mandate everyone use containers - since there are non-container options, and rhel+container options - maybe ensuring we support non-container deployments on slow-moving platforms like rhel is too much?18:51
clarkbso that any existing users on say centos can plan ahead (which they are likely already doing but potentially to another 3.6 distro)18:51
mordred++18:52
Open10K8Scorvus: Hi there. Can you check this plz? https://review.opendev.org/c/zuul/zuul-jobs/+/77465018:53
clarkbit does feel like maybe python2.7 has spoiled us in a way. It woudl be nice if there was a largely agreed upon minium version of python3 most things should support18:54
clarkbof course with python2 that also happened to be the max value :)18:54
corvusOpen10K8S: probably tomorrow18:55
Open10K8Sok, thank you for your test and review :)18:55
corvusOpen10K8S: you're welcome, thank you :)18:55
clarkbanother thought: I've been on python3.8 locally for a while now and have yet to have it break on projects that do older python3. To me that means that hte biggest questions are likely to be "Will deps continue to support these older versions" and "Are there any new python3 features we would really like to see in the code base"18:58
tobiashf strings have been introduced in 3.6 right?18:59
avassclarkb: walrus operators, pattern matching, better f-strings18:59
corvusi'm having trouble getting bionic's zookeeper doing tls connections, which is required for nodepool's unit tests (and i think we're probably going to require them for zuul's soon too, but right now, due to a quirk of how we set up the conneciton, zuul's tests don't need to use tls even if zuul does)18:59
clarkbyes f strings were added in 3.618:59
clarkbavass: pattern matching is nice but we don't even do 3.9 yet :)18:59
clarkbor at least not when I last checked19:00
tobiashcorvus: which zk verson does that have?19:00
tristanCclarkb: if i understand correctly, the default python3.6 in rhel8 is named platform-python and it is used for dnf and other system service, regular python are distributed with streams which have different lifecycle, and rhel-8.2 has python3.819:00
corvustobiash: 3.4.10.  it technically should be present, but it wasn't fully supported19:00
corvustobiash: i get: Exception in thread "main" java.lang.NoClassDefFoundError: org/jboss/netty/channel/group/ChannelGroup19:00
clarkbtristanC: Oh I didn't realize they had updated the non platform python19:00
corvus(even though netty jar is in classpath)19:00
clarkbcorvus: as a workaround I think ensure-zookeeper has an install from tarball option or similar. Maybe that would work?19:01
corvusclarkb: yeah... so have test-setup.sh just run zookeeper from a jar?19:01
clarkbcorvus: ya (I do that locally because suse doesn't package zk)19:02
clarkbthe jar comes with an init like script you can use to start and stop the service19:02
clarkbyou just need a java to be present19:02
clarkbtristanC: hrm all of the internet guides from installing new python3 on rhel/centos 8 say to compile it from source :/19:03
clarkbat least the several top results I've clicked on so far19:03
openstackgerritMerged zuul/zuul master: Require TLS for zookeeper connections  https://review.opendev.org/c/zuul/zuul/+/77624919:05
tristanCclarkb: the command should be `dnf install python38` , from https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_basic_system_settings/using-python3_configuring-basic-system-settings19:07
tristanCand you can give this a try by using this free image: registry.access.redhat.com/ubi8/ubi19:07
clarkbneat, in that case ya I think we probably don't need to support 3.6 for centos/rhel 819:07
clarkb(since you are supposed to update to the latest point release as a centos or rhel user the old 8.0 and 8.1 python versiosn aren't super valid anymore)19:08
fungialso ubuntu does make newer python available on older lts releases too19:09
tobiashcorvus: don't do zuul's unit tests tls? I'm confused because the py-36 jobs of the zuul change worked.19:10
fungidebian stable already has 3.7, and debian oldstable was 3.5 anyway so we've already dropped that19:11
fungiand ubuntu bionic has 3.7 (and 3.8) packages available19:12
clarkbfungi:the 3.8 is an alpha version though right?19:12
fungiyeah, but the 3.7 is not19:12
fungi3.7.3 in the base distro for bionic, and 3.7.5 in bionic-updates19:13
corvustobiash: no, zuul doesn't use zk tls in unit tests19:13
tobiashok, that explains things19:13
corvustobiash: so whatever we do for nodepool, we're going to need to do for zuul if we move the connection handling into the components.19:13
fungii would say if we want to be able to drop python 3.6 support but still want people to be able to use zuul on ubuntu bionic, then we can test 3.7 there via the optional python3.7 package (or on debian buster where 3.7 is the default python3)19:16
tobiashcorvus: maybe we could also use test-setup-docker.sh instead if zk is too old19:16
corvustobiash: i kind of like that idea19:16
tobiashI prefer that during development anyway19:17
corvusi think i'm going to try porting that over to nodepool; then if that works, we can port it back to zuul19:17
tobiash++19:17
corvusyeah, the whole idea was to try to make it easy for devs to match the unit test env19:17
*** sduthil has quit IRC20:07
*** hashar has quit IRC20:23
corvusis there a bindep entry for docker-compose anywhere?21:19
corvuslooks like there's some in codesearch for docker, not seeing compose21:20
clarkbI am not aware of one21:20
corvusi'm thinking that with a [test] is the best way to get those installed for test-setup.sh to use21:20
clarkbyou can pip install it too21:21
clarkb(if it is running in a tox context that may make sense)21:21
corvusoh, that sounds neat but i think test-setup.sh is run before tox21:21
corvus(and there's a nice benefit for devs in that it reduces tox startup time if they can just leave it running)21:22
tristanCtobiash: here are the measures i gather from mqtt events: https://github.com/TristanCacqueray/zuul-nix/blob/master/benchmark.py#L2321:24
tristanCre-running the script on a fresh vm consistently gives these results: for 100 builds, it takes 104 seconds, Enqueue time: mean 0.028 std dev 0.044, Report time : mean 0.053 std dev 0.04021:25
corvustristanC: are those metrics emitted by statsd?21:25
corvuser to statsd21:26
tristanCcorvus: it doesn't seem to be emitted to statsd21:26
corvustristanC: they look useful and should be easy to add -- could we go ahead and do that?  i think it'd be better overall if mqtt wasn't required in order to get stats like that21:28
tristanCcorvus: on the other hand, it's easier to setup mqtt than statsd21:28
*** cloudnull has quit IRC21:28
corvustristanC: for the benchmark or for a production system?21:28
tristanCwell i'm not familiar with statsd, but for mqtt, mosquitto is simple to setup21:30
tristanCi can have a look to duplicate that metrics in statsd if you prefer21:31
corvusi mean, mqtt isn't really a metrics database; yes, a production system needs a backing database for that (graphite/influx/whatever), but a sizable production system is going to have that, so just adding an extra metric is easy21:31
corvusbut for a benchmark you don't need a backing database21:31
corvusso using the statsd *protocol* can actually be even easier than using mqtt21:31
corvusgimme a sec to dig up a link21:32
*** cloudnull has joined #zuul21:33
corvustristanC: a statsd server doesn't have to be much more than this: https://opendev.org/zuul/zuul/src/branch/master/tests/base.py#L2783-L2812  in the tests, we parse the data in the test functions instead of on the server, but a simple split there could do it.21:33
corvustristanC: i think this is worthwhile because i really like the metrics you're looking at and the idea of benchmarking, and i'd love to use the same mechanism in prod as the benchmark21:34
corvusi can help flesh out that statsd server if you want21:34
tristanCi understand that statsd is more lightweigh, but i find the mqtt or prometheus ecosystem more active and easier to integrate21:35
tristanCanyway, it's already in place so we might as well add this enqueue and report time metric21:36
*** vishalmanchanda has quit IRC21:37
tristanCnote that my goal was to limite deviation, so i'm measuring trigger to enqueue and completed build to report, so that ansible execution is not taken into account21:38
tristanCand to make things go faster, i've replaced the ansible-playbook command by an `exit 0` script21:38
corvusthose will obviously produce different values in benchmark and prod :)21:38
corvusbut we can still measure processes before and after the build21:39
dmsimardo/ heads up: ansible==3.0.0 releasing tomorrow21:45
corvusdmsimard: thanks!21:45
avassdmsimard: oh interesting21:45
corvustristanC: i think this should handle counters, and almost handle timers: http://paste.openstack.org/show/802753/21:45
corvustristanC: i figure you might want to do your stdev calcs, etc, so i stopped there.. want me to finish that up, or you want to take it from here?21:46
*** gmann is now known as gmann_afk21:51
tristanCcorvus: i'm looking at adding the enqueue_time and report_time in Manager.reportStats21:52
tristanCor somewhere else, not sure what's the most appropriate yet21:55
corvustristanC: yeah, that's tricky; it might be better to do it in reportEnqueue etc...22:00
*** sduthil has joined #zuul22:06
corvustristanC: http://paste.openstack.org/show/802755/ should plug into your script fairly easily22:08
corvustristanC: should be able to do something like mean(statsd.timers['stats.gauges.zuul.tenant.foo.enqueue_time']) or whatever the metric name ends up being.22:09
openstackgerritJames E. Blair proposed zuul/nodepool master: WIP: Require TLS  https://review.opendev.org/c/zuul/nodepool/+/77628622:22
corvusthat will almost certainly fail the openshift job; we'll have to figure out how to do zk tls there22:23
*** jhesketh has joined #zuul22:32
openstackgerritJames E. Blair proposed zuul/nodepool master: WIP: Require TLS  https://review.opendev.org/c/zuul/nodepool/+/77628622:34
openstackgerritTristan Cacqueray proposed zuul/zuul master: scheduler: add statsd metric for enqueue time  https://review.opendev.org/c/zuul/zuul/+/77628722:35
tristanCcorvus: i can have a look, but why do you think it will fail?22:37
corvustristanC: it uses ensure-zookeeper to run zk, so we'll need to figure out how to set up tls certs for that  (or maybe see if we can switch to using test-setup.sh which will run zk in podman)22:38
tristanCcorvus: can I add a 'use_tls' ensure-zookeeper role attribute for that?22:40
tristanCthis would only be the third time i add tls configuration to a zookeeper confmgmt :-)22:41
corvustristanC: yeah, and i set the nodepool test framework to accept cert paths via ENV variables22:41
tristanCok, i can have a look now22:42
corvustristanC: what should we do for generating certs?  i also am adding zk-ca.sh to the nodepool repo, so we can run that if we want22:42
tristanCi guess i'll copy the script in zuul-jobs if that's ok22:43
corvuswfm22:43
corvuswhere does the k8s test get zookeeper?22:48
tristanCit's using the default zookeeper image from docker, see https://opendev.org/zuul/zuul-operator/src/branch/master/conf/zuul/components/ZooKeeper.dhall#L3722:49
corvustristanC: nodepool-functional-k8s doesn't use that does it?22:50
tristanCi don't think so22:50
corvusthat's the job i meant22:50
tristanCoh i missread then22:51
corvus(likewise nodepool-functional-openshift for openshift)22:51
tristanCwell it would make sense to do k8s integration test with the operator22:51
corvusi think i prefer the simpler approach -- let's keep building up22:51
corvusoh22:56
corvusit must have just used zookeeper from bindep22:56
corvustristanC: for the k8s job we can probably switch to using ensure-zookeeper as well22:58
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: ensure-zookeeper: add use_tls role var  https://review.opendev.org/c/zuul/zuul-jobs/+/77629023:02
openstackgerritJames E. Blair proposed zuul/nodepool master: WIP: Require TLS  https://review.opendev.org/c/zuul/nodepool/+/77628623:04
tristanCcorvus: what about regular job, should they also use ensure-zookeeper then?23:04
corvustristanC: i've got test-setup.sh setting up zk with tls, so we can use the regular tox jobs23:05
corvus(it's also how i plan on running tests on my workstation from now on too)23:05
corvusthat uses docker or podman23:05
corvustristanC: i'll update my patch to depends-on yours23:06
tristanCcorvus: ok, i got to go, i was adding the zookeeper_use_tls role vars, making a symlink from /opt/zookeeper/ca to tools/ca and skipping the compose when the dir exists in test-setup.sh23:08
tristanCthough file permission probably needs to be adjusted in zk-ca too23:09
*** rlandy is now known as rlandy|bbl23:11
openstackgerritJames E. Blair proposed zuul/nodepool master: WIP: Require TLS  https://review.opendev.org/c/zuul/nodepool/+/77628623:13
corvustristanC: since you're afk i'm going to push up a fix to your change23:13
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: ensure-zookeeper: add use_tls role var  https://review.opendev.org/c/zuul/zuul-jobs/+/77629023:15
openstackgerritJames E. Blair proposed zuul/nodepool master: WIP: Require TLS.  https://review.opendev.org/c/zuul/nodepool/+/77628623:15
*** gmann_afk is now known as gmann23:22
*** jkt has quit IRC23:22
*** jkt has joined #zuul23:23
clarkbcorvus: would it be helpful to review either of those two changes at this point?23:28
corvusclarkb: only if you're bored; i'm still bashing it into shape23:29
clarkbwell I do think I could use a break from staring at scala23:29
clarkbI'll start with the one that doesn't say WIP on it as it is also passing testing23:30
corvusclarkb: k; i'm looking at unit test failures right now23:30
corvuscool23:30
fungii lost an hour to kernel wireless firmwares moving from brcm to cypress without sufficient symlinks. that was a break indeed23:30
corvusclarkb: as tristanC was alluding to, i think we need a solution to the ownership issue with the use_tls patch23:34
corvusit's going to run the ca as root, but the tests need to read the client cert and key as the zuul user23:34
corvusshould we just run the ca as the zuul user?23:34
clarkbcorvus: for test purposes that seems fine and practical23:35
clarkbas an alternative you could chown all the client and server stuff to zuul but keep the ca as root?23:36
openstackgerritJames E. Blair proposed zuul/nodepool master: WIP: Require TLS.  https://review.opendev.org/c/zuul/nodepool/+/77628623:37
corvusi'll adapt the zuul-jobs change to run the ca without become23:38
clarkbcorvus: couple of thoughts on the existing ps but nothing that really needs changing23:39
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: ensure-zookeeper: add use_tls role var  https://review.opendev.org/c/zuul/zuul-jobs/+/77629023:44
corvusclarkb: okay let's try that ^23:44
* clarkb refreshes23:44
openstackgerritJames E. Blair proposed zuul/nodepool master: WIP: Require TLS  https://review.opendev.org/c/zuul/nodepool/+/77628623:45
clarkbthat role update looks like it should work23:47

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!