openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Add API endpoint to get frozen jobs https://review.openstack.org/607077 | 00:07 |
---|---|---|
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Get executor job params https://review.openstack.org/607078 | 00:07 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Separate out executor server from runner https://review.openstack.org/607079 | 00:10 |
tristanC | ^ just fixing yet another rebase conflict | 00:10 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: runner: implement prep-workspace https://review.openstack.org/607082 | 00:11 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: runner: add configuration schema https://review.openstack.org/640672 | 00:11 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: runner: add execute sub-command https://review.openstack.org/630944 | 00:11 |
*** irclogbot_3 has joined #zuul | 00:18 | |
*** jamesmcarthur has quit IRC | 00:30 | |
*** jamesmcarthur has joined #zuul | 01:46 | |
*** jamesmcarthur has quit IRC | 02:03 | |
*** bjackman has joined #zuul | 02:34 | |
*** jamesmcarthur has joined #zuul | 03:51 | |
*** jamesmcarthur has quit IRC | 04:12 | |
*** daniel2 has quit IRC | 04:30 | |
*** wxy-xiyuan has quit IRC | 04:31 | |
*** dcastellani has quit IRC | 04:31 | |
*** wxy-xiyuan has joined #zuul | 04:31 | |
*** spsurya has quit IRC | 04:31 | |
*** maxamillion has quit IRC | 04:31 | |
*** PrinzElvis has quit IRC | 04:31 | |
*** gundalow has quit IRC | 04:32 | |
*** kmalloc has quit IRC | 04:32 | |
*** hogepodge has quit IRC | 04:32 | |
*** jbryce has quit IRC | 04:33 | |
*** spsurya has joined #zuul | 04:34 | |
*** PrinzElvis has joined #zuul | 04:34 | |
*** gundalow has joined #zuul | 04:34 | |
*** hogepodge has joined #zuul | 04:34 | |
*** jbryce has joined #zuul | 04:38 | |
*** daniel2 has joined #zuul | 04:38 | |
*** dcastellani has joined #zuul | 04:39 | |
*** kmalloc has joined #zuul | 04:40 | |
*** jamesmcarthur has joined #zuul | 04:52 | |
*** jamesmcarthur has quit IRC | 04:57 | |
*** raukadah is now known as chandankumar | 05:06 | |
*** saneax has joined #zuul | 06:09 | |
*** swest has joined #zuul | 06:13 | |
*** bjackman has quit IRC | 06:41 | |
*** bjackman has joined #zuul | 06:41 | |
*** pcaruana has joined #zuul | 07:27 | |
*** gtema has joined #zuul | 08:22 | |
*** jpena|off is now known as jpena | 08:51 | |
*** gtema has quit IRC | 09:00 | |
openstackgerrit | Merged openstack-infra/nodepool master: Update docs for provider removal. https://review.openstack.org/645220 | 09:27 |
arxcruz|pto | hey guys, can we have some love on https://review.openstack.org/#/c/607077/ ? | 09:28 |
*** bjackman has quit IRC | 09:42 | |
*** saneax has quit IRC | 10:01 | |
*** saneax has joined #zuul | 10:03 | |
*** hashar has joined #zuul | 10:23 | |
openstackgerrit | Luigi Toscano proposed openstack-infra/zuul-jobs master: DNM Debug stage-output, change archival mechanism https://review.openstack.org/645239 | 10:42 |
*** dcastellani has quit IRC | 10:59 | |
*** spsurya has quit IRC | 10:59 | |
*** dcastellani has joined #zuul | 11:00 | |
*** spsurya has joined #zuul | 11:01 | |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul master: Elasticsearch Zuul reporter https://review.openstack.org/644927 | 11:12 |
*** arxcruz|pto is now known as arxcruz | 11:31 | |
*** pcaruana has quit IRC | 11:53 | |
*** Guest12731 has joined #zuul | 12:08 | |
*** rlandy has joined #zuul | 12:14 | |
*** logan- has quit IRC | 12:14 | |
*** Guest12731 is now known as logan- | 12:14 | |
openstackgerrit | Luigi Toscano proposed openstack-infra/zuul-jobs master: DNM Debug stage-output, change archival mechanism https://review.openstack.org/645239 | 12:19 |
*** jpena is now known as jpena|lunch | 12:36 | |
*** pcaruana has joined #zuul | 12:54 | |
*** hashar has quit IRC | 12:58 | |
*** altlogbot_2 has quit IRC | 13:01 | |
*** irclogbot_3 has quit IRC | 13:01 | |
*** altlogbot_0 has joined #zuul | 13:03 | |
*** irclogbot_0 has joined #zuul | 13:03 | |
openstackgerrit | Luigi Toscano proposed openstack-infra/zuul-jobs master: DNM Debug stage-output, change archival mechanism https://review.openstack.org/645239 | 13:08 |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul master: Elasticsearch Zuul reporter https://review.openstack.org/644927 | 13:09 |
openstackgerrit | Luigi Toscano proposed openstack-infra/zuul-jobs master: DNM Debug stage-output, change archival mechanism https://review.openstack.org/645239 | 13:26 |
*** jpena|lunch is now known as jpena | 13:36 | |
fbo | Hi, we have a Zuul user that was looking for a way to send build results into elasticsearch (then explore/graph via kibana). I did a reporter proposal here https://review.openstack.org/644927, do you think that is something relevant to have into Zuul ? | 14:26 |
*** smyers has quit IRC | 14:27 | |
corvus | arxcruz: the zuul-runner stack is our next priority -- we're finishing up the multi-ansible stuff this week (i think we need to make one more point release). see the most recent project update email. | 14:28 |
arxcruz | corvus: no problem, i wait until now, i can wait a little bit more :) | 14:29 |
arxcruz | just wanted to ensure it's not missed :) | 14:29 |
corvus | arxcruz: so close! :) thanks! | 14:29 |
*** smyers has joined #zuul | 14:37 | |
corvus | fbo: i'll give it a quick look :) | 14:37 |
fbo | corvus: thanks :) | 14:44 |
corvus | fbo: that seems like a fine idea -- i left one quick question, and will give it a more detailed review later. we'll definitely want clarkb to review that too :) | 14:46 |
mordred | fbo: yeah - I saw that review come by yesterday - the concept seems like a potentially neat option for zuul users | 14:57 |
mordred | fbo: (although I was on an airplane and haven't actually, you know, looked at it) | 14:57 |
pabelanger | does anybody have thoughts on graylog? that came up in discussion recently for log mgmt | 15:08 |
*** jamesmcarthur has joined #zuul | 15:12 | |
fbo | corvus: mordred yes and we are excited to build nice dashboard on top of that data. We tried before based on the log artifacts exported to logstash/elastic but had to find a unique line of log like (Job console starting...) and that was combersome | 15:18 |
fbo | having both: build/buildset data + log artifacts in elk is a nice have | 15:20 |
*** altlogbot_0 has quit IRC | 15:21 | |
fbo | pabelanger: problem with elk is the authentication (rely on x-pack extention (not free)) and it seems graylog have the support | 15:24 |
*** altlogbot_2 has joined #zuul | 15:26 | |
clarkb | fbo: corvus: two things I notice really quickly are that we only allow for a single uri? you may want to take a list so that you can have fallback nodes. Also I think you want indexes to rollover on some period for management purposes | 15:28 |
clarkb | if an index becomes corrupt being able to delete a days worth of data is worthwile. At least for log data when you are talking terabytes of data. The zuul data may be small enough that isn't a huge concern | 15:29 |
*** irclogbot_0 has quit IRC | 15:30 | |
pabelanger | fbo: yah, I haven't looked too much at graylog myself, aside from looking at the website. | 15:31 |
openstackgerrit | Luigi Toscano proposed openstack-infra/zuul-jobs master: stage-output: fix the archiving of all files https://review.openstack.org/645239 | 15:31 |
*** irclogbot_0 has joined #zuul | 15:33 | |
*** saneax has quit IRC | 15:33 | |
*** irclogbot_0 has quit IRC | 15:36 | |
fbo | clarkb: yes good points the driver should take a comma separated list of el nodes (I kept it simple for the first implementation). Also yes the index should not grow as fast as with logs but splitting index might be useful for the reason you gave. We can think of having a strategy based of the number of docs in the index or simply by date to split the indexes. | 15:37 |
*** irclogbot_3 has joined #zuul | 15:38 | |
SpamapS | fbo: regarding parsing the console.. did you try sending the json in as a document? | 15:49 |
SpamapS | I guess it's a bunch of lists.. might not be useful in elastic | 15:50 |
SpamapS | but.. there should be no reason to parse the text. You have everything you need in the json. | 15:50 |
SpamapS | And really I don't know that you need a reporter. You could make a post-run playbook that feeds into elastic pretty easily. | 15:50 |
*** hashar has joined #zuul | 15:52 | |
fbo | SpamapS: that's not about parsing the console. You can see it as the same as the sql reporter but for elastic so only related to build and buildset data. In a post-run playbook some info will be missing like a SKIPPED job result. | 16:07 |
SpamapS | fbo: there are skipped job results? | 16:12 |
SpamapS | Oh I guess if parents fail | 16:13 |
*** smyers has quit IRC | 16:14 | |
SpamapS | Anyway, IMO you can get pretty far just scraping the database and shoving it into elastic. | 16:14 |
SpamapS | The problem with reporters is they tax the scheduler, which is already really, really busy. | 16:14 |
*** smyers has joined #zuul | 16:15 | |
SpamapS | (of course, we could fix that with some gearman or zk job farming.. but... moar components?) | 16:15 |
pabelanger | could you not get data from mqtt reporter? then offload that to some other publisher? | 16:19 |
fbo | SpamapS: yes skipped child job due to parent failure. Yes more load but still that is configurable/activable by pipeline | 16:20 |
fbo | pabelanger: yes that's possible but a bit more complicated to setup for a Zuul operator imo. | 16:22 |
pabelanger | agree, would put more work on deployers | 16:22 |
pabelanger | might be a good way to scale out too | 16:23 |
clarkb | its not necessarily more work on deployers if it is automatically forked like the geard process | 16:30 |
pabelanger | oh, yah. In my brain, I was thinking about how openstack logstash workers were setup | 16:32 |
fbo | ok you mean still part of zuul, like a zuul-sql-reporter or zuul-elastic-reporter | 16:32 |
fbo | reading gearman report jobs from their own geard | 16:33 |
*** maxamillion has joined #zuul | 16:37 | |
clarkb | well if the main scheduler process always ran an (internal) mqtt reporter then you could fork off reporters for the other backends | 16:40 |
clarkb | then the cpu required to report is limited to mqtt (or whatever other internal bus is chosen) | 16:40 |
clarkb | I dno't know if that is actually worthwhile, but is one appraoch that could be taken | 16:46 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Update component diagram to show statsd https://review.openstack.org/645798 | 16:58 |
pabelanger | would love to get some eyes on ^ for statsd integration. Looking at code, I could only see executors and scheduler sending data to statsd. | 16:58 |
*** chandankumar is now known as raukadah | 17:02 | |
*** jamesmcarthur has quit IRC | 17:05 | |
corvus | tobiash: shall i tag efae4deec5b538e90b88d690346a58538bd5cfff as 3.7.1 ? | 17:10 |
tobiash | corvus: ++ | 17:12 |
tobiash | corvus: but I found another bug, default_ansible_version seems to be ignored in zuul.conf | 17:12 |
tobiash | but that might be less critical | 17:13 |
corvus | tobiash: think you might have a fix soon? if so, we could wait for it; if not, we could do 3.7.1 today and 3.7.2 next week | 17:16 |
tobiash | pabelanger: merger should send data to statsd too | 17:16 |
tobiash | corvus: define 'soon' | 17:17 |
corvus | tobiash: 3 hours? :) | 17:17 |
tobiash | challenge accepted | 17:18 |
corvus | tobiash: cool... would you mind adding a release note about that and also the uri module fix? doesn't have to be much, but i just realized we don't have any release notes and it seems weird to have a release without at least one. | 17:18 |
tobiash | k, will do | 17:19 |
tobiash | was uri only broken for 2.7? | 17:19 |
corvus | tobiash: yes | 17:19 |
pabelanger | tobiash: Hmm, I didn't see a code path for that, but also could just be blind :) | 17:22 |
pabelanger | looking again | 17:22 |
*** jamesmcarthur has joined #zuul | 17:22 | |
tobiash | pabelanger: the merger queue | 17:23 |
tobiash | oh, the merger queue probably comes from the scheduler | 17:24 |
pabelanger | tobiash: http://git.zuul-ci.org/cgit/zuul/tree/zuul/scheduler.py#n396 | 17:24 |
pabelanger | yah | 17:24 |
tobiash | then I think you're right | 17:24 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add fetch-sphinx-tarball role https://review.openstack.org/645346 | 17:26 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add download artifact role https://review.openstack.org/645384 | 17:26 |
*** jamesmcarthur has quit IRC | 17:27 | |
*** jamesmcarthur has joined #zuul | 17:34 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Fix ignored default ansible version https://review.openstack.org/645819 | 17:47 |
tobiash | corvus: ^ | 17:47 |
Shrews | clarkb: you will be happy (or sad?) to hear that i have found multiple issues around the nodepool build process, none of which i have solutions for atm | 17:49 |
tobiash | Shrews: you mean the image build process? | 17:49 |
Shrews | tobiash: yes | 17:49 |
Shrews | tobiash: this is causing some image files to be left unmanaged on the builders | 17:50 |
tobiash | we're leaking massively images in our clouds btw | 17:52 |
tobiash | need to look into that as well | 17:52 |
tobiash | every few weeks I need to delete 10-20TB of images from our cloud :-/ | 17:53 |
Shrews | tobiash: the leak seems mostly related to losing the ZK session during the image build process | 17:53 |
pabelanger | tobiash: wow | 17:54 |
Shrews | the other problem is that we are creating two different image build znodes for a single image build | 17:54 |
Shrews | that one seems more easily fixed, but not sure its impact on the leak yet | 17:54 |
Shrews | if any | 17:54 |
tobiash | Shrews: do you think it's viable to fail and delete the image if we lost the lock? | 17:55 |
tobiash | sure, it can be very expensive, but that image is probably lost anyway? | 17:55 |
clarkb | if dib has been killed externally then there isn't really a good way to recover | 17:56 |
Shrews | tobiash: it's supposed to do that already (state is BUILDING but no build lock). that might be due to the 2-znode problem | 17:56 |
tobiash | ah ok | 17:56 |
Shrews | i still see issues with that though | 17:57 |
Shrews | i think we can force kick the cleaning process when the build finishes, rather than let the cleanup thread do it | 17:58 |
fungi | though is the cleanup thread also not working? | 18:03 |
Shrews | fungi: it works, but it's a timing thing. the lost zk session causes the cleanup thread to begin a build that may still be in process (thus new files may appear after we think we've deleted them) | 18:05 |
Shrews | begin to cleanup* a build, that is | 18:05 |
*** jpena is now known as jpena|off | 18:13 | |
corvus | tobiash: thanks! | 18:18 |
corvus | i think we can issue the release after that lands without an openstack-infra burn-in | 18:19 |
*** jamesmcarthur has quit IRC | 18:19 | |
*** jamesmcarthur has joined #zuul | 18:20 | |
corvus | SpamapS, fbo: it's true that reporters are run in the scheduler main thread, however, as long as we keep them simple, i don't think they should have too much of an impact -- a pipeline usually doesn't have too many, they don't run that often, and usually they're just 'fire and forget' -- should only take a few hundred ms. i think we'll have scale-out schedulers before reporter cpu-time becomes a significant | 18:21 |
corvus | problem. | 18:21 |
corvus | SpamapS, fbo: it's also true that many 'reporting' actions can be handled in jobs (indeed, we do that in openstack-infra with our logstash processer), there is a difference in the data available to them -- jobs necessarily report about themselves, wheras reporters have the big picture of a buildset. so it's worth considering which one is right for a given application. | 18:23 |
*** jamesmcarthur has quit IRC | 18:24 | |
SpamapS | corvus: scaling out reporters would be another use case for the "cleanup job" concept.. have the cleanup job look at the whole tree and do all of the reporter work. | 18:24 |
SpamapS | but alas, the time. | 18:25 |
corvus | SpamapS: yes... the infinite amout of time in the universe which we are unable to access... :( | 18:25 |
* SpamapS shakes fist at time and space | 18:26 | |
*** pcaruana has quit IRC | 18:45 | |
pabelanger | Question: If you don't have zuul-fingerfg running, is zuul-web smart enought to try to connect to zuul-executors streaming port directly? Or does in need to connect via zuul-fingergw? | 18:48 |
pabelanger | enough* | 18:49 |
pabelanger | I know the finger client wouldn't work, because the port is not 21/tcp | 18:49 |
Shrews | pabelanger: iirc, it should get the streaming info (server and port), via gearman. it wouldn't go through the zuul-fingergw | 18:56 |
Shrews | zuul-fingergw gets the info the same way and should just be for finger clients | 18:57 |
pabelanger | Shrews: ack, thanks! | 18:58 |
corvus | pabelanger: any reason you can't run fingergw? | 19:07 |
pabelanger | corvus: nope, booting it now. Was mostly curious is executors were public, if it was really needed | 19:09 |
pabelanger | was also trying to see flow for firewalls | 19:09 |
corvus | pabelanger: it's needed so that "finger uuid@zuul" works, which i think is a useful feature :) but it's just like web -- only the web and fingergw services need to be publicly accessible; executors never do -- they only need to be accessible by the web and fingergw processes. | 19:11 |
pabelanger | great, thats how I remembered it | 19:12 |
pabelanger | thanks | 19:12 |
*** saneax has joined #zuul | 19:17 | |
tobiash | corvus: do you think it makes sense to increase the default wait_timeout further to 90s? | 19:19 |
tobiash | we're currently rechecking all the time :( | 19:20 |
corvus | tobiash: maybe so | 19:21 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add web / fingergw connections for components graph https://review.openstack.org/645852 | 19:21 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Increase default wait_timeout https://review.openstack.org/645853 | 19:22 |
SpamapS | BTW, I failed to notice before so c'est la vie, but did we break config file compatibility again? I think we did by requiring the user to be set in fingrgw | 20:02 |
corvus | SpamapS: yes, we batched a couple of those changes up and made sure to highlight them in the release upgrade notes -- so 3.7.0 is the "you may need to pay attention to your deployment settings" release | 20:19 |
corvus | 3 changes total -- multi-ansible, zookeeper connection default, and fingergw user | 20:20 |
SpamapS | Yeah that's ok. Just making sure I understand. | 20:20 |
corvus | maybe we should name our releases like that :) | 20:20 |
SpamapS | I don't use fingergw, so it doesn't affect me | 20:20 |
SpamapS | (why would I need to use fingergw?) | 20:20 |
corvus | SpamapS: because it's awesome? :) "finger uuid@zuul | grep -i error" | 20:20 |
SpamapS | corvus: we should look for quotes from Ghostbusters to adequately convey that need. ;) | 20:21 |
corvus | SpamapS: nice :) | 20:21 |
corvus | i'll, um, get right on that research project :) | 20:21 |
SpamapS | 3.7.0 - "I'm warning you, turning off these machines would be extremely hazardous." | 20:21 |
SpamapS | 4.0.0 can be "Many Shubs and Zuuls knew what it was to be roasted in the depths of a Sloar that day, I can tell you!" | 20:24 |
corvus | gold | 20:25 |
SpamapS | We should rename nodepool to Sloar. | 20:27 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Don't assume secrets are text in encrypt_secret https://review.openstack.org/645888 | 21:22 |
*** saneax has quit IRC | 21:32 | |
*** mgoddard has quit IRC | 21:47 | |
*** mgoddard has joined #zuul | 21:47 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Minor improvements to docker-image doc structure https://review.openstack.org/645897 | 21:49 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Organize documentation by subject area https://review.openstack.org/645955 | 22:52 |
corvus | AJaeger: ^ your feedback especially sought on that one | 22:52 |
clarkb | corvus: is the idea there that around the autojob loads you can write the narrative? | 22:55 |
corvus | clarkb: yep, sort of like how i did the container images documentation for opendev/base-jobs (but probably with many fewer words) | 22:56 |
corvus | so you could say "these next 3 roles are all about dealing with python releases", and put that in a subsection heading | 22:56 |
corvus | that way if a user is trying to find out what's available to help with a python project, they have a better tool than 'grep' :) | 22:57 |
corvus | (or ctrl-f in browser) | 22:58 |
*** rlandy has quit IRC | 23:11 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!