manjeets | corvus, ping (still there I've got a question ) regarding config project | 00:38 |
---|---|---|
manjeets | I forked zuul-config here in https://github.com/manjeetbhatia/zuul-config and defined a project and modified pipeline to just have check https://github.com/manjeetbhatia/zuul-config/blob/master/zuul.d/projects.yaml | 00:39 |
manjeets | and I modified zuul-ci.org part http://paste.openstack.org/show/734385/ in zuul_conf | 00:41 |
manjeets | and trying to see a comment here https://review.openstack.org/#/c/612818/ which never get posted i saw event for that change been streamed on setup | 00:43 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Use openstacksdk submit_task https://review.openstack.org/616387 | 00:51 |
ianw | hrm, anyone know anything about the nodepool kubernetes job? | 01:03 |
ianw | 2018-11-08 00:59:04.366651 | ubuntu-xenial | [ERROR SystemVerification]: unsupported docker version: 18.09.0 | 01:03 |
tristanC | ianw: seems like this version of minikube doesn't work anymore on that node. | 01:08 |
ianw | i guess ubuntu bumped a version? | 01:08 |
tristanC | ianw: or minikube, it seems like we are installing the latest | 01:09 |
ianw | oh, that makes sense, minikube stopped supporting that | 01:10 |
ianw | there hasn't been a release | 01:10 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul-jobs master: install-kubernetes: pin minikube version by default https://review.openstack.org/616388 | 01:12 |
ianw | 18.09 | 01:13 |
ianw | 2018-11-08 ... docker | 01:13 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: DNM: test older minikube version https://review.openstack.org/616389 | 01:13 |
ianw | tristanC: ^^ commented in above, don't we need to pin docker? | 01:15 |
tristanC | ianw: can we do that? | 01:18 |
ianw | ... ? :) | 01:19 |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: [wip] add docker package pin option https://review.openstack.org/616391 | 01:52 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul-jobs master: install-kubernetes: enable minikube version pinning https://review.openstack.org/616388 | 01:56 |
tristanC | ianw: i meant, since it comes from distro, isn't the version we need going to be removed eventually? | 01:58 |
ianw | tristanC: well it comes from the upstream repos i think, but i'm not sure what they keep around in there, as you say it may not be there | 01:59 |
ianw | anyway, good excuse to get back to the docker role stuff i've had on the backburner | 01:59 |
ianw | oh, haha i just realised i rewrote that anyway | 02:40 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: [dmn] test pinning docker version https://review.openstack.org/616398 | 02:50 |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: [wip] add docker package pin option https://review.openstack.org/616391 | 02:56 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: [dmn] test pinning docker version https://review.openstack.org/616398 | 03:04 |
*** fungi has quit IRC | 03:06 | |
*** fungi has joined #zuul | 03:09 | |
*** fungi has quit IRC | 03:10 | |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: install-docker: add package version option https://review.openstack.org/616391 | 03:18 |
*** mrhillsman has joined #zuul | 03:19 | |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Pin docker for k8s test https://review.openstack.org/616398 | 03:21 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Use openstacksdk submit_task https://review.openstack.org/616387 | 03:21 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Revert "Pin docker for k8s test" https://review.openstack.org/616404 | 03:21 |
ianw | tristanC: ^ i think that little stack gets us back on track ... | 03:21 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Pin docker for k8s test https://review.openstack.org/616398 | 03:24 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Use openstacksdk submit_task https://review.openstack.org/616387 | 03:24 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Revert "Pin docker for k8s test" https://review.openstack.org/616404 | 03:24 |
tristanC | ianw: nice! | 03:38 |
*** fungi has joined #zuul | 03:40 | |
*** fungi has quit IRC | 03:41 | |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: install-docker: add package version option https://review.openstack.org/616391 | 03:43 |
ianw | tristanC: heh, not yet ... fiddling with the jinja string manipulations ... but i think it will work at least | 03:43 |
*** fungi has joined #zuul | 03:45 | |
tristanC | ianw: perhaps we should try using minikube with cri-o instead of that docker-ce version issues. | 03:50 |
tristanC | everytime i tried k8s, i keep on getting weird errors related to docker... | 03:51 |
*** rlandy|bbl is now known as rlandy | 03:59 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Remove unused nodepool-k8s-functional role https://review.openstack.org/616409 | 04:05 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Remove nodepool-k8s-functional and install-nodepool roles https://review.openstack.org/616409 | 04:24 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift resource provider https://review.openstack.org/570667 | 04:29 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift resource provider https://review.openstack.org/570667 | 04:49 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider https://review.openstack.org/590335 | 04:52 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Support node caching in the nodeIterator https://review.openstack.org/604648 | 06:26 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Support node caching in the nodeIterator https://review.openstack.org/604648 | 06:29 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Rate limit updateNodeStats https://review.openstack.org/613680 | 06:35 |
tristanC | tobiash: what do you think about the unified stat interface to support prometheus proposed in the last comment of https://review.openstack.org/#/c/599209/ ? | 06:46 |
tobiash | tristanC: are you in berlin next week? | 06:47 |
tristanC | tobiash: yes, but only tuesday/wednesday | 06:47 |
tobiash | maybe we should discuss this there | 06:48 |
tristanC | that works for me | 06:49 |
tobiash | I think a monitoring driver would be a little bit more complex as it would need to abstract different methologies (path based mapping vs labels) | 06:49 |
tristanC | tobiash: i suggested to use a path based string format by default to keep statsd support, and prometheus would use the kwargs as labels | 06:51 |
tristanC | but perhaps i'm missing some corner cases, i haven't look at the all the current statsd metrics | 06:53 |
tristanC | i've added the interface argument documentation in a new comment on 599209 | 07:01 |
*** quiquell|off is now known as quiquell | 07:03 | |
openstackgerrit | Andreas Jaeger proposed openstack-infra/zuul-jobs master: install-docker: add package version option https://review.openstack.org/616391 | 07:10 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/zuul-jobs master: install-kubernetes: enable minikube version pinning https://review.openstack.org/616388 | 07:11 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement a Runc driver https://review.openstack.org/535556 | 07:23 |
*** pcaruana has joined #zuul | 07:34 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement a Runc driver https://review.openstack.org/535556 | 07:47 |
*** quiquell is now known as quiquell|brb | 07:49 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift resource provider https://review.openstack.org/570667 | 08:02 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider https://review.openstack.org/590335 | 08:02 |
*** themroc has joined #zuul | 08:10 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Update node during lockNode https://review.openstack.org/616450 | 08:14 |
tobiash | corvus: this implements your idea of updating a node in lockNode ^ | 08:15 |
*** quiquell|brb is now known as quiquell | 08:17 | |
*** jpena|off is now known as jpena | 08:36 | |
*** hashar has joined #zuul | 08:37 | |
*** sshnaidm|afk is now known as sshnaidm|rover | 08:54 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Update node during lockNode https://review.openstack.org/616450 | 09:07 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Add extra safety belt when reusing a node https://review.openstack.org/616465 | 09:07 |
*** CrayZee has joined #zuul | 09:48 | |
*** CrayZee is now known as Guest31212 | 09:48 | |
*** Guest31212 has quit IRC | 09:49 | |
*** ssbarnea has joined #zuul | 09:50 | |
*** snapiri- has joined #zuul | 09:52 | |
openstackgerrit | Simon Westphahl proposed openstack-infra/nodepool master: wip: Try optimizing node lock check https://review.openstack.org/616484 | 09:58 |
*** CrayZee- has joined #zuul | 09:59 | |
*** CrayZee- has quit IRC | 09:59 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Support node caching in the nodeIterator https://review.openstack.org/604648 | 10:01 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Update node during lockNode https://review.openstack.org/616450 | 10:01 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Add extra safety belt when reusing a node https://review.openstack.org/616465 | 10:01 |
*** electrofelix has joined #zuul | 10:10 | |
*** snapiri- has quit IRC | 10:19 | |
*** snapiri- has joined #zuul | 10:20 | |
openstackgerrit | Merged openstack-infra/zuul-jobs master: install-docker: add package version option https://review.openstack.org/616391 | 10:23 |
*** pbrobinson has joined #zuul | 10:25 | |
*** CrayZee has joined #zuul | 10:29 | |
openstackgerrit | Merged openstack-infra/zuul-jobs master: install-kubernetes: enable minikube version pinning https://review.openstack.org/616388 | 10:30 |
*** panda|off is now known as panda | 10:31 | |
*** snapiri- has quit IRC | 10:32 | |
*** hashar has quit IRC | 10:51 | |
*** CrayZee_ has joined #zuul | 11:00 | |
*** CrayZee has quit IRC | 11:02 | |
*** CrayZee_ has quit IRC | 11:31 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Rate limit updateNodeStats https://review.openstack.org/613680 | 11:38 |
*** goern has joined #zuul | 11:41 | |
goern | heya, has anyone exported zuul stats to prometheus? | 11:42 |
tobiash | yes | 11:42 |
goern | :) | 11:42 |
tobiash | use statsd_exporter | 11:42 |
tobiash | (one per service instance if possible) | 11:42 |
tobiash | the required mappings for the statsd_exporter are linked in the comments of https://review.openstack.org/#/c/599209/ | 11:43 |
* goern never used statsd, but I see the mappings, should be strgfwd | 11:45 | |
goern | @tobiash++ | 11:45 |
goern | thanks for the pointer | 11:46 |
tobiash | no problem | 11:46 |
*** jpena is now known as jpena|lunch | 12:03 | |
*** quiquell is now known as quiquell|lunch | 12:24 | |
*** rlandy has joined #zuul | 12:51 | |
*** AJaeger_ has joined #zuul | 12:52 | |
*** AJaeger has quit IRC | 12:55 | |
openstackgerrit | Merged openstack-infra/nodepool master: Only set basepython once https://review.openstack.org/615942 | 12:57 |
*** panda is now known as panda|off | 12:59 | |
mordred | tobiash: did we ever discuss putting those in a file in tree somewhere? | 13:05 |
tobiash | mordred: I'd like to discuss that in berlin | 13:05 |
tobiash | (if we want to support statsd and/or prometheus) | 13:06 |
mordred | cool | 13:06 |
tobiash | if we decide for only statsd I plan to add and maintain the mappings in the zuul tree | 13:06 |
tobiash | but I'd like to wait for a decision about that first | 13:07 |
mordred | I really wish prometheus supported a broadcast mechanism like statsd uses instead of just http polling | 13:07 |
tristanC | tobiash: why wouldn't we also support "native" prometheus protocol? | 13:07 |
mordred | tristanC: because it's super heavy-weight and requires an http endpoint | 13:07 |
mordred | as opposed to processes being able to just emit udp packets | 13:07 |
tobiash | I think that is the most important point for being discussed in berlin as there has not been a final decision about that | 13:08 |
mordred | yah | 13:08 |
mordred | there hasn't been a final decision - that's just what the basis of the pushback is | 13:08 |
mordred | it was clearly designed for a world where everything is a web service | 13:09 |
mordred | but we have things like the nodepool launchers - which neither have nor need an http endpoint | 13:09 |
mordred | but ... prometheus is also clearly a very important part of the k8s ecosystem | 13:09 |
goern | it not the new default | 13:10 |
mordred | yeah. my understanding is that people doing k8s basically are automatically going to be doing prometheus | 13:11 |
tristanC | mordred: adding a periodic http request doesn't sound "super heavy-wieght" compared to the rest of zuul threads, and it would simplify management a lot as it will act as a monitoring probe with simple alerting/trigger action | 13:13 |
mordred | tristanC: right- but also the stats have to be stored in memory so that they can be reported when polls | 13:14 |
mordred | polled | 13:14 |
goern | and one thing to keep in mind: prometheus is more like metrics and alerting, not really for long term storage of metrics, so if you want to preserve 5 years of metric history.... | 13:15 |
mordred | right. which we do | 13:15 |
tristanC | mordred: it seems like "just emit udp packets" is also tricky and may requires manual in-memory trick to keep load under control... | 13:16 |
mordred | tristanC: not really - we've been using it at scale for like 6 years with zero management | 13:16 |
mordred | and we have all these lovely graphs: http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1 | 13:16 |
goern | thats just grafana.. it doesnt care if it reads from prom or statsd | 13:17 |
mordred | goern: yah - that's right | 13:17 |
mordred | just saying - we keep and hold all sorts of nice historical data - and we get it for extremely low overhead in each process | 13:18 |
mordred | but this is just me grumping - it's definitely a topic we should discuss in berlin - because clearly the prometheus side of the equation isn't going away any time soon | 13:19 |
tristanC | mordred: i bring the topic back after reading through 613680, which seems to suggest there is also an overhead with statsd protocol | 13:20 |
mordred | tristanC: sure - there's an overhead ... but there isn't a _development_ overhead | 13:20 |
mordred | we don't have to write an http metrics endpoint that can be polled, and add it to every process, and add in-memory metrics storage for it to read from, and maintain that system over time, and add it to every process we use regardless of whether it's a process that otherwise would be looking for in-bound traffic | 13:21 |
mordred | like - I have statsd metrics reporting instrumented now inside of openstacksdk -- it happens at the point of out-bound http request/response. how in the world are those metrics supposed to be intercepted and reported through an application layer that does not interact with those http requests at all? | 13:23 |
*** jpena|lunch is now known as jpena | 13:24 | |
*** hashar has joined #zuul | 13:25 | |
mordred | (hopefully this ranting provides good background for folks for our discussion in berlin :) ) | 13:27 |
tristanC | mordred: perhaps detect if the prometheus_client is running and use it to store the openstacksdk metrics too. | 13:29 |
mordred | tristanC: perhaps. it's solvable, for sure | 13:30 |
tristanC | fwiw, you don't actually have to write anything, the python client takes care of everything. it's different for sure, but it sounds like a good stack and it doesn't seems difficult to support both statsd and prometheus | 13:32 |
mordred | tristanC: o_O | 13:32 |
tristanC | mordred: https://github.com/prometheus/client_python#three-step-demo | 13:35 |
goern | the component exporting the metrics (to an http endpoint) is easy, running the http server might be a little bit heavy and something that you dont have with statsd, prometheus server itself is just scraping (and maybe long term storing) from the metrics exporter... it is different, and it introduces a httpd for each component willing to export metrics | 13:37 |
tobiash | what I have to note is that statsd_expoerter oom crashes here once a day (and it has a 400MB limit) | 13:39 |
mordred | tobiash: that's good to know | 13:40 |
*** quiquell|lunch is now known as quiquell | 13:41 | |
*** _cryptosignal_me has joined #zuul | 13:42 | |
*** AJaeger_ has quit IRC | 13:54 | |
*** AJaeger has joined #zuul | 13:57 | |
*** goern has quit IRC | 14:02 | |
*** ssbarnea has quit IRC | 14:02 | |
*** themroc has quit IRC | 14:02 | |
*** edleafe_ has quit IRC | 14:02 | |
*** irclogbot_3 has quit IRC | 14:02 | |
*** bstinson has quit IRC | 14:02 | |
*** pleia2 has quit IRC | 14:02 | |
*** dmellado has quit IRC | 14:02 | |
*** tobiash has quit IRC | 14:02 | |
*** jlk has quit IRC | 14:02 | |
*** bstinson_ has joined #zuul | 14:02 | |
*** themroc has joined #zuul | 14:04 | |
*** themroc has quit IRC | 14:04 | |
*** dmellado has joined #zuul | 14:04 | |
*** themroc has joined #zuul | 14:04 | |
*** tobiash has joined #zuul | 14:05 | |
*** ssbarnea has joined #zuul | 14:07 | |
*** pleia2 has joined #zuul | 14:07 | |
mordred | tristanC: thanks for the pointer - you are right, from an openstacksdk perspective it's not much different to report to prometheus | 14:29 |
mordred | tristanC: I updated the stats patch there to support both: https://review.openstack.org/#/c/614834/ | 14:30 |
tristanC | it seems like we need to figure out an abstract function to report metrics, then adding prometheus (or something else in the future) should be fairly simple | 14:41 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Report tenant and project specific resource usage stats https://review.openstack.org/616306 | 14:45 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Add resource metadata to nodes https://review.openstack.org/616262 | 14:47 |
tobiash | corvus, mordred: are you aware of merger/depends-on issues? | 14:49 |
tobiash | change 616306 resulted in 'This change depends on a change that failed to merge'. However the dependency has no merge conflict. | 14:50 |
*** bstinson_ is now known as bstinson | 14:52 | |
mordred | tobiash: it's a timing thing | 14:54 |
tobiash | ? | 14:54 |
tobiash | it's the check pipeline, not gate | 14:54 |
mordred | yah - one sec - lemme double check ... | 14:55 |
mordred | (I'm going to use my local times for this next thing - sorry) | 14:55 |
*** _cryptosignal_me has quit IRC | 14:55 | |
mordred | you uploaded ps 2 of 616306 at 8:45, then ps 3 of its depends on 616262 at 8:47 - that is an update to the depends-on patch while the tests of 616306 is running | 14:56 |
mordred | so zuul invalidate the ps2 run | 14:56 |
mordred | with a merge conflict error | 14:56 |
mordred | because the dependency it calculated at the time the tests started was on ps2 - which isn't going to be the thing that merges | 14:57 |
mordred | tobiash: I hit this issue all the time over in sdk-land and it drives me a bit batty, but I don't have a better suggestion for what zuul should do differently yet | 14:57 |
mordred | I kind of wish zuul would just restart the tests on ps2 of 616306 when it's depends-on is updated while it's in the gate ... that would make my personal life better ... but I don't know that it's the correct behavior | 14:58 |
tobiash | mordred: ah got it | 14:58 |
tobiash | thx | 14:58 |
SpamapS | mordred: I think that is correct behavior, because Depends-On always pulls in the latest patch in a depended on change. | 15:06 |
mordred | SpamapS: yah, I agree. it's just a correct behavior that's annoying to an end user and leads to "git review" ; "recheck" when updating patch series that bounce back and forth between projects | 15:14 |
mordred | SpamapS: like, project a patch 1 <- project b patch 2 <- project a patch 3 <- project b patch 4 | 15:14 |
mordred | there's no way to push updates to the whole stack that doesn't result in needing to go run recheck comments | 15:15 |
mordred | but maybe that's just life and I shoudln't spend as much of my time rebasing interleaved patch stacks :) | 15:15 |
*** hashar has quit IRC | 15:27 | |
corvus | tristanC: regarding 613680, i think you may have misunderstood the change -- that inefficiency has nothing to do with statsd, that's due to the way we count the nodes (by querying zk) to produce the data. the problem would be the same with prometheus. (cc: mordred) | 15:37 |
mordred | corvus: ++ | 15:38 |
corvus | tobiash: i just had another idea regarding 613680; i left a comment on the change. | 15:45 |
JosefWells | I'm still having problems with ansible... I should probably look at your docker quick-start now, but I feel like I'm close. Added ansible to PATH, but I get: | 15:45 |
JosefWells | zuul-executor_1 | 2018-11-08 15:43:07,848 DEBUG zuul.AnsibleJob: [build: 46dcd167e1844c41bd8a84fccafc2d3f] Ansible output: b"ImportError: No module named 'ansible'" | 15:45 |
JosefWells | but if I run python3 in the container, and do 'import ansible' it works | 15:45 |
corvus | JosefWells: can you enable verbose ansible logging by running "zuul-executor verbose", repeat that, and then paste the full traceback which it will output? | 15:46 |
JosefWells | sure, sounds good, thanks corvus | 15:47 |
corvus | JosefWells: (you can run 'zuul-executor verbose' while the executor is running to turn verbose logging on; running 'zuul-executor unverbose' will turn it off) | 15:47 |
JosefWells | also, when I comment 'recheck' it doesn't seem to put the PR back into the check pipeline, it maybe ?caches? the results, so I have to update my PR to get this actually show up.. is that intended? | 15:49 |
JosefWells | my zuul.yaml has a pr comment trigger that matches "recheck" | 15:49 |
corvus | JosefWells: that depends on the pipeline definition -- there must be a comment trigger on the pipeline for that to work; if you paste your pipeline definition we can take a look | 15:49 |
JosefWells | - pipeline: name: check description: | Newly opened pull requests enter this pipeline to receive an initial verification manager: independent trigger: qc_github: - event: pull_request action: - opened - changed - reopened - event: pull_request action: comment comment: (?i)^\s*recheck\s*$ | 15:50 |
JosefWells | ack, terrible, sorry | 15:50 |
corvus | JosefWells: you may want to use paste.openstack.org for that :) | 15:50 |
JosefWells | indeed | 15:50 |
JosefWells | http://paste.openstack.org/show/734428/ | 15:51 |
corvus | JosefWells: that certainly looks right -- at least, it matches what we have -- you might be able to look at scheduler debug logs to see if there are any clues there. that's the part that processes those events and decides what to do with them | 15:53 |
tobiash | corvus: that's a great idea | 15:56 |
tobiash | Rate limited stats update triggered by cache updates | 15:56 |
tobiash | I'll try that when I have time | 15:56 |
corvus | tobiash: i'm trying to drop the rate limiting :) | 15:58 |
corvus | tobiash: granted, maybe that should wait until after the Node objects are cached to avoid the json parsing. but once we have the right data in memory, it should be almost no cost to emit an update in real time. | 15:59 |
tobiash | corvus: there will be still a cost iterating over a thousand nodes which I think isn't needed several times a second | 16:01 |
corvus | tobiash: sure, but computers can count to a thousand *very* fast these days :) to me, it's worth it to have the real-time info. | 16:03 |
tobiash | except if we can manage to cache the states itself and just increment/decrement | 16:03 |
corvus | tobiash: that would be fine too, but i think when we get this down to 0.001 seconds, it's not going to matter that much | 16:03 |
tobiash | ok, let's try this out and maybe we should also measure how much time it spends during stats updating | 16:06 |
tobiash | if we get these events per node we can do that incrementally without looping over all nodes (maybe loop over all every x updates to be sure) | 16:06 |
tobiash | that would be fast enough probably | 16:07 |
corvus | tobiash: yeah. and if we need to do periodic, that's fine. i'm just trying to keep the complexity down. :) | 16:07 |
JosefWells | Here is the chunk of logs after setting verbose | 16:09 |
JosefWells | http://paste.openstack.org/show/734431/ | 16:09 |
JosefWells | I could set PYTHONPATH in the container, but that seems odd as /usr/bin/python can see it | 16:10 |
corvus | JosefWells: can you tell us more about how you installed zuul (and whether you installed ansible separately?) | 16:11 |
JosefWells | my docker files are here: https://github.com/josefwells/zuul-docker although I have updated them a bit | 16:12 |
JosefWells | basically just pip install zuul | 16:12 |
JosefWells | oh, wow yeah, those dockers are before I understood how to docker | 16:12 |
JosefWells | but the install hasn't changed just how I run, docker-compose, etc | 16:13 |
JosefWells | I should just look at the 'official' one and stop pestering you guys, just want to make sure I'm not missing something fundamental | 16:13 |
*** pcaruana has quit IRC | 16:14 | |
JosefWells | I only have zuul and friends installed in the containers, the static nodepool I'm using should be assumed to have nothing modern or useful installed other than ssh | 16:14 |
JosefWells | I don't need ansible or zuul anywhere else right? | 16:14 |
corvus | JosefWells: well, one thing you could do is use our new upstream images (we publish them on every commit); that's what the example docker-compose does | 16:17 |
corvus | i'm having trouble connecting to github so i can't see the dockerfile right now | 16:17 |
JosefWells | better if you don't, they don't reflect what I've got now | 16:18 |
JosefWells | I should be able to execute that ansible command on the executor and explore more | 16:19 |
*** panda|off is now known as panda|rover | 16:19 | |
JosefWells | Hmm, I guess the ansible.cfg is no longer there | 16:20 |
corvus | JosefWells: you can run 'zuul-executor keep' and zuul will keep the temporary build dirs | 16:20 |
JosefWells | cool | 16:21 |
JosefWells | ansible_python_interpreter: /usr/bin/python2 | 16:24 |
JosefWells | seems wrong in inventory.yaml | 16:24 |
JosefWells | pyhton2 doesn't have ansible importable | 16:25 |
corvus | JosefWells: i believe that just tells ansible to run python2 on the remote side (which is apparently necessary if you run ansible under python3 on the local side) | 16:25 |
JosefWells | hmm ok | 16:25 |
*** themroc has quit IRC | 16:36 | |
mordred | corvus: it may not be necessary any longer now that we're on newer ansible- we should really investigate whether we need to keep doing that | 16:38 |
*** quiquell is now known as quiquell|off | 16:39 | |
*** sshnaidm|rover is now known as sshnaidm|notrove | 16:46 | |
corvus | JosefWells: i don't have a lot of experience with 'pip3 install --user' zuul. i know that installing it in a virtualenv should work, as well as installing it globally as root. | 16:48 |
corvus | JosefWells: if you want to continue to dig into this, it might be "fun". :) if you just want to get something working, you might consider switching to the upstream images on dockerhub, or updating your dockerfile to install zuul and nodepool as root in your image. | 16:49 |
corvus | mordred: ^ | 16:49 |
*** sshnaidm|notrove is now known as sshnaidm|wknd | 16:53 | |
tobiash | corvus: re cache triggered stats. We might want to elect a 'stats leader' in order to avoid cluttered graphs caused by multiple nodepools updating the same stats with slightly different versions of the data | 17:00 |
tobiash | Fortunately there is also a kazoo tecipe for that :) | 17:01 |
corvus | they think of everything | 17:01 |
*** panda|rover is now known as panda|rover|off | 17:02 | |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul master: WIP - Pagure driver https://review.openstack.org/604404 | 17:09 |
tobiash | This is needed to unbreak nodepool: https://review.openstack.org/#/c/616398 | 17:14 |
tobiash | (k8s functional job) | 17:14 |
clarkb | tobiash: approved | 17:14 |
tobiash | Thx | 17:15 |
*** edleafe_ has joined #zuul | 17:36 | |
openstackgerrit | Merged openstack-infra/nodepool master: Pin docker for k8s test https://review.openstack.org/616398 | 17:38 |
*** sshnaidm|wknd is now known as sshnaidm|off | 17:42 | |
*** jlk has joined #zuul | 17:43 | |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Convert kubernetes config docs to zuul-sphinx https://review.openstack.org/616643 | 17:45 |
corvus | tristanC, Shrews, ianw: ^ | 17:45 |
*** jpena is now known as jpena|off | 17:51 | |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Alter the metasyntactic variable in driver docs https://review.openstack.org/616646 | 17:58 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Normalize sidebar in docs https://review.openstack.org/616649 | 18:01 |
*** irclogbot_3 has joined #zuul | 18:02 | |
*** pcaruana has joined #zuul | 18:10 | |
SpamapS | corvus: Ok I think I get what you were trying to tell me yesterday | 18:19 |
SpamapS | corvus: so, zuul-base-jobs, and zuul-jobs, it sounds like, need to be always be in a connection that uses git.zuul-ci.org, not git.openstack.org/openstack-infra? | 18:19 |
SpamapS | be always be | 18:19 |
* SpamapS needs more coffee | 18:19 | |
corvus | SpamapS: zuul-jobs doesn't care, but zuul-base-jobs does; so if you use zuul-base-jobs directly, yes. (or you can fork it and modify it). | 18:27 |
*** jimi|ansible has joined #zuul | 18:29 | |
SpamapS | Indeed, that fixed it | 18:33 |
SpamapS | corvus: will there be some kind of stable tag or branch I can attach to for zuul-base-jobs? This broke me "some time between september and last week" | 18:37 |
SpamapS | I can of course run a local mirror and submit PR's that run my tests, just wondering if there will be something lighter weight than that. | 18:38 |
clarkb | I don't think zuul can pin its configs to a specific ref? | 18:40 |
clarkb | (so even if there were tags it would roll forward as there were updates) | 18:40 |
SpamapS | yeah, I wasn't sure how it could be done either. | 18:42 |
SpamapS | and I was riding the lightning a little setting up zuul-base-jobs and "run from master" zuul anyway, so I'll also accept like "we won't break it again" as an answer. :) | 18:42 |
SpamapS | I *think* though, the right thing is going to be to have a local mirror of zuul-base-jobs whose sync job runs some generic "can *this* zuul use these jobs" tests before merging. | 18:43 |
corvus | we should try not to break it again. :) i think there probably at least should have been an announcement about that; i think the reviewers must have overlooked the possibility of breakage there. | 18:43 |
SpamapS | It's possible I missed the announcement in the whole "finding and starting a new job" chaos between September and Now. :) | 18:44 |
SpamapS | so, my question is more meta... just wondering what the strategy for zuul-base-jobs will be for less-involved users. | 18:44 |
corvus | fungi, AJaeger, tobiash: ^ https://review.openstack.org/599607 probably warranted an announcement | 18:45 |
SpamapS | One thing I was thinking is that if you *could* make zuul-base-jobs use a tag somehow in the tenant config, it would allow users to do the traditional "read the release notes, update the version" flow. | 18:45 |
tobiash | corvus: oops, you're right | 18:47 |
tobiash | SpamapS: sorry for tha | 18:47 |
corvus | SpamapS: i think we could implement that without too much difficulty, but i think there would be a large cognitive overhead to making releases of zuul-jobs which i don't think will be worth it. we've actually been pretty good about keeping zuul-jobs CD-able -- i think we can manage to do that. | 18:48 |
*** caphrim007 has joined #zuul | 18:52 | |
SpamapS | corvus: cool that's great news. | 18:52 |
tobiash | corvus: I just had a crazy idea about these cross references | 18:53 |
tobiash | corvus: what if we would be able to attach some kind of uuid that zuul understands to the repo so we can reference such public repos via uuid | 18:54 |
tobiash | then this wouldn't depend on how connections are named or if this is a private but unpatched fork | 18:55 |
SpamapS | I was thinking the opposite. Not a UUID, but symbolic names, with defaults that point to the physical location. | 18:55 |
*** electrofelix has quit IRC | 18:55 | |
tobiash | aliases in the tenant config? | 18:55 |
tobiash | yes, probably easier | 18:56 |
SpamapS | So, upstream:zuul-jobs could be mapped to https://git.openstack.org/openstack-infra/zuul-jobs or https://git.mylocalmirror.local/foo and both be consumed in jobs as upstream:zuul-jobs.. allowing me to decouple my deployment choices from zuul upstream jobs' layout. | 18:56 |
SpamapS | anyway, ideas are a dime a dozen. I'll think about this a bit. For now I think being careful with the base jobs and paths should suffice. | 18:57 |
openstackgerrit | Doug Hellmann proposed openstack-infra/zuul-jobs master: enable setting python version for ensure-twine https://review.openstack.org/616673 | 19:08 |
*** smyers_ has joined #zuul | 19:26 | |
*** smyers has quit IRC | 19:26 | |
*** smyers_ is now known as smyers | 19:26 | |
*** rlandy is now known as rlandy|brb | 19:57 | |
*** hashar has joined #zuul | 20:08 | |
*** rlandy|brb is now known as rlandy | 20:19 | |
*** hashar has quit IRC | 20:21 | |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Use openstacksdk submit_task https://review.openstack.org/616387 | 20:29 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: Revert "Pin docker for k8s test" https://review.openstack.org/616404 | 20:30 |
clarkb | ianw: ^ heh that was quick | 20:30 |
ianw | clarkb: oh, sorry not sure if it's fixed, i just rebased it on master | 20:31 |
clarkb | ah | 20:31 |
ianw | it was previously stacked ontop that stats fix (which is ready for review ;) | 20:31 |
dkehn | tobiash: or clarkb: another question- the executor is getting a key error when its runs and I’m wonder what keyfile its using? see https://pastebin.com/Mktaryz4 | 20:47 |
tobiash | dkehn: I think this should help you: https://review.openstack.org/608453 | 20:48 |
fungi | corvus: ahh, i didn't think about announcing it because i was told it was already broken and unusable directly since it needed a connection to an undocumented remote | 20:49 |
fungi | didn't think anyone was (or could) directly consume the zuul-base-jobs examples from that repo | 20:50 |
fungi | sorry about that SpamapS! | 20:50 |
dkehn | tobiash: ah yes that should | 20:50 |
openstackgerrit | Merged openstack-infra/nodepool master: Convert kubernetes config docs to zuul-sphinx https://review.openstack.org/616643 | 20:57 |
openstackgerrit | Merged openstack-infra/nodepool master: Alter the metasyntactic variable in driver docs https://review.openstack.org/616646 | 20:57 |
openstackgerrit | Merged openstack-infra/nodepool master: Normalize sidebar in docs https://review.openstack.org/616649 | 20:58 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: enable setting python version for ensure-twine https://review.openstack.org/616673 | 21:16 |
*** manjeets has quit IRC | 22:00 | |
corvus | clarkb, tobiash, jlk: i think the github connector in openstack's zuul has been stuck on an http request for several days. here's the stacktrace for the thread: http://paste.openstack.org/show/734445/ | 22:08 |
jlk | hrm. | 22:08 |
corvus | i think we may need to look into timeouts... though not sure where in github3/requests/urllib we need to be looking :) | 22:09 |
jlk | we definitely had some hiccups on our end recently. | 22:09 |
clarkb | corvus: can we try to kill the socket and if we did any idea if that would force it to reconnect? | 22:09 |
jlk | yeah, I would have thought there was a built in timeout :( | 22:09 |
corvus | clarkb: oh, er, maybe? though i mostly want to prevent it from happening again | 22:09 |
clarkb | ++ | 22:09 |
corvus | (as for fixing this, we need to restart zuul anyway for new features, so not too worried about that) | 22:09 |
clarkb | rgr | 22:10 |
corvus | i know requests hase a timeout option to its .get .post etc methods | 22:10 |
jlk | yeah | 22:12 |
jlk | I don't see anything in github3.py for timeout | 22:12 |
jlk | looking at the trace it does not provide a way to pass through the timeout | 22:14 |
clarkb | guessing that we never got a FIN or RST so our end thinks the tcp connection is still open but github likely has no idea about it? | 22:14 |
corvus | looks like https://github.com/sigmavirus24/github3.py/blob/master/src/github3/models.py#L191 is where the action is | 22:14 |
clarkb | in whcih case timeouts in the lib would be good | 22:15 |
corvus | requests advocates using this: http://docs.python-requests.org/en/master/user/quickstart/#timeouts | 22:16 |
jlk | https://github.com/sigmavirus24/github3.py/issues/156 | 22:16 |
jlk | at one point requests may have been doing a default timeout | 22:16 |
corvus | i wouldn't mind a global setting -- like an argument to the constructor that set a timeout and then the _request method in models.py just uses that | 22:17 |
jlk | yeah, would need to come up with an easy reproducer | 22:19 |
corvus | jlk: it uses betamax, right? i wonder if there's a way to tell betamax "don't ever return any data for this http request" | 22:22 |
jlk | it does use betamax | 22:24 |
clarkb | kmalloc: ^ would probably know? | 22:24 |
jlk | I'm not super worried about testing requests functionality in our integration tests. We can simply test that the timeout param is passed through | 22:25 |
jlk | but I would like to show easily the current malfunction of indefinite hang | 22:25 |
clarkb | jlk: I think for that what you want is for the remote to close its socket without sending a reset or fin | 22:26 |
clarkb | jlk: like if the power goes out | 22:26 |
kmalloc | oh hai | 22:26 |
kmalloc | can scroll back as soon as i walk doggo to corner | 22:26 |
corvus | jlk: fwiw, what happens is exactly what the requests docs say will happen if the parameter is omitted :) but any kind of non-response will do i think. | 22:26 |
jlk | nod | 22:26 |
jlk | we may not need the test, if we're just following docs | 22:27 |
corvus | jlk: what do you think of this approach? https://github.com/jeblair/github3.py/commit/8a3c8a58f25a4b8ba327133917401c56a1bee3d2 | 22:31 |
jlk | not bad. couple things jump out | 22:40 |
jlk | 1) we'll have to accept the argument for the GitHubEnterprise class too | 22:41 |
jlk | oh. 2 was me not really understanding the GitHubCore session thing, but now I think I understand it | 22:43 |
corvus | i *think* i've got that right, but i didn't run tests or anything :) | 22:43 |
jlk | nod | 22:43 |
corvus | i'll clean that up and see if i can run tests real quick | 22:46 |
* kmalloc is back | 22:53 | |
kmalloc | looking at backscroll | 22:53 |
kmalloc | so the key with betamax is that if you want to not respond... you'd just have a null response entry | 22:54 |
kmalloc | it's not super magical... that said, i'm super not into betamax where requests mock can be used | 22:55 |
*** manjeets has joined #zuul | 22:55 | |
corvus | sounds like we may not need to figure it out anyway. i hope. :) | 22:55 |
kmalloc | cool | 22:55 |
kmalloc | thats is good news | 22:56 |
*** JosefWells has quit IRC | 23:13 | |
corvus | jlk: good news and bad news. good news: it's very easy to verify the parameter is passed all the way to requests -- all the tests mock out the session object and assert that the request method was called with specific arguments. bad news: all the tests assert that the request method was called with specific arguments so the patch would change at least 358 assert calls to add a timeout parameter. | 23:29 |
jlk | hahahah ooof. | 23:29 |
jlk | I mean, it's the right thing to do, but that's a lot of churn | 23:29 |
jlk | if you want to start the PR and hand off that grunt work I'm fine w/ that. | 23:29 |
corvus | yeah, i was also thinking that's probably the right thing. i'll push up a branch with 2 commits to see the change and a single test fix -- maybe get review/buy-in on that before proceeding? | 23:30 |
jlk | yeah that seems reasonable | 23:30 |
jlk | this isn't gerrit, so not every commit has to pass tests | 23:31 |
jlk | blessing/curse | 23:31 |
corvus | yep, we can use that to our advantage here :) | 23:31 |
corvus | jlk: how's this look? https://github.com/jeblair/github3.py/pull/1/commits -- if that looks okay, i'll open a pr against upstream | 23:36 |
jlk | That looks great to me as an opening gambit | 23:39 |
corvus | i'll be sure to indicate that in the pr message | 23:40 |
corvus | (that the test fixes are waiting on buy-in on the approach) | 23:40 |
corvus | jlk, clarkb, tobiash: https://github.com/sigmavirus24/github3.py/pull/904 | 23:46 |
jlk | Nice. Fwiw you can use markdown in the PR body, like: The [requests documentation](http://url) recommends | 23:47 |
jlk | not that it detracts from the PR at all | 23:48 |
jlk | corvus: I'm really afraid that all the betamax recordings have specific requests data in them such as the timeout= call, which would mean re-recording all of them. | 23:49 |
jlk | which, I can certainly do, it'd just be a slog | 23:49 |
corvus | oh boy that sounds fun! | 23:52 |
corvus | well, hopefully we'll have some good chat on the pr and come up with the best thing to do before anyone wastes days on it :) | 23:53 |
jlk | yeah, or several :D | 23:53 |
*** caphrim007 has quit IRC | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!