openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Support IPv6 with zuul_stream https://review.openstack.org/500401 | 00:02 |
---|---|---|
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Use afs_publisher_target for afs-docs https://review.openstack.org/500406 | 00:15 |
*** xinliang has quit IRC | 02:22 | |
*** xinliang has joined #zuul | 02:34 | |
*** xinliang has quit IRC | 02:34 | |
*** xinliang has joined #zuul | 02:34 | |
jeblair | mordred: http://zuulv3.openstack.org/static/stream.html?uuid=68504b89c2454917865fbe2dc1d6b5a6&logfile=console.log hit the stop-on-etcd3 problem | 02:59 |
jeblair | it appears that the script continued | 02:59 |
jeblair | the next several steps are to write a sha256sum file, then untar the download; both of those appear to have happened (i logged into the node and inspected) | 02:59 |
jeblair | this is the non-boring part of the process list: http://paste.openstack.org/show/620283/ | 03:00 |
jeblair | pstree: http://paste.openstack.org/show/620284/ | 03:02 |
jeblair | better pstree: http://paste.openstack.org/show/620285/ | 03:02 |
jeblair | it looks like the script has moved on all the way to the pip install | 03:03 |
jeblair | which has stopped at write(1, "Successfully built uwsgi\n", 25 | 03:03 |
jeblair | so it's probably stuck at output buffering | 03:04 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: DNM: test wget https://review.openstack.org/500426 | 03:07 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: DNM: test wget https://review.openstack.org/500426 | 03:22 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: sql: normalize logger name https://review.openstack.org/500431 | 03:42 |
*** persia has quit IRC | 04:20 | |
*** bhavik1 has joined #zuul | 04:53 | |
*** piccobit has joined #zuul | 05:03 | |
piccobit | hi, we're using submodules in our projects and currently we're are trying to figure out, how we can trigger a build in a project if something gets changed in an embedded submodule. any hints? | 05:06 |
piccobit | ah, forget my question, just saw the discussion regarding submodules some days ago! | 05:13 |
*** piccobit has quit IRC | 05:13 | |
*** hashar has joined #zuul | 06:18 | |
*** hashar is now known as hasharAway | 06:18 | |
openstackgerrit | Jamie Lennox proposed openstack-infra/zuul-jobs master: Allow overriding the workspace directory in prepare-workspace https://review.openstack.org/500466 | 07:25 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: upload-logs/bindep: fix indentation https://review.openstack.org/500339 | 07:32 |
* tobiash is back from vacation | 08:17 | |
openstackgerrit | Jamie Lennox proposed openstack-infra/zuul feature/zuulv3: Print SIGTERM logging to debug https://review.openstack.org/500476 | 08:23 |
*** hasharAway is now known as hashar | 08:42 | |
*** electrofelix has joined #zuul | 08:56 | |
*** bhavik1 has quit IRC | 11:01 | |
tobiash | woot, rebased my zuul deployment after more than 3 weeks and nothing broke :) | 11:48 |
*** jkilpatr has joined #zuul | 11:50 | |
mordred | tobiash: seriously? are you sure you rebaesd it properly? I'm sure we MUST have broken something | 12:59 |
mordred | jeblair: yes - this is EXACTLY the same as what happened when I looked at it | 13:00 |
mordred | jeblair: as in, I saw the sha and the untar and the python was stopped on something related to uwsgi | 13:00 |
mordred | tobiash: welcome back! | 13:01 |
* tobiash thinks he still knows how to use git rebase | 13:03 | |
tobiash | ;) | 13:03 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: sql: normalize logger name https://review.openstack.org/500431 | 13:30 |
pabelanger | morning | 13:34 |
*** hashar is now known as hasharAway | 13:49 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add python backport PPA during stream test https://review.openstack.org/500549 | 13:50 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Set ansible_python_interpreter in stream test https://review.openstack.org/500550 | 13:50 |
electrofelix | is there a known issue, where a change in the gate can be en-queued into the check pipeline through a 'recheck' command and result in a spurious 'merge failure' due to zuul resetting the merge attribute on the change object? | 13:55 |
electrofelix | saw it with a GitHub Enterprise PR | 13:56 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Move ara output generation to post playbook https://review.openstack.org/500403 | 14:16 |
clarkb | electrofelix: are you sure it isnt a valid merge failure? | 14:22 |
clarkb | every job run comes with its own constructed git tree | 14:22 |
electrofelix | clarkb: positive, I went through the log and I could see when the merger succeeded and the corresponding 'complete, merged: True, updated: False, commit: None' in the scheduler | 14:32 |
electrofelix | clarkb: but then the reporter had "Reported change 38,9a54b8475f906256d012e250753870ce25b2cc91 status: all-succeeded: True, merged: False" | 14:32 |
electrofelix | which stumped me for quite a while until I spotted 'getItemForChange(self, change)' can return an item that is already queued on any of the other pipelines | 14:33 |
electrofelix | and since it got added to the check pipeline without it actually being a new change, it didn't trigger the code that would kick out the existing change in the gate queue | 14:34 |
electrofelix | The sequence was: run check, run gate, trigger check due to 'recheck', less than 10 seconds later the gate finished and reported merge failed, subsequent the merger for the triggered check ran | 14:35 |
electrofelix | Actually that's not quite right | 14:42 |
electrofelix | PR38 queued in check queue & passes, queued in gate queue, merger runs and passes, 'recheck' issued on PR38, merger runs and passes at "17:11:42,209", gate jobs complete report merge failure at "2017-09-01 17:11:46,207", recheck completes, change is requeued into gate and passes | 14:42 |
electrofelix | How it went from 'merged: True' to 'merged: False' is stumping me, but I can only assume it was some form of race with the same change going into the check queue at the same time | 14:47 |
electrofelix | I would point out that if the change had really failed to merge, I would have expected a message containing 'Unable to merge change ..." to appear in the log based on the scheduler code | 14:48 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: DNM: test wget https://review.openstack.org/500426 | 14:52 |
electrofelix | Is it possible that after emitting the log message from merger client, that part of the call to self.sched.onMergeCompleted(..) only completed after part of the reportItem had already retrieved the value False using 'self.sched.onMergeCompleted()' ? | 14:52 |
electrofelix | Fundamentally it seems like there is a problem where 'recheck' can result in the same change appearing in the check and gate queues at the same time, and reuse of the same item object definitely has the potential for a race to occur | 14:54 |
electrofelix | I'm just not quite sure what the fix for this scenario should be, make the comment trigger be ignored if the same change is in the gate (how?), or abort the change in the gate (again how to know that this is correct?) | 14:55 |
electrofelix | is there a need to provide some kind of link between pipelines to say a change should be unique between the two? | 14:56 |
jeblair | electrofelix: they don't use the same item object. the 'merged: False' from the gate message isn't related to the zuul merger. that means that the change was not merged into the upstream repo. | 14:58 |
jeblair | electrofelix: aside from this case, is the gate pipeline generally able to merge pull-requests? is there branch protection that could have prevented the merge? is there anything in the logs about why the merge failed? | 15:01 |
electrofelix | jeblair: no there is nothing to stop it, and the subsequent run in the gate after the recheck comment worked without anyone having done anything to the original repo | 15:02 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: DNM: test wget https://review.openstack.org/500426 | 15:02 |
electrofelix | so the branches were not out of sync, and the merger run returned success each time | 15:03 |
jeblair | electrofelix: can you paste all the log lines between "Reporting change 38,9a54b8475f906256d012e250753870ce25b2cc91" and "Reported change 38,9a54b8475f906256d012e250753870ce25b2cc91 ..." ? | 15:04 |
jeblair | electrofelix: (ideally debug level if you have 'em) | 15:05 |
electrofelix | jeblair: don't have debug level but I can put the chunk into a paste service | 15:06 |
electrofelix | jeblair: http://paste.openstack.org/show/620349/ | 15:09 |
electrofelix | we have two check pipelines, an old one called check-zing-github, which is obsolete and will be removed shortly and a newer one called check-github, just in case you're wondering what's going on | 15:10 |
jeblair | electrofelix: do you have any log lines that match the regex "zuul.Github"? | 15:16 |
electrofelix | jeblair: last one was "2017-08-08 15:11:58,502 ERROR zuul.GithubWebhookListener: Exception when handling event:", we've turned on debug this morning, so none around when this occurred | 15:20 |
jeblair | electrofelix: hrm, i expected a warning log line from zuul.GithubConnection. | 15:25 |
jeblair | electrofelix: well, turning on debug logging was my next suggestion; i think we'll need that to proceed further; so let me know if it happens again and we can comb through the logs then | 15:26 |
electrofelix | before we even get there, if there is a concept of check & gate pipelines, it does seem like a recipe for surprise if a change in the gate can also simultaneously added to the check queue through a comment | 15:32 |
electrofelix | wondering if there should be something to prevent this case from occurring rather than worrying about debugging how it managed to think the merge failed | 15:33 |
pabelanger | so, feature request. if anybody js wizards could also add finger:// URLs to zuulv3.o.o/status, it would make me so happy :D | 15:45 |
pabelanger | I like the webUI, but my CPU and browser doesn't | 15:47 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Make validate-host read from site-variables https://review.openstack.org/500592 | 16:05 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Make validate-host read from site-variables https://review.openstack.org/500592 | 16:06 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: DNM: test wget https://review.openstack.org/500426 | 16:11 |
fungi | pabelanger: maybe as an icon next to the websockets url? | 16:24 |
pabelanger | finger emoji wfm :D | 16:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: DNM: test wget https://review.openstack.org/500426 | 16:28 |
electrofelix | jeblair: I think I know what caused the behaviour we saw, the github project has required checks enabled for the check & gate queues | 16:42 |
electrofelix | jeblair: in triggering a recheck, the status of the check was reset so zuul request to merge would fail, so this looks like it's really just a problem of a poor error message | 16:42 |
electrofelix | or a failure in the connection code to detect that the merge request was rejected | 16:45 |
jeblair | electrofelix: ah, i thought you had said there was no github branch protection. so yes, that makes sense. | 16:45 |
jeblair | electrofelix: yeah, there's supposed to be a warning level log message if the merge fails | 16:45 |
electrofelix | jeblair: sorry, meant there was nothing preventing the zuul user from merging, should have been more clear | 16:45 |
electrofelix | jeblair: I think there is something missing with the version we're running, we might be missing a commit or two | 16:46 |
electrofelix | jeblair: or because the log is only at debug for when that occurs | 16:47 |
electrofelix | or rather on the branch protections, they were supposed to be setup to allow the zuul user do merges, but the setting to require the status checks (to prevent users from forgetting and clicking he merge button themselves) was enabled | 16:49 |
electrofelix | anyway, I know far now about zuul internals than I did yesterday | 16:49 |
electrofelix | s/far/far more/ | 16:49 |
electrofelix | jeblair: is there room for another failure message in zuul pipelines? it seems that zuul merger failure should be considered different to the failure to submit a Gerrit change/merge a PR | 16:53 |
jeblair | electrofelix: yes, though first thing is to find out why there was no warning message from zuul.GithubConnection, since that would be the source of what we'd report back there. | 16:54 |
electrofelix | jeblair: our tree is out of date, the code at https://github.com/openstack-infra/zuul/blob/feature/zuulv3/zuul/driver/github/githubreporter.py#L142-L144 is using debug level in our codebase | 16:57 |
jeblair | ah ok | 16:57 |
*** jkilpatr has quit IRC | 17:17 | |
*** electrofelix has quit IRC | 17:42 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Make validate-host read from site-variables https://review.openstack.org/500592 | 18:25 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rework upload-logs to enable running on localhost https://review.openstack.org/500611 | 18:25 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add zuul base roles to ease sharing base job content https://review.openstack.org/500612 | 18:25 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add zuul base roles to ease sharing base job content https://review.openstack.org/500612 | 18:27 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Contract back to a single upload-logs tasks file https://review.openstack.org/500618 | 18:27 |
*** kmalloc_ has joined #zuul | 18:30 | |
*** eventingmonkey_ has joined #zuul | 18:36 | |
*** leifmadsen_ has joined #zuul | 18:36 | |
*** eventingmonkey has quit IRC | 18:37 | |
*** leifmadsen has quit IRC | 18:37 | |
*** kmalloc has quit IRC | 18:37 | |
*** kmalloc_ is now known as kmalloc | 18:37 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Reduce debug output for repo https://review.openstack.org/500622 | 18:49 |
pabelanger | jeblair: mordred: ^ reduce some logging to prevent secrets from showing in debug logs | 18:49 |
pabelanger | heh, I should test that first | 18:51 |
*** hasharAway is now known as hashar | 18:51 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: DNM: test wget https://review.openstack.org/500426 | 18:57 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Reduce debug output for repo https://review.openstack.org/500622 | 18:57 |
mordred | jeblair: WHY DOES wget WORK IN YOUR TEST BUT NOT IN DEVSTACK?????? | 19:15 |
*** pabelanger has quit IRC | 19:17 | |
*** pabelanger has joined #zuul | 19:17 | |
jeblair | mordred: hrm; i may still be missing weird things that devstack does with output or file handles, etc.... i was trying to start simple and add them one by one | 19:56 |
jeblair | i'm turning on keep to find out why the post playbook doesn't copy logs when the devstack job times out | 20:06 |
pabelanger | ack | 20:08 |
pabelanger | jeblair: mordred: https://review.openstack.org/500622/ is green now. Would be great to land and reset executor for cleaner debug logs | 20:09 |
jeblair | lgtm | 20:10 |
pabelanger | right now, logs are not pastebin friendly | 20:10 |
*** hashar has quit IRC | 20:44 | |
mordred | pabelanger: +3 | 21:03 |
mordred | jeblair: there's a weird thing going on with zuulv3 and shade that I thought might be worth point you at, in case it's a thing you're worried about | 21:06 |
mordred | jeblair: if you look at https://review.openstack.org/#/c/499357 | 21:06 |
mordred | jeblair: you'll see that zuul has voted -2 on the change | 21:07 |
mordred | even though there are no jobs for shade in the gate pipeline | 21:07 |
mordred | oh - actually - https://review.openstack.org/#/c/499357/ is the base change, and it just got the same thing | 21:08 |
mordred | jeblair: I'm going to go look through scheduler logs to see if I can figure out why zuulv3 thinks it should leave -2 there | 21:08 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Reduce debug output for repo https://review.openstack.org/500622 | 21:12 |
mordred | jeblair: 2017-09-04 19:28:14,600 INFO zuul.DependentPipelineManager: Resetting builds for change <Change 0x7fb05ff854e0 499357,7> because the item ahead, <QueueItem 0x7fb066205d68 for <Change 0x7fb06446bda0 499345,4> in gate>, failed to merge | 21:20 |
mordred | jeblair: "499345" was the previous top of the queue, and zuul v2 actually merged it - I'm wondering if there's just a race-condition between the 2 zuuls, with v3 trying to do a merge check or something, and v2 merges it so v3 attempting to merge the patch on top of remote master errors | 21:22 |
mordred | jeblair: other than not being sure why it would be considering the change during gate queue processing | 21:22 |
mordred | jeblair: so - it's the gate queue processing that makes me wonder if it'sa real bug we should care about - if it's just v2 and v3 fighting, I'm not worried about that | 21:23 |
fungi | mordred: any guess why 499843 is still resulting in that "extra keys not allowed" validation error from zuulv3 even after its parent merged? | 21:59 |
openstackgerrit | Jamie Lennox proposed openstack-infra/zuul feature/zuulv3: Print SIGTERM logging to debug https://review.openstack.org/500476 | 22:05 |
Shrews | lots of working for a holiday | 22:34 |
* rcarrillocruz waves | 22:36 | |
rcarrillocruz | so, looking at zuulv3 github docs | 22:36 |
rcarrillocruz | it says there are two options, either a github app or a webhook | 22:36 |
rcarrillocruz | is there docs or blogpost, anything, about setting the two in the zuul context | 22:36 |
rcarrillocruz | perms, what not | 22:36 |
* rcarrillocruz throws a squirrel to mordred | 22:37 | |
pabelanger | rcarrillocruz: see https://docs.openstack.org/infra/zuul/feature/zuulv3/admin/drivers/github.html | 22:45 |
pabelanger | we're using github app right now | 22:45 |
pabelanger | https://github.com/apps/openstack-zuul | 22:45 |
rcarrillocruz | yeah, but that depicts from zuul side, i wonder if there are docs or samples or a blogpost about 'hey, i set up a webhook on this repo so it could consume events on my zuul' kind of thing | 22:46 |
rcarrillocruz | alternatively, a github app that sends thing to a given zuul, not openstack zuul | 22:47 |
rcarrillocruz | i assume the github app is a hardcoded convenient thing you install in a repo | 22:47 |
rcarrillocruz | thta sets it up to send events from yuour repo to openstack zuul | 22:47 |
pabelanger | I still think we are working on that part, but mordred did setup openstack-infra. So best person to ask | 22:47 |
rcarrillocruz | i wonder how to create 'ricky dummy zuul gh ap' | 22:47 |
jeblair | rcarrillocruz: if you do it, please write it down and add to the docs | 22:53 |
rcarrillocruz | if i get pointers on how openstack zuul was done, i | 22:54 |
rcarrillocruz | will contribute that back for sure | 22:54 |
jeblair | mordred, dmsimard: this file has some weird encoding: http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/emit-ara-html/tasks/main.yaml | 23:05 |
jeblair | line 18, the space before bool is actually "\xc2\xa0" | 23:05 |
jeblair | dmsimard: i hereby revoke your emacs derogation privileges. :) | 23:07 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Ignore errors from ara generate https://review.openstack.org/500645 | 23:09 |
jeblair | that's why the devstack timeout isn't uploading logs. we'll probably want to snarf the ara db from that to find out *why* ara generate isn't working as well. | 23:10 |
jeblair | i turned off keep | 23:12 |
jeblair | dmsimard: here's the ara sqlite file that caused ara generate to fail: http://files.openstack.org/user/corvus/ara-failed-ansible.sqlite | 23:15 |
jamielennox | rcarrillocruz: a github app is an instance of a thing, it's basically a wrapper around a webhook with the permissions inbuilt | 23:36 |
jamielennox | you can do the same thing with out the app, you just need to point the webhook at the correct zuul, but then you also need a user for the zuul to post as | 23:37 |
jamielennox | so webhook_key (or whatever) is always common, but you need a github oauth token if you don't have a app_id | 23:38 |
dmsimard | jeblair: that's ok I don't use emacs | 23:57 |
dmsimard | Also, yes, it's entirely possible there is a junk whitespace in there | 23:58 |
dmsimard | It's something my keyboard layout sometimes produce and it's a real PITA | 23:58 |
dmsimard | So much that I have a shell alias just to find them | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!