*** brinzhang0 has quit IRC | 00:02 | |
*** diablo_rojo has quit IRC | 00:05 | |
*** brinzhang has joined #openstack-release | 00:43 | |
*** armax has quit IRC | 00:58 | |
*** armax has joined #openstack-release | 01:06 | |
*** mgoddard has quit IRC | 01:55 | |
*** armax has quit IRC | 02:31 | |
*** armax has joined #openstack-release | 03:49 | |
*** armax has quit IRC | 04:06 | |
*** mgoddard has joined #openstack-release | 04:12 | |
*** mgoddard has quit IRC | 04:53 | |
*** ykarel has joined #openstack-release | 04:57 | |
*** evrardjp has quit IRC | 05:33 | |
*** evrardjp has joined #openstack-release | 05:33 | |
*** mgoddard has joined #openstack-release | 06:11 | |
*** vishalmanchanda has joined #openstack-release | 06:48 | |
*** mgoddard has quit IRC | 06:53 | |
*** slaweq has joined #openstack-release | 07:42 | |
*** hberaud has quit IRC | 07:59 | |
*** hberaud has joined #openstack-release | 08:00 | |
*** mgoddard has joined #openstack-release | 08:05 | |
*** sboyron has joined #openstack-release | 08:13 | |
openstackgerrit | Lucian Petrut proposed openstack/releases master: Release os-win 5.4.0 (Wallaby) https://review.opendev.org/c/openstack/releases/+/767674 | 08:23 |
---|---|---|
*** rpittau|afk is now known as rpittau | 08:30 | |
*** e0ne has joined #openstack-release | 08:34 | |
*** sboyron has quit IRC | 08:38 | |
*** e0ne has quit IRC | 08:41 | |
openstackgerrit | Merged openstack/releases master: Release Patrole 0.11.0 https://review.opendev.org/c/openstack/releases/+/767164 | 08:49 |
*** mgoddard has quit IRC | 08:53 | |
*** elod is now known as elod_pto | 09:04 | |
*** mgoddard has joined #openstack-release | 09:05 | |
*** slaweq has quit IRC | 09:16 | |
*** slaweq has joined #openstack-release | 09:18 | |
*** slaweq has quit IRC | 09:20 | |
*** slaweq has joined #openstack-release | 09:25 | |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/releases master: Branch OpenStack-Ansible Roles https://review.opendev.org/c/openstack/releases/+/767686 | 09:37 |
*** slaweq has quit IRC | 10:51 | |
*** slaweq has joined #openstack-release | 10:55 | |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/releases master: Branch OpenStack-Ansible Roles https://review.opendev.org/c/openstack/releases/+/767686 | 11:07 |
*** e0ne has joined #openstack-release | 12:01 | |
*** brinzhang has quit IRC | 12:08 | |
*** sboyron has joined #openstack-release | 12:10 | |
*** ykarel_ has joined #openstack-release | 12:14 | |
*** ykarel has quit IRC | 12:17 | |
*** ykarel_ is now known as ykarel | 12:18 | |
*** sboyron has quit IRC | 12:33 | |
*** tosky has joined #openstack-release | 12:58 | |
*** e0ne has quit IRC | 14:09 | |
*** e0ne has joined #openstack-release | 14:09 | |
*** iurygregory has joined #openstack-release | 14:38 | |
*** e0ne_ has joined #openstack-release | 15:02 | |
*** e0ne has quit IRC | 15:05 | |
*** armax has joined #openstack-release | 15:12 | |
openstackgerrit | Merged openstack/releases master: Replace series name by template variable https://review.opendev.org/c/openstack/releases/+/767128 | 15:25 |
openstackgerrit | Merged openstack/releases master: Branch OpenStack-Ansible Roles https://review.opendev.org/c/openstack/releases/+/767686 | 15:35 |
*** rpittau is now known as rpittau|afk | 15:38 | |
*** gibi is now known as gibi_pto | 16:35 | |
hberaud | A new release error to well start the hollidays => http://lists.openstack.org/pipermail/release-job-failures/2020-December/001496.html | 17:00 |
hberaud | :) | 17:00 |
gmann | hberaud: which comment you are referring? https://review.opendev.org/c/openstack/releases/+/767163 | 17:15 |
hberaud | gmann: our comment on PS1 (Dec 15) | 17:16 |
hberaud | s/our/your | 17:16 |
gmann | ah this one 'we need to cap constraints in tox also.' ? | 17:17 |
gmann | hberaud: that is done, I updated comment. | 17:18 |
hberaud | gmann: yes sorry I read wrong | 17:19 |
gmann | hberaud: thanks | 17:19 |
*** vishalmanchanda has quit IRC | 17:19 | |
hberaud | you're welcome | 17:19 |
openstackgerrit | Merged openstack/releases master: Release Tempest 26.0.0 https://review.opendev.org/c/openstack/releases/+/767163 | 17:29 |
*** dave-mccowan has quit IRC | 17:42 | |
hberaud | fungi: FYI we got an error [1] with https://review.opendev.org/c/openstack/releases/+/767686 during the tag-releases. The goal of this patch was branching OSA roles for victoria, I checked all the branching one by one and everything seems correct. I did some verifications in the logs [1] and I didn't noticed issues. The job exited with rc 56, I don't know what it is (exit code 56). I didn't find | 17:44 |
hberaud | significant messages in the logs. Any ideas? [1] https://zuul.opendev.org/t/openstack/build/cc76b45601184d6484c5783b550a0951/log/job-output.txt | 17:44 |
fungi | looking | 17:44 |
hberaud | thanks | 17:44 |
hberaud | I surely missed something during my investigation | 17:45 |
*** dave-mccowan has joined #openstack-release | 17:47 | |
*** e0ne_ has quit IRC | 17:54 | |
fungi | hberaud: a bit of looking around suggests that git will exit 56 on some network failures | 17:54 |
hberaud | ah ok | 17:56 |
hberaud | however branching seems ok | 17:56 |
fungi | seems to show up most frequently when a network connection initiated by git is prematurely terminated | 17:56 |
fungi | was openstack/openstack-ansible-tests the last repo it needed to branch? | 17:57 |
* hberaud looking | 17:59 | |
hberaud | fungi: so 59 repos asked for branching, 59 branching were successfully did, and apparently yes openstack/openstack-ansible-tests was the latest expected | 18:05 |
hberaud | and the latest seems ok too => https://opendev.org/openstack/openstack-ansible-tests/src/branch/stable/victoria | 18:07 |
fungi | hberaud: aha, i think git exit codes were a distraction... looks like that exit code originated here: https://opendev.org/openstack/project-config/src/branch/master/roles/copy-release-tools-scripts/files/release-tools/process_release_requests.py#L236 | 18:13 |
fungi | there are a few places in that function where error_count is incremented | 18:13 |
fungi | and then the script returns the total tally as its exit code, which gets carried through by the calling scripts | 18:14 |
hberaud | ok, does this script contains some retry mechanismes which could explain why branches seems successfully created? | 18:14 |
fungi | i we're meant to read that as saying the job encountered 56 errors | 18:15 |
hberaud | ok I see | 18:15 |
fungi | it looks like it's the total of the return codes of tag_release() and make_branch() function calls | 18:16 |
fungi | which in turn is the number of subprocess.CalledProcessError exceptions raised in them | 18:17 |
fungi | looks like it's supposed to spew on stderr, so maybe the job-output.json has them | 18:17 |
hberaud | ack | 18:18 |
fungi | grr, i can't transform that json file with flamel, "yaml.reader.ReaderError: unacceptable character #x001f: special characters are not allowed" | 18:18 |
fungi | oh, it's getting retrieved compressed | 18:19 |
hberaud | as the branches were properly created do you think we need to reenqueue "publish-tox-docs-releases" | 18:19 |
fungi | no dice, the json says stderr from that task was an empty string, so if there was stderr emitted by process_release_requests.py it got swallowed or the fd wasn't correctly inherited by one of the calling layers | 18:22 |
fungi | s/by/from/ | 18:23 |
fungi | hberaud: i'd be inclined to ignore it if everything the job needed to do looks done. but if you see that job continue to show nonzero exit codes we should probably add more debugging output in the script and/or look into why stderr isn't captured | 18:24 |
fungi | my guess is something from make_branch.sh is triggering a subprocess.CalledProcessError exception for most of those. could that happen if the branch already existed? | 18:25 |
fungi | every so very many layers of scripts on more scripts | 18:26 |
hberaud | fungi: +1 to ignore that, we'll watch if similar issues will happen, but for now I'm not aware of similar issues. | 18:27 |
fungi | reminds me of when i used to manage sco unix servers, and most of their userspace tooling was an endless maze of ksh scripts calling one another | 18:27 |
hberaud | lol | 18:27 |
*** ykarel has quit IRC | 18:28 | |
fungi | i see lots of commands in https://opendev.org/openstack/project-config/src/branch/master/roles/copy-release-tools-scripts/files/release-tools/make_branch.sh which could trigger errors under the right conditions | 18:29 |
fungi | though it does seem to want to exit out cleanly without creating branches if they already exist | 18:30 |
fungi | did the .gitreview changes get pushed for all of those branches? | 18:31 |
fungi | maybe that's where the errors are stemming from | 18:31 |
hberaud | I didn't verifying that point | 18:33 |
hberaud | lemme check | 18:33 |
fungi | pushing changes to gerrit seems likely to be the most fragile part of make_branch.sh | 18:34 |
hberaud | fungi: without checking all of them for now I can see => https://review.opendev.org/c/openstack/openstack-ansible-tests/+/767905 and https://review.opendev.org/c/openstack/openstack-ansible-tests/+/767906 | 18:35 |
fungi | if there are more than 3 then i think we can rule that out as the cause of the 56 failures out of 59 branches | 18:36 |
hberaud | all seems ok there => https://review.opendev.org/q/openstack/openstack-ansible-os_ironic https://review.opendev.org/q/openstack/openstack-ansible-os_panko https://review.opendev.org/q/openstack/openstack-ansible-tests | 18:38 |
hberaud | 9 patches | 18:38 |
hberaud | s/96/ | 18:38 |
hberaud | s/9/6 | 18:38 |
hberaud | https://review.opendev.org/q/topic:%22create-victoria%22+(status:open%20OR%20status:merged) | 18:39 |
fungi | https://docs.python.org/3/library/subprocess.html#subprocess.check_call "Code needing to capture stdout or stderr should use run() instead" | 18:40 |
hberaud | around ~120 corresponding patches related seems created | 18:41 |
fungi | so that's why we don't get the output from make_branch.sh even though the comment here seems to indicate we should: https://opendev.org/openstack/project-config/src/branch/master/roles/copy-release-tools-scripts/files/release-tools/process_release_requests.py#L110-L116 | 18:41 |
hberaud | that correspond to our 59 branching | 18:41 |
hberaud | I see | 18:43 |
fungi | or maybe i'm misreading the implications there, but i guess if this continues to be a problem, we should rework process_release_requests.py to replace uses of check_call() with something which actually captures the stderr so we can directly emit it instead of relying on fd inheritance to take care of that | 18:44 |
hberaud | so I think a fix is on the horizon | 18:44 |
hberaud | yes I agree with you | 18:44 |
hberaud | good catch | 18:45 |
fungi | it's entirely possible that when you run this locally your tty's stderr gets inherited by make_branch.sh but when run under ansible that doesn't filter down | 18:45 |
hberaud | I think it's depends on how we use requiretty | 18:47 |
hberaud | with ansible | 18:47 |
fungi | oh could be | 18:47 |
hberaud | however I'm not an ansible expert | 18:48 |
fungi | same here, i'm afraid | 18:49 |
fungi | i know just enough about ansible to be a danger to myself and others ;) | 18:49 |
hberaud | This is an ironical situation to face this kind of issue with OSA :) | 18:50 |
fungi | excellent point, maybe we can get them to help | 18:52 |
hberaud | yyep | 18:52 |
hberaud | yep | 18:52 |
hberaud | let me ping them | 18:53 |
hberaud | noonedeadpunk o/, around by chance? | 18:55 |
hberaud | noonedeadpunk: fungi and myself need some help about Ansible's tty management | 18:56 |
hberaud | nicolasbock: it's related to your patch https://review.opendev.org/c/openstack/releases/+/767686 who exited in error http://lists.openstack.org/pipermail/release-job-failures/2020-December/001496.html | 18:57 |
hberaud | nicolasbock: sorry wrong dest | 18:57 |
hberaud | noonedeadpunk: it's related to your patch https://review.opendev.org/c/openstack/releases/+/767686 who exited in error http://lists.openstack.org/pipermail/release-job-failures/2020-December/001496.html | 18:57 |
nicolasbock | No worries hberaud :) | 18:58 |
hberaud | noonedeadpunk: indeed we faced an issue but we didn't get the stderr as expected in our script (c.f the previous discussion below) | 18:59 |
hberaud | noonedeadpunk: and we wonder if it could be related to ansible's tty management and by example to requiretty | 19:00 |
hberaud | noonedeadpunk: but we aren't both ansible expert | 19:01 |
hberaud | fungi: if we didn't get response about this until tomorrow then I'll start a related thread on the ML to help us to track this and grab some experts feedbacks | 19:07 |
hberaud | fungi: I suggest to move that in a more async mode for now, is it ok for you? | 19:09 |
fungi | absolutely! (i already have, also it's late for you i'm sure) | 19:14 |
hberaud | thanks | 19:27 |
noonedeadpunk | hberaud: hey | 19:34 |
noonedeadpunk | eventually ansible does not require tty unless it has pipelining enabled | 19:36 |
*** e0ne has joined #openstack-release | 19:36 | |
noonedeadpunk | since default behaviour is just to copy python modules to the destination and execute them locally from tmp dir with permissions of either connect user or become user | 19:37 |
*** dave-mccowan has quit IRC | 19:37 | |
noonedeadpunk | tty becomes a requirement when pipelining is enabled but again I think it's related to become process only (since sudoers should not contain requiretty in such cases) | 19:38 |
noonedeadpunk | isn't this an actual issue? | 19:38 |
noonedeadpunk | `/home/zuul/scripts/release-tools/add_release_note_page.sh: line 51: python: command not found`? | 19:38 |
noonedeadpunk | also I'm wondering if that might be the case if let's say stable/ussuri has been branched from the same sha I'm trying to branch victoria now? | 19:40 |
noonedeadpunk | I'm not sure if this exist since I haven't checked for that, but theoretically that might happen considering amount of roles (and that some of them do not change much) | 19:41 |
hberaud | no idea | 19:42 |
noonedeadpunk | btw, in the meanwhile, things seemed to branch actually | 19:42 |
hberaud | yes | 19:42 |
hberaud | everything seems ok, branches have been created and gerrit/victoria patches are on the rails for each repo | 19:43 |
hberaud | fungi noticed that possibly a our error catching doesn't work as expected and the fd doesn't contains the stderr | 19:45 |
hberaud | however it could come from different sides | 19:46 |
hberaud | and one of them is how ansible manage the tty and the context of the execution | 19:47 |
hberaud | noonedeadpunk: thanks for feedback, I need to dash, during the week end I'll move the discussion into a ML thread | 19:49 |
noonedeadpunk | oh, I think I know what got wrong | 19:50 |
noonedeadpunk | branches in gerrit were not created | 19:50 |
fungi | right, this is ansible calling a shell script calling a python script calling a shell script... comments in the python script suggest the shell script it's calling should inherit the parent process's stderr, but it's not getting captured in the task json's stderr field | 19:50 |
noonedeadpunk | I don't have gerrit/stable/victoria | 19:50 |
noonedeadpunk | and my guess would be nobody has ever branched after gerrit upgrade? | 19:51 |
fungi | noonedeadpunk: which repo are you looking at? | 19:51 |
noonedeadpunk | openstack-ansible-testsd | 19:51 |
hberaud | hm I think that yes | 19:52 |
noonedeadpunk | *openstack-ansible-tests | 19:52 |
fungi | https://review.opendev.org/admin/repos/openstack/openstack-ansible-tests,branches | 19:52 |
fungi | shows it there | 19:52 |
noonedeadpunk | eventually any | 19:52 |
fungi | i have a remotes/gerrit/stable/victoria | 19:53 |
hberaud | yes | 19:54 |
noonedeadpunk | I'm wondering why I have http://paste.openstack.org/show/801170/ | 19:54 |
fungi | git clone https://opendev.org/openstack/openstack-ansible-tests && cd openstack-ansible-tests && git review -s && git remote update && git branch -a | 19:54 |
fungi | that's what i just did anyway | 19:54 |
noonedeadpunk | uh, ok, yes | 19:55 |
noonedeadpunk | I did just git pull and thought that would be enough | 19:55 |
fungi | i can also `git review -d 767906` in there just fine | 19:55 |
noonedeadpunk | git remote update worked | 19:55 |
hberaud | we don't need https://review.opendev.org/c/openstack/openstack-ansible-tests/+/767905 frist ? | 19:56 |
fungi | yeah, you may need to pull or remote update after branch creation before you can `git review -d ...` a change for that branch | 19:56 |
hberaud | first | 19:56 |
noonedeadpunk | we need to commit things but I think it shouldn't affect downloading of patches | 19:56 |
noonedeadpunk | well pull does not work for sure:) It just omit branch in case no new patches are there I guess | 19:56 |
noonedeadpunk | ok, sorry for false alarm | 19:57 |
fungi | no worries, i'd prefer the opportunity to double-check things than ignore a possible bug | 19:57 |
hberaud | +1 | 19:57 |
fungi | we tested what we could reasonably test on a copy of production before upgrading the service fo realz, but we know there are plenty of scenarios we simply couldn't or didn't have time to try | 19:59 |
noonedeadpunk | that's totally fair | 20:00 |
noonedeadpunk | it's just me who should double checking stuff before spilling out the beans :) | 20:01 |
*** e0ne has quit IRC | 20:10 | |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/releases master: Release OpenStack-Ansible Ussuri https://review.opendev.org/c/openstack/releases/+/767945 | 20:12 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/releases master: Release OpenStack-Ansible Train https://review.opendev.org/c/openstack/releases/+/767946 | 20:13 |
*** e0ne has joined #openstack-release | 20:22 | |
*** e0ne has quit IRC | 20:40 | |
hberaud | anyway it could be worth to get the opportunity to retry some isolated new branching creationns before the rush of wallaby's last weeks | 20:46 |
hberaud | else if something related to it goes wrong it could be really painful to manage it at `pow()` | 20:49 |
*** slaweq has quit IRC | 20:54 | |
*** e0ne has joined #openstack-release | 21:36 | |
*** e0ne has quit IRC | 21:41 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!