tristanC | clarkb: in the config module, reset should only be called if a driver has been active | 00:01 |
---|---|---|
clarkb | heh not sure how I completely skipped over that file. I must've clicked on the second one | 00:02 |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: drivers: only reset drivers that have been active https://review.opendev.org/c/zuul/nodepool/+/777024 | 00:03 |
clarkb | is there another related issue here where you could have config for multiple k8s installations in the k8s config file but only use a subset of them in the nodepool providers? | 00:04 |
clarkb | that is probably more difficult to address because i bet config loading for k8s loads everything | 00:04 |
tristanC | clarkb: and there was a typo. But again, there is surely a better way to do that | 00:04 |
*** tosky has quit IRC | 00:38 | |
*** hamalq has quit IRC | 01:26 | |
*** rlandy has quit IRC | 01:28 | |
*** smyers has quit IRC | 02:44 | |
*** smyers has joined #zuul | 02:47 | |
*** ricolin has quit IRC | 03:15 | |
*** ricolin has joined #zuul | 03:30 | |
*** ykarel has joined #zuul | 04:36 | |
*** saneax has joined #zuul | 04:51 | |
*** saneax has quit IRC | 04:52 | |
*** saneax has joined #zuul | 05:10 | |
*** evrardjp has quit IRC | 05:33 | |
*** evrardjp has joined #zuul | 05:33 | |
*** jfoufas1 has joined #zuul | 05:48 | |
*** sanjayu_ has joined #zuul | 06:00 | |
*** saneax has quit IRC | 06:03 | |
*** vishalmanchanda has joined #zuul | 06:25 | |
*** hashar has joined #zuul | 07:12 | |
*** piotrowskim has joined #zuul | 07:35 | |
*** jcapitao has joined #zuul | 07:40 | |
felixedel | corvus: No, I'm currently working on the refactoring of the builds API. So would be glad if you could do that. Regarding your comments on the change: I'm fine with that procedure and I will rebase my change once the tls changes are all merged. But I would like to keep it in it's current position in the stack. Otherwise I would have to do a lot more rebasing and I don't think it's worth. | 08:03 |
*** rpittau|afk is now known as rpittau | 08:28 | |
*** ykarel_ has joined #zuul | 08:31 | |
*** ykarel has quit IRC | 08:33 | |
*** tosky has joined #zuul | 08:50 | |
*** dry has joined #zuul | 08:56 | |
*** msuszko has quit IRC | 08:58 | |
*** jpena|off is now known as jpena | 08:58 | |
*** nils has joined #zuul | 09:20 | |
*** noonedeadpunk has quit IRC | 09:26 | |
*** jonass_ has quit IRC | 09:29 | |
*** noonedeadpunk has joined #zuul | 09:29 | |
*** jonass has joined #zuul | 09:30 | |
*** jfoufas1 has quit IRC | 09:41 | |
*** jfoufas1 has joined #zuul | 09:46 | |
*** jfoufas1 has quit IRC | 10:17 | |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Revert "Revert "Update upload-logs roles to support endpoint override"" https://review.opendev.org/c/zuul/zuul-jobs/+/776677 | 10:18 |
*** ykarel_ is now known as ykarel | 10:20 | |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Revert "Revert "Update upload-logs roles to support endpoint override"" https://review.opendev.org/c/zuul/zuul-jobs/+/776677 | 10:21 |
*** ajitha has joined #zuul | 10:35 | |
icey | is it possible for a project to override a zuul job defined in project config? | 10:50 |
*** rpittau is now known as rpittau|bbl | 11:00 | |
*** iurygregory_ has joined #zuul | 11:00 | |
*** iurygregory has quit IRC | 11:01 | |
*** iurygregory_ is now known as iurygregory | 11:06 | |
*** jcapitao is now known as jcapitao_lunch | 11:13 | |
*** ikhan has joined #zuul | 11:50 | |
*** hashar is now known as hasharLunch | 11:57 | |
*** rlandy has joined #zuul | 12:29 | |
*** jpena is now known as jpena|lunch | 12:30 | |
Open10K8S | Hi team. Is there any chance to check these PSs regarding zuul log upload? https://review.opendev.org/c/opendev/base-jobs/+/777087 https://review.opendev.org/c/zuul/zuul-jobs/+/776677 | 12:36 |
avass | neither include_tasks or include_role runs in parallel on multiple hosts :( | 12:39 |
avass | however it looks like a bare 'include' does run in parallel | 12:41 |
avass | I'm gonna do a bit of investigation around that later, but that could mean that some roles in zuul-jobs has a perfomance cost for multi-node jobs. (looking at bindep specifically atm) | 12:44 |
avass | oh actually it could be that looping does not happen in parallel, and we're looping over a include_tasks to install packages in the bindep role | 12:49 |
*** jcapitao_lunch is now known as jcapitao | 12:50 | |
*** rpittau|bbl is now known as rpittau | 13:01 | |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: drivers: only reset drivers that have been active https://review.opendev.org/c/zuul/nodepool/+/777024 | 13:27 |
*** jpena|lunch is now known as jpena | 13:33 | |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: kubernetes: refactor client creation to utils_k8s https://review.opendev.org/c/zuul/nodepool/+/777022 | 13:36 |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: kubernetes: refactor client creation to utils_k8s https://review.opendev.org/c/zuul/nodepool/+/777022 | 13:52 |
openstackgerrit | Benedikt Löffler proposed zuul/zuul master: Add tenant to reconfiguration end log message https://review.opendev.org/c/zuul/zuul/+/777109 | 13:53 |
*** rlandy is now known as rlandy|training | 13:57 | |
mordred | avass: and sadly we're doing that to handle a case that is unlikely to happen most of the time. also - why dont' loops happen in parallel?? | 14:25 |
*** harrymichal has joined #zuul | 14:25 | |
*** vishalmanchanda has quit IRC | 14:33 | |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: kubernetes: refactor client creation to utils_k8s https://review.opendev.org/c/zuul/nodepool/+/777022 | 14:36 |
avass | mordred: I can't think of a good reason why they don't. maybe we should update that role to handle that a bit better somehow | 14:40 |
avass | I guess weird things could technically happen if you modify the inventory in a loot and run those in parallel on multiple host and get race conditions that way | 14:43 |
avass | s/loot/loop/ | 14:44 |
avass | but I don't see why ansible wouldn't allow loop to run in parallel at all because of that | 14:44 |
Shrews | mordred: avass: i would probably ask bcoca to confirm since that bit of code is a bit complex and I've only looked at it in bits and pieces, but I believe that since `include_*` are dynamic, the execution of that include is in parallel, but it gets expanded on the workers as a new task list which is then executed sequentially. | 14:58 |
mordred | nod | 14:59 |
*** ykarel has quit IRC | 14:59 | |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: kubernetes: refactor client creation to utils_k8s https://review.opendev.org/c/zuul/nodepool/+/777022 | 15:01 |
*** jangutter has joined #zuul | 15:12 | |
jangutter | Hi! We're bootstrapping a zuul install from scratch, and I'm trying to debug something weird in the web-ui. If I curl <zuul-web>/api/tenant/workday/builds, I get 404 "tenant workday doesn't exist" | 15:15 |
jangutter | However, accessing jobs and projects works perfectly. | 15:15 |
fungi | jangutter: have you checked to see if zuul-web is logging any errors? maybe it's having trouble reaching the database? | 15:23 |
jangutter | fungi: the logs are _really_ quiet, I'm not seeing any obvious errors. Status seems to be operating fine though.... | 15:27 |
fungi | also possible that's the benign result of zuul-web misinterpreting an empty query result because there have been no builds for that tenant yet? | 15:27 |
fungi | or at least none which belong to any reported buildset | 15:28 |
jangutter | fungi: hah, and that situation occurs rather often on a clean install! | 15:29 |
jangutter | do we need a db reporter for this to work? | 15:30 |
fungi | if you're installing zuul >=4.0.0 the mysql reporter is implicit on all pipelines now i believe | 15:31 |
* fungi double-checks release notes | 15:31 | |
jangutter | Bwaha: 3.19.1.dev422 064042c7 | 15:31 |
corvus | jangutter, fungi: i'd start with checking the scheduler | 15:33 |
corvus | that's actually what the zuul-web consults for determining if there's a tenant | 15:33 |
jangutter | Thanks! I can now explain my hairloss! I did not configure an sqlreporter. | 15:33 |
corvus | oh, you only see the tenant doesn't exist with builds? then yeah, lack of sql is probably the issue | 15:34 |
jangutter | corvus: I _think_ the scheduler is running fine? I can see jobs, even stream the console of one live.... | 15:34 |
* jangutter goes and adds sqlreporter to the base job. | 15:35 | |
corvus | jangutter: i'd go straight to 4.0 at this point, and don't configure a 'sql' connection or reporter, just configure a 'database' entry | 15:35 |
corvus | jangutter: 4.0 is nearly the same as what you're running, just with different syntax around sql reporting, and requires zk tls | 15:35 |
corvus | or.. maybe even exactly the same as what you're running | 15:36 |
corvus | (not sure which commit 3.19.1.dev422 064042c7 actually maps to) | 15:36 |
jangutter | Googling that version leads me to a log that happened in 2020-11-05 | 15:38 |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: kubernetes: refactor client creation to utils_k8s https://review.opendev.org/c/zuul/nodepool/+/777022 | 15:39 |
corvus | jangutter: but i think dev 422 is 422 commits after that, so probably more recent | 15:40 |
corvus | there were 522 commits between that version and now | 15:42 |
corvus | so it's probably not from yesterday :) | 15:43 |
fungi | though pbr also counts merge commits in that total so it might be half as many changes as that | 15:43 |
corvus | yeah | 15:43 |
jangutter | heh, we're using zuul from containers, but obviously our internal mirror is a bit... stale :-p | 15:45 |
corvus | jangutter: yeah, we do the same thing on opendev, sometimes running from a random master commit. you're in good company there, i think we just haven't quite solved the traceability issue. | 15:46 |
clarkb | the 064042c7 should map to a specific commit | 15:46 |
clarkb | however not one in our upstream repo? | 15:47 |
fungi | in theory yes, but it doesn't map to a commit in the history of zuul's master branch that i can find, so could be a fork | 15:47 |
clarkb | ya | 15:47 |
corvus | clarkb: yeah, but possibly a speculative one | 15:47 |
clarkb | oh I see | 15:47 |
corvus | doesn't have to be a fork | 15:47 |
corvus | the tagged releases will map to known commits | 15:47 |
corvus | but the running :latest from master can be speculative commits | 15:48 |
corvus | that's the mapping issue we need to figure out :) | 15:48 |
fungi | so it could actually just be an ephemeral fork we made in our gate pipeline ;) | 15:48 |
jangutter | yep, we took from :latest - good to know that it's so new it might be in a future that hasn't happened yet! | 15:49 |
corvus | we could solve that either by encoding additional information about the actual commits, or zuul-push could address that by making the ephemeral tree reality | 15:49 |
corvus | jangutter: yeah, time behaves differently for zuul :) | 15:49 |
clarkb | and even gerrit may not know about that due to speculative merges | 15:50 |
avass | fungi: "there is no fork" | 15:58 |
avass | too bad git doesn't have spoons :) | 15:59 |
fungi | heh | 15:59 |
*** sshnaidm is now known as sshnaidm|afk | 16:04 | |
avass | it might be a good idea to merge a change in zuul/zuul to bump the docs so can I interest anyone in some easy reviews? :) https://review.opendev.org/c/zuul/zuul/+/775499 https://review.opendev.org/c/zuul/zuul/+/772491 | 16:17 |
corvus | avass: done! | 16:19 |
*** hasharLunch is now known as hashar | 16:22 | |
jangutter | Aha, I see now: https://review.opendev.org/c/zuul/zuul/+/630472 (I've been shamelessly copying the config and pipelines from _current_ zuul, but using the containers built on old-and-busted zuul) | 16:25 |
fungi | in opendev our scheduler is logging lots of "Exception: No job nodeset for some_job" tracebacks. probably benign, looks like it may happen if it tries to cancel a build which hasn't gotten a node assignment yet. is this a known behavior? investigating a (likely) separate problem and stumbled across these errors | 16:26 |
avass | fungi: I think I've seen that before but didn't investigate why it happened. maybe we have something similar I can dig up | 16:30 |
corvus | jangutter: ah yep, we removed our sql reporter defs immediately after landing that (the warnings are quite a motivator) | 16:30 |
corvus | fungi: have a stacktrace handy? | 16:31 |
corvus | it is likely benign but we raise that exception to catch bugs :) | 16:32 |
avass | yep, got 22 events the last 24h | 16:32 |
fungi | corvus: http://paste.openstack.org/show/802936/ | 16:33 |
fungi | we got 85054 of those in the past 10 hours | 16:35 |
corvus | fungi: likely for the same event/jobs over and over | 16:38 |
mordred | corvus: do we have a really old python or something on the nodepool nodes in gerrit's zuul? | 16:42 |
corvus | mordred: oh maybe, i think they are stretch? | 16:42 |
mordred | corvus: https://ci.gerritcodereview.com/t/gerrit/build/9001d1fda6494c43bc5dc9306f660d07/console#1/0/20/testnode - that's what I get trying to read gerrit's .gitmodules - but if I read it locally (with python2, which seems to be what we're using) - it doesn't work | 16:42 |
mordred | corvus: yeah, I think so | 16:42 |
corvus | mordred: it does or doesn't work locally? | 16:43 |
mordred | corvus: it works locally | 16:43 |
mordred | with both python2 and python3 | 16:43 |
mordred | I did some googling and python added support for leading spaces in ini files for configparser like back in 2010 - so I'm really confused :( | 16:44 |
corvus | heh shouldn't be that old | 16:44 |
mordred | (I'm now trying spinning up some stretch for more debugging) | 16:45 |
mordred | right? | 16:45 |
mordred | ok - yes - stretch python2 does not parse .gitmodules | 16:45 |
mordred | stretch python3 does work | 16:46 |
mordred | corvus: should we maybe consider using python3 for ansible there? I could also obviously just read the file into a string, remove the leading spaces, then wrap it in a StringIO to pass to configparser | 16:47 |
mordred | which should work everywhere | 16:48 |
corvus | mordred: is py3 already installed? | 16:48 |
corvus | if not, then we'll take a hit installing it | 16:48 |
mordred | good question | 16:48 |
mordred | we're not building our own images there are we? | 16:48 |
mordred | I'm guessing no - since I dont think we're overriding the python setting, which means ansible is just finding python2 as python | 16:49 |
corvus | correct | 16:50 |
mordred | why don't I do StringIO for now - we can talk about python3, or changing the node type, as a follow | 16:50 |
corvus | yeah, we should change the node type but there are too many irons in the fire | 16:50 |
mordred | ++ | 16:50 |
corvus | fungi, avass: it looks like the bug centers around change C which git-depends on change B which is failing and change B is queue-dependent on change A which is passing. in that case, we can't shift change C out of the way any more, it just has to wait behind B, but since B is failing, we stop running jobs on C. we just cancel the jobs over and over; we don't see that they're already canceled. | 16:58 |
corvus | i can't dig into it any more than that right now, but i think that's enough to know that it probably really is benign other than the log/processing spam. we should be able to make a test case for that and fix it. | 16:59 |
fungi | yep, thanks. it's also looking increasingly unrelated to the problem i was initially trying to debug | 17:00 |
corvus | we should definitely fix it; there is a notable human and cpu cost to all that extra processing | 17:01 |
fungi | yes, sounds like repeatedly retrying to do something that's already been done is certainly suboptimal | 17:03 |
*** harrymichal has quit IRC | 17:12 | |
pabelanger | clarkb: looping back to ready unlocked nodes, it would be interesting to also debug https://zuul.opendev.org/t/openstack/nodes I see a few nodes that are greater then 10hours / 18hours. 1 is even 2 years | 17:12 |
pabelanger | I'm still trying to trace them on our side | 17:13 |
clarkb | pabelanger: ya we are looking at them currently in #openstack-infra | 17:13 |
clarkb | pabelanger: the vast majority are all in a single provider and originate during a similar time period | 17:13 |
pabelanger | ack | 17:14 |
clarkb | they belong to what appear to be completed node requests, nodepool is just not marking them as completed so zuul doesn't do anything with them | 17:14 |
clarkb | which is slightly different than the behavior you describee | 17:14 |
tobiash | ready unlocked or ready locked? | 17:14 |
pabelanger | clarkb: yah | 17:15 |
clarkb | tobiash: ready locked | 17:15 |
pabelanger | tobiash: in our case, they are ready unlocked | 17:15 |
clarkb | tobiash: and the node request is sitting pending | 17:15 |
tobiash | yeah we see that as well, there is still a node leak to be fixed | 17:15 |
clarkb | tobiash: it looks like the handler poll is simply not noticing the node is ready and can be unlocked and set the node request to fulfilled | 17:16 |
fungi | which winds up blocking changes in a dependent queue from merging because the change currently at the front has a slew of queued builds corresponding to those node requests | 17:16 |
tobiash | but had no luck and time yet so we currently work arou d that by automatically delete those locks after 24h | 17:16 |
pabelanger | tobiash: how do you delete them? | 17:16 |
clarkb | deleting the lock won't cause the node request to be fulfilled though? | 17:16 |
fungi | just deleting the znodes for them i guess? | 17:16 |
clarkb | maybe it will detect it lost the lock and then try again though | 17:16 |
pabelanger | I guess, crontab and zuul client? | 17:16 |
tobiash | we delete the lock via zkcli | 17:17 |
pabelanger | ah | 17:17 |
fungi | got it, yeah that's what i figured you meant | 17:17 |
corvus | pabelanger: i thought your issue was ready unlocked and you were going to check on logs to see if nodepool got around to deleting them? | 17:17 |
tobiash | basically nodepool list | grepsedsomething | xargs zkcli delete | 17:17 |
tobiash | pabelanger: do you use max ready age? | 17:18 |
pabelanger | corvus: yah, that is the issue. nodepool eventually deletes them or they get used, but only have a few hours | 17:18 |
pabelanger | tobiash: yes | 17:18 |
corvus | pabelanger: right, so if nodepool eventually gets around to them, then this isn't something that should be solved by deleting zk records | 17:18 |
pabelanger | but, I cannot see why they are ready unlocked for so long | 17:18 |
pabelanger | corvus: agree | 17:18 |
pabelanger | I can actually delete the unlocked node, which frees up quota, and get launched again | 17:19 |
pabelanger | just trouble tracking why the large delay | 17:19 |
clarkb | in this case if we restart the launcher we should end up trying again because the locks would be removed right? | 17:20 |
clarkb | it is almost like these ~12 node request handler threads all just died | 17:20 |
clarkb | because there doesn't seem to be any of the logging you'd expect from the handler after the launcher finishes launching the node | 17:21 |
fungi | maybe triggered by a cloud api outage? | 17:21 |
fungi | would a thread dump help? | 17:21 |
clarkb | thread dump may help if we label the threads well enough to figure out which might be handling this request | 17:21 |
clarkb | seems like we should start there | 17:21 |
pabelanger | clarkb: is there enough quota for more nodes to be launched? are the node requests waiting for another node to comeonline to unlock? | 17:21 |
fungi | at least see what those threads *think* they're doing | 17:21 |
clarkb | pabelanger: its a single node request so no | 17:21 |
pabelanger | k | 17:22 |
fungi | yeah, a lot of these are for very simple (linter, unit test) jobs | 17:22 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: Gitlab: raise MergeFailure exception to retry a failing merge https://review.opendev.org/c/zuul/zuul/+/777169 | 17:23 |
pabelanger | clarkb: yah, I would expect nodepool-launcher restart to clear the lock | 17:26 |
openstackgerrit | Merged zuul/zuul master: Add a note how to access zuul.items variable https://review.opendev.org/c/zuul/zuul/+/772491 | 17:32 |
openstackgerrit | Merged zuul/zuul master: Replace reset_repo_to_head(repo) with GitPython. https://review.opendev.org/c/zuul/zuul/+/775499 | 17:36 |
clarkb | tobiash: pabelanger: we took sigusr2 thread dumps then restarted the service. All of the ready locked nodes seem to be getting used now | 17:36 |
clarkb | (stopping the launcher would've unlocked the nodes then starting it again would've made nodepool realize it can fulfill requests with them I think) | 17:37 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: Gitlab: raise MergeFailure exception to retry a failing merge https://review.opendev.org/c/zuul/zuul/+/777169 | 17:37 |
clarkb | fwiw looking at the thread dumps doesn't seem to show much. There is no launcher thread for the node id (implying its launchign was complete) and I think the handler polling happens in the main provider thread which appeared to be running just fine | 17:38 |
clarkb | maybe we lost a handler reference somewhere? | 17:38 |
*** rpittau is now known as rpittau|afk | 17:41 | |
mnaser | has the circular dependencies seen progress other than the spec? i don't see anything right off the bat | 17:41 |
tobiash | clarkb: have you any storenode related exceptions in the logs? | 17:41 |
tobiash | I see them occasionally so could be related | 17:42 |
corvus | mnaser: yes, there's an implementation in review. | 17:43 |
clarkb | tobiash: there aren't any in the log file for the period of time when this node request was first handled. There are a couple in more recent blocks of time thuogh | 17:44 |
*** piotrowskim has quit IRC | 17:47 | |
*** Shrews has left #zuul | 17:56 | |
*** jpena is now known as jpena|off | 18:02 | |
*** rlandy|training is now known as rlandy | 18:02 | |
corvus | tristanC, tobiash: can you look at https://review.opendev.org/776566 real quick? | 18:39 |
avass | corvus: is there any specific reason zuuls dockerfiles doesn't set 'USER zuul' ? | 18:42 |
tobiash | avass: the executor at least needs privileges due to bubblewrap | 18:44 |
tobiash | as well as nocepool- builder | 18:44 |
avass | tobiash: yeah but the rest should be able to do non-root by default | 18:44 |
tobiash | corvus: lgtm | 18:44 |
corvus | tobiash, avass: zuul-executor does not need root | 18:44 |
corvus | opendev runs its executors as the zuul user | 18:45 |
clarkb | you do need the kernel support for the user namespacing though | 18:45 |
clarkb | and I think rhel/centos disable that by default | 18:45 |
clarkb | (in those cases you need root aiui) | 18:46 |
corvus | tobiash, avass: opendev also runs its builders as non-root (the nodepool user) | 18:46 |
tobiash | corvus: right, it just needs sudo | 18:46 |
corvus | tobiash: yeah that does seem to be the case | 18:48 |
corvus | the only thing that needs to start as root is fingergw if (and only if) you want to run in on port 79. but if you're using containers or behind a lb, you can run it as non-root and map the port. | 18:48 |
avass | that still doesn't really answer my question, I see no reason why zuul-web would need to run as root by default for example | 18:48 |
*** jcapitao has quit IRC | 18:48 | |
corvus | avass: i think we don't set USER to accomodate openshift | 18:48 |
avass | oh | 18:49 |
corvus | however, openshift is changing how they're doing their user mapping, so it's worth checking back on that when we can assume openshift 4 everywhere | 18:49 |
tobiash | unfortunately we're not there yet | 18:49 |
tobiash | but we won't switch to 4 but to plain k8s later this year | 18:50 |
avass | I haven't really looked into openshift at all | 18:51 |
corvus | i'd love to switch. i think defaults should always be the correct value, and unfortunately we have a case here where if you use the defaults with k8s or docker, you're doing the wrong thing. | 18:52 |
corvus | but last we looked, specifying the user was the wrong thing in openshift. so <shrug> | 18:52 |
corvus | but somehow we managed to thread the needle so at least it is possible to do the right thing in both systems | 18:53 |
avass | yeah it's just a bit awkward to do runAsUser: 10001 | 18:54 |
openstackgerrit | Merged zuul/zuul-storage-proxy master: Add zuul user to container https://review.opendev.org/c/zuul/zuul-storage-proxy/+/776566 | 18:54 |
*** jangutter has quit IRC | 19:06 | |
*** hashar has quit IRC | 19:22 | |
*** gmann is now known as gmann_lunch | 19:47 | |
*** gmann_lunch is now known as gmann | 20:07 | |
*** ikhan has quit IRC | 20:14 | |
*** sanjayu_ has quit IRC | 20:16 | |
mordred | corvus: so - I'm a little stumped. https://ci.gerritcodereview.com/t/gerrit/build/e5d5502511bb4732b0a53f87edbc8aee/console#1/0/20/testnode | 20:20 |
mordred | shows a bunch of "no section" errors | 20:20 |
mordred | but when I run similar commands locally on the .gitmodules file from gerrit, it totally works | 20:21 |
mordred | https://gerrit-review.googlesource.com/c/zuul/jobs/+/297442 - I just pushed up a few more debug lines to see if we can figure out what's different on the nodes there from locally | 20:25 |
*** hamalq has joined #zuul | 20:30 | |
*** ikhan has joined #zuul | 20:33 | |
mordred | oh wow. | 20:37 |
mordred | https://ci.gerritcodereview.com/t/gerrit/build/b3e05e06045143858fd1eed056136166/console#1/0/20/testnode <-- | 20:37 |
mordred | we're sorting line numbers in the console output as strings not as numbers | 20:37 |
avass | hah | 20:37 |
avass | you mean you don't count like that? | 20:38 |
mordred | I mean, I do think that 2 should come between 19 and 20 | 20:39 |
mordred | but others might find that confusing | 20:39 |
*** ikhan has quit IRC | 20:42 | |
mordred | avass: any idea where I should look for the code that is rendering those? | 20:43 |
avass | mordred: I'm looking in web/src/containers/build/Console.jsx right now | 20:45 |
avass | not sure if that's it however | 20:46 |
corvus | it's not us, it's an external component | 20:46 |
corvus | react-json-view | 20:47 |
corvus | ReactJson from ^ | 20:47 |
mordred | neat. so react-json-view is displauing lists wrong? :) | 20:49 |
corvus | afaict | 20:50 |
corvus | maybe the 'sortKeys' option we set is affecting this | 20:50 |
avass | checking that right now | 20:50 |
corvus | https://github.com/mac-s-g/react-json-view/pull/263 | 20:51 |
mordred | oh - I betcha | 20:51 |
mordred | https://github.com/mac-s-g/react-json-view/blob/825e8dc96b59a551c6167bef3aa72eee0d2045c7/src/js/components/DataTypes/Object.js#L252 | 20:52 |
corvus | mordred: i linked to the fix ^ | 20:52 |
mordred | ah - you found the PR - it looks like master dtrt | 20:52 |
mordred | ++ | 20:52 |
avass | sortKeys={false} does it | 20:52 |
mordred | now - if only they tagged their repo it would be easy to see if that fix is in the latest release :) | 20:52 |
corvus | i think so. i'll push a zuul change. | 20:53 |
mordred | cool | 20:53 |
mordred | yeah - latest release 18 days ago | 20:53 |
avass | wow that took a long time to merge | 20:54 |
fungi | who needs arrays with more than 10 values? i mean, really | 20:54 |
mordred | good point | 20:54 |
fungi | if you have that many values, you should just use arrays of arrays | 20:55 |
mordred | or just alphabetical names of your favorite lemurs | 20:55 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Update react-json-view https://review.opendev.org/c/zuul/zuul/+/777223 | 20:56 |
mordred | corvus: woot | 20:56 |
avass | surprisingly few depdencies that were updated | 20:57 |
avass | at least it feels like you usually get another 100 packages for free :) | 20:57 |
fungi | the fix was also mentioned in a subsequently merged pr "bump patch to 1.20.5" so looks like they don't use git tags for their release process | 20:58 |
mordred | corvus: figuring that out has not helped me figure out why the config parser in the job is working differently than the local configparser | 20:58 |
corvus | so demanding | 20:59 |
mordred | basically - I'm now printing out the .gitmodules that we're loading with configparser (it's down at the bottom of the output because it's line 40) - and you can see the sections are there that are later tossing "No section:" errors | 20:59 |
mordred | I've been trying to reproduce with a local stretch container - and it's perfectly happy loading that content :( | 21:00 |
*** harrymichal has joined #zuul | 21:04 | |
corvus | mordred: try outputing config.sections() to see what the section names are represented as internally | 21:05 |
corvus | maybe there's some subtle difference (bytes? quotes?) | 21:05 |
fungi | this is still the "why is indentation breaking ini file parsing" investigation? | 21:05 |
avass | yep | 21:07 |
mordred | well - no, we're working around the indentation - that turned out ot be "old python on stretch" - but now we have a new ini file parsing issue :) | 21:09 |
mordred | corvus: ++ | 21:09 |
avass | mordred: my python2 is able to read the .gitmodules file after removing indentation but the sections are empty | 21:14 |
mordred | avass: oh? that's great - that's more reproducing than mine | 21:14 |
mordred | if I remove the indentation my python2 can read it and sections are not empty | 21:14 |
avass | this is on python 2.7.18 | 21:15 |
mordred | mine is 2.7.13 | 21:16 |
avass | oh... | 21:17 |
pabelanger | should database dependencies be installed by default now with zuul? Right now pymysql is missing, if you setup dburi | 21:17 |
corvus | pabelanger: i think we have them set as extras | 21:20 |
corvus | so you pick the right extras package based on which db you want to use | 21:20 |
corvus | we could just include both i guess | 21:20 |
pabelanger | Yah, remember that now | 21:20 |
pabelanger | we don't document that in upgrade notes | 21:20 |
pabelanger | we should likely add it or maybe move them into default requirements now | 21:21 |
corvus | i'd be fine with that | 21:23 |
mordred | ++ | 21:23 |
mordred | I don't think pymysql or the pg one are terribly onerous install requirements | 21:23 |
*** ajitha has quit IRC | 21:24 | |
fungi | pymysql doesn't have, at least. it's a pure python implementation. not sure about psycopg2 | 21:26 |
avass | yeah I get the same on 2.7.13 | 21:26 |
avass | just running the reading, formatting and loading of .gitmodules in the interpreter however | 21:27 |
fungi | looks like we use psycopg2-binary, which vendors in its precompiled c libs | 21:28 |
fungi | so not quite as clean as pymysql, but dependency-wise still not bad | 21:29 |
avass | mordred: ooooooh | 21:34 |
avass | mordred: I think the tempfile could be removed before configparser gets to read it | 21:34 |
avass | mordred: moving the config.read inside the 'with tempfile' works a lot better | 21:36 |
mordred | avass: aha! which is actually what I was doing locally | 21:39 |
mordred | avass: THANKYOU - I don't know how long I would have stared at things and still not seen that | 21:40 |
openstackgerrit | Paul Belanger proposed zuul/zuul master: Include database requirements by default https://review.opendev.org/c/zuul/zuul/+/777245 | 21:40 |
avass | who would have thought temporary files were temporary :) | 21:40 |
mordred | avass: ikr? | 21:40 |
fungi | aggressively so at times, apparently | 21:44 |
avass | tbh configparser should complain at least a little bit about the file you're telling it to read does not actually exist | 21:45 |
avass | mordred: \o/ | 21:46 |
corvus | heh, yes... the section is missing, but that's not really the big picture is it? :) | 21:46 |
fungi | shoe not tied (leg missing) | 21:46 |
mordred | corvus, avass : WOOT - I think the new python code is actually working now | 22:26 |
mordred | https://ci.gerritcodereview.com/t/gerrit/build/033c128e41e3416e9073a3aab396b1f9/console#1/0/20/testnode shows actions that I think we expect | 22:27 |
mordred | so - I think next step is putting the jobs back on that patch and letting it do real builds | 22:27 |
corvus | \o/ ++ | 22:30 |
avass | oh are your doing specualive submodules | 22:30 |
avass | ? | 22:30 |
avass | (since you're doing a 'mv ..' instead of initializing the submodules) | 22:31 |
corvus | yeah, it actually kinda works with gating if you're using gerrit because gerrit will update the submodule reference when a commit lands. so the gating atomicity is still guaranteed. | 22:31 |
avass | oh interesting | 22:31 |
mordred | we've been doing it in a job for now to learn edge cases | 22:33 |
corvus | (if you did this without gerrit, then it's basically a recipe for a guaranteed broken tree) | 22:33 |
avass | yeah we're using submodules but setting the remote to a local copy checked out with required-projects | 22:34 |
avass | to get around needing secrets to initialize the submodules mainly | 22:35 |
mordred | corvus: think it's ok for me to DOS gerrit with that change? | 22:36 |
mordred | (I'm fairly confident it'll work this time) | 22:36 |
avass | I guess superproject subscriptions is what configures gerrit to bump submodules ? | 22:37 |
mordred | yup | 22:37 |
avass | cool that could be useful | 22:38 |
mordred | so the code we have there will either use the required-projects version of the repo if there is one and it's set up to do subscriptions - or it will check out the ref in the repo if it's not configured for subproject subscriptions | 22:38 |
corvus | mordred: yep | 22:38 |
mordred | corvus: bombs away | 22:39 |
*** nils has quit IRC | 22:44 | |
mordred | corvus: you know - now that this logic is in python - we could probably get rid of the gerrit_project_mapping | 22:44 |
mordred | because you should be able to calculate it with relative paths - it's just not reasonable to do that in ansible directly | 22:45 |
*** kgz has quit IRC | 22:46 | |
mordred | also - should be able to get rid of gerrit_root and make it fully generic - because you could really just iterate over every repo on disk looking for .gitmodules files- then process relative paths from the url parameter | 22:47 |
mordred | not doing that today - just saying I think we coudl do that - and then it would be reasonable to move that role into zuul-jobs | 22:47 |
corvus | ++ | 22:48 |
avass | mordred: ++ | 22:48 |
*** kgz has joined #zuul | 22:50 | |
mordred | corvus: https://ci.gerritcodereview.com/t/gerrit/buildset/708abf94afe74c9490d0fae1566beca2 - all green | 23:04 |
*** ikhan has joined #zuul | 23:06 | |
*** harrymichal has quit IRC | 23:15 | |
fungi | and as you've discovered, it's not easy being green | 23:23 |
openstackgerrit | James E. Blair proposed zuul/zuul-storage-proxy master: Allow mounting somewhere other than / https://review.opendev.org/c/zuul/zuul-storage-proxy/+/777261 | 23:39 |
openstackgerrit | James E. Blair proposed zuul/zuul-storage-proxy master: Add an integration test to the image build https://review.opendev.org/c/zuul/zuul-storage-proxy/+/777262 | 23:39 |
corvus | mordred: ^ i don't know if that job is going to work on the first try, but i know the process works (i ran the playbook locally). that should be our first automated test of something zuul related with a live swift. | 23:40 |
mordred | corvus: \o/ | 23:58 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!