*** rlandy has quit IRC | 00:00 | |
*** jamesmcarthur has quit IRC | 00:02 | |
*** adamw has quit IRC | 00:02 | |
*** adamw has joined #zuul | 00:03 | |
*** armstrongs has joined #zuul | 00:05 | |
*** erbarr has quit IRC | 00:08 | |
*** armstrongs has quit IRC | 00:14 | |
*** jamesmcarthur has joined #zuul | 00:28 | |
*** jamesmcarthur has quit IRC | 00:32 | |
tristanC | mnaser: +3, thanks, those looks great! | 00:42 |
---|---|---|
*** tosky has quit IRC | 00:46 | |
openstackgerrit | Merged zuul/zuul-jobs master: wait-for-pods: Wait for all pods to become Ready https://review.opendev.org/713107 | 00:50 |
*** rlandy has joined #zuul | 00:52 | |
openstackgerrit | Merged zuul/zuul-jobs master: Keep doc/source/roles.rst sorted https://review.opendev.org/713128 | 00:59 |
openstackgerrit | Merged zuul/zuul-jobs master: install-docker: add option to use buildset registry https://review.opendev.org/713115 | 01:00 |
mnaser | thanks tristanC | 01:00 |
*** jamesmcarthur has joined #zuul | 01:07 | |
mnaser | zuul just reported this on my change | 01:10 |
mnaser | Unable to freeze job graph: 'NoneType' object has no attribute 'decrypt' | 01:10 |
mnaser | i used a secret that isn't defined | 01:11 |
fungi | that error seems more straightforward than some, at least | 01:12 |
mnaser | so it should at least be something more straight forward aha | 01:12 |
*** Goneri has quit IRC | 01:12 | |
*** jamesmcarthur has quit IRC | 01:15 | |
*** jamesmcarthur has joined #zuul | 01:16 | |
*** kmalloc has joined #zuul | 01:50 | |
*** jamesmcarthur has quit IRC | 01:53 | |
*** jamesmcarthur has joined #zuul | 02:05 | |
*** swest has quit IRC | 02:08 | |
*** rlandy has quit IRC | 02:09 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-operator master: Add TLS configuration to ZooKeeper service https://review.opendev.org/712759 | 02:10 |
openstackgerrit | Merged zuul/zuul-operator master: Use explicit provides/requires for container jobs https://review.opendev.org/712816 | 02:24 |
*** jamesmcarthur has quit IRC | 02:24 | |
*** swest has joined #zuul | 02:24 | |
*** jamesmcarthur has joined #zuul | 02:32 | |
*** bhavikdbavishi has joined #zuul | 03:01 | |
mnaser | so I have https://review.opendev.org/#/c/713339/ up to move tempest to load it’s jobs and there’s a devstack change that is complaining about a missing job? That sounds like a zuul bug? https://review.opendev.org/#/c/708317/ is the devstack change | 03:33 |
tristanC | corvus: nice, so https://review.opendev.org/#/c/712759/ worked with zk-ca unencrypted pkcs8 files. so it seems like we just need to concat the key and cert? not sure if the cert-manager can provide this natively, but that shouldn't be a problem. | 03:50 |
*** jamesmcarthur has quit IRC | 03:59 | |
*** kmalloc has quit IRC | 04:00 | |
*** jamesmcarthur has joined #zuul | 04:03 | |
*** jamesmcarthur has quit IRC | 04:08 | |
*** jamesmcarthur has joined #zuul | 04:10 | |
*** jamesmcarthur has quit IRC | 04:35 | |
*** bolg has joined #zuul | 04:47 | |
*** jamesmcarthur has joined #zuul | 04:48 | |
*** bolg has quit IRC | 04:51 | |
*** bolg has joined #zuul | 04:52 | |
*** jamesmcarthur has quit IRC | 04:53 | |
*** saneax has joined #zuul | 04:53 | |
*** zxiiro has quit IRC | 05:18 | |
*** evrardjp has quit IRC | 05:35 | |
*** evrardjp has joined #zuul | 05:36 | |
*** dpawlik has joined #zuul | 06:57 | |
*** dpawlik has quit IRC | 07:41 | |
*** dpawlik has joined #zuul | 07:42 | |
*** dpawlik has quit IRC | 07:48 | |
*** raukadah is now known as chandankumar | 07:58 | |
*** avass has joined #zuul | 08:28 | |
*** jcapitao has joined #zuul | 08:30 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Add parent and abstract flags for diskimages https://review.opendev.org/713157 | 08:30 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Correct dib_path typo https://review.opendev.org/713380 | 08:30 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Convert to slots https://review.opendev.org/713381 | 08:30 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: diskimage: make name primary key https://review.opendev.org/713382 | 08:30 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Correct dib_path typo https://review.opendev.org/713380 | 08:34 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Convert diskimages to slots https://review.opendev.org/713381 | 08:34 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: diskimage: make name primary key https://review.opendev.org/713382 | 08:34 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Add parent and abstract flags for diskimages https://review.opendev.org/713157 | 08:34 |
*** jpena|off is now known as jpena | 08:56 | |
*** hashar has joined #zuul | 08:58 | |
*** harrymichal has joined #zuul | 09:05 | |
*** harrymichal has quit IRC | 09:12 | |
*** tosky has joined #zuul | 09:17 | |
*** threestrands has quit IRC | 09:56 | |
*** harrymichal has joined #zuul | 10:28 | |
*** harrymichal has quit IRC | 11:34 | |
*** harrymichal has joined #zuul | 11:59 | |
*** jcapitao is now known as jcapitao_lunch | 12:04 | |
*** hashar_ has joined #zuul | 12:16 | |
*** hashar has quit IRC | 12:16 | |
*** hashar_ has quit IRC | 12:20 | |
*** hashar_ has joined #zuul | 12:20 | |
*** hashar_ is now known as hashar | 12:23 | |
*** rlandy has joined #zuul | 12:29 | |
*** jpena is now known as jpena|lunch | 12:31 | |
*** harrymichal has quit IRC | 12:32 | |
*** harrymichal has joined #zuul | 12:34 | |
openstackgerrit | Denys proposed zuul/nodepool master: Add floating-pool option https://review.opendev.org/713428 | 12:53 |
zbr | AJaeger: can you explain the removal of bindep from zuul-roles mentioned in https://review.opendev.org/#/c/708642/18/zuul-tests.d/python-jobs.yaml@14 ? | 12:54 |
zbr | do we now assume that all roles defined in zuul-roles are supposed to work w/o a bindep file? if true, I bet lots of them will be effectively broken. | 12:56 |
zbr | i do not have a particular love for bindep, but i know for sure that these roles do not correctly use dependency chain in order to install the stuff they need | 12:57 |
openstackgerrit | Jens Harbott (frickler) proposed zuul/nodepool master: Correct dib_path typo https://review.opendev.org/713380 | 12:57 |
zbr | there are lots of assumptions made which consider only the openstack nodepool images, while these roles are supposed to be used by any zuul deployment. | 12:58 |
zbr | and due to this, i faced lots of bugs with them on both rdo-zuul and ansible-zuul, instances, sometimes it was something wrong in an image, but in many cases it was the role itself which did not count for a specific case. | 12:59 |
*** arxcruz|rover has quit IRC | 13:00 | |
*** sshnaidm is now known as sshnaidm|afk | 13:02 | |
*** arxcruz has joined #zuul | 13:02 | |
*** arxcruz is now known as arxcruz|rover | 13:03 | |
Shrews | fbo: can we abandon https://review.opendev.org/619525 ? that's fairly old and can be handled via a clouds.yaml change anyway | 13:12 |
fbo | Shrews: sure done | 13:14 |
AJaeger | zbr: we added bindep for Zuul install, and remove the Zuul install from zuul-jobs - thus mordred removed it in https://review.opendev.org/712547 | 13:19 |
AJaeger | zbr: we can add content back if needed - for now, we added it because of Zuul install and rmeove that, so remove bindep as well... | 13:20 |
*** jcapitao_lunch is now known as jcapitao | 13:24 | |
*** michael-beaver has joined #zuul | 13:25 | |
zbr | AJaeger: ok, to avoid surprises, i would prefer to trigger all testing jobs when bindep.txt is touched (removal counts as touched) | 13:25 |
zbr | it could help use discover what could get broken by its removal | 13:25 |
zbr | i am inclined to believe that we still want a bindep file, as even linting could need some extra deps. | 13:27 |
*** Goneri has joined #zuul | 13:27 | |
zbr | i often use bindep file to manually install deps on my dev machine, it also acts as a way to document deps, not only useful for zuul. | 13:27 |
*** jpena|lunch is now known as jpena | 13:33 | |
AJaeger | zbr, we needed it for Install of Zuul - and we don't install Zuul anymore. | 13:34 |
zbr | sure, let me try to make a test change, to validate if removal breaks something else. | 13:34 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: DNM: test bindep.txt touching effects https://review.opendev.org/713445 | 13:42 |
*** bhavikdbavishi has quit IRC | 13:45 | |
corvus | bindep in zuul-jobs is only used for roles that need extra dependencies for testing (ie, roles with unit tests); it's not used for expressing dependencies needed for roles themselves; that should either be handled by the role itself (eg ensure-tox) or by documentation in the role which explains pre-requisites for images. | 13:45 |
mnaser | yeah as i understand it, it's like what corvus was saying. if anything actually, removing bindep should help us potentially catch more cases of roles not working properly because they are not installing a dependency | 13:48 |
AJaeger | mnaser: agreed. | 13:48 |
AJaeger | So, change under discussion is https://review.opendev.org/712547 - anybody wants to +2? | 13:49 |
mnaser | corvus: btw, i faced an interesting (what i think) is a zuul bug if you have not had a chance to catch up with scrollback from yesterday | 13:56 |
corvus | i agree with that change, but we can also wait for zbr's jobs to return just to confirm that there wasn't any scope-creep for the bindep file. but if there are any errors, we should still probably merge mordred's change and fix the jobs with errors to do something other than rely on bindep. | 13:56 |
mnaser | ^ yep, that was my thoughts as well, id like to see the jobs report first | 13:57 |
*** mhu has joined #zuul | 14:01 | |
corvus | mnaser: re zuul bug, do you mean the errors on 708317? i'm not sure i entirely follow and could use an update on the current state :) | 14:02 |
openstackgerrit | Zen Kurosaky proposed zuul/zuul master: This patch extends zuul deployment from scratch documentation. https://review.opendev.org/713453 | 14:12 |
mnaser | corvus: so i think the interesting thing that happened was that 708317 was getting "Job tempest-full-py3 not defined". at that time, i had added openstack/devstack to load jobs, then added openstack/tempest to projects _but not_ load jobs. | 14:13 |
mnaser | the interesting thing is i'm (kinda) sure that the errors with tempest-full-py3 not defined was caused because of the change i did (even though it was in another tenant) | 14:14 |
mnaser | once https://review.opendev.org/#/c/713339/ merged and ianw rechecked a couple hours later, the patch no longer removed "job not found" | 14:14 |
mnaser | so in some weird or strange way i managed to maybe undefine the job in speculative zuul config changes .. or something along those lines? i don't understand that system very well but i hope i did an (ok) job trying to explain what i observed, happy to clarify | 14:15 |
corvus | mnaser: it was only not defined in the vexxhost tenant which is the one that left the message on the change | 14:16 |
corvus | mnaser: if you notice right after the 'job not defined' message, there's a 'depends on a change that failed to merge' message. that second one is probably from the openstack tenant. | 14:16 |
corvus | mnaser: this is the problem mordred and i were alluding to yesterday | 14:17 |
corvus | mnaser: (it shows up most often in merge conflict messages, but a config error is similar) | 14:17 |
corvus | mnaser: i'm not sure if we have a good way to turn those off for "secondary" tenants | 14:18 |
mnaser | mordred: oh. i understand now. so because the tenant has a pipeline configured that has reporting configured, the openstack and vexxhost tenants were both capturing the change, and openstack wasn't complaining but the vexxhost tenant was (and seeing it comes from the same "zuul" account, we wouldn't notice it | 14:20 |
mnaser | oops, i mean corvus ^ | 14:20 |
corvus | yep | 14:20 |
mnaser | so theoretically if there was another connection to opendev for this tenant and it had its own reporter, it would have been zuul (openstack) +1, vexxhost -1 | 14:21 |
mnaser | ok that explains it | 14:21 |
*** jamesmcarthur has joined #zuul | 14:22 | |
fungi | or if the report somehow indicated the tenant name (perhaps parenthetically prepended to the job name or something) | 14:29 |
mnaser | fungi: yeah, that's what i was thinking | 14:37 |
fungi | or mentioned along with the pipeline name | 14:39 |
fungi | though that might not have a workable analogue in the checks api data structure for gerrit | 14:40 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: WIP: Separate connection registries in tests https://review.opendev.org/712958 | 14:40 |
fungi | so yeah, maybe the account-per-tenant option we talked about does make more sense | 14:43 |
AJaeger | mnaser, corvus, want to +2A 712547 now that https://review.opendev.org/#/c/713445/ has run? Or is anything odd in it? | 14:43 |
*** NBorg has quit IRC | 14:43 | |
mnaser | AJaeger: it's a +2 from me but i rather have someone else have a look and +W just because there was a lot of concerns brought up | 14:46 |
*** gmann is now known as gmann_afk | 14:48 | |
*** zxiiro has joined #zuul | 14:48 | |
zbr | AJaeger: take a look at https://review.opendev.org/#/c/713445/ | 14:55 |
zbr | even some current jobs are broken due to bindep.txt file, i think it was a mistake not to include it in all jobs, similar mistake with test-reqs.txt | 14:56 |
corvus | zbr: what jobs are broken? | 14:56 |
AJaeger | zbr: there's one failure that is clearly unrelated, isn't it? | 14:57 |
*** sshnaidm|afk is now known as sshnaidm | 14:57 | |
AJaeger | zbr: I'm confused, please give more details | 14:57 |
zbr | not^ f30 is clearly related, not sure about opensuse one which was already nv. | 14:57 |
mordred | f30 is a broken fedora repo | 14:58 |
fungi | looks to me like it's confirmed we should be safe to remove | 14:58 |
mordred | yup | 14:58 |
zbr | let me trigger it again with file removed, i bet it will be more than one broken job. | 14:59 |
corvus | nonono | 14:59 |
mordred | also - I mean - if any of the tests are broken by bindep removal - the roles are broken | 14:59 |
mordred | roles do NOT depend on bindep | 14:59 |
mordred | bindep is not used when roles are used | 14:59 |
corvus | let's be a little more cautious with asking the ci system to run 100 jobs | 14:59 |
mordred | it is ENTIRELY possible that there are roles in zuul-jobs that do not work on all platforms | 15:00 |
corvus | we don't need to run that again. we have good data | 15:00 |
mordred | but under no circumstances will anything related to bindep fix that | 15:00 |
mordred | s/possible/probable/ | 15:00 |
corvus | zbr: wait, i just looked at your change again... you didn't actually remove anything from bindep.txt... | 15:01 |
corvus | zbr: so what was it you just used all those machines to test? | 15:01 |
zbr | corvus: how about running these on periodic-weekly? so we get an idea about which ones are get broken,. | 15:01 |
zbr | this is what i am trying to say, i just tocuhed the file, not removing it. | 15:01 |
corvus | zbr: then you just wasted all of our time, and a bunch of ci resources too, and the time of everyone who is waiting on those | 15:02 |
corvus | because that test told us nothing useful after all | 15:02 |
corvus | other than, maybe "zuul can run lots of jobs" | 15:03 |
zbr | instead of using blame, we could maybe think about how we can improve the current testing logic. | 15:03 |
corvus | zbr: please do. when you come up with a full test plan, let us know. | 15:03 |
AJaeger | zbr, nothing in your tests showed that we cannot merge https://review.opendev.org/712547, I'll +2A now. | 15:04 |
fungi | by removing the bindep.txt file, we'll be solving whatever logic problem there is around not running the jobs for the zuul-jobs repo when bindep.txt is altered (because there will be no more bindep.txt file to worry about) | 15:04 |
zbr | most of the roles can be tested with containers, when I mentioned using molecule to run them I was told "we have zuul jobs for that, we can run as many as we need" | 15:04 |
corvus | zbr: yes, we can. but please run *useful* tests. | 15:04 |
zbr | to be clear: i am not against removal of the bindep. | 15:05 |
corvus | zbr: that would have been a useful test if you removed the contents of bindep.txt to see what would be affected by its removal. | 15:05 |
corvus | zbr: but you just run jobs with no changes, and they worked, which is exactly what we would expect. | 15:05 |
zbr | that was my mistake, not to remove it. | 15:05 |
AJaeger | zbr: wnat to update 708642 and remove the bindep lines? | 15:06 |
corvus | i'd rather we don't | 15:06 |
mordred | https://review.opendev.org/#/c/712547/ already removes bindep though - so we already have that test. I think it would be useful to step back for a sec and try to restate what it is that we're trying to fix before we do next things | 15:06 |
AJaeger | corvus: I'm talking about https://review.opendev.org/#/c/708642/18/zuul-tests.d/python-jobs.yaml | 15:07 |
corvus | AJaeger: oh sorry | 15:07 |
AJaeger | np | 15:07 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: Improve ensure-tox role https://review.opendev.org/708642 | 15:07 |
corvus | jamesmcarthur: i think i would take the silence on the ml about the user survey to indicate that it's ready to go... that seem right to you? | 15:09 |
jamesmcarthur | corvus: thanks, that was my feeling as well :) | 15:09 |
jamesmcarthur | I'll get with aprice and work on promotion and outreach | 15:10 |
jamesmcarthur | we should put a link to it on the Zuul homepage | 15:10 |
openstackgerrit | Daniel Pawlik proposed zuul/zuul-jobs master: Add phoronix-test-suite job https://review.opendev.org/679082 | 15:10 |
corvus | jamesmcarthur: i don't know about the wiki article... | 15:11 |
corvus | jamesmcarthur: it seems like there are sufficient secondary sources (several published news articles).... | 15:12 |
corvus | jamesmcarthur: certainly more than, say, a page like https://en.wikipedia.org/wiki/Gated_commit | 15:12 |
jamesmcarthur | corvus: I'm working with Robert Cathey on fixing up the Zuul article. He found some additional sources and has had experience getting wiki articles published. | 15:12 |
corvus | which doesn't mention zuul, and all of its sources are software documentation and blogs, and were published like 6 years after we started describing zuul as a gating system | 15:12 |
corvus | jamesmcarthur: awesome! | 15:13 |
jamesmcarthur | corvus: I think it's just a bit of a slog. The person who denied our first entry specializes in "TV and entertainment" | 15:13 |
jamesmcarthur | so... we're going to have to have a sweet spot of undeniable documentation + the right reviewer | 15:14 |
corvus | jamesmcarthur: ah :) | 15:14 |
corvus | jamesmcarthur: are conference videos useful? | 15:14 |
jamesmcarthur | it seems like articles carry the most weight | 15:14 |
corvus | shame -- mordred and i could come up with like 30 of those over the past 8 years :) | 15:15 |
fungi | from "reputable sources" (meaning perodicals someone specializing in tv and entertainment will have heard of before?) | 15:15 |
corvus | mordred: did you mention zuul in your wired cover story? | 15:16 |
jamesmcarthur | i think just sites with a high rank | 15:16 |
fungi | apparently sites like opensource.com don't count as reputable | 15:16 |
jamesmcarthur | corvus: re videos - but if you have some that aren't on openstack.org you can send my way, I'll add them to the list Robert put together. | 15:16 |
corvus | jamesmcarthur, mordred: yes, monty mentioned zuul in wired: https://www.wired.com/2013/04/new-hackers-taylor/ | 15:16 |
openstackgerrit | Mohammed Naser proposed zuul/zuul master: Display clean error message for missing secret https://review.opendev.org/713469 | 15:16 |
jamesmcarthur | oh nice! | 15:16 |
corvus | it doesn't use the word gating, but it does describe the process and conditions, so we might be able to get a cite out of that | 15:17 |
jamesmcarthur | corvus: One of the other comments was it seemed like we were just quoting ourselves, and so it felt like an advertisement. | 15:18 |
jamesmcarthur | I'm hoping Robert can help with the tone there to avoid that perception. | 15:18 |
corvus | jamesmcarthur: that wired article has responses from other people and companies | 15:19 |
corvus | "Whatever you call it, it's a sign of things to come. Michael Lehenbauer, a former Microsoft developer, says the software giant has used a similar setup on large projects inside the company." | 15:20 |
jamesmcarthur | corvus: awesome - just sent it on to Robert as well | 15:20 |
mordred | are any of the articles from bmw or volvo about how they're using zuul useful sources? | 15:26 |
openstackgerrit | Tobias Urdin proposed zuul/nodepool master: Filter active images for OpenStack provider https://review.opendev.org/713471 | 15:26 |
corvus | mordred: oh interesting, yeah that volvo article in particular may be a unique source | 15:30 |
mordred | yeah - since it was them talking about automotive things and then talking about how they used zuul to solve it - it wasn't really a zuul sales pitch | 15:31 |
AJaeger | so, zbr has a job to improve ensure_tox role, anybody to review that one later, please? https://review.opendev.org/#/c/708642/ | 15:31 |
mnaser | mordred: can i pick up your ensure-python change? noonedeadpunk is working on an dib element which would cache python builds (and so in combination with your work it'll likely be noop) | 15:33 |
mnaser | sorry more specifically to work with pyenv | 15:33 |
*** erbarr has joined #zuul | 15:37 | |
openstackgerrit | David Shrewsbury proposed zuul/nodepool master: Add options to CLI info command https://review.opendev.org/712539 | 15:37 |
openstackgerrit | Merged zuul/zuul-jobs master: Use a zuul_* and add an .ansible-lint file https://review.opendev.org/712547 | 15:40 |
mordred | mnaser: absolutely | 15:42 |
mordred | mnaser: the biggest issue I ran in to was trying to figure out how to set paths - but maybe just adding a /usr/local/bin symlink would be enough | 15:43 |
zenkuro | fuhhh... looks like Im almost done with adding mysql to docs | 15:43 |
zenkuro | But I continue struggle with log managment in zuul, can anybody give a guide on how to grasp log gathering and uploading? Or supervise me in this crusade? I will post a doc on the basis of my finding | 15:45 |
mnaser | mordred: ok, i will look into it a tad more! | 15:55 |
corvus | zenkuro: there's a very small amount of discussion about logs here: https://zuul-ci.org/docs/zuul/tutorials/quick-start.html#configure-a-base-job | 15:57 |
corvus | zenkuro: i think at this point we would suggest people run generate-zuul-manifest and one of the upload-logs(.*) roles | 15:58 |
corvus | zenkuro: also fetch-output and merge-output-to-logs | 15:58 |
corvus | zenkuro: fetch-output makes it easier to get logs from the remote machine to the executor, merge-output-to-logs makes sure that artifacts and docs get uploaded, generate-zuul-manifest is needed for the web ui to work correctly, and then obviously upload-logs (or upload-logs-swift, or upload-logs-gcs) is needed to put them in storage and which and how to use it depends on the installation. | 16:00 |
*** AJaeger has quit IRC | 16:22 | |
*** AJaeger has joined #zuul | 16:22 | |
*** hashar is now known as hasharAway | 16:39 | |
*** chandankumar is now known as raukadah | 16:48 | |
clarkb | https://review.opendev.org/#/c/713380/3 is an easy nodepool bugfix | 16:55 |
zenkuro | corvus: thanks again! | 16:59 |
*** jamesmcarthur has quit IRC | 17:11 | |
*** jamesmcarthur has joined #zuul | 17:12 | |
*** jamesmcarthur has quit IRC | 17:13 | |
*** jamesmcarthur has joined #zuul | 17:13 | |
*** jcapitao has quit IRC | 17:19 | |
clarkb | ianw corvus tristanC https://review.opendev.org/#/c/713157/4 looks good to me other than some minor code and test things | 17:23 |
clarkb | might be good to rereview that tristanC and see if it does what you had thought about for inheritance (my interpretation is that it does) | 17:23 |
*** hasharAway is now known as hashar | 17:25 | |
*** jamesmcarthur has quit IRC | 17:28 | |
corvus | clarkb, ianw: my first impression is that i like that, and the test fixtures look like a pretty good example usage (so i can imagine how opendev would change pretty easily). we should make sure Shrews has a chance to look at this; i think he's been heads down in other stuff and i don't want this to pass him by :) | 17:29 |
*** jamesmcarthur has joined #zuul | 17:29 | |
clarkb | rgr | 17:29 |
*** jamesmcarthur has quit IRC | 17:31 | |
*** jamesmcarthur has joined #zuul | 17:31 | |
tristanC | +1 | 17:34 |
*** evrardjp has quit IRC | 17:35 | |
Shrews | Will look in a bit, but here is a question for the team: clarkb found the need to manually delete some zk image upload records the other day because of the recent FN move. I hate having to force anyone to do anything with zk-shell, so I'm working on nodepool CLI changes to help with this. We already have the 'erase' command which deletes ALL zk info for a provider. I want to change this to be more selective in what we delete. | 17:36 |
*** evrardjp has joined #zuul | 17:36 | |
Shrews | e.g., image builds, image uploads, or node records | 17:36 |
Shrews | turns out, deleting image build records also deletes image upload records because of the nested nature of the records | 17:36 |
Shrews | do we want to keep that behavior? | 17:37 |
mordred | clarkb: ++ from me on that nodepool change too | 17:37 |
Shrews | or force the user do upload records separately? | 17:37 |
fungi | now that nodepool goes ahead and deletes image upload records when it asks the provider to delete the image, rather than when the provider eventually gets around to deleting the image, having the build record deletion also delete upload records seems fine to me | 17:39 |
clarkb | Shrews: in the case I ran into the build records were automatically purged once the uploads were gone | 17:39 |
Shrews | clarkb: yeah, i'm not seeing any reason one would want to keep the build records after deleting the upload records | 17:40 |
fungi | or is it just that it cleans up the on-disk copies once asked to delete, but is still currently holding onto the build/upload records until the providers confirm deletion? | 17:40 |
Shrews | fungi: if any uploads exist, the build will not be deleted (either from zk or on disk) | 17:41 |
Shrews | so once the upload records were removed, nodepool was free to cleanup after itself | 17:41 |
fungi | hrm, i thought that changed a few weeks ago | 17:42 |
fungi | because we were stuck with a bunch of extra local copies of images which providers refused to delete due to stuck nodes booted from those images in a boot-from-volume configuration | 17:42 |
* fungi checks the changelog | 17:43 | |
openstackgerrit | Merged zuul/nodepool master: Correct dib_path typo https://review.opendev.org/713380 | 17:43 |
Shrews | fungi: "now that nodepool goes ahead and deletes image upload records when it asks the provider to delete the image" ... not sure what that change is | 17:43 |
fungi | yeah, i was asking if it was that or just the local copies of the images | 17:44 |
fungi | looks like the latter | 17:44 |
fungi | https://review.opendev.org/702062 Delete dib images when all uploads set to deleting | 17:44 |
fungi | so it's still keeping the build and upload records, just removing the on disk copies | 17:44 |
Shrews | oh, yeah. forgot about that one | 17:44 |
fungi | is there much urgency now to clean up the upload and build records since we're freeing up disk immediately? | 17:46 |
fungi | i guess for the "provider is totally gone" case there is | 17:46 |
fungi | because nodepool will never find out the images are gone from the (now nonexistent) provider | 17:46 |
mordred | yeah | 17:47 |
* mordred says useful things like "yeah" to provide value | 17:47 | |
Shrews | good point. i think we can actually just leave the 'erase' command as-is since we could have used that | 17:47 |
Shrews | but we just forgot it existed | 17:47 |
fungi | so with that as the main use case at least, i think deleting the upload records is sufficient because the build records will get cleared as soon as those are gone anyway | 17:47 |
Shrews | so... yeah. thx mordred! | 17:47 |
mordred | Shrews: I'm glad I solved this problem | 17:47 |
Shrews | mordred: you are a true hero | 17:48 |
fungi | yeah | 17:48 |
fungi | which i think is what clarkb was saying, on reflection | 17:48 |
clarkb | fungi: ya that | 17:49 |
fungi | and nodepool is already smart enough to not get rid of the build records if there are lingering upload records in another provider you didn't explicitly clear | 17:50 |
corvus | yeah | 17:53 |
*** avass has quit IRC | 17:55 | |
*** jpena is now known as jpena|off | 18:04 | |
Shrews | clarkb: corvus: i'm fine with the builder dib changes, btw. i find myself wishing we could abstract away a lot of the image building stuff, but i also wish i weren't confined to my house and realize i just have to deal with the current situation :/ | 18:07 |
*** saneax has quit IRC | 18:35 | |
mnaser | ok i'm losing it i think | 18:40 |
mnaser | with update-test-platforms, what's *actually* the thing that adds the "autogenerated" bit into it | 18:40 |
mnaser | i seriosuly don't see anywhere that adds the comment block that says "autogenerated" | 18:40 |
mnaser | if i manually add it and re-run it, the script wipes it | 18:41 |
mnaser | and i dont see where the script is actually generating it | 18:41 |
clarkb | mnaser: I think its based on a tag? | 18:42 |
mnaser | clarkb: yeah that part works fine, i can get the jobs and project section to generate, what's 'breaking' is the comment saying its autogenerated is disappearing | 18:42 |
mnaser | aka this https://www.irccloud.com/pastebin/dzvxhAGM/ | 18:43 |
clarkb | oh that I don't know | 18:43 |
clarkb | corvus: ^ set that up and may know | 18:43 |
corvus | apparently the autogenerated message is not autogenerated; it was manually added. [i did not do that] | 18:44 |
*** rfolco has joined #zuul | 18:44 | |
mnaser | weirdly enough it's removing it when i manaully add it above the project section (but other files don't have it removed) | 18:44 |
corvus | mnaser: you'll need a recent version of ruamellib to avoid having it removed | 18:45 |
corvus | mnaser: i found the version in bionic is insufficient, so i had to make a venv and install ruamel from pip | 18:45 |
mnaser | i used tox -eupdate-test-platforms which seems to have ruamel.yaml>=0.16.7 | 18:45 |
corvus | (sorry, not ruamellib, just ruamel) | 18:45 |
mnaser | aa | 18:45 |
corvus | oh neat, that's new to me | 18:45 |
corvus | hrm | 18:45 |
corvus | then i would expect that tox to work | 18:45 |
mnaser | and it actually initially worked, but it put the comment above the jobs that were auto generated | 18:46 |
mnaser | so i moved it and reran it and it wipes it | 18:46 |
mnaser | i mean, it's not a big deal, i think. but it is confusing and curious as to why | 18:46 |
mnaser | update-test-platforms installed: ruamel.yaml==0.16.10,ruamel.yaml.clib==0.2.0 | 18:47 |
corvus | that is curious; i don't immediately know. it may be that the comment is associated with a line that is getting removed/replaced and so it's going with it | 18:47 |
*** rfolco has quit IRC | 18:48 | |
corvus | 0.16.7 is the version i last used; not sure if it would behave any differently | 18:48 |
corvus | (i doubt it, but might be worth a check) | 18:48 |
fungi | yeah, seems likely the comment was added manually and was getting preserved by ruamel-yaml since it preserves comments and ordering | 18:49 |
fungi | but it does have funny behaviors around associating comments with lines of yaml data | 18:50 |
mnaser | if anyone has a few minutes, i'd appreciate some help with the test i wrote.. i _thought_ it was going to work but it clearly didn't -- so maybe there's something else i'm missing naively? https://review.opendev.org/#/c/713469/ | 18:59 |
mnaser | fwiw i think it's the right fix (based on the traceback: http://paste.openstack.org/show/790797/), i dont think im testing it properly | 19:00 |
corvus | mnaser: i'll look after the infra meeting | 19:02 |
mnaser | thank you corvus ! | 19:03 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: Improve ensure-tox role https://review.opendev.org/708642 | 19:06 |
*** SpamapS has quit IRC | 19:12 | |
*** bhavikdbavishi has joined #zuul | 19:16 | |
*** jamesmcarthur has quit IRC | 19:16 | |
*** jamesmcarthur has joined #zuul | 19:18 | |
*** jamesmcarthur has quit IRC | 19:22 | |
*** SpamapS has joined #zuul | 19:24 | |
*** gmann_afk is now known as gmann | 19:25 | |
*** openstackgerrit has quit IRC | 19:33 | |
*** michael-beaver has quit IRC | 19:33 | |
*** jamesmcarthur has joined #zuul | 19:34 | |
*** openstackgerrit has joined #zuul | 19:36 | |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: DNM: Add support for installing python with pyenv https://review.opendev.org/704266 | 19:36 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: DNM: Add support for installing python with pyenv https://review.opendev.org/704266 | 19:37 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: DNM: Add support for installing python with pyenv https://review.opendev.org/704266 | 19:40 |
openstackgerrit | Merged zuul/zuul-jobs master: Improve ensure-tox role https://review.opendev.org/708642 | 19:46 |
*** jamesmcarthur has quit IRC | 19:46 | |
*** jamesmcarthur has joined #zuul | 19:46 | |
mnaser | mordred: https://review.opendev.org/#/c/704266/ is now testing on all platforms and seems to build successfully on all of them (except for gentoo, ill have to figure out how to use emerge and try out what's missing | 20:02 |
mordred | mnaser: cool! | 20:03 |
mnaser | i was wondering -- what do you think about an idea of that role setting zuul_python_path and then other python roles using that if it's defined instead of mucking around with the path | 20:03 |
mnaser | now that i type this out though i realize that it's possible someone will use 'command' or 'shell' with ensure-python and that would make it useless | 20:03 |
mnaser | hmm, apparently we can call python-build $version /usr/local and it would install it globally on the system | 20:05 |
mnaser | i don't know if that's an approach we want to take though ... but realistically also how different is that from doing an apt install pythonX-dev | 20:05 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add TLS support for ZooKeeper https://review.opendev.org/712531 | 20:07 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Use ZK TLS in quickstart https://review.opendev.org/712817 | 20:07 |
*** jamesmcarthur has quit IRC | 20:08 | |
corvus | mnaser: left comment on 713469 | 20:09 |
*** jamesmcarthur has joined #zuul | 20:10 | |
mnaser | thanks | 20:12 |
openstackgerrit | Mohammed Naser proposed zuul/zuul master: Display clean error message for missing secret https://review.opendev.org/713469 | 20:12 |
*** jamesmcarthur has quit IRC | 20:15 | |
*** jamesmcarthur has joined #zuul | 20:16 | |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: DNM: Add support for installing python with pyenv https://review.opendev.org/704266 | 20:33 |
mordred | corvus: any reason we should not land this: https://review.opendev.org/#/c/713060/ ? | 20:35 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: DNM: Add support for installing python with pyenv https://review.opendev.org/704266 | 20:46 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: DNM: Add support for installing python with pyenv https://review.opendev.org/704266 | 20:56 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: Add support for installing python with pyenv https://review.opendev.org/704266 | 20:56 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: Add support for installing python with pyenv https://review.opendev.org/704266 | 20:57 |
corvus | mordred: i think we're good, i'll do that now | 20:58 |
mordred | cool | 20:59 |
openstackgerrit | Mohammed Naser proposed zuul/zuul master: Display clean error message for missing secret https://review.opendev.org/713469 | 20:59 |
*** rlandy is now known as rlandy|afk | 21:02 | |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: Add support for installing python with pyenv https://review.opendev.org/704266 | 21:04 |
*** jamesmcarthur has quit IRC | 21:04 | |
mordred | good job mnaser - that looks great | 21:06 |
mnaser | mordred: thanks, those will unblock us and will allow us to start adding more jobs to cherrypy to be tested using the same base os :) | 21:07 |
mordred | mnaser: did noonedeadpunk figure out how to pre-cache the pyenv versions? like - if you just build a pile of them but don't install in /usr/local - is that final build step really quick? | 21:07 |
mordred | or is that still WIP? | 21:08 |
mordred | (mostly just curious) | 21:08 |
mnaser | mordred: i think last thing we were at was starting to write a dib element, but i think that's a bit newer for them | 21:08 |
*** jamesmcarthur has joined #zuul | 21:08 | |
mnaser | mordred: ill let noonedeadpunk answer tho but i dont know if he's still around now :) | 21:08 |
mnaser | i think my idea was that we can do this in parallel and the latter is just a speed up | 21:08 |
mordred | cool. I mean - that's an optimization ... yeah | 21:08 |
mordred | exactly | 21:08 |
mordred | my _hunch_ is that it'll use what it has built before | 21:09 |
mnaser | mordred: yep, thats what i think too.. | 21:10 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Convert diskimages to slots https://review.opendev.org/713381 | 21:10 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: diskimage: make name primary key https://review.opendev.org/713382 | 21:10 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Add parent and abstract flags for diskimages https://review.opendev.org/713157 | 21:10 |
*** jamesmcarthur has quit IRC | 21:13 | |
mordred | mnaser: nope. at least not in local testing | 21:14 |
mnaser | mordred: urgh | 21:14 |
mordred | mnaser: I have 3.7.4 installed in ~/.pyenv like "normal" | 21:14 |
mordred | and I did python-build 3.7.4 /usr/local and it decided it needed to download something | 21:14 |
mordred | mnaser: oh! there's a "--keep" option for keeping the source tree | 21:15 |
mordred | let me try something | 21:15 |
mnaser | also i need to fix my assert | 21:15 |
mnaser | complains that python_version is not defined if it's not | 21:15 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: Add support for installing python with pyenv https://review.opendev.org/704266 | 21:16 |
* noonedeadpunk reads back | 21:16 | |
mnaser | also something i found out is that we run ensure-tox _before_ ensure-python ? | 21:16 |
mordred | mnaser: ensure-tox include_roles: ensure-python | 21:16 |
mordred | if you have a variable set | 21:17 |
mnaser | mordred: yes but it includes it _after_ installing tox | 21:17 |
mordred | oh. wait | 21:17 |
mordred | yeah | 21:17 |
mordred | wow | 21:17 |
mnaser | which means in the case of zuuls py37 tests | 21:17 |
mordred | that seems like an easy enough bug to fix | 21:17 |
mnaser | /usr/local/lib/python3.6/dist-packages/tox/config/__init__.py:593: UserWarning: conflicting basepython version (set 3.6, should be 3.7) for env 'py37';resolve conflict or set ignore_basepython_conflict | 21:18 |
mordred | I can't imagine that's in that order on purpose | 21:18 |
mnaser | so i suspect that tox-py37 in zuul/zuul-jobs right now is actually... tox-py36 | 21:19 |
noonedeadpunk | So I've just started work on element, but not yet finished. | 21:19 |
mordred | I tried python-build --keep and it did not make any difference - I installed into one dir, then ran it again installing to a different dir and it started downloading the tarball again | 21:20 |
mnaser | bleh | 21:20 |
mnaser | so much for this ambitious plan | 21:20 |
mordred | maybe this is a thing where we need to send in a PR upstream to get them to understand that we'd like to do this thing | 21:20 |
mnaser | mordred: for what it's worth tho, python is installing in 1m25s | 21:21 |
mordred | cool! | 21:22 |
mordred | so maybe not a huge deal anyway | 21:22 |
mnaser | i thought it would be a lot worse | 21:22 |
noonedeadpunk | Yeah, when I tried it locally it was even up to 3min | 21:23 |
mnaser | noonedeadpunk: what was the instance size? i wonder if maybe its beacuse there's 8 cores on the opendev systems | 21:23 |
noonedeadpunk | it was 4 cores | 21:23 |
noonedeadpunk | and like 4gb ram iirc | 21:24 |
mnaser | maybe 8/8 makes a difference | 21:24 |
mnaser | we can try that out | 21:24 |
*** dtroyer has quit IRC | 21:26 | |
*** dtroyer has joined #zuul | 21:27 | |
*** dtroyer has left #zuul | 21:27 | |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: Add support for installing python with pyenv https://review.opendev.org/704266 | 21:28 |
mnaser | last attempt, hopefully. | 21:28 |
fungi | fwiw, i build a variety of versions of cpython from source just using its native build toolchain, and you could totally save the `make altinstall` step for job runtime, would only require a few minutes. we could even tar up the build tree and just fetch it from our mirror network like we do with wheels, so it's not bloating our node images | 21:30 |
fungi | er, only a few seconds for running `make altinstall` from the build dir | 21:30 |
mnaser | fungi: i was thinking of that approach too, but thought that it started to be a little too much "walking our own path" | 21:31 |
fungi | i have 9 versions of python right now, total install size is 2gb, so baking a lot of them in does definitely increase the image size a fair amount | 21:32 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: Add support for installing python with pyenv https://review.opendev.org/704266 | 21:32 |
mnaser | fungi: if we bake them in, it seems like there's no quick way to reuse them or switch between them, not natively with pyenv anyways | 21:33 |
fungi | what do you mean switch between them? | 21:33 |
fungi | if you want python 3.6 you call python3.6 and if you want python 3.8 you call python3.8 | 21:34 |
fungi | `make altinstall` lets you specify a completely separate libtree | 21:34 |
fungi | so they can all exist in parallel in harmony, no problem | 21:34 |
fungi | also i don't know that it's really walking our own path if we're using the exact build steps recommended by the upstream cpython maintainers | 21:35 |
fungi | but maybe i don't find them as esoteric and mystical as some do | 21:35 |
*** jamesmcarthur has joined #zuul | 21:35 | |
mnaser | fungi: oh i see what you mean | 21:36 |
mnaser | i guess it remains "what is python3" means | 21:36 |
fungi | (it really is just `configure;make;make altinstall` but with some optimization flags passed) | 21:36 |
mnaser | in a system like that | 21:36 |
fungi | right, i don't think we need to define what the default python3 is, we can leave that for distro python | 21:36 |
openstackgerrit | Merged zuul/nodepool master: Revert abitrary uid support https://review.opendev.org/713060 | 21:36 |
fungi | but that's just my opinion | 21:37 |
mnaser | fungi: i actually like that idea | 21:37 |
mordred | could also just make a symlink | 21:37 |
mordred | if you _did_ want to set a default | 21:37 |
mnaser | and that way folks who use tox can just make sure they have the right env name or basepython that poits to specific things | 21:37 |
mordred | yeah | 21:37 |
fungi | right, you `make altinstall` to some specific tree on the filesystem and then symlink /usr/local/sbin/python3.6 to that python3.6 you built | 21:38 |
mnaser | i personally don't care too much in my case of having an image that's 2gb bigger but saving on ~2m of build time *and* a download | 21:38 |
fungi | er, /usr/local/bin not sbin | 21:38 |
mordred | yeah | 21:38 |
fungi | but you get the point | 21:38 |
mordred | also - for you mnaser, you really probably only care about 3.5, 3.6, 3.7 and 3.8 right? | 21:38 |
fungi | in opendev we'd probably provide half as many as i'm installing on my workstation, yeah | 21:39 |
mordred | so you don't quite need the fungi 9 versions | 21:39 |
mordred | in fact ... | 21:39 |
mnaser | technically don't even care about 3.7 too cause that comes with buster natively | 21:39 |
fungi | i have latest point releases of 2.7, 3.3-3.8, latest alpha of 3.9 and also a 3.10 which is really just current master | 21:39 |
mordred | mnaser: if you go that route - you could really do the make altinstall of all of them (don't save it for runtime) - and just install all 4 versions then clean up the source/build tree | 21:40 |
mordred | so you'd have a /usr/local/python3.6 and a /usr/local/python3.7 (for instance) - and then symlinks into /usr/local/bin and you're good | 21:40 |
fungi | yep, worst case save the symlinking for a common role in the jobs | 21:40 |
mordred | yeah. but if you skipped doing /usr/local/bin/python3 ... | 21:41 |
fungi | so they don't appear in the default path unless requested | 21:41 |
mordred | you could totally pre-symlink totally in dib | 21:41 |
*** jamesmcarthur has quit IRC | 21:41 | |
mordred | you could have ensure-python have the ability to symlink one of them to /usr/local/bin/python3 for setting a default | 21:41 |
mordred | in case that's important for jobs | 21:41 |
mnaser | hrm | 21:42 |
fungi | the risk i see there is if you build, say, python3.7 from source into the ubuntu-bionic image and symlink it at /usr/local/bin/python3.7 such that it winds up shadowing the /usr/bin/python3.7 provided by the distro | 21:42 |
*** tjgresha has joined #zuul | 21:42 | |
fungi | then folks wind up testing from-source 3.7 if they meant to test distro 3.7 | 21:43 |
mnaser | fungi: right, unless we do python-only images | 21:43 |
mordred | yeah. but that might also be a thing you decide you _want_ to do | 21:43 |
mordred | what mnaser said | 21:43 |
fungi | ahh, dedicated images, i see | 21:43 |
mnaser | but then the complication is its not trivial to use zuul/zuul-jobs unfortunately | 21:43 |
mordred | well ... maybe it's still fine | 21:43 |
mnaser | because how do we override tox-py36 to use a specific image.. | 21:43 |
mnaser | (or tox for example) | 21:44 |
mordred | oh - I see what you mean | 21:44 |
fungi | if ensure-python just checks that the requested python is available in the default execution path before taking action, it should be fine? | 21:44 |
mordred | no - for mnaser's use case for vexxhost - if he build a python base image that had all the pythons | 21:44 |
mordred | but still wanted tox-py36 to work | 21:44 |
mordred | how would he assign it to a node | 21:44 |
fungi | do we specify node labels/names in zuul/zuul-jobs? | 21:44 |
mnaser | nope | 21:44 |
fungi | i thought we wanted to avoid that so people could run them on whatever nodes they wanted | 21:45 |
mordred | that's right | 21:45 |
corvus | mordred: an image with *all* the pythons, or an image for each python? | 21:45 |
mordred | all | 21:45 |
*** jamesmcarthur has joined #zuul | 21:45 | |
mordred | fungi: but our way of doign that is that tox-py36 just runs on "the default" | 21:45 |
mnaser | yeah, it'd be nice if we could "template" nodesets but that's a pretty big change i think | 21:46 |
mnaser | nodeset: "{{ python_build_image | default(base_image) }}" | 21:46 |
mnaser | or default(omit) | 21:46 |
mordred | mnaser: well - let's step back to a previous thought - if you don't do any build-time symlinking | 21:46 |
corvus | there's two approaches here: 1) make tox-py36 work in a variety of approaches; 2) inherit from tox-py36. if vexxhost wants to use a specific image for tox-py36, then that's where a job like "vexxhost-tox-py36" or whatever would come into play | 21:47 |
mordred | but just put each python into a dir in /usr/local | 21:47 |
mordred | then add support to ensure-python for symlinking from a /usr/local structure instead of using pyenv | 21:48 |
mordred | you could just put the pythons into your normal base image - not worry about screwing people otherwise - and set the "please symlink me" flag in a site var or something | 21:48 |
mnaser | corvus: ideally, i'd like it so that users of zuul dont have to go read some "vexxhost ci" documentations and instead can refer to zuul's docs (that we can help improve if something in zuul-jobs is missing). that may be me tying too man things together | 21:48 |
mnaser | mordred: i like that approach, that seems like the one that probably has the least affect overall | 21:49 |
corvus | mnaser: i agree that's ideal | 21:49 |
fungi | for that i think we need some way of telling ensure-python how/where to get its python in a specific environment and use the default node | 21:49 |
*** jamesmcarthur has quit IRC | 21:49 | |
mordred | yeah | 21:49 |
*** jamesmcarthur has joined #zuul | 21:50 | |
mordred | or - just define an interface - "look for python{{ python_version }} in {{ python_install_base | default(/usr/local) }}" | 21:50 |
mnaser | but yeah, maybe ensure-python can have a "look for pythons" var that it checks if /usr/local/python{{ python_version }} exists and symlinks from there | 21:50 |
mordred | mnaser: jinx | 21:50 |
mnaser | aha :) | 21:50 |
corvus | i have read scrollback, but i'm still not certain i understand the premise of the current convo; is it basically that while the pyenv approach works, it's not efficient, and we'd like to optimize it by having many pythons pre-installed on the image, but if we do that, how do we get tox using the right preinstalled python? | 21:51 |
mnaser | corvus: i guess if ensure-python does symlinks, it can symlink everything inside python3.X/bin/ which would also symlink pip into /usr/local/bin/pip | 21:52 |
mnaser | so when we run pip install tox in ensure-tox, it'll already use that specific python | 21:52 |
mordred | corvus: yes - I thnik that's the gist | 21:53 |
mordred | FWIW - this is almost never the answer, but we could use https://www.gnu.org/software/stow/manual/stow.html | 21:53 |
fungi | i still think it ought to be viable to have a periodic job build your desired pythons for your default image, stash tarballs of them somewhere, then splat them onto the node when requested | 21:53 |
fungi | but that's a tradeoff in network usage vs image size | 21:54 |
corvus | mordred: nice... i mean, that program was designed *exactly* for this use case :) | 21:54 |
mordred | fungi: yah - I think the biggest problem would be adding support to that to zuul-jobs | 21:54 |
fungi | right, i guess it gets back to the same problem, it'd be an inherited job which adds a new role | 21:55 |
mordred | corvus: yeah - and then we could just have "get_python_from_stow" - which is already a defined thing so makes _total_ sense for zuul-jobs to support | 21:55 |
corvus | fungi: that approach is nice, but adds a dependency on an external service | 21:55 |
mordred | and in fact - isnt' python specific - so could be a pattern people could in general use to support pre-installation of things into nodes | 21:55 |
mordred | (also, stow is already in debian - so apt-get install stow gets you the tool) | 21:56 |
mnaser | old but https://bugs.python.org/issue19968 | 21:56 |
fungi | corvus: sure, the alternative adds a lot of in-image data for files which might not be accessed often. it's like when in opendev we decided to stop pre-downloading distro packages into the images and instead rely on nearby mirror servers | 21:56 |
mordred | mnaser: wow | 21:57 |
mordred | mnaser: oh - you know - I think that's not so much of an issue | 21:58 |
mnaser | i dont know anything about stow but i just ran into that searching "python" and "stow" | 21:58 |
mordred | yeah - so - in this case what the person wants isn't as important to us | 21:58 |
fungi | right, we wouldn't want to move it to a different path than it was built for | 21:59 |
fungi | or at least that doesn't seem like a necessary part of the solution anyway | 21:59 |
mordred | it's _fine_ if sys.path is /usr/local/stow/python3.6 as long as /usr/local/bin/python3.6 exists and is a symlink to /usr/local/stow/python3.6/bin/python | 21:59 |
mnaser | also wrt fungi comments about light image vs downloading things -- i think whatever approach we should take should be something that can be documented by "This is how you can run it and these are the jobs you need to do it" (similar to how container jobs + intermediate registry + how to write the base jobs is documented) | 21:59 |
mordred | ++ | 21:59 |
mnaser | so i'm not opposed to that approach, i just don't wanna end up doing something alone bc i know opendev will probably have to end up looking at solving the same problem and so will other users | 22:00 |
mnaser | so aligning would be nice | 22:00 |
mordred | well - actually - stow solves the issue _one_ way and will make ensure-python-> make python{{version}} /usr/local/bin/python3 work fine | 22:00 |
mordred | but it _won't_ result in /usr/local/bin/python3.{5,6,7,8} - because that's not what it's good at - for supporting doing that I think we'd want to define our own symlinking | 22:01 |
mordred | but I don't thnik we actually need that | 22:01 |
fungi | yeah, i don't think one or the other is necessarily the "right way" just noting that it's a tradeoff and there will always be a lot more things which users need semi-frequently but not often enough to warrant embedding in images | 22:01 |
mordred | as much as we want to ensure that a particular job has acecss to the particular python it's looking for | 22:01 |
mordred | so I *thnk* it's fine for zuul-jobs normal assumptions | 22:01 |
mordred | for a provider like vexxhost witha multi-tenant zuul - building larger images with all the pythons, rubys, golangs, whatev in them - but that are known to be copy-on-write anyway might be a better tradeoff than build-time downloading or installing | 22:03 |
fungi | in openstack, users mostly want "any old python interpreter" or "the version of python provided by this system" but less frequently "some specific version of python" so need to work out where to draw the line in providing that in images | 22:03 |
mordred | and if we get this working with stow - we could extend the pattern to ruby, golang, whatev - and providers that don't want to build such images dont' have to | 22:03 |
openstackgerrit | Mohammed Naser proposed zuul/zuul master: Display clean error message for missing secret https://review.opendev.org/713469 | 22:04 |
mnaser | mordred: ok so i guess the pattern might be something an element that builds into stow (hopefully inside dib) and then maybe we can make a load-from-stow role that gets included inside python so we can reuse the bits | 22:05 |
mordred | yeah | 22:06 |
mordred | *hand wave* | 22:06 |
mnaser | ok well i think for now the pyenv approach might be good enough and this seems like something we can optimize on eventually too | 22:07 |
mordred | but I thnk it's at least worth thinking about for a few minutes - because it might give you a tool to solve this not only for python and in zuul-jobs - but broadly for the same pattern of problem | 22:07 |
mordred | mordred: ++ | 22:07 |
mordred | gah | 22:07 |
mordred | mnaser: ++ | 22:07 |
mnaser | mordred: i agree, i think it's a good thing to figure out a pattern | 22:08 |
corvus | new subject: i'm wondering if we can put the pull-from-buildset-registry role in the base jobs, so that it can run in any job. that way we can have container-using jobs which don't build images "require" other images and get them pulled. right now, only jobs that either run a buildset registry or build an image pull images. but zuul-quick-start, for example, does neither. | 22:12 |
corvus | there are two cases to cover: that ^ is one of them; in that case the registry is run by another job (one that this job depends on) and so we know that as soon as this job starts, the buildset registry is running. the other case is that this is either a buildset registry job or an image build job, in which case, there will not be a registry running at the start of the job; in that case we would just have to | 22:13 |
corvus | run the role again later in the job once the registry is running. | 22:13 |
mordred | hrm | 22:17 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Use ZK TLS in quickstart https://review.opendev.org/712817 | 22:17 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Move zuul-quick-start requires to zuul-build-image https://review.opendev.org/713545 | 22:17 |
mordred | what a great late-in-day question | 22:17 |
mnaser | corvus: i'm trying to follow but i don't see "pull-from-buildset-registry" in zuul/zuul-jobs/roles? | 22:17 |
corvus | er sorry | 22:17 |
corvus | pull-from-intermediate-registry | 22:17 |
mnaser | corvus: so if i understand it would be making a base job which replaces yoursite-buildset-registry (based on https://zuul-ci.org/docs/zuul-jobs/docker-image.html ? ) | 22:21 |
corvus | 713545 is the workaround because of that | 22:21 |
corvus | mnaser: no, i'm thinking just put that role in *the* base job for the site, so that any job automatically gets that. | 22:21 |
mordred | corvus: biggest issue I coudl see is that would cause it to run before installing docker, no? | 22:22 |
corvus | everything else stays the same | 22:22 |
mordred | would that matter? | 22:22 |
corvus | i think that's okay; it just runs on the executor | 22:23 |
mordred | corvus: oh right. nod | 22:23 |
mordred | yeah | 22:23 |
mordred | I *think* it would be ok | 22:23 |
corvus | the other idea i had was use multiple inheritance for this; so zuul-quick-start could inherit from "almost-base-job-that-runs-pull-from-intermediate-registry" | 22:24 |
*** mattw4 has joined #zuul | 22:24 | |
mnaser | it seems to not be an issue but i think the role might need some cleaning up as it makes assumptions from what i see | 22:25 |
corvus | (strictly speaking, zuul-quick-start doesn't need multiple inheritance for this, but other similar jobs might, so i wanted to think about it generally) | 22:25 |
mnaser | i.e. it's not very noop-y | 22:25 |
*** mattw4 has joined #zuul | 22:25 | |
*** harrymichal has quit IRC | 22:25 | |
mnaser | seems like something that doesn't hurt and makes life easier | 22:25 |
*** mattw4 has quit IRC | 22:25 | |
mordred | yeah | 22:25 |
mordred | and causes things to dwim | 22:26 |
corvus | mnaser: yes probably so | 22:28 |
corvus | right now it either assumes that a previous job started a buildset registry, or this job started a buildset registry. we would need to also support "there is no buildset registry" | 22:28 |
mnaser | yeah | 22:28 |
corvus | but i think if we moved it to the base job, it would make the process pretty intuitive | 22:28 |
mnaser | i'm all for making lives easier | 22:28 |
*** noonedeadpunk has quit IRC | 22:29 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Add parent and abstract flags for diskimages https://review.opendev.org/713157 | 22:29 |
mnaser | also pyenv stuff should be good to go: https://review.opendev.org/#/c/704266/13 | 22:29 |
clarkb | re stow thats sort of the packaging agnostic version of nix? | 22:29 |
mordred | yeah. and works via symlinks | 22:30 |
mordred | so it doesn't have what nix does with expressing depend lists | 22:30 |
mordred | it's more a tool to manage symlinks for a bunch of parallel software installed in parallel dirs | 22:30 |
*** noonedeadpunk has joined #zuul | 22:31 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: Don't use JKS with ZK https://review.opendev.org/713340 | 22:40 |
*** rlandy|afk is now known as rlandy | 22:48 | |
clarkb | ianw the image inheritance stack lgtm now | 22:50 |
*** Goneri has quit IRC | 22:50 | |
ianw | cool, i'll fix up the config changes as that illustrates it's use | 22:51 |
ianw | project-config i mean | 22:51 |
clarkb | I've also abandoned https://review.opendev.org/#/c/712997/ as the base image change to openssl.cnf should address that | 22:51 |
clarkb | and we can unabandon if that isn't the case | 22:51 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Move config merge into DiskImage object https://review.opendev.org/713550 | 22:53 |
*** jamesmcarthur has quit IRC | 22:56 | |
mordred | clarkb: https://review.opendev.org/#/c/712495/ | 22:58 |
mordred | is green | 22:58 |
clarkb | mordred: approved | 22:59 |
clarkb | semi related to that is fungi's python3.8 chagne for nodepool looks like it hit a test failure | 22:59 |
clarkb | I'll reapprove it | 22:59 |
*** jamesmcarthur has joined #zuul | 22:59 | |
*** hashar has quit IRC | 23:04 | |
clarkb | corvus: mordred: https://review.opendev.org/#/c/712544/ is a change I wrote in response to mordred getting that confusing error from zuul | 23:14 |
clarkb | it ended up being a non issue other than the message itself, but I think it may be worth reviewing that to see if we can pull on the thread enough to be less confusing? | 23:14 |
corvus | clarkb: oh thanks, that slipped my mind | 23:15 |
corvus | clarkb: okay, so that's the case where there's an item ahead, but the item ahead's layout is none, and you reckon that happens when the item ahead has an error? | 23:16 |
clarkb | corvus: ya in this case the item ahead was the base config for the tenant I think and it was in error | 23:17 |
corvus | clarkb: the item ahead is a change of some kind | 23:18 |
clarkb | don't we set parent_layout to the base config if there isn't a chagne ahead? | 23:18 |
corvus | yes | 23:18 |
corvus | that should never be none | 23:18 |
clarkb | oh right | 23:18 |
corvus | if that's the case, we should not land this change and we should look harder into the error | 23:18 |
clarkb | at the time I didn't think there was a chagne ahead, but I'm not 100% positive of that | 23:19 |
mordred | I;m _fairly_ certain there wasn't ... | 23:19 |
clarkb | ya so this could be a deeper bug | 23:19 |
mordred | oh - actually | 23:20 |
mordred | it is stacked on https://review.opendev.org/#/c/712495/ | 23:20 |
mordred | but it doesn't look like that change ever had an error | 23:20 |
mordred | and certainly not around the time that https://review.opendev.org/#/c/712495 tripped the error | 23:21 |
corvus | 495 is the change that saw the error, what's the change it was stacked on? | 23:22 |
mordred | oh blah | 23:23 |
mordred | https://review.opendev.org/#/c/712489/ | 23:23 |
corvus | there was not a reconfiguration in progress at the time | 23:26 |
corvus | but that did happen right around when 489 was approved and enqueued into gate, and 495 had a new patchset uploaded | 23:27 |
corvus | what if: because 489 went into gate, it superceded the non-live version of 489 which was enqueued into check ahead of 495? | 23:29 |
mordred | corvus: oh - interesting | 23:30 |
clarkb | can they cross pipeline boundaries like that? | 23:30 |
mordred | well - it's that the gate job cancelled the check job since we don't do clean check in zuul | 23:30 |
*** Defolos has quit IRC | 23:30 | |
corvus | well, i think we have the gate pipeline supercede check, so it's supposed to remove live versions of changes in check | 23:31 |
mordred | yeah | 23:31 |
corvus | it's not supposed to do that for non-live versions | 23:31 |
corvus | but i'm just brainstorming possible edge cases | 23:31 |
mordred | yeah | 23:31 |
mordred | maybe I should say yeah again | 23:32 |
corvus | mordred: you uploaded the new patchset of change #2 the same second zuul cleared the verified vote on change #1 because it went into gate | 23:32 |
mordred | corvus: I'm very good | 23:32 |
corvus | mordred: yeah | 23:32 |
openstackgerrit | Merged zuul/nodepool master: Declare support for Python3.8 https://review.opendev.org/712494 | 23:38 |
*** tosky has quit IRC | 23:38 | |
clarkb | I bet that was intentional timing too | 23:38 |
clarkb | "hey let me try and confuse zuul" | 23:38 |
*** jamesmcarthur has quit IRC | 23:44 | |
*** jamesmcarthur has joined #zuul | 23:46 | |
corvus | i don't see the superceded log line around that time, so i don't think that theory holds | 23:51 |
*** jamesmcarthur has quit IRC | 23:51 | |
corvus | it might be worth it to try to recreate in a unit test; enqueue A in gate, hold in node requests, upload patchset 2 of B. | 23:53 |
corvus | clarkb: that's what i'd do next; i'll leave that to you if you want to keep pulling on the thread, or i can look at doing that tomorrow | 23:54 |
clarkb | I doubt I'll get to it today. I hear cries of "I'm hungry" downstairs | 23:54 |
clarkb | I should go sort out dinner | 23:54 |
corvus | i'm eod as well | 23:55 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!