Friday, 2021-11-19

ianw	fungi: i'm probably just sensitive because i've been using a self-compiled git lately :)	00:08
fungi	hah	00:14
fungi	that would certainly make me sensitive	00:14
fungi	i agree about just matching on the first two or three version components thpugh	00:14
Clark[m]	Ya I can update the change tomorrow to take the first 3 tuple and emit a warning if we don't determine a version	01:02
*** rlandy\|ruck is now known as rlandy\|out		01:52
opendevreview	Merged opendev/system-config master: infra-prod: remove master override steps https://review.opendev.org/c/opendev/system-config/+/818191	02:43
*** pojadhav\|afk is now known as pojadhav		04:20
*** ysandeep\|out is now known as ysandeep		05:14
*** akahat\|rover is now known as akahat\|lunch		08:36
*** jpena\|off is now known as jpena		08:37
*** akahat\|lunch is now known as akahat\|rover		09:59
*** ysandeep is now known as ysandeep\|afk		11:04
*** rlandy\|out is now known as rlandy\|ruck		11:11
*** ysandeep\|afk is now known as ysandeep		12:22
dtantsur	fungi: to follow-up on the yesterday's cross-project tests discussion: I've come up with https://review.opendev.org/c/openstack/ironic/+/818553/4/zuul.d/ironic-jobs.yaml and it seems to work as I wanted, even when run on the library.	14:08
fungi	dtantsur: awesome!	14:12
fungi	seems fairlt straightforward too	14:12
fungi	er, fairly	14:12
dtantsur	yeah, zuul for the win :)	14:15
fungi	dtantsur: fwiw, i see other projects omit ansible_user_dir for that: https://opendev.org/openstack/horizon/src/branch/master/.zuul.d/cross-jobs.yaml#L16	14:17
dtantsur	I see. Yeah, I copy-pasted this bit from somewhere in zuul-jobs	14:17
mgariepy	Clark[m], you can release the vm : openstack-ansible-deploy-infra_lxc-ubuntu-focal for https://review.opendev.org/817384 root@198.72.124.136	14:27
fungi	mgariepy: Clark[m]: i've deleted that autohold. thanks!	14:59
mgariepy	thanks a lot fungi	15:00
fungi	my pleasure	15:00
*** rlandy\|ruck is now known as rlandy\|ruck\|biab		15:08
*** ysandeep is now known as ysandeep\|out		15:13
clarkb	fungi: I think we should be good to try https://review.opendev.org/c/opendev/system-config/+/816770 next for gerritbot user updates	16:19
fungi	oh, yep, approved. looks virtually identical, i forgot that was separate	16:20
*** rlandy\|ruck\|biab is now known as rlandy\|ruck		16:22
opendevreview	Clark Boylan proposed opendev/git-review master: Fix use of removed --preserve-merges option https://review.opendev.org/c/opendev/git-review/+/818219	16:25
clarkb	fungi: ianw ^ I think that addresses the latest comments on that change	16:26
opendevreview	Clark Boylan proposed opendev/lodgeit master: Update docker image to bullseye and python 3.8 https://review.opendev.org/c/opendev/lodgeit/+/818597	16:31
clarkb	fungi: ^ re bullseye we basically need a bunch of changes like that. For the most part the updates haven't been too bad. Only zuul executors ran into socat behvior changes and nodepool builders had problems with container stuff? Python apps that don't interact with the system much should be easy (like lodgeit)	16:39
fungi	yeah, on the whole i expect there would be no real functional difference	16:39
fungi	occasionally we'll hit things around changes in command-line options	16:40
opendevreview	Merged openstack/project-config master: Retire puppet-senlin - Step 3: Remove Project https://review.opendev.org/c/openstack/project-config/+/817327	16:40
outbrito	G'day folks! Do you happen to know why zuul is not merging and I'm seeing the "submit" button on this change? https://review.opendev.org/c/starlingx/kernel/+/817140/	16:40
outbrito	I see it disabled though	16:40
fungi	clarkb: why did we not need to explicitly declare the older python-builder image though?	16:41
clarkb	fungi: I think that waas a bug	16:41
clarkb	outbrito: you should never get a working submit button in gerrit. I think it may be showing you the button because it is submittable but you don't have permissions to do so (only zuul should have permissions for that)	16:42
clarkb	which means we need to figure out why zuul isn't doing that or isn't able to	16:42
fungi	outbrito: it looks like that change is based on an outdated parent	16:43
fungi	its git parent is 816259,2 but someone revised 816259 without rebasing 817140 so now it can't merge	16:44
fungi	if you rebase 817140 onto the master branch at this point it should work	16:44
fungi	816259,3 is what ended up merging	16:44
clarkb	I guess gerrit 3.3 stopped showing you an orange warning for that	16:44
fungi	i think that's what it's trying to signal by putting the (Merged) next to the parent is in red	16:45
fungi	normally it would be grey/black	16:45
clarkb	ah yup https://review.opendev.org/c/opendev/system-config/+/816770 shows the dark grey color for its merged parent	16:46
fungi	not a good ui choice for accessibility	16:46
fungi	even something common like red/green color-blindness would make that virtually impossible to notice	16:46
mgariepy	clarkb, fungi would it be possible to have an auto-hold on vms that timeout for one role we have ?	16:46
clarkb	mgariepy: we can only filter by project change or job. Not role if that is what you are asking	16:47
*** marios is now known as marios\|out		16:47
mgariepy	so you cloud filter on timeout on a specific patch ?	16:48
fungi	yeah	16:48
mgariepy	could**	16:48
fungi	if "role" here means a particular git repository, that's doable	16:48
clarkb	but it has to be the project that triggered the job not the timeout if that makes sense	16:49
johnsom	Hmm, zuul status is giving me "Something went wrong", is there a restart going on?	16:49
opendevreview	Merged opendev/system-config master: Run matrix-gerritbot with gerritbot user https://review.opendev.org/c/opendev/system-config/+/816770	16:49
clarkb	I too get the something went wrong	16:49
clarkb	and now it is back	16:49
mgariepy	ok i'll think about it and see what we should do.	16:50
clarkb	johnsom: its probably a bug in zuul-web dealing with updating configs/layouts	16:50
clarkb	johnsom: the service itself seems to be fine though.	16:50
johnsom	Or a mis-configured health monitor on the LB pool?	16:50
clarkb	johnsom: it isn't an LB pool	16:50
johnsom	Well, there is your problem. GRIN	16:50
clarkb	everything is active active active active and needs to deal with locks and such properly	16:51
clarkb	and currently your webbrowser talks to a single web frontend	16:51
clarkb	Basically I think it is a bug but only in rendering the info to the end user. The actual zuul processing in the background seems to be happy. And if you wait 30 seconds it resolves itself	16:52
johnsom	Yeah, it looks like the job I was looking for started even though I couldn't see it	16:52
fungi	the zuul-web logs have a bunch of deserialization exceptions	16:53
fungi	json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)	16:54
clarkb	ya so it probably read data before it was properly written. I'm guess we need to add a lock to something to avoid that	16:54
fungi	also kazoo.exceptions.NoNodeError	16:54
clarkb	*I guess	16:54
fungi	or just avoid invalidating the cache until it gets a good read	16:55
fungi	i also see some KeyError: 'change_queues'	16:56
fungi	all of these are potentially the same underlying cause though	16:56
fungi	the KeyError: 'change_queues' usually follows a kazoo.exceptions.NoNodeError though sometimes i see kazoo.exceptions.NoNodeError without the subsequent KeyError	16:57
clarkb	I think change_queues is a db record that is kept separate for performance reasons. It wouldn't surprise me if we aren't handling its specialness properly in zuul web	16:58
outbrito	fungi, will try, tks	17:08
opendevreview	Clark Boylan proposed opendev/system-config master: Switch lodgeit to run under a dedicated user https://review.opendev.org/c/opendev/system-config/+/818606	17:12
clarkb	fungi: ^ that just reported in matrix testing channel. Lets confirm the matrix gerritbot restarted	17:12
clarkb	it hasn't updated its docker compose yet or restarted	17:14
clarkb	I guess it did gerritbot first and I need to be patient :)	17:14
clarkb	oh I see, the deploy job that was running was for a project-config update and since we dno't update to master in deploy we weren't running with latest there	17:14
clarkb	we restarted gerritbot for a new project add? But the currently running job is for the matrix-gerritbot update so we should see that update shortly	17:15
fungi	oh, yeah that explains it	17:15
clarkb	ok matrix updte failed beacuse we actually rewrite the config with an ansible task and that couldn't overwrite the existing file bceause it is 644 by root	17:21
clarkb	I'm going to manually chown the file then the next hourly run should sort us out	17:21
clarkb	thats done. I need to find breakfast but will check in on this again later (and the hourly runs should update it I think)	17:23
fungi	oh, and the ansible task runs after?	17:26
clarkb	ya this is a pre step to make a config for the bot	17:29
clarkb	since the bot takes a dhall config but we want ot maintain yaml configs for humans	17:29
*** jpena is now known as jpena\|off		17:41
opendevreview	Ghanshyam proposed opendev/irc-meetings master: Remove Technical committee office hours https://review.opendev.org/c/opendev/irc-meetings/+/818613	17:50
clarkb	Warning: Could not get or create the default cache directory: <- matrix-gerritbot is unhappy	17:51
clarkb	unfortunately that string doesn't seem to show up in the matrix-gerritbot source so I'm not sure what the default cache directory is	17:55
clarkb	tristanC: ^ do you know what the cache directory is?	17:57
clarkb	my hunch is that HOME=/root here https://github.com/softwarefactory-project/gerritbot-matrix/blob/master/flake.nix#L62 is the problem	18:03
clarkb	and its trying to write to $HOME/.config or some such	18:03
clarkb	I'll push up a partial revert for now	18:03
fungi	could we override $HOME when starting the container?	18:04
clarkb	fungi: yes, we can, but I don't know to what value. I think we should consider only consuming docker images that are built with standard tools. The nix stuff is hard to process	18:05
clarkb	(and build our own matrix-gerritbot image if that is necessary)	18:05
fungi	ahh, yeah	18:05
clarkb	fungi: we could try /tmp maybe	18:05
fungi	using docker to build an image ourselves is probably the most straightforward	18:05
clarkb	fungi: do you want me to try overriding to /tmp by hand before we try the partial revert?	18:06
fungi	i suppose it can't hurt	18:06
clarkb	same error	18:07
fungi	huh, in further exercising the new pep-517 pbr version, i see newer interpreters are complaining that pbr doesn't explicitly close manifests it reads	18:07
clarkb	I've manually done the partial revert (just commented out the user directive in docker-compose.yaml) and it seems happier. I'll push the revert up which should confirm it is happier too	18:09
fungi	thanks	18:09
opendevreview	Clark Boylan proposed opendev/system-config master: Partial revert of matrix-gerritbot user change https://review.opendev.org/c/opendev/system-config/+/818618	18:11
clarkb	that showed up in the testing channel in matrix I think the revert is sufficient	18:12
clarkb	er I mean the partial revert	18:12
clarkb	fungi: I'll happily review/write PBR updates to use with open(foo) as bar: context managers or similar to fix that	18:13
clarkb	just point me at the location	18:13
fungi	yeah, once i have a handle on where it's complaining about, i'll push some up	18:13
clarkb	tristanC: to summarize matrix-gerritbot can't get or create the default cache location when we override the user. It doesn't tell us what the location is. If you can help us understand that better we'd appreciate it	18:13
fungi	i think it's just find_sources and get_version from pbr.packaging where i'm running into it, but i'll try to make sure whatever i put together is comprehensive since i'm sure i'm not exercising every last code path in pbr here	18:15
clarkb	ya I guess we can grep for open() and then update all occurenaces easily enough	18:15
fungi	confirmed adding {toxinidir} to deps on all testenvs where i was previously relying on usedevelop is still working too	18:22
fungi	i always worry that just . won't work as expected if tox is called from somewhere which isn't the root of the repo	18:23
clarkb	oh good point	18:24
clarkb	I'll update my example bindep change to do that	18:24
fungi	i added a comment on the bindep change just so we don't forget	18:24
opendevreview	Clark Boylan proposed opendev/bindep master: Try out PBR pep 517 support https://review.opendev.org/c/opendev/bindep/+/816741	18:26
clarkb	that should do it	18:26
fungi	well, good news, the vast majority of open() calls in pbr are already using context managers, and most of the remainder are in tests. i only see one other obvious case besides the two i was hitting in packaging.py	18:26
fungi	oh, in fact i misread, the other one i thought i saw was a subprocess.Popen() it just matched my naive grep for open(	18:29
fungi	one of the two i hit is easy enough to fix, the other will be tricky since it's an open() inside a try/except	18:32
clarkb	fungi: you should be able to use with there or a finally?	18:32
clarkb	(then guard againstalready closed fd)	18:32
fungi	yeah, i can also explicitly close it in another try i guess	18:32
fungi	maybe i can just combine these two try/except blocks? https://opendev.org/openstack/pbr/src/branch/master/pbr/packaging.py#L825-L832	18:35
fungi	then i can do a with inside the try	18:35
fungi	and catch (IOError, OSError, email.errors.MessageError)	18:35
clarkb	fungi: I would try: with open(filename, 'r') as pkg_metadata: and then catch whatever needs catching from that	18:37
fungi	something like https://review.opendev.org/818622	18:37
clarkb	its always a continue	18:37
fungi	yeah, i suppose i could nest the try blocks as an alternative	18:38
clarkb	no you did what I was thinking about	18:38
fungi	yeah, this seems more concise	18:38
fungi	should i fix up the tests to not leave open descriptors too, or will anyone likely care?	18:39
clarkb	I've figured out the rough area of matrix-gerritbot that is hainv problems. It is after we connect to matrix and validate our session	18:39
clarkb	I think it is the gerrit connection that is breaking because we also log after gerrit connects that matrix is ready	18:40
clarkb	however haskell does delayed execution so this might be flawed analysis	18:42
clarkb	aha I was reading it wrong. I think it is in the joinroom area of the code beacuse we validate session then join room and we don't get room join logs	18:49
fungi	clarkb: aargh, moving target! https://setuptools.pypa.io/en/latest/history.html#v59-1-0	19:00
fungi	"Back out deprecation of setup_requires and replace instead by a deprecation of setuptools.installer and fetch_build_egg. Now setup_requires is still supported when installed as part of a PEP 517 build, but is deprecated when an unsatisfied requirement is encountered."	19:00
clarkb	I guess that means you still have to specify it in pyproject.toml so that things line up	19:01
fungi	yeah	19:01
clarkb	good to know, I think we support that just fine	19:01
clarkb	I'm having a really hard time finding anything that would need a cache directory in matrix gerrit bot so far.	19:01
clarkb	It uses in memory "databases" to store things like room info. It forks `ssh` directly	19:02
clarkb	It might be the matrix library?	19:02
fungi	that certainly seems possible	19:04
clarkb	I appreciate that python tends to do a better job of identifying the origins of log messages	19:05
opendevreview	Merged opendev/system-config master: Partial revert of matrix-gerritbot user change https://review.opendev.org/c/opendev/system-config/+/818618	19:11
clarkb	tristanC: maybe add -prof to the cabal compile options then we can run the executable with -xc for problems like this? I'm not sure what the -prof impact at runtime is but I imagine its small if you have to explicitly set -xc on the executable to get that info back?	19:19
clarkb	I feel like I'm reaching the end of my ability to debug this as I don't intend on pulling in nix to build this image.	19:19
clarkb	it just occured to me that I could lsof the running process to see what cache it might be opening	19:25
clarkb	I'll do that	19:25
fungi	oh, yep now that it's running	19:26
fungi	assuming it holds an open descriptor to its cache anyway	19:26
fungi	(it may not)	19:26
fungi	but worth a shot	19:27
clarkb	ya I'm not seeing anything that could be the cache	19:27
fungi	so it probably only opens things there on demand	19:27
clarkb	ya its got fd 0 on /dev/null 1 and 2 on pipes some event loop kernel fds and then tcp sockets	19:29
clarkb	nothing looks like an on disk caceh	19:29
clarkb	I could probably strace it and hope to filter out the noise somehow to find reads/writes to a cache	19:30
clarkb	but that seems iffy	19:30
clarkb	This has led me to suspecting it might be the prometheus health endpoint that is doing caching as that runs a webserver	19:35
tristanC	clarkb: catching up, let me see if i can reproduce locally	19:46
clarkb	tristanC: I put the exact output in the comments of https://review.opendev.org/c/opendev/system-config/+/818618/1/playbooks/roles/matrix-gerritbot/templates/docker-compose.yaml.j2	19:47
clarkb	my best guess at this point is that it is related to the web server for prometheus metrics. Otherwise I'm not really finding anything that might be trying to cache stuff. But I'm also not finding anything indicating the web server there is caching (based on lsof and my hitting it manually)	19:48
tristanC	clarkb: i see thanks. So the error couldn't be printed because of a missing utf-8 locale, and it would have showed `$HOME/.cache/dhall`	19:48
clarkb	tristanC: ok we overrode $HOME to be /tmp and that didn't help	19:49
clarkb	I would expect that /tmp would be writable by all users on the image but I'm probably making a bad assumption because nix	19:49
clarkb	tristanC: but also that string isn't utf8 it is ascii? shouldn't putchar be fine with ascii?	19:50
tristanC	clarkb: the error message is using utf-8 character	19:50
clarkb	I guess that comes after what I got since it failed	19:52
clarkb	and is in addition to `$HOME/.cache/dhall`	19:52
tristanC	clarkb: ftr it is https://github.com/dhall-lang/dhall-haskell/blob/30f96178fce9d0bcafc74812df73e46fb66febd3/dhall/src/Dhall/Import.hs#L938	19:54
clarkb	it is interesting that google hasn't indexed that string	19:55
clarkb	(I tried googling it several different ways before giving up, probably too much source code out there to index entirely)	19:55
clarkb	tristanC: any idea why setting $HOME to /tmp in the docker-compose.yaml didn't correct this?	19:57
clarkb	(we assumed something might be trying to write to $HOME/.cache which is why we tried taht0	19:57
tristanC	clarkb: there is no /tmp in the image	19:58
clarkb	of course not	20:00
clarkb	tristanC: would it be crazy to suggest that using slightly bulkier images that are possible to debug and build locally using normal tools is a good idea?	20:00
clarkb	I appreciate the nix image is super minimal but that makes it very difficult to debug and it uses very specialized tools to do something that doesn't necessarily benefit from that	20:01
clarkb	the image also sets a bash prompt but bash isn't even installed	20:01
clarkb	we should be able to `cabal build` on something like debian right?	20:02
opendevreview	Ghanshyam proposed openstack/project-config master: Retire training-labs: remove project infra https://review.opendev.org/c/openstack/project-config/+/817507	20:03
clarkb	I think OpenDev should probably consider doing that at least.	20:03
clarkb	Then we won't have to worry about /tmp or bash or utf8	20:04
opendevreview	Ghanshyam proposed openstack/project-config master: Retire training-labs: remove project infra https://review.opendev.org/c/openstack/project-config/+/817507	20:07
clarkb	I guess the solution here would be to mount something to /root/.cache with appropriate permissions?	20:07
clarkb	assuming no changes to the image	20:07
clarkb	Hrm but /root is likely to not be o+x	20:07
clarkb	that probably won't work either	20:08
opendevreview	Clark Boylan proposed opendev/system-config master: Give matrix-gerritbot a writeable cache https://review.opendev.org/c/opendev/system-config/+/818627	20:16
clarkb	that seems really hacky but might work?	20:16
opendevreview	Ghanshyam proposed openstack/project-config master: Remove 'publish-training-labs-scripts' definition https://review.opendev.org/c/openstack/project-config/+/818628	20:17
tristanC	clarkb: i can reproduce the error and i'll provide a fix in the image. Are the host file sharing the same uid as the container one right?	20:18
clarkb	tristanC: they are. I'm not sure what we need you to hardcode the uid in the image, we just need to be able to have enough of a normal filesystem that we can bind mount appropriately	20:19
opendevreview	Ghanshyam proposed openstack/project-config master: Remove 'publish-training-labs-scripts' definition https://review.opendev.org/c/openstack/project-config/+/818628	20:19
clarkb	The underlying problem here seems to be we've overly minimized (lack of utf8 locale and lack of expect filesytem locations)	20:19
clarkb	and that is made worse by having a bunch of non standard tools. I am really surprised that dhall needs to write to disk	20:20
clarkb	er * Not sure that we need you to hardcode the uid in the image. s/what/that/	20:20
tristanC	clarkb: i meant with podman, when using the `--user $(id) --volume $HOME/.ssh:/root/.ssh` then the .ssh directory in the container is still owned by root	20:22
clarkb	tristanC: yes because it is under /root	20:22
clarkb	that was my point above we can't mount to /root/.cache. But we could mount to /tmp	20:23
clarkb	or just use /tmp and not cache since it is all epehermal anyway	20:23
tristanC	clarkb: i meant without `--userns keep-id`, which i assume is what docker set by default	20:23
tristanC	clarkb: you can mount in /root, the folder actually doesn't exist	20:23
clarkb	tristanC: is dhall creating it in that case?	20:23
clarkb	well no it can't be because then it wouldn't error	20:24
clarkb	we error because that directory is not readable	20:24
tristanC	i think it happens when the runtime is creating the parent directory	20:24
clarkb	but the runtime would be the non root user and create it as itself if that were the case	20:24
clarkb	but we have strong evidence that directory is not readable	20:24
tristanC	iirc, when bind mounting to /a/b, if /a doesn't exists it get created as root by default	20:25
clarkb	ah	20:25
clarkb	the docker runtime not the haskell runtime	20:25
clarkb	tristanC: thinking out loud here: is there any way to tell dhall to not cache to disk?	20:28
clarkb	since this is a container caching to disk is as ephemeral as the process so caching to memory seems fine	20:28
fungi	so with latest pbr i still see setuptools complain about calls to setup.py install, originating from pbr.util.setup_cfg_to_setup_kwargs here: https://opendev.org/openstack/pbr/src/branch/master/pbr/util.py#L407	21:04
fungi	this is the full traceback: https://paste.opendev.org/show/811204	21:08
fungi	clarkb: does that make any sense to you?	21:08
Clark[m]	We might need to avoid instantiating the class to get around the warning? That seems odd though. Does bindep do that?	21:15
Clark[m]	Might be config specific if not? I'm popping out for a bike ride now but can look closer after	21:15
fungi	i'll try to reproduce with bindep in a bit	21:17
fungi	but yeah it seems to happen when pip calls on setuptools to parse setup.cfg	21:18
tristanC	clarkb: in that case, dhall is just issuing a warning that it can't access the cache folder, and the unicode character makes the print fails. But I think the main issue is that the HOME directory is not writable when setting an arbitrary user.	21:29
tristanC	moreover, when using openssh, the .ssh location is resolved through /etc/passwd with a default to `/.ssh`	21:32
tristanC	so i think i know how to slightly adjust the image so that it can works with arbitrary uid	21:33
opendevreview	Tristan Cacqueray proposed opendev/system-config master: Update the gerritbot-matrix image to support arbitrary uid https://review.opendev.org/c/opendev/system-config/+/818645	21:46
tristanC	clarkb: i'm sorry this caused so much trouble and i hope 818645 should enable what you are trying to do.	21:48
tristanC	clarkb: and of course you can use `cabal build` to build the gerritbot-matrix binary, but i think the dockerfile will need a similar trick to support arbitrary uid	21:58
clarkb	tristanC: thanks. Its mostly that I question the utility of some of these decisiosn as they seem at odds with one another. The minimal image build doesn't seem to get along with dhall (and I guess openssh?)	22:56
clarkb	and if not doing a minimal image build makes sense for the software then I question why use nix to build the image	22:56
clarkb	and I don't think anything prevents us from running cabal in a Dockerfile?	22:58
clarkb	fungi: ok so the issue is that initialize_options is where setuptools raises the deprecation warning beacuse I guess that implies you're calling it on the command line? THat surprises me a little, but I think we work backward from that to figure out how to bypass it with pbr	22:59
clarkb	fungi: I think this is only an issue if using https://docs.python.org/3/distutils/extending.html#integrating-new-commands	23:02
clarkb	fungi: I don't think bindep has this problem because it doesn't extend setup.py this way	23:02
clarkb	fungi: do you have a link to the repo you're hitting this with?	23:02
clarkb	but basically cmdclass is deprecated aiui because you have to run setup.py to hit it ratherthan say build	23:02
clarkb	I think that means this is expected	23:03
tristanC	clarkb: i would say the benefit of nix container is two folds: it declares all the dependencies in a reproducable setting (think base image + bindep + requirements.txt), and sharable layers (each dependency is a single layer)	23:33
clarkb	tristanC: I think the second thing only really matters if you're doing a lot of nix containers right? For example in opendev's case this is our only nix container image so we get all the layers and no deduping for additional images	23:34
clarkb	But you get the deduping using a consistent base image like opendev does anyway	23:34
tristanC	clarkb: it does matters even for a single image where update will only pulls missing layers	23:35
clarkb	the strict control over all the deps is a neat feature of nix. I'm just not sure if gets us much here for a simple service like this. Cabal is capable of pinning things too right? then you're only dealing with the distro ghc and openssh	23:35
fungi	clarkb: it's not extending setup.py, and this was just the pip install tox was doing	23:35
clarkb	fungi: I think your setup.cfg sets a cmdclass value	23:36
tristanC	clarkb: and you can build gerritbot-matrix differently if you prefer, but you would need a similar trick for the home user dir so that it can work with arbitrary uid	23:36
clarkb	fungi: and cmdclass extends setup.py and pbr is trying to make that happen	23:36
fungi	clarkb: it's here: https://mudpy.org/gitweb?p=mudpy.git;a=blob;f=setup.cfg;h=1cbd5501ce8ceecf677085c4272c76468dacc015;hb=HEAD	23:36
clarkb	tristanC: yup I'm trying to work that through in my head. I'm beginning to think it might be a reaosnable thing for us to do for consistency with our images	23:36
tristanC	clarkb: having all the deps frozen is helpful to ensure the image can build in the far future	23:36
clarkb	tristanC: it also ensures that you're not getting security updates	23:37
tristanC	clarkb: right, so instead of updating a comment in a dockerfile to get a new build, you would update the repository commit instead	23:38
clarkb	tristanC: you'd also need to unpin things	23:38
tristanC	clarkb: here is an example dockerfile we use for another cabal base application: https://github.com/change-metrics/monocle/blob/master/Dockerfile-api	23:38
clarkb	but I guess you can do that in the same commit	23:38
clarkb	tristanC: how did the image update for your change above? I don't see the updated flake.nix in the github repo. Maybe that is just a sync problem though	23:40
tristanC	clarkb: nix flake update is the command to update dependencies, and you can do a tree diff to see what exactly changes	23:41
clarkb	re layer splits for updates. I'm not sure there is a ton of value in that. Yes, you'll avoid some network traffic but again that really only matters if you are doing significant numbers of updates that represent large amounts of data	23:43
clarkb	It is "neat" but I don't think it ie necessary when you update an image once a week or less	23:43
clarkb	and only have a handful of images that share those layers	23:43
clarkb	our base debian images with python in them are like 200MB total	23:44
clarkb	If we pull that once a week on a number of servers it isn't a big deal	23:44
clarkb	Basically I'm trying to optimize for simplicity and easy of use. Not for deploying massive amounts of software frequently to large datacenters. There are different needs.	23:45
clarkb	Nix would probably do well if you had hundreds of releases a day hitting tens of thousands of nodes	23:45
clarkb	and you'd accept the complexity and divergence from expected norms as those optimizations become important for you	23:45
clarkb	fungi: thats interesting because the pbr code is executing that path when you've set [global] commands if I'm reading it correctly	23:49
clarkb	and translating that to cmdlcass	23:49
clarkb	fungi: what I did in the past was preinstall pbr and then told build to not use isolated build environments. Then I could instrument the pbr installation to sort out what was going on. Might need to do that here	23:52
clarkb	to see what sorts of values are being handled there to work backwards and figure it out	23:52
clarkb	I wonder if you're hitting it in a dependency?	23:52
fungi	i doubt it's a dependency (the dependencies are listed there in the setup.cfg, passlib and pyyaml)	23:54
fungi	but yeah, first i'll try turning on warnings in bindep and see what i can reproduce with it	23:55

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!