Monday, 2020-03-30

fungigroup creation seems to be working in the wake of 715726 so exists now, but,access still hasn't gotten populated00:14
fungithis is our next exception: AttributeError: 'Gerrit' object has no attribute 'username'00:15
*** dangtrinhnt has joined #opendev00:41
openstackgerritMerged openstack/diskimage-builder master: Add Fedora 31 support and test jobs
*** dangtrinhnt has quit IRC01:55
*** dangtrinhnt has joined #opendev01:56
*** dangtrinhnt has quit IRC01:57
*** dangtrinhnt has joined #opendev02:04
*** dangtrinhnt has quit IRC02:07
*** dangtrinhnt_ has joined #opendev02:07
openstackgerritIan Wienand proposed zuul/zuul-jobs master: test-upload-logs-swift: revert download script
openstackgerritIan Wienand proposed zuul/zuul-jobs master: bulk-download : role with script to download all log files
*** dangtrinhnt_ has quit IRC03:37
*** dangtrinhnt has joined #opendev03:50
*** dangtrinhnt has quit IRC04:08
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files
*** dangtrinhnt has joined #opendev04:24
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files
*** ykarel|away is now known as ykarel04:50
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files
*** dangtrinhnt has quit IRC04:58
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files
*** ykarel is now known as ykarel|afk05:21
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files
*** ykarel|afk is now known as ykarel05:40
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files
*** DSpider has joined #opendev05:53
ianwcoruvs: ^ i think this is more what you were thinking?06:03
openstackgerritOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml
*** dpawlik has joined #opendev06:23
openstackgerritMerged openstack/project-config master: Normalize projects.yaml
*** tosky has joined #opendev07:25
*** ysandeep|rover is now known as ysandeep|rover|l07:25
*** rpittau|afk is now known as rpittau07:34
*** ralonsoh has joined #opendev07:53
*** ysandeep|rover|l is now known as ysandeep|rover08:34
*** ykarel is now known as ykarel|lunch09:32
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow configure-mirrors to enable extra repos
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow configure-mirrors to enable extra repos
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Improve job and node information banner
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Avoid confusing rsync errors when source folders are missing
openstackgerritSorin Sbarnea proposed openstack/diskimage-builder master: Validate virtualenv and pip
openstackgerritSorin Sbarnea proposed openstack/diskimage-builder master: Validate virtualenv and pip
*** ykarel|lunch is now known as ykarel10:15
*** rpittau is now known as rpittau|bbl10:56
zbrwho can help me doing few abandons, like ?11:46
*** ysandeep|rover is now known as ysandeep|rover|b11:51
*** lpetrut has joined #opendev12:02
AJaegerzbr: might be best if you give repository names. For elastic-recheck, I cannot help, for those that I'm core, I'm happy to...12:07
*** ysandeep|rover|b is now known as ysandeep|rover12:13
openstackgerritMerged openstack/project-config master: Add Shrews to alumni
openstackgerritMerged openstack/project-config master: Replace python-charm-jobs to py3 job
*** rpittau|bbl is now known as rpittau12:46
openstackgerritGrzegorz Grasza proposed openstack/project-config master: Add ability to push signed tags to tripleo-ipa
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded
openstackgerritMonty Taylor proposed opendev/jeepyb master: Username is on the connection objet
mordredfungi: ^^13:04
mordredfungi: I think that should fix the most recent issue13:04
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: install-docker: allow removal of conflicting packages
mordredfungi, clarkb : I'm kind of confused as to why that just broke13:07
fungii'm also confused to how the acl eventually got applied13:08
fungimaybe that's always been broken?13:08
mordredI don't see any *recent* changes that would have impacted that - but maybe we just hadn't updated gerritlib in a long time or something13:08
fungioh, yeah, could be newer gerritlib13:08
mordredfungi: I look forward to a future where we have manage-projects (or something) running as a zuul job and not as a cron pulse so that these logs can be more evident and visible to people like AJaeger and mnaser13:09
fungioh, indeed, the group got created with the project creator as a member13:11
fungii'll clean that up too13:11
mnaseryep, that would be awesome mordred13:11
mordredfungi: so maybe this has been broken for a while but we never noticed? and maybe project creator is a member of a bunch of groups and we didn't notice? :)13:13
fungimordred: oh, i think i see why... manage-projects ran and created the new group that new acl needed but raised AttributeError trying to clean up the initial group membership so we skipped the remainder of that project setup, then on the next go round it saw the group already existed so didn't try to create it and just pushed the new acl13:14
mordredfungi: yay for eventual consistency13:14
fungimordred: so... this is the only match on AttributeError in any manage_projects.log for the past month's retention13:17
fungileading me to suspect the new gerritlib theory is correct13:17
mordredwe made a new gerritlib release in jan - but I think it had been a _while_13:18
mordredyeah - 2018 was the previous on13:18
mordredit's entirely possible we haven't added any new groups between jan 28 and now13:19
mordredand yes - I have confirmed - the username change happened between 0.8.1 and 0.8.213:20
fungipossible puppet didn't upgrade gerritlib on review.o.o when we tagged a new release?13:23
fungiand we were still continuing to run much older?13:23
fungisystem context pip says we've still got 0.8.1 installed13:24
fungiso maybe we've only run 0.8.2 from docker13:24
mordredthis is a very good possibility13:25
fungiwhich may explain a bunch of these behavior changes13:26
fungionce 715937 merges we can try another project creation change and see if it makes it all the way through without error13:35
openstackgerritMonty Taylor proposed opendev/system-config master: Add job to run manage-projects in zuul
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: install-docker: allow removal of conflicting packages
openstackgerritMonty Taylor proposed openstack/project-config master: Run manage-projects on gerrit related changes
mordredfungi: ^^ there we go - I think those two patches shoudl do the manage-projects run, yes?13:47
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Adds roles to install and run hashicorp packer
*** ysandeep|rover is now known as ysandeep|away14:12
*** lpetrut has quit IRC14:13
*** lpetrut has joined #opendev14:15
*** ykarel is now known as ykarel|away14:25
openstackgerritMonty Taylor proposed opendev/system-config master: Run manage-projects/base/bridge on system-config changes
openstackgerritDmitriy Rabotyagov (noonedeadpunk) proposed opendev/lodgeit master: Add lodgeit-db script
openstackgerritMonty Taylor proposed opendev/system-config master: Log manage-projects to stdout
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow configure-mirrors to enable extra repos
openstackgerritThierry Carrez proposed opendev/system-config master: [gitea] Point to newly-split Getting Started content
mordredttx: I feel like we just landed a patch that did that ...14:46
mordredttx: oh - maybe if I actually read your commit message14:46
mordredinfra-root: I made this: which I *think* we're actually ready for - but that's obviously a big step, so is worth extra eyeballs14:49
AJaegermordred: I left a question on the project-config one, there's a slight misuse of promote due to the way we characterized promote. Are we fine with that?14:52
openstackgerritJeremy Stanley proposed opendev/system-config master: Add a service discussion mailing list for OpenDev
openstackgerritMerged opendev/jeepyb master: Username is on the connection objet
mordredAJaeger: yeah- I think in this case it's still decently relevant - it's a thing running after changes merge - but I agree, there's no artifacts to promote in the strict sense. The important bit for promote (other thant he files matcher that you mentioned) - is that it's supercedent so we don't wind up running manage-projects 4 times if we land 4 changes but rather only once14:54
mordredsince step 1 of the manage-projects playbook is "update project-config"14:55
ttxmordred: yeah for some reason the gerritbot did not post the infra-manual change here14:55
mordredAJaeger: I say that ... but actually it doesn't - I should add that14:57
openstackgerritMonty Taylor proposed opendev/system-config master: Add job to run manage-projects in zuul
openstackgerritMonty Taylor proposed opendev/system-config master: Run manage-projects/base/bridge on system-config changes
openstackgerritMonty Taylor proposed opendev/system-config master: Log manage-projects to stdout
openstackgerritMonty Taylor proposed opendev/system-config master: Update project-config in manage-projects
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: install-docker: allow removal of conflicting packages
*** DSpider has quit IRC15:07
*** DSpider has joined #opendev15:07
*** yoctozepto has quit IRC15:07
*** yoctozepto has joined #opendev15:08
*** osmanlicilegi has quit IRC15:08
*** osmanlicilegi has joined #opendev15:10
clarkbmordred: if you have a moment is the dependency of ttx's change you alredy reviewed15:20
mordredclarkb: done15:22
AJaegerttx, we have not updated gerritbot config for a bit, this is part of the larger system-config changes going on.15:30
clarkbmordred: fungi is there a quick jump in point for helping with the jeeypb stuff?15:35
clarkbI'm no longer up todate15:35
fungiclarkb: i think we're ready to approve some more project creation changes at this point15:35
mordredyeah. the things have all landed15:35
clarkbgreat, maybe that means i should make tea :)15:36
fungiclarkb: short summary, we've been running with "old" gerritlib on the gerrit server up to now, and the container is getting "newer" gerritlib, hence all the behavior changes we've observed and patched manage-projects for15:36
mordredclarkb: that said - I added a new stack you could look at:
AJaegermordred, fungi , are two repo creation changes15:36
clarkbfungi: up to now meaning before we started usign the container?15:36
fungiclarkb: right15:37
fungidocker is pulling latest gerritlib15:37
AJaegerconfig-core, please review and as two reverts15:37
fungiclarkb: but puppet has not been upgrading gerritlib15:37
mordredclarkb: puppet did not seem to update gerritlib15:37
mordredso at least this most recent thing actually broke back in january - but we never noticed :)15:37
fungior longer ago than january, but that's when the release containing it was tagged15:40
*** rajinir has quit IRC15:45
*** ttx has quit IRC15:45
*** rpittau has quit IRC15:45
*** rpittau has joined #opendev15:45
*** rajinir has joined #opendev15:45
*** ttx has joined #opendev15:45
clarkbmordred: that won't work as is. You need a non root user on the server with sudo access that zuul can ssh in as15:48
mordredclarkb: that's already set up in the base playbook15:50
mordredclarkb: it's the zuulcd user15:51
mordredand it seems to have the zuul ssh deployment keys in its authorized_keys already15:51
clarkbmordred: thats only for bridge15:51
clarkband I don't know that it landed?15:51
clarkbmordred: the idea there was to have zuul log in to bridge then run a nested ansible from there15:51
*** hashar has joined #opendev15:51
mordredclarkb: it only needs to be for bridge15:51
mordredright. that's what that job does15:51
clarkbif I'm reading your job correctly its talking directly to review*15:51
mordredclarkb: playbooks/zuul/run-production-playbook.yaml15:51
clarkboh hrm, ok I'm not sure how far along that got it sort of died when I couldn't get the zuul user landed iirc15:52
mordredclarkb: infra-prod-playbook logs in to bridge as zuulcd and runs the named playbook15:52
mordredthe zuul user is landed and in place - it SEEMS like all of the pieces are there15:52
clarkbok so maybe that eventually caught up and I didn't notice (it sat stale for a long time)15:53
clarkbI'm not sure how much testing this has received in general15:53
clarkbbut I guess we can try it wiht manage projets15:53
clarkbnext up we actually need to land that at the same time we land any manage-project changes right? otherwise we'll try to replicate to locations that don't exist15:55
clarkband I think that needs strict ordering of jobs15:55
clarkbhrm I think we are using artifact dependencies rather than job dependencies?15:56
clarkbmordred: but the first thing I was concerend about is that we need gitea to run before manage-projects15:57
clarkband I don't think we've done that in the change there15:57
mordredwe don't need to do that actually15:57
clarkboh wait its all in the one playbook15:57
clarkb(but the diff context isn't default big enough to show that)15:57
clarkbok so that first concern is good15:57
mordredwe _do_ need to run the update system-config ... so we should probably squash that one with the one before15:57
clarkbas this scope grows maybe its a good idea to land a simple change that runs ansible via bridge15:58
mordredI do think that for completeness we should eventually put in gitea and gerrit service playbooks and put in soft-depends between manage-projects and them - in case we land a change that touches both things15:58
clarkbensure that all works before we throw gitea and gerrit and bridge at it15:59
clarkboh and base which is all the servers15:59
mordredwell - that was why I had the first manage-projects change like it is15:59
mordredyup. that's the second change :)15:59
* mordred has a call ... will be back in about 3015:59
clarkbreally I'm thinking it would be good to take small bites and get manage-projects running successfully and solve that problem. Then tackle the problem of driving from zuul so that we don't prlong the first thing as we debug the second16:00
clarkband the original work in that space tried to start small (run dns updates iirc)16:00
mordredtotally. it's just that I thnk manage-projects is now fixed :)16:01
clarkbbut we've sort ofl eaped ahead to "run base on all servers and set up gitea and gerrit and bridge)16:01
mordredso this is the next step16:01
mordredand step one in the stack is just running manage-projects16:01
mordredso I agree with you16:01
clarkbya except its also hitting gitea too16:02
clarkbmordred: thats the playbook we never ran on the nameservers16:07
clarkbbut the chagne itself died beacuse it needed to be triggered from the zone repos and that made it more complicated16:09
clarkbmaybe having a simple start like that would be worthwile though since we never got that far in the past?16:09
*** rpittau is now known as rpittau|afk16:12
AJaegermordred: do we have a todo list? It should include gerribot as well...16:13
clarkbhrm that said we have learned some things about this via the goaccess playbook16:30
clarkbtop of list is we don't run a zuul console logger daemon thing which causes confusion (but in this case may actually be desireable)16:30
clarkbcorvus: the webserver log stats reporter tool16:30
clarkbcorvus: it doesn't bounce through bridge though16:31
mordredclarkb: yeah - I mean, this should work pretty much like goaccess - it's a no-node job on bridge that adds the remote host with add_host16:40
mordredso the console logging should work from the zuul POV about the same, yeah?16:41
mordredwe might not get live streaming - but we still should get the final logs16:41
clarkbmordred: I'd have to double check the goaccess logfile but I think we domt get anything logged16:42
mordredclarkb: also - what I meant above by step one being just running manage-projects is that it's a pretty self-contained payload - that the manage-projects playbook itself talks to 10 hosts isn't really super important from a mechanism perspective, right?16:42
clarkbmordred: it wasmostly my oncern we hadnt gotten a debug output playbook to work at all so this is a big jump but then I rememberes gpaccess16:43
mordredyeah. that was my thinking - it's working for goaccess which is structurally the same even if it's touching a different jump host16:43
mordredif we get _nothing_ logged then we might want to update the run playbook to redirect the ansible-playbook stdout to a file then grab that file as a logfile to upload16:44
clarkbmordred: ya I don't think we ever get logs16:59
clarkbso maybe that is the only thing we should teark17:00
clarkbto start at least its kinda nice to have the logs hidden from zuul, then we can add them in if we decide they aren't leaking things17:20
clarkbfungi: thats a gerritlib fix we mirroed in jeepyb17:21
clarkbfungi: but if we can land that change then we can make a release and remove the related cleanup in jeepyb17:21
fungiapproved it. i was debating also solving in gerritlib, though am on the fence as to whether that's just a desirable behavior change vs a regression17:23
clarkbthanks, in that case I should probably make a release as soon as that lands and not wait for additional fixes?17:27
fungiyeah, a reasonable choice17:29
fungithough i can whip up a change for the regression we worked around with 715726 if you think it's worth patching in gerritlib17:30
clarkbfungi: do you have the traceback handy?17:31
clarkbI assume it failed in _ssh()?17:31
fungiyeah, in _ssh()17:34
openstackgerritMerged opendev/gerritlib master: Return lists from listing functions
fungii don't really have a preference when it comes to testing for nonetype vs catching an exception in manage-projects17:37
clarkbfungi: some of this is from memory so may not be entirely correct. jeepyb in the old setup was using a db lookup of groups which would have different failure modes than the ssh api lookup. Rereading gerritlib I'm not entirely sure this is a regression there (as it would've returend an error if gerrit didn't exit 0 on the ssh command) and instead we needed to properly catch the different case in jeepyb17:37
clarkbthat said I could see an argument that a better behavior for listGroup() would be to return [] if there were no matches rather than raising17:38
*** diablo_rojo has joined #opendev17:38
fungiright, that's why i didn't ultimately also push up a change for gerritlib on that one17:39
*** lpetrut has quit IRC17:40
clarkbfor now maybe its best to leave the gerritlib behavior stable in case anyone else is using it and checking for exceptions already17:40
clarkbmordred: maybe we should log to bridge in /var/log/ansible? then if we vet the output add it to the job as a logfile?17:47
clarkbmordred: we sort of did similar with goaccess where we had it write the html report to disk but then didn't collcet it to start, we reviewed the output didn't disclose anything extra, then added it as a log file17:47
openstackgerritMerged opendev/system-config master: [gitea] Point to newly-split Getting Started content
clarkbttx: ^ thank you for that17:50
clarkbfungi: I'm looking for my keyachain now in order to sign a tag17:53
clarkbwill get that pushed up asoon as I've found it17:53
fungior i can do it if you prefer17:54
*** diablo_rojo has quit IRC17:55
clarkbnah I've found it :)17:55
clarkbit was where I left it on the night stand17:55
*** diablo_rojo has joined #opendev17:55
clarkb0.8.4 is what I'll be tagging17:55
fungiyeah, seems right to me17:56
clarkband pushed17:57
fungi0.8.3 was the last tag, and only bug fixes since17:57
clarkbfungi: mordred looking at project-config it appears we have a few project creation changes18:06
clarkbI heard we think things are good to go now, should we be landing those?18:07
clarkbI'll set topic:new-project on them18:07
fungiyeah, maybe let's approve one and make sure it goes through without error, then do the rest in bulk18:09
clarkbI think topic:new-project has the list18:09
clarkbmnaser's would be a good candidate btu it is in merge conflict18:09
clarkbmnaser: ^ want ot update that one really quickly with a rebase? or should we?18:10
openstackgerritJeremy Stanley proposed openstack/project-config master: Replace incident channel with opendev-meeting
openstackgerritJeremy Stanley proposed opendev/system-config master: Replace incident channel with opendev-meeting
mnaserclarkb: i can rebase it if you need me to18:10
clarkbmnaser: ya I think it needs one to merge according to gerrit18:11
openstackgerritMohammed Naser proposed openstack/project-config master: vexxhost: add repos for exporters
openstackgerritMerged zuul/zuul-jobs master: test-upload-logs-swift: revert download script
openstackgerritMerged opendev/base-jobs master: Revert "virtualenv-config: add to base pre playbook"
mnaserclarkb: ^ :)18:15
clarkbmnaser: thank you +2'd18:15
clarkbfungi: mordred maybe you can rereview and that will be our canary?18:15
*** ralonsoh has quit IRC18:17
fungiapproved, though it won't test the acl and group creation bits which are where we ran into our most recent errors18:26
clarkboh good point maybe find another canary thne18:28
clarkb what about that one18:28
fungiyep, +2 from me18:29
fungiif there's one with an "upstream" import url, that might be a good test too18:29
mordredclarkb: I think that's a good idea (log to log file then manually vet the output)18:39
mordredclarkb: those both look good18:40
openstackgerritMerged openstack/project-config master: Add nginx-ingress-controller armada app to StarlinX
openstackgerritMonty Taylor proposed opendev/system-config master: Add job to run manage-projects in zuul
openstackgerritMonty Taylor proposed opendev/system-config master: Log manage-projects to stdout
openstackgerritMonty Taylor proposed opendev/system-config master: Run manage-projects/base/bridge on system-config changes
openstackgerritMonty Taylor proposed opendev/system-config master: Parameterize manage-projects logging output
openstackgerritMonty Taylor proposed opendev/system-config master: Redirect production playbook output
mordredclarkb: ^^ I think that captures a set of safe steps19:02
mordredclarkb: I think we can (and should) land the first already. the second should also be safe to land and should be a no-op19:04
mordredthird is definitely no-op since it's to a playbook that isn't in use :)19:05
mordredthe fourth would trigger manage-projects from zuul - but still logging to the log file on gerrit - then the fifth would log the ansible output to a file on bridge (so we can see that it's useful output)19:05
mordredcorvus, fungi : ^^19:06
clarkbmordred: k reviewing in a moment, fnishing early lunch19:15
openstackgerritMerged openstack/project-config master: vexxhost: add repos for exporters
AJaegerinfra-root, according to grafana, the logstash queue is linearly increasing - is that normal? now over 15k in the queue19:29
clarkbAJaeger: I checked it earlier today as a followup to friday and it was at 5k. I expect we've added a bunch of logs to process in some jobs 9possibly via large console logs or somewhere else)19:31
clarkbAJaeger: I think the thing to check is if it goes back down after zuul load subsides. If it doesn't then we aren't keeping up at all and we need to identify where the bloat is coming from19:31
AJaegerclarkb: I see, thanks19:33
openstackgerritMonty Taylor proposed opendev/system-config master: Log manage-projects to stdout
openstackgerritMonty Taylor proposed opendev/system-config master: Run manage-projects/base/bridge on system-config changes
clarkbmordred: in don't we want to redirect to a file? but that change sets it to false19:44
clarkboh I see the next change overrides in the child job19:44
mordredyeah - I was trying to get it set up into small discreet chunks :)19:44
clarkbmordred: I've approved the first in the stack (update of git repo)19:46
clarkbthe others to the point of logging locally lgtm. but I did leave a comment on the logging one19:47
clarkbwhich you've seen so yay19:47
fungiour canary changes 714686 and 714965 seem to have been processed with no exceptions raised19:48
clarkbmordred: has a note too19:48
openstackgerritMonty Taylor proposed opendev/system-config master: Redirect production playbook output
openstackgerritMonty Taylor proposed opendev/system-config master: Add job to run manage-projects in zuul
openstackgerritMonty Taylor proposed opendev/system-config master: Log manage-projects to stdout
openstackgerritMonty Taylor proposed opendev/system-config master: Run manage-projects/base/bridge on system-config changes
mordredclarkb: I added the date time19:48
clarkbmordred: looking19:48
*** mugsie has quit IRC19:51
*** mugsie has joined #opendev19:54
mordredcorvus: I updated a bit based on review from clarkb if you have a sec to re-review19:58
*** dpawlik has quit IRC19:59
clarkbAJaeger: fungi: mordred should we start landing more of those topic:new-project changes?20:02
clarkb(I'm happy to help review them but have been taking my cues from yall on readyness)20:02
*** dpawlik has joined #opendev20:08
fungiyeah, i can take a look in a moment20:08
fungi did not wind up with the project creator user left in it20:10
fungiand the,access acl got created correctly, looks like20:10
mordredclarkb: yes - but...20:11
mordredclarkb, fungi : let's keep a few in our pocket so we can use them to test the zuul triggered ones20:11
fungiand the initial repo state got created successfully in with a .gitreview file containing the correct bits20:12
fungiso i think the only thing we haven't really tested yet is repository importing from an "upstream" url20:12
*** dpawlik has quit IRC20:13
corvusmordred: note from clarkb on
corvusmordred: and ansible bug on
clarkbcorvus: is that true when the first character of the string isn't { ?20:25
clarkbre the ansible bug20:25
corvuswell, the linter seems to be complaining20:29
clarkbprobably best to be safe there then20:29
AJaegerfungi: didn't we test import yesterday?20:30
corvusi'm happy with whatever the linter is (it's a yaml parse error, not some finicky thing).  but that's what i'd do20:30
AJaegerfungi: Id9648164023590a440c56906ecd982523b176179 has upstream20:31
fungiyeah, we just haven't tested it in the context of an error-free run, but probably good enough20:32
AJaegerspeaking about new repos, do we have rules on what to take into opendev - and would we "adopt" (if it passes)?20:33
* AJaeger will read backscroll tomorrow and waves good night20:35
corvusthat seems like a good meeting topic20:36
mordredcorvus, clarkb : I think on clarkb's comment - we very well might need to log to a file and attach ... but I think we might also just get what we need from -v since there is output - and think it's worth trying? or shoudl we just go ahead and make a tmp file and log to it and then copy it20:36
clarkbAJaeger: zbr I don't think we should fork pre-commit20:36
corvusmordred: iiuc, the issue is that that output is going to the inner ansible running on bridge?20:36
corvusso if that works, aren't we just going to see "ok: 1" ?20:37
corvusor "changed: 1" or whatever20:37
fungiclarkb: i'm assuming the reason is that pre-commit wants to clone everything over the network from github, and is unwilling to entertain fixes for that (because they don't consider it a problem)?20:37
fungibut yes, carrying a fork of it to patch around that also seems questionable20:38
clarkbfungi: ya if openstack or whoever wants to do that its up to them, but I don't think we need to provide that as part of the opendev service20:39
*** njohnston is now known as njohnston_20:39
fungimaybe it's something the openstack qa team wants to consuder20:40
clarkbmordred: corvus that is what it looks like for goaccess20:42
clarkbmordred: corvus I think we'll get similar here, it will just be no logger found until the end of the playbook20:42
mordredcorvus, clarkb: yes. I agree :)20:43
corvus(we could, incidentally, open the streaming port in the firewall on bridge and static and get streaming console logs, but that's a separate issue)20:44
clarkbthe upside to having the log as an artifact is it ensures we log it on the host too20:44
fungiokay i've approved a few more topic:new-project changes20:44
clarkb(whcih I think would be nice)20:45
clarkbwe can still dump to console log and to disk though, one doesn't imply the other isn't happening20:45
corvusclarkb: the goaccess/static situation is still different though, that's a single ansible run, not nested, right?20:45
clarkbcorvus: correct, but its at the top layer which we have to pass through when nested20:45
corvusclarkb: because zuul *does* eventually see the output from the run:
clarkbcorvus: it sees the ansible backend stuff but not the console log20:46
clarkbin our case because its nested ansible that means we'll get effectively nothing20:46
*** njohnston_ has quit IRC20:47
corvusclarkb: to be precise (sorry, this helps me make sure we're talking about the same thing): in the goaccess case, zuul is running the playbook directly, so the zuul output json is exactly as normal for any zuul job.  but the streaming console log and text file is missing the output from shell commands because the log streamer daemon is firewalled.20:48
openstackgerritMerged opendev/system-config master: Update project-config in manage-projects
corvusclarkb: in the prod playbook case, it's similar, except that the 'shell' command in this case is nested ansible.  so in our json file, we will get a little bit of ansible boilerplate and ending with "changed: 1", but similarly no ansible output in the text or streaming logs.20:49
*** hashar has quit IRC20:49
openstackgerritMonty Taylor proposed opendev/system-config master: Collect production playbook output
*** njohnston has joined #opendev20:51
mordredcorvus, clarkb : ^^ I think that should do the thing we want - yes? log to a file, then collect the file20:51
mordredone sec - typo20:52
openstackgerritMonty Taylor proposed opendev/system-config master: Collect production playbook output
mordredalso - should we add a -v to the ansible-playbook invocation so that we get stdout for success lines?20:53
clarkbyes that should do what we want20:53
mordredclarkb: want me to add the -v?20:53
clarkb(and we'll have it on the server in the normal location too for people that will keep looking there as we transition)20:53
clarkbmordred: that distinction is a bit fuzzy to me. With -v we get the stdout of successful tasks? but without it will just say change: true or whatever?20:54
clarkbis there any concern we'll leak things we shouldnt' that way?20:54
clarkbotherwise its probably fine20:54
mordredI don't think so - no20:55
mordredI mean - the output will be the stdout of manage-projects which I think is actually pretty solid20:55
clarkbin the case here it would probably be gitea admin credentials since everything else uses ssh keys20:55
clarkbya manage-projects should be fine since its using an ssh key20:55
clarkbbut the gitea side maybe? and also this is the default in that base job which might catch more things over time20:56
openstackgerritMerged openstack/project-config master: Add xstatic-** projects for vitrage-dashboard
mordredwell - we'll start with collect_logs false - so we can verify that20:56
mordredwhen we add new things20:56
openstackgerritMonty Taylor proposed opendev/system-config master: Collect production playbook output
mordredclarkb, corvus : now with -v20:57
clarkbfwiw ansible verbosity has always confused me20:57
clarkblike to get a traceback to understand why it failed you need -vvvvv but to leak secret data it seems to just happen :)20:57
clarkb(hence the explicit no_log: true things we do)20:57
openstackgerritMonty Taylor proposed opendev/system-config master: Add job to run manage-projects in zuul
openstackgerritMonty Taylor proposed opendev/system-config master: Log manage-projects to stdout
openstackgerritMonty Taylor proposed opendev/system-config master: Run manage-projects/base/bridge on system-config changes
mordredclarkb: yeah20:58
*** mlavalle has joined #opendev21:05
corvusone of the advantages of using ansible modules for things is that you can relax no_log.  for instance, a module that takes a password parameter will automatically mask that in logs.  but that's extra effort; most of our secret use is just shell tasks21:06
corvusbut if we find some specific thing we want to pull out of no_log in this effort, we might look into that as an option21:06
clarkbfungi: should we try to use opendev-meeting tomorrow or is better in a week?21:13
* clarkb is putting together agenda email and want to get location correct21:13
openstackgerritMerged openstack/project-config master: Add Rook to StarlingX
openstackgerritMerged openstack/project-config master: Add Cert-Manager Armada app to StarlingX
fungiclarkb: i would say we should agree on it in tomorrow's meeting and announce in the meeting that we're moving to the new channel next week?21:14
fungijust for maximum possible continuity21:14
mordredclarkb, corvus : I'm landing the first two of those (which should not have any noticable impact)21:14
clarkbfungi: wfm21:14
fungiclarkb: maybe also decide on the new ml at the same-ish time21:14
fungi(and announce similarly in meeting and then on the old ml21:15
fungiannouncement to the old ml can similarly mention the change in meeting venue21:16
clarkbfungi: did you settle on a name for the new list? (I'll put notes in the agenda too, but for tomorrow will be business as usual)21:17
fungiyep, that's what's in the change anyway21:18
fungiit's available for debate21:18
clarkbI think its fine :)21:18
fungibut that's what seemed consensual in last week's meeting so it's what i went with21:18
clarkbwe will have an "OpenDev" heavy agenda which is probably a good thing (we are headed in the right direction)21:20
fungiand the following meeting in #opendev-meeting might be an entirely opendev agenda (by definition!) ;)21:21
openstackgerritMerged zuul/zuul-jobs master: Improve job and node information banner
clarkbfungi: the irc channel changes and the new list change lgtm. topic:opendev-comms if others want toreview too.21:40
* fungi hopes others will review, at least21:41
fungiwhen the ml one merges we'll want to remind everyone to subscribe and then update references in lots of places21:41
fungii'm happy to volunteer to serve as a list moderator since i already check moderation queues on openstack-infra ml daily21:42
fungialso when the irc change lands we'll want to propose an irc-meetings change21:42
openstackgerritIan Wienand proposed openstack/diskimage-builder master: run_functests: handle build without tar
openstackgerritIan Wienand proposed openstack/diskimage-builder master: centos 8 image build: fix mirror
*** DSpider has quit IRC21:59
openstackgerritMerged opendev/system-config master: Parameterize manage-projects logging output
openstackgerritMerged opendev/system-config master: Remove /tarballs proxy from mirrors
openstackgerritMerged opendev/system-config master: Collect production playbook output
ianwmordred: are you around to discuss your thoughts on container upgrade procedures?22:00
mordredianw: sure!22:02
mordredI may or may not have useful things to say :)22:02
mordredianw: (also - check it out - we've got a stack moving towards zuul run ansible!)22:03
ianwso will just restart with the latest image by pulling it each ansible pulse?22:03
fungii thought the idea of upgrades in the container ecosystem was that you burn it all down, salt the earth, and move on22:04
clarkbianw: its the next task that will do it22:04
mordredyeah - the docker-compose up -d command22:04
clarkbianw: the pull will just update the images, but not restart any containers, but running docker-compose up next though it will update any running containers to the latest images22:04
ianwright, so the up is expected to re-up if the pull pulled a new image?22:04
clarkbianw: yes22:04
clarkbit will noop if no image updated22:04
corvusfungi: that is more or less what mordred and ianw are talking about does :)22:04
mordredcorvus: mmm. salt22:05
fungicorvus: i figured. you blow away the container and deploy a new one?22:05
corvusfungi: yeah; since we're bind mounting all our data in, that sticks around22:05
corvusso, i mean, don't salt *that* earth22:05
mordredno - that woudl be bad salted earth22:06
corvusbut this earth over here is okay22:06
ianwyeah, i thought so ... since so much state is outside on the builder i'm thinking it might need ... something else22:06
fungidata migrations22:06
mordredianw: like what?22:07
clarkbianw: mordred: we could continue to run it as we do today and only restart through manual intervention22:07
clarkbhave ansible do the pulls to keep us up to date maybe?22:07
ianwmordred: yeah, like what was what i'm trying to figure out :)22:07
fungiif we're talking about gerrit, it does the usual rdbms tactic of recording a schema version and running migrations if it's restarted with code which expects a newer schema22:07
mordredfungi: actually - not quite22:07
mordredwith gerrit we have to run gerrit init to run the migrations22:08
fungiit used to...22:08
mordredso there is a special upgrade step22:08
ianwis gerrit ok with being sort of arbitrarily killed and started like that, in term of in-flight db stuff, etc?22:08
mordredyeah - the init script runs init every time22:08
mordredbut it's not actually necessary22:08
fungioh, right, init is supposed to be a no-op if the schema version is already at the expected level22:08
mordredwell - with _gerrit_ we're not planning on having ansible restart it22:08
mordredbecause gerrit22:08
mordredfungi: yeah - also - for non-container, gerrit init expands the war plugins and downloads the db connector jar22:08
mordredbut we do that at container build time now22:09
mordredbut - gerrit is currently special22:09
ianwyeah ... i guess that's also "because dib" ... i'm thinking ahead a bit for the launchers too22:09
mordredwe don't have ansible run docker-compose up for it unless we set a flag - which is normally unset22:09
corvusi thought dib could be stopped via signal?22:10
ianwboth i think we want to have the ability to stop what they're doing?22:10
openstackgerritMerged openstack/diskimage-builder master: Mellanox element: removed ibutils,libibcm,libmlx4-dev
ianwcorvus: yeah, at least signalling so it can try and run it's cleanup would be a start22:10
ianwi think ideally we'd tell the builder to drop out of accepting any new requests, finish what it's doing22:10
mordredit would be great if we could just auto-upgrade the nodepool components and not require manual restarts ... so maybe we need to add a signal to the playbook22:11
corvusianw: i think that a docker container stop begins with a sigterm?22:11
corvusterm, grace period, kill22:12
corvusso maybe if we can tell docker-compose to use a really long grace period, it will Just Work?22:12
ianwhrm, i guess that would hold up the whole ansible pulse though?22:13
mordredit shouldn't - docker-compose up -d should return pretty immediately22:13
corvusthe down could hold it up, but is that a problem?22:13
clarkbyou can set --timeout on docker-compose up22:13
corvusmordred: oh22:13
ianwmordred: hrm, then i wonder what happens if the next pulse happens while the last one is still shutting down :)22:14
corvusmordred: you're saying the docker-compose up will return immediately, and docker will async stop and start in the background22:14
mordredI *think* so22:14
mordredbut then ianw has a good question22:14
corvussounds reasonable22:14
corvusit doesn't take that long to stop dib, does it?22:14
mordredwill a second up correctly no-op22:14
mordredcorvus: does builder propogate the TERM?22:14
corvusi think that's still how we shut down a builder?22:15
ianwcorvus: no, but i think ideally, if there was a way, you'd stop accepting new requests, finish what you were doing, and then exit22:15
corvus(like, we systemctl stop nodepool-builder, which would then term)22:15
corvusianw: i feel like with builders it should be okay to just stop asap22:16
ianwyeah, we usually combine that with a reboot and clearing out a bunch of crap, in practice22:16
clarkbyou could end up starving builds if nodepool images were built often22:16
clarkb(I don't think that is a really big issue though)(22:16
corvusproblems i'd like to have: 47) nodepool merging changes fast enough to worry about build starvation :)22:17
ianwi will say that anywhere dib exits and doesn't clear up *is* a bug ... but practically, it depends on where you exit as to how well mounts etc are cleaned up22:17
*** tosky has quit IRC22:20
ianwi guess we just see how it goes, and worry about it if it's leaking itself to death22:21
clarkbsemi related, ianw did you see the request for a dib release over the weekend?22:23
mordredianw: yeah - I think maybe if it leaks itself to death - we should just figure out how to fix it ;)22:23
openstackgerritKendall Nelson proposed opendev/irc-meetings master: Update FC SIG Meeting
corvusmordred: i have an opendev zuul-registry <-> sdk <-> swift question; where should i ask you about that? :)22:24
mordredcorvus: so many options!22:25
mordredcorvus: here or -sdks probably22:25
mordredlet's start here - we can go there if it turns out to be an SDK bug22:26
corvusmordred: k, lemme get a paste going22:26
* mordred is excited22:26
corvusthis is hard; we need an event id for these lines.  it'll take me a minute.22:29
mordredoh GOOD22:29
mordreda hard one22:29
ianwclarkb: one was done yesterday with the python stow stuff22:29
clarkbianw: perfect! :)22:29
mordrednoonedeadpunk: ^^22:30
ianwclarkb: and fedora 31, which i was going to add to the builder soon, which was making me think about how to restart it :)22:30
ianwalthough, i'm still not 100% sure we have the old semantics that dib releases update in the container22:30
ianw$ disk-image-create --version22:32
ianw$ nodepool --version22:32
ianwNodepool version: 3.12.1.dev3022:32
clarkbianw: this is probably similar to jeepyb and gerrit images. You want both to trigger an update to the artifact22:34
clarkbianw: currently I doubt that nodepool iamges are rebuilt when dib updates22:34
mordredclarkb: it would be very hard to do so since nodepool is in the zuul tenant22:34
clarkbmaybe this is the reason to start periodic image updates?22:35
mordredmaybe. or make something that runs in promote in dib and sends a null-change to nodepool that we can insta-merge22:35
mordred(or in release in dib rather)22:35
mordred(there is a very worthwhile general concept here that would be good to wrap our heads around)22:36
corvusif nodepool requires a specific version of dib, bump the requirement in nodepool?22:36
clarkbcorvus: oh thats a good way of expressing it, though I'm not sure we really need a hard dep as much as "using latest is nice"22:36
mordredcorvus: I don't think nodepool does need it - but I think we'd like the latest dib in prod22:36
corvus(like, if we don't care about dib releases, then we don't care about making new nodepool images when dib updates.  if we do care about dib releases, then we should have a version spec.)22:37
mordredyeah. what clarkb said22:37
* mordred is fine bumping the min over in nodepool - just saying we don't _strictly_ *require* the most recent in nodepool ... but maybe in this case that semantic difference isn't necessary22:37
corvusi'm not particularly concerned about running the latest in prod unless there's a bugfix or new feature :|22:37
ianwyeah, i think it's if the user wants the latest features of dib, in terms of the interface between nodepool<->dib there's no hard dependency22:37
mordredcorvus: in this case it's a new feature22:37
mordredbut yeah -it's a zuul-jobs user that wants a new dib element22:38
ianwnot really, i want the new version for fedora 31 support22:38
mordredoh - even more important22:38
ianwwhich is an infra image22:38
mordredianw: so maybe that is, in fact, a good reason to require the new dib in nodepool22:38
mordredsince anybody who wants to use nodepool to build f31 images on ubuntu is going to need it22:38
ianwi don't mind turning it into >= type requirement situation in nodepool, if that's what we agree22:39
mordredfor this case it seems valid22:39
ianwmordred: to propose a null change, we'd have to have something similar to the proposal bot?  encode a secret for a user, pull a nodepool tree, and push a change?  or is there some other way?22:41
ianwto automagically propose changes via the release pipeline, i mean22:41
mordredyeah - it would be something like that22:41
corvusi'd really like it if we could treat dib like a normal dependency22:43
clarkbya that is why I wondered if periodic builds would be worthwhile22:43
clarkbthen each day (or whatever period) we'd get an image with the newest versions of all deps22:44
corvuswe build a nodepool image every few days22:44
corvusat most22:44
corvusit seems that right now we REALLY CARE about what dib version it is, so let's bump it22:44
openstackgerritAmy Marrich (spotz) proposed opendev/irc-meetings master: Update FC SIG Meeting
corvusthen go back to not caring for the next 6 months :)22:44
ianwyeah, just given what dib practically does, it's unlikely there will be many releases that infra production doesn't care about -- to say another way we're not doing a lot of outside development22:46
ianwso traditionally the model that puppet pulls in the latest as it releases has worked well22:46
ianwwe don't *have* to automate that, just that's the way it's been22:46
corvusthen maybe we should look at splitting the elements out from dib22:48
corvusso they can be updated faster and externally from the software dependency22:48
clarkbwouldn't that have a similar issue?22:48
corvusbind mount them over the installed image or something22:49
clarkbI guess if we did't install them into thei mage we could keep an external set up to date that were mounted in22:49
mordredI mean - honestly, the number of times DIB changes these days is almost never22:49
mordredthe dib element library is where most of the action takes place22:50
clarkbwhat if we ignore dib for a minute. And think about this from the perspective of needing the base image to be updated for a security update22:50
corvusthen we'd just update it22:50
clarkb(I think there is a general class of problem here and Iwonder if we are fixated on dib too much)22:50
clarkbcorvus: the base image itself would be called the same thing it would just need a rebuild22:50
clarkbcorvus: how do we express that? a noop change?22:51
corvusi'd really like to just find out if the nodepool image only updating 3 times a week is really untenable for us22:51
corvusi don't think it is22:51
ianwproposed : Ian Wienand proposed zuul/nodepool master: Update dib dep to 2.35.0
corvusclarkb: sure, if it's important, go for it22:51
corvusit's just that in the normal course of events, this happens every few days automatically22:52
clarkbcorvus: ok if we are on board with that I think it works for that general case of problem22:52
mordredI agree - a noop change in such a situation would work fine22:52
ianwproposed : Ian Wienand proposed zuul/nodepool master: Update dib dep to 2.35.0
ianwalthough, another option would be to manually do that in the Dockerfile22:52
mordredmanually do what?22:53
ianwsince nodepool doesn't *need* dib 2.35.0 ... infra production, or maybe we could say users of the container image do?22:53
mordredI don;'t have any problem just bumping the min - I don't know that there's any compelling use case for avoiding the bump22:53
mordredanybody installing via pip is getting 2.35 right now anyway :)22:54
ianwyes true, only if they'd gone to some effort to pin backwards for some reason22:54
corvusif it's worth the 3 of us talking about it for 30 minutes, it's worth a version bump22:54
mordredyeah. and if that person exists, maybe this will flush them out and they'll tell us about why22:54
ianwcorvus: heh, well yeah, but brown-bag fixes patched up in dib are frequent enough that i'm glad we can figure out procedure now rather than when builds are dead everywhere :)22:56
corvusmaybe dib changes more than i think it does?22:56
ianwi mean these days it's usally new distro release, but occasionally something async happens like a new point release of centos etc that breaks and requires fixes22:58
fungiwe often go a month or two of quiet and then there's a few weeks where some platform is broken and needs a series of fixes in dib and one or more new releases depending on how thoroughly it gets fixed on the first try22:59
mordredyeah - so - I just looked23:00
fungibut it's basically all elements23:00
mordredover the past 6 months23:00
mordredwe've had 3 incidents of relevant change23:00
mordredone sept 27, one feb 13 then 2 mar 18 and one mar 2723:01
mordredso by and large, while I agree there is a theoretical conceptual issue that could be solved23:01
mordredin practice, this is not an issue for us23:01
mordredsept 27 was adding centos-8 support - feb 13 was fixing a venv/glean thing - then we had a cluster of build-only-packages, python-stow and fedora-3123:03
mordredeach of those, to me, is worthy of a nodepool min bump actually23:03
mordredsince they're all things one would want to make sure are in a sane nodepool-buiolder23:03
clarkbya I think bumps in those cases make sense23:03
ianwbasically it would be the exception that a dib release did *not* result in a production bump, rather than the rule, i would say even after looking over the recent releases23:11
ianwi'm totally fine with manually bumping nodepool, since it seems we're agreed the nodepool requirements.txt isn't so much "the minimum required to actually do something" but more tied to what people want to build at that point in time23:11
corvusi'm not sure i agree with that23:11
corvusyou're telling me we can't build f31 without the newest version.  i think that warrants a bump.23:12
corvusi don't think i would agree with "we should just encode whatever the current version is in requirements"23:12
corvus(i don't think we should bump the version to get the stow element)23:13
ianwwell, i'm saying that practically, and you can read back through the changes in dib release history and maybe you'll agree, there are not many dib releases that would not have some affect on a production infra image somehow23:13
corvusyeah, that may be true; i'm not sure what dib's release criteria are23:14
corvusi don't know if stow got its own release or not23:14
ianwwell yes, i mean chicken-egg -- the release criteria is usually "we need this in some sort of production" and that production is *usually* opendev infra23:15
ianwanyway, if people want to make the procedure concrete with a vote on then that would be great, and we can have 2.35.0 in production and i can try deploying f3123:16
fungii do expect that most dib releases in modern history have been because opendev asked for a release so we could consume some fix23:16
corvusmordred: the issue i'm seeing looks like an object in swift is 4 bytes shorter than i expect23:20
corvusmordred: the key line is #16: INFO registry.api: Finish Upload chunk zuul/zuul-executor 5d35161962bd40bebeb022fcc41686ae 2856794023:20
corvusmordred: last number is the length of the first chunk of the upload; that's really the only chuck, but the weird docker process involves uploading a second zero-byte chunk too (so you'll see a /2) in there23:21
corvusmordred: then we do a COPY on both of those chunks23:21
corvusmordred: then we upload a multipart-manifest pointing to both of them to combine them23:22
corvusthen it's done23:22
corvusbut when i fetch that object, i get 28567936 bytes23:23
mordredcorvus: yesh23:24
mordredcorvus: I'm going to need to digest that - and it's evening walk time ... I've got it loaded up in my browser though23:24
corvusmordred: yeah, i'm eoding too; hopefully that's a good enough stopping point we can pick it up tomorrow23:25
corvusmordred: er, sorry, i think it's a 540672 byte difference: 28567940 vs 2802726823:30

Generated by 2.15.3 by Marius Gedminas - find it at!