Tuesday, 2021-04-13

*** hamalq has quit IRC01:19
*** sanjayu__ has quit IRC01:21
*** ajitha has joined #zuul01:57
*** rlandy has joined #zuul02:17
*** rlandy is now known as rlandy|rover02:18
*** evrardjp has quit IRC02:33
*** evrardjp has joined #zuul02:33
*** rlandy|rover has quit IRC02:41
*** bhavikdbavishi has joined #zuul03:13
*** bhavikdbavishi1 has joined #zuul03:16
*** bhavikdbavishi has quit IRC03:17
*** bhavikdbavishi1 is now known as bhavikdbavishi03:17
*** vishalmanchanda has joined #zuul03:18
*** jfoufas1 has joined #zuul04:49
*** fbo has joined #zuul05:07
*** y2kenny has quit IRC05:26
*** bhavikdbavishi has quit IRC06:17
*** bhavikdbavishi has joined #zuul06:23
*** saneax has joined #zuul06:30
*** dpawlik has joined #zuul06:33
*** dpawlik has quit IRC06:37
*** dpawlik has joined #zuul06:40
*** dpawlik has quit IRC06:42
*** dpawlik has joined #zuul06:55
*** dpawlik has quit IRC06:55
*** bhavikdbavishi has quit IRC06:58
*** jcapitao has joined #zuul07:12
*** dpawlik has joined #zuul07:13
*** dpawlik has quit IRC07:13
*** dpawlik has joined #zuul07:14
*** jpena|off is now known as jpena07:34
*** rpittau|afk is now known as rpittau07:43
*** tosky has joined #zuul07:49
*** nils has joined #zuul08:11
*** dpawlik has quit IRC09:28
*** nhicher has quit IRC09:28
*** fbo has quit IRC09:29
*** nhicher has joined #zuul09:48
*** fbo has joined #zuul09:48
*** jcapitao has quit IRC09:49
*** dpawlik has joined #zuul09:52
*** rpittau is now known as rpittau|bbl09:54
*** bhavikdbavishi has joined #zuul10:26
*** bhavikdbavishi has quit IRC10:44
*** dpawlik has quit IRC10:45
*** dpawlik9 has joined #zuul10:53
swestcorvus: 785972 lgtm. no need to wait longer than necessary from my pov11:14
*** bhavikdbavishi has joined #zuul11:29
*** jpena is now known as jpena|lunch11:31
*** rlandy has joined #zuul11:37
*** rlandy is now known as rlandy|rover11:37
*** bhavikdbavishi1 has joined #zuul11:57
*** rlandy|rover is now known as rlandy|rover|mtg11:58
*** bhavikdbavishi has quit IRC11:58
*** bhavikdbavishi1 is now known as bhavikdbavishi11:58
*** jcapitao_ has joined #zuul12:02
*** jcapitao_ is now known as jcapitao12:04
*** rpittau|bbl is now known as rpittau12:31
*** jpena|lunch is now known as jpena12:37
*** tosky has quit IRC13:19
*** tosky has joined #zuul13:25
*** tosky has quit IRC13:29
*** tosky has joined #zuul13:29
*** bhavikdbavishi has quit IRC13:40
*** rlandy|rover|mtg is now known as rlandy|rover13:59
openstackgerritMerged zuul/zuul master: Make buildset mandatory on build  https://review.opendev.org/c/zuul/zuul/+/77090014:02
*** sanjayu_ has joined #zuul14:30
*** saneax has quit IRC14:33
*** sanjayu__ has joined #zuul14:39
*** sanjayu_ has quit IRC14:42
*** hamalq has joined #zuul16:20
*** bhavikdbavishi has joined #zuul16:22
*** rpittau is now known as rpittau|afk16:24
corvusswest: ack16:28
*** jcapitao has quit IRC16:29
corvuszuul-maint: https://review.opendev.org/758940 is going to be a minor breaking change for operators (we need them to generate a new secret and put it in zuul.conf).  if i send an announcement about that today, how long should we wait to merge that change in master?  until tomorrow, or until monday?16:30
corvusobviously that only affects people running master CD; at the same time we can also let folks know that it will be required in the next release, which i imagine will be at least a few weeks from now.16:31
corvus(and people can add the new section asap, so this is about letting people know to add this now to prepare for when it's merged/release)16:32
*** nils has quit IRC16:32
clarkbcorvus: the new secret is the encryption key for the data at rest in zk?16:33
corvusclarkb: yep16:33
clarkbis that something we can auto generate to ease deployment? or would that reduce security posture?16:33
corvusit's used (or will be used) by the schedulers and executors, and needs to be distributed to them by some means other than zk.  right now the implementation puts it in the zuul.conf file; we could conceivably auto-generate it and write it to another file, but i don't know how we'd distribute it.16:36
fungii don't think autogenerating it would reduce security posture, however it would mean additional code and something to write it out16:36
fungiand yes, once it's used by additional machines, you have a classic key distribution problem16:36
corvus(i mean, if you look at the state right now with exactly one scheduler needing it, it's easy.  but with >1 shedulers ande >1 exeuctors, it's a distribution problem)16:36
clarkbah ya that makes sense since all members of the cluster will need to decrypt eventually16:36
corvusmaybe not web and merger, but certainly a lot16:37
corvus(and who knows, maybe they will)16:37
fungikey distribution is considered notoriously hard in cryptography circles. there are already many solutions out there, we should allow operators to pick one which best suits them and not try to invent our own (which we would invariably get wrong, and that *would* reduce the overall security posture)16:38
corvusi'm expecting executors to need it because if we don't want to store decrypted secrets in zk, we'll need to push the decryption to later (ie, right now we put decrypted secrets in gearman, but we won't want to do that in zk)16:38
clarkbmakes sense16:38
corvusfungi: yeah, for now it's "whatever you're doing with your config management"; i could definitely see us doing something like adding support for a vault or something later.  much later.  :)16:39
fungiso in opendev for example, i expect we would put the key material in our private variable store on our bastion, and then use our ansible jinja2 configuration template as usual16:39
*** jfoufas1 has quit IRC16:40
corvusfungi: exactly like https://review.opendev.org/78598016:41
fungii guess the point is, zuul already needs to be configured with other secrets, so this one should be handled the same as the rest for consistency (i.e. we punt and expect the sysadmin to address the need)16:41
corvusyep.  gerrit/github http creds are handled this way16:41
clarkbcorvus: I feel like tomorrow and maybe even monday may not be sufficient notice? however, I'm not sure how many people are installing from source (tobiash, gerrit zuul, and opendev zuul?)16:42
corvussource or :latest images16:43
corvusclarkb: want to do 2 weeks then?16:43
clarkbwe've still got big release stuff happening on the opendev zuul this week. I wasn't planning to do the zk upgrade stuff until friday at this point :/16:43
*** jpena is now known as jpena|off16:43
corvusclarkb: oh, i'm not worried about opendev :)16:43
fungii think if the error message logged when that key is missing is clear, easy to spot and has sufficient breadcrumbs/guidance in the upgrade release note, it should be fine?16:43
corvusopendev will be taken care of within the hour16:43
clarkbcorvus: we'd still need to restart once the change lands and ensure everything is working as expected?16:44
tobiashwe're also updating in a controlled manner so we're unaffected as well16:44
clarkbthe change being the zuul side change, we can stage the config change early and wait16:44
corvusclarkb: at some point, yeah, but i'm in no rush to restart zuul right now16:44
clarkbcorvus: got it16:45
corvusi think the issue is this: how much notice do we give people running CD that we're making a breaking deployment change?16:45
corvuswe usually give lots of notice because we usually only do them on major releases16:45
corvuswe should have included this in v4, but we forgot16:45
clarkbmaybe say mid next week and give people a week?16:46
corvusand i don't think this is worth revving to v5 and then renaming the distributed scheduler as v6; but if we wanted to be really strict about our versions, we could do that.16:46
clarkbthat should heopfully be enough time to notice and either pin deployment or stage the key appropriately16:46
corvus(mostly because this is a really easy deployment change to make)16:46
corvusclarkb: that sounds reasonable; i can send the email today with a target merge date of april 2016:47
corvuszuul-maint: ^ sound good?16:47
fungiseems like plenty of time to me. i'm reading through the change now just to get an idea of what error message the scheduler is likely to throw if started without that set16:48
clarkbcorvus: sounds good to me16:48
corvusfungi: i think it's: raise RuntimeError("No key store password configured!")16:49
fungiaha, thanks!16:49
corvusso i think that would show up as an exception in the scheduler log16:50
corvuswith traceback16:50
fungii was getting bogged down in the various kinds of keys handed in zuul/lib/keystorage.py16:50
fungiand yeah, the RuntimeError exception was in zuul/scheduler.py which i hadn't reached yet16:51
fungithat looks clearly noticeable16:51
fungiit'll also be safe to drop back to 4.2.x until you correct your configuration, i expect16:52
corvusyep16:53
corvuszuul-maint: draft email https://etherpad.opendev.org/p/ta01qhhP9DNg0jpmMFjK16:57
fungicorvus: does it need any guidance on the format/length of teh value?16:59
*** decimuscorvinus has quit IRC16:59
fungier, i guess you say "strong password" so good enough16:59
fungiyeah lgtm16:59
*** decimuscorvinus has joined #zuul17:00
fungiideally secret encryption key would be at least as strong as the strongest secret encrypted with it17:00
fungiis there an effective size limit for that value?17:01
corvusfungi: yeah, i was thinking not to be too prescriptive there, but i could give an example if it would help...17:01
corvusfungi: i don't believe so; we treat it as a string and pass it to the 'password' field when serializing the private keys17:02
fungii'm going back over the change a bit more closely just to see what kdf is being used, though presumably it's consistent with the one we use on the per-project keys17:03
corvusserialization.BestAvailableEncryption(password) looks like the important bit there17:03
fungioh, right, we're using rsa to encrypt the project secrets17:04
funginot symmetric encryptuion17:05
fungiencryption17:05
corvusyep, so the issue here is storage of the rsa private key (and also the ssh private key for the similar per-project ssh key feature)17:05
fungiyeah, symmetrically encrypting asymmetric encryption private keys, essentially17:06
corvusyep17:07
fungii'll check out whether cryptography.hazmat.primitives.serialization.BestAvailableEncryption() does any key stretching17:07
fungior if we need to layer some in there17:07
corvusfungi: the ssh key side of this would be this from paramiko: http://docs.paramiko.org/en/stable/api/keys.html#paramiko.pkey.PKey.write_private_key17:11
corvus(with an RSAKey)17:11
clarkbis there a risk using a best available method that if best available changes later you'll eb out of sync with the stored data?17:12
clarkb(thinking about the symettry of decryption and encryption, you encrypt with best available and if it doesn't appropriately identify the storage system then decryption may get confused?)17:13
fungithe serialization *should* encode an algorithmic identifier17:13
fungiso the decryption method can use that to determine how to decrypt it17:14
clarkbok (the docs do confirm it can change over time, but as long as they are writing metadata to make an older or newer option useable then should be fine)17:14
fungithe examples do demonstrate passing a raw passphrase directly to serialization.BestAvailableEncryption() so this is likely safe without doing any stretching before hand to convert it into a usable symmetric key with even distribution17:17
*** decimuscorvinus has quit IRC17:19
*** decimuscorvinus has joined #zuul17:23
*** decimuscorvinus has quit IRC17:23
*** decimuscorvinus has joined #zuul17:25
fungiand it looks like the underlying implementation is specific to the kind of material being encrypted too17:27
*** sshnaidm has joined #zuul17:28
*** sshnaidm is now known as sshnaidm|pto17:28
clarkbwhich makes sense since the produced key values will want different attriutes based on whether or not you use rsa or ed25519 or whatever under the hood?17:28
clarkbI was looking at it and it appears they do handle a "password is too long" return code from openssl, but I can't find any indication of what the limit might actully be on the openssl side17:29
clarkbI suspect that it is quite large and we don't need to wrry about it17:29
clarkbah there we go there is a check for > 1023 bytes17:30
fungiso basically don't use 8 kilobit keys17:31
fungibut 4096 bits would be reasonable17:31
corvusswest, tobiash, fungi: i did happen to spot a minor upgrade-related concern i missed yesterday and switched my vote to a -1; i'll see if i can make sure that code path is tested later today.17:32
fungiespecially if you're using it to encode 4096-bit private keys17:32
clarkbI haven't looked much at the zuul side, but still slightly concerend about best available changing. We use bestAvailable to produce a private key, will that private key continue to decrypt data in zk if the best available encrpytion changes?17:32
corvusclarkb: i think we use best-available to choose the serialization encryption for the private key17:33
clarkbcorvus: ah the serialization not the actual private key derivation, got it17:33
corvusclarkb: so i think the thought is the serialization handler should have the metadata fungi was talking about to know which mech to use to deserialize it with the provided password17:33
fungiworst case if that goes through a shift you may need to clear it out of zk and take some downtime to restart all schedulers i guess17:34
corvusi don't know that for certain, but that's my read17:34
fungibut yeah, how it's signalled is going to probably be up to cryptography's load methods autodetecting how to deserialize17:35
fungifor example we're using TraditionalOpenSSL format for the rsa private keys, so it will be up setting/reading to whatever algorithmic identifiers that format supports17:37
fungithis isn't an opaque symmetrically encrypted blob we're storing in zk, we're relying on whatever serializations are typical for the various kinds of key material we're relying on, and whatever encryption options they have natively17:38
clarkbya I guess I'm just wondering what happens if you end up with older ssl on half the hosts and they cannot do best available for the others. Or if best available moves ahead for some other reason (cyptography update?)17:40
fungiso that's saying to use whatever the "best" parameters are for TraditionalOpenSSL17:40
fungiand yes, we'll probably need to set some expectations around upgrade ordering for things like ssl libs17:41
corvuswell, if best available doesn't immediately change to a previously unavailable algo, then it should be fine17:41
clarkbcorvus: that is a good point17:41
corvusiow, if best available switches from A to B but B was previously available, then it should be invisible17:42
fungiright, you'll probably need to avoid upgrading some of your systems and then blowing away and reencoding the keys you're storing in zk17:42
corvusbut if a scheduler does not have B, and another scheduler both has B and bestavailable switches to B, then it's a prob17:42
corvushopefully not common in practice, but i guess if you made a huge (10 year?) jump it might be expected17:43
fungiand may need to avoid adding new projects if part of your system isn't upgraded yet17:43
fungiupgrade some but not all of your servers from ubuntu 16.04 to 20.04 and then add new projects, you might end up encoding with the openssl from 20.04 in ways which the openssl on 16.04 can't handle... or the other way around openssl may be configured on 20.04 to no longer allow some encoding which was "best" as of 16.04 (though much more unlikely, typical deprecation is much slower than that, but maybe some17:47
fungiadvance in factoring will suddenly make sha2 unsafe in a few years?)17:47
fungiand yeah, i'd worry less about what the openssl on those platforms supports, more about what the distros might decide to disallow through runtime or buildtime configuration17:48
fungias we've seen with fedora 33 and rsa for openssh17:48
clarkbI guess the good news with ^ is that undoing that as a user isn't terribly difficult17:49
clarkbyou reduce your default security stance but that may be worthwhile to get all hosts up to date, then restrict again in a short term period17:49
fungianyway, i'm satisfied at this point it doesn't represent a glaring risk of future breakage. there may be changes we become aware of which are problematic for some deployments, but we should be able to tackle them if and when they arise17:51
fungiand i wouldn't want to prematurely optimize for problems we don't have, what's there now is already necessarily complex so we shouldn't make it any more complex than it has to be ;)17:52
fungiafter all, complexity is the greatest source of security problems, not lack of foresight17:52
corvus++17:58
corvusi'll go ahead and send that email17:58
fungiout of curiosity, what would the process look like for changing that key? shut down all dependent services, remove the project keys from zk, then start services again with the new config?17:59
openstackgerritMerged zuul/zuul master: Move project secrets key loading to key storage  https://review.opendev.org/c/zuul/zuul/+/75893918:00
clarkbfungi: probably need to clear entries in zk too?18:00
clarkbto avoid zuul trying to read them without the old key present18:00
fungii said "remove the project keys from zk" but you mean something in addition to those?18:01
clarkbfungi: yes all of the other secret data is or will be encrypted with the resulting encryption key if I read things right18:01
clarkbif left behind zuul would try to read and decrypt it with the new key and fail?18:02
fungioh, i missed that the scheduler is decrypting secrets with the project keys and then reencrypting them wit this shared key18:02
corvuser that should not be what's happening18:02
fungiyeah, so you shouldn't need to remove job secrets from zk right?18:03
clarkbspecifically "All private data in Zookeeper will be encrypted at rest."18:03
clarkbfrom teh commit message18:03
fungiyeah, but the job secrets are already encrypted with per-project keys18:03
clarkboh I see18:04
fungiso that part was already solved18:04
corvuscorrect.  and now the per-project keys are encrypted with this new master password, and since they are, they can now be stored in zk.18:04
clarkbthat only becomes a problem if the project keys change then which doesn't necessarily happy if you change the top level key, though it may18:04
fungiand those keys themselves aren't changing, just how those keys are stored is18:04
clarkb(if you wanted to rotate everyting out below in the "tree")18:04
*** josefwells has joined #zuul18:40
*** sanjayu__ has quit IRC18:53
*** josefwells has quit IRC18:59
*** bhavikdbavishi has quit IRC19:32
*** ajitha has quit IRC19:46
*** vishalmanchanda has quit IRC19:55
*** avass has quit IRC19:55
*** avass has joined #zuul20:02
*** y2kenny has joined #zuul20:17
y2kennyWhere can I find out more about the Zuul contribution workflow?  (I see the Workflow label on Gerrit but I am not sure what is missing for the value to change.)20:20
fungiy2kenny: one of the zuul maintainers has to set the workflow label to +1 (it's controlled via group acls in gerrit)20:21
clarkby2kenny: there are three review categories. The first is verified which zuul supplies a +1 for if check jobs pass after uploading your patchset. Next is code-review and workflow. These tend to be applied by core reviewers. Our gerrit enforces a single code-review +2 and a workflow +1 to merge code. When these happen zuul reruns tests again and if they pass it +2 verifies and then merges it20:21
clarkbthe code review and workflow labels are separated so that you can control the timing of a merge if it doesn't make sense to land something yet but you still want to indicate you have done proper review20:22
clarkbit also helps us to do our by convention two reviewers +2 before approval process. The second person to review can add the approval20:22
y2kennyfungi, clarkb: ok.  Is there a major release coming that slowing submission?  I am just wondering if there's anything else I need to do for a patch to go in and the timing of the patch.  (It's a fix that I wanted to get into my deployment.  https://review.opendev.org/c/zuul/zuul/+/783986)20:24
clarkby2kenny: a release was just made so I think any holdups around that should be gone now20:27
clarkbI've persoanlly been distracted with other things lately, but I can take a look at that change20:27
y2kennyclarkb: thanks!20:27
clarkbtoday is actually a limbo day with openstack release tomorrow and me not wanting to break anything before then :)20:27
y2kennymake sense.20:28
clarkbthat change does conflict with a number of other changes. corvus any concern with that one going in?20:29
clarkbI +2'd it but didn't approve to give corvus a chance to weigh in on the conflcits resolution, but I suspect we can land it20:31
y2kennygreat.  Thanks!20:31
corvusy2kenny, clarkb: it's worth noting it conflicts with another implementation of the fix :)20:35
y2kennycorvus: oops... I did not realize that20:36
corvuswell, to be fair, it has a -1 and may have fatal flaws.  i don't mind at all :)20:36
corvusbut we should look closely at it and see if the issues there are resolved by the new one :)20:37
clarkb++20:37
corvusthe other two conflicts should be easily resolved though; i can't speak for their authors, but in general, i think a fix like this can go in first and rebasing on it shouldn't be a big deal (to my knowledge the other two changes don't *change* this code, just move it around, so the rebase should be easy or nearly automatic)20:38
corvusbut yeah, let's take a quick look at the other and make sure we have the bases covered20:39
corvushttps://review.opendev.org/738668 is the mirror change20:39
clarkbI think y2kenny's case doesn't handle when a run is aborted20:42
clarkbbut I'm not sure we want it to since the intent is to rerun?20:42
clarkband aborted jobs should be those that are forcefully killed (do we want logs in that case?)20:42
y2kennymy implementation also doesn't handle RESULT_DISK_FULL20:46
y2kennyso perhaps I should handle both UNREACHABLE and DISK_FULL but not ABORT20:47
corvusdisk_full is handled at the level above, it's not necessary to do it there.21:00
y2kennyok21:01
corvusand i don't think there's anything else we want to do with aborted; we should just return asap when we abort, and that still happens21:02
corvusi think tristanC_'s question about skipping the pause stanza if we're unreachable is a good question, which i never answered on my version.  i tend to think it's mostly harmless to leave there, as setting the pause flag is probably the last thing it does in the playbook, so unreachable isn't likely to happen after it's set.  but it could, so skipping that may be an improvement.21:04
corvusoh, but maybe the success check handles that21:05
corvusyeah, we can't run pause because we won't have set success21:05
corvusthat actually wasn't present in the old version, so that's improved in the mean time21:05
corvusy2kenny: i believe your patch is not subject to either of the issues tristanC_ raised on mine, so we should abandon mine and continue to merge yours21:07
y2kennycorvus: sounds good.21:08
corvuslet's give tristanC_ a little bit in case he wants to review https://review.opendev.org/783986 since he looked at the earlier one; but i'm happy it has sufficient review and addresses his earlier concerns, so if he's busy and doesn't weigh in soon, we can go ahead and approve it.21:08
*** rlandy|rover is now known as rlandy|rov|biab21:09
tristanC_corvus: thanks for the reminder, i actually already started the review, but got distracted and forget to +2 the change21:19
*** tristanC_ is now known as tristanC21:19
*** rlandy|rov|biab is now known as rlandy|rover21:58
openstackgerritMerged zuul/zuul master: Allow post playbook to run when run playbook has unreachable result  https://review.opendev.org/c/zuul/zuul/+/78398622:23
*** y2kenny has quit IRC23:11
*** tosky has quit IRC23:34
openstackgerritJames E. Blair proposed zuul/zuul-operator master: Allow terminationGracePeriodSeconds to be configurable  https://review.opendev.org/c/zuul/zuul-operator/+/78598923:48
corvustristanC: i'm interested in your thoughts on https://review.opendev.org/78598523:50
corvustristanC: you and i both have independently used secrets for everything -- in the operator, and in my raw k8s i set up for gerrit's zuul.  but avass rightly points out that we can use configmaps for some things, and that's what the original spec said.  personally, i think i gravitated towards secrets for everything for simplicity's sake.  i'm interested in your thoughts / experiences there.23:51
corvusi'm honestly pretty ambivalent about it; i could go either way.23:52
corvus(no rush on that, just wanted to draw attention to it as something in particular that could use more discussion)23:53
*** ajitha has joined #zuul23:56
corvusremote:   https://gerrit-review.googlesource.com/c/zuul/ops/+/303243 Add a keystore password for Zuul [NEW]23:59
corvusthere's the keystore change for gerrit's zuul23:59

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!