*** hemna7 is now known as hemna | 00:53 | |
*** tkajinam is now known as Guest5435 | 07:14 | |
*** Guest5435 is now known as tkajinam | 07:36 | |
bauzas | morning Nova | 07:48 |
---|---|---|
gibi | o/ | 08:13 |
opendevreview | Merged openstack/nova master: Adds link in releasenotes for hw machine type bug https://review.opendev.org/c/openstack/nova/+/849532 | 08:48 |
stephenfin | morning, I need another +2 on dtantsur's patch here https://review.opendev.org/c/openstack/nova/+/849881 | 08:51 |
stephenfin | to hopefully finally unblock jsonschema 4.x | 08:51 |
gibi | sean-k-mooney[m]: done | 09:00 |
gibi | sorry | 09:00 |
gibi | stephenfin: done | 09:00 |
stephenfin | gibi: thanks | 09:14 |
opendevreview | Merged openstack/nova master: Add a proper schema version to network_data.json https://review.opendev.org/c/openstack/nova/+/849881 | 09:22 |
bauzas | Uggla: lemme know when you have time to respin https://review.opendev.org/c/openstack/nova/+/845897 | 09:49 |
Uggla | bauzas, yep I will look at it beginning of the afternoon. ok for you ? | 09:50 |
bauzas | Uggla: cool, I already uploaded my series using the 2.92 microversion so I'll only fix some simple merge conflicts https://review.opendev.org/c/openstack/nova/+/849133 | 09:51 |
bauzas | sean-k-mooney: I found a way to have a regression test for https://bugs.launchpad.net/nova/+bug/1951656 | 09:51 |
bauzas | I'll upload it | 09:51 |
sean-k-mooney | bauzas: that great | 09:51 |
sean-k-mooney | i did not have time to try and create one myslef but i hoped it woudl be possibel without too much work | 09:52 |
sean-k-mooney | bauzas: with that regression did you also obseve the breakage of the reuse of freed mdevs | 09:52 |
sean-k-mooney | my guess is the list of aviable free mdevs was empty | 09:52 |
bauzas | yes | 09:53 |
sean-k-mooney | so we always tried to alocate | 09:53 |
bauzas | so the inventory is getting less | 09:53 |
sean-k-mooney | well thats good that we have a repoducer now | 09:53 |
sean-k-mooney | it should aid in fixing and backporting | 09:53 |
gibi | sean-k-mooney, bauzas: can we quickly discuss the force down requirement in https://review.opendev.org/c/openstack/nova/+/848886 ? | 09:56 |
gibi | sean-k-mooney: could you elaborate on the data corruption risk. is it depends on the task state? | 09:57 |
sean-k-mooney | well my ortinal concern was we were in the midel of doing an operation | 09:59 |
sean-k-mooney | it could be a snapshot for example | 09:59 |
sean-k-mooney | and depending on the backend we do strange thigns with say nfs | 09:59 |
sean-k-mooney | so im not sure how safe it is to always ignore it | 09:59 |
sean-k-mooney | like for nfs cinder volume snapshots we create a delta disks and update some paths in the xml and on the cidner side | 10:00 |
sean-k-mooney | if we evaucate in the midel of that i dont knwo what the sate will be | 10:00 |
sean-k-mooney | same for ceph i guess | 10:01 |
sean-k-mooney | in this case its just powering off | 10:01 |
sean-k-mooney | which shoudl be fine because we allow eveac with active vms | 10:01 |
gibi | ohh so the case is when the task was actually started by the compute service, then the compute service died. leaving a half uploaded snapshot or leaving a not fully updated geust xml behind | 10:02 |
sean-k-mooney | ya or if we were shelving ectra | 10:02 |
sean-k-mooney | basically if we are in the midel fo an operation im not sure what will happen | 10:02 |
sean-k-mooney | maybe its fine to evacuate | 10:02 |
sean-k-mooney | but we dont allwo that today | 10:03 |
gibi | so if that is the case then simply asking the admin to fence + force down before evac is not enough, the admin manually needs to clean up / repair things | 10:03 |
sean-k-mooney | ya i guess that is true | 10:03 |
gibi | right now force down only requires fencing | 10:03 |
sean-k-mooney | yes | 10:03 |
gibi | but going forward it will required a manual check by the admin | 10:03 |
gibi | and we need to be able to describe what to check | 10:03 |
sean-k-mooney | so i guess it really does not help in that respect | 10:04 |
sean-k-mooney | (force down) | 10:04 |
sean-k-mooney | i guess we are starting form this is blocked and you have to do reset-state | 10:04 |
sean-k-mooney | to evacuate | 10:04 |
gibi | yeah, it is convinient to push the responsibility to the admin by asking to force down, but we have to be able to tell the admin what to do before force down | 10:04 |
sean-k-mooney | if the node is down say psu exploded | 10:05 |
gibi | reset-state and force down is pretty similar in this regard, we ask the admin to do something and take over the burden of keeping the system consistent | 10:05 |
sean-k-mooney | i dont think there is much if anything the operator can do to clean up the state | 10:06 |
gibi | yeah | 10:06 |
sean-k-mooney | at least not without out lookign behind cinders back at the stroage | 10:06 |
sean-k-mooney | so maybe im overthinking it | 10:06 |
sean-k-mooney | but that is why i was suggesting force down | 10:06 |
gibi | we can suggest force down but then we have to change the definition of force down, as simple fencing the host is not enough any more | 10:08 |
sean-k-mooney | ya i think im coming around to your way of thinking and not overloading force-down | 10:09 |
sean-k-mooney | we perhaps should isntead just document that if the instance task state is not None | 10:09 |
sean-k-mooney | the operator may need to do addtional cleanup of the instance | 10:09 |
sean-k-mooney | e.g. remove a partial snapshot | 10:10 |
gibi | yepp | 10:10 |
gibi | we can keep the reset-state requirement as today if that helps against accidental evac | 10:10 |
sean-k-mooney | well i guess that is the choice we have to make. if we think the timeout is suffeicnet as it has been in the past | 10:11 |
sean-k-mooney | then we can proceed with the change and just drop the force down requirement and add some extra docs | 10:11 |
sean-k-mooney | otherwise yes we can leave it as it is today with reset state | 10:12 |
bauzas | OK, I need to go lunching, but I think I see the problem with the mdev names | 11:13 |
bauzas | it creates an exception when we run the periodic RT method for updating | 11:13 |
sean-k-mooney | ack that is what i was expecting woudl happen | 11:14 |
sean-k-mooney | either an excption or the list would be empty | 11:14 |
sean-k-mooney | in either case resulitng int placment getting out of sync | 11:14 |
sean-k-mooney | and preventing reuse | 11:14 |
opendevreview | Merged openstack/nova master: libvirt: Ignore LibvirtConfigObject kwargs https://review.opendev.org/c/openstack/nova/+/830644 | 11:27 |
opendevreview | Merged openstack/nova master: libvirt: Remove unnecessary TODO https://review.opendev.org/c/openstack/nova/+/830645 | 11:47 |
sean-k-mooney | stephenfin: i added https://review.opendev.org/c/openstack/nova-specs/+/849488 to open discussion to ask for the spec freeze exception | 11:54 |
sean-k-mooney | stephenfin: so lets defer the +w to sylvain so they can either -2 it if we reject the exctpion or +w if we accept assuming they agree with the spec content | 11:55 |
stephenfin | sounds good | 12:06 |
sean-k-mooney | stephenfin: remind me you orgianlly wanted to truncate the displayname to set the hostname right | 12:08 |
sean-k-mooney | rather then normalise | 12:08 |
sean-k-mooney | im stongly considering if we shoudl have done neither and added support for fqdns when talking to neutron by doing the truncation there | 12:10 |
stephenfin | oh, I've no idea /o\ I'd have to go check the spec/patches | 12:10 |
sean-k-mooney | ya not really imporant now i guess | 12:11 |
sean-k-mooney | context is https://review.opendev.org/c/openstack/nova-specs/+/849765 and whetere or not we should do this in nova | 12:11 |
opendevreview | sean mooney proposed openstack/nova-specs master: Revert "Configurable instance domains" https://review.opendev.org/c/openstack/nova-specs/+/850048 | 12:14 |
sean-k-mooney | dansmith: ^ ill wait for the discusion in the nova team meeting to determin if i shoudl repopose the spec un a spec freeze exception or if we defer to AA | 12:17 |
sean-k-mooney | in which case we dont need to rush to reopen the review and we can wait for artom to return form pto | 12:17 |
opendevreview | sean mooney proposed openstack/nova-specs master: Revert "Configurable instance domains" https://review.opendev.org/c/openstack/nova-specs/+/850048 | 12:32 |
sean-k-mooney | ^ less typos and better commit | 12:32 |
opendevreview | Stephen Finucane proposed openstack/nova master: Use unittest.mock instead of third party mock https://review.opendev.org/c/openstack/nova/+/714676 | 12:46 |
opendevreview | Stephen Finucane proposed openstack/nova master: Remove the PowerVM driver https://review.opendev.org/c/openstack/nova/+/850346 | 12:46 |
opendevreview | ribaudr proposed openstack/nova master: Allow unshelve to a specific host (REST API part) https://review.opendev.org/c/openstack/nova/+/845897 | 13:28 |
*** dasm|off is now known as dasm|ruck | 13:31 | |
opendevreview | Merged openstack/nova-specs master: Revert "Configurable instance domains" https://review.opendev.org/c/openstack/nova-specs/+/850048 | 13:46 |
*** haleyb_ is now known as haleyb | 14:02 | |
opendevreview | sean mooney proposed openstack/nova-specs master: Revert "Revert "Configurable instance domains"" https://review.opendev.org/c/openstack/nova-specs/+/850352 | 14:11 |
sean-k-mooney | ok i have mad the instnace.domain -> instance.dns_domain change and tried to call out the open issues in ^ | 14:12 |
bauzas | I'm under deep water but we'll have our weekly meeting | 15:23 |
bauzas | 37 mins from now here | 15:23 |
bauzas | I'll prepare the agenda | 15:23 |
Uggla | Went out for a bike ride to get my car from the garage. It is really hot today. | 15:27 |
admin1 | hi .. i have cpu_allocation_ratio is at 4.0, but it refuses to go above the physical threads .. nova scheduler reporting: There was a conflict when trying to complete your request. Unable to allocate inventory: Unable to create allocation for 'VCPU' on resource provider 'UUID '. The requested amount would exceed the capacity. | 15:33 |
admin1 | if i want to deploy something below the actual vcpus, it works .. but it does not go above the physical threads | 15:33 |
sean-k-mooney | admin1: you canno thave a singel allcotion that excced the numa of actual cpus | 15:34 |
admin1 | sorry .. what does that mean :) | 15:34 |
sean-k-mooney | we do not allow vms to over subscibel against themselves | 15:34 |
sean-k-mooney | so 1 vm can never have more vcpu then the host has | 15:34 |
sean-k-mooney | the allocation ratio is not related too that | 15:35 |
admin1 | i have 40 cpus ( physical threads) .. and ratio is 4.0 .. but i can only have 2 instance of 16 vcpu there | 15:35 |
sean-k-mooney | hum you shoudl be able to boot more | 15:35 |
sean-k-mooney | so the total on the inventory is 40 | 15:36 |
admin1 | yes | 15:36 |
sean-k-mooney | and used is 32 | 15:36 |
admin1 | right | 15:36 |
sean-k-mooney | what version of mariadb are you useing | 15:36 |
sean-k-mooney | you might be hitting a mariadb bug that was reported on the mailing list last week | 15:36 |
admin1 | Server version: 10.6.5-MariaDB-1:10.6.5+maria~focal-log mariadb.org binary distribution | 15:36 |
sean-k-mooney | we are posibly seeing the same bug downstream | 15:36 |
sean-k-mooney | admin1: yep that version is apprently broken and its fixed in 10.6.8 i belive | 15:37 |
*** akekane_ is now known as abhishekk | 15:37 | |
sean-k-mooney | admin1: see this thread https://lists.openstack.org/pipermail/openstack-discuss/2022-July/029536.html | 15:37 |
admin1 | sean-k-mooney thanks for the direction | 15:38 |
bauzas | #startmeeting nova | 16:00 |
opendevmeet | Meeting started Tue Jul 19 16:00:21 2022 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'nova' | 16:00 |
bauzas | hello everyone | 16:00 |
bauzas | #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting | 16:00 |
* bauzas is currently trampled by internal issues with vgpus but I'll chair this meeting | 16:00 | |
bauzas | who's around ? | 16:01 |
gibi | o/ | 16:01 |
elodilles | o/ | 16:01 |
bahnwaerter | o/ | 16:01 |
dansmith | o/ | 16:01 |
Uggla | o/ | 16:02 |
bauzas | ok, we can start | 16:02 |
bauzas | #topic Bugs (stuck/critical) | 16:02 |
bauzas | #info No Critical bug | 16:03 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 10 new untriaged bugs (-1 since the last meeting) | 16:03 |
bauzas | Uggla: thanks for helping on this | 16:03 |
bauzas | any bug to point out ? | 16:03 |
bauzas | mmmm, crickets | 16:04 |
bauzas | #link https://storyboard.openstack.org/#!/project/openstack/placement 27 open stories (+0 since the last meeting) in Storyboard for Placement | 16:04 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:05 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:05 |
bauzas | #link https://storyboard.openstack.org/#!/project/openstack/placement 27 open stories (+0 since the last meeting) in Storyboard for Placement | 16:05 |
bauzas | sean-k-mooney: do you have some time for looking at bugs this week ? | 16:05 |
bauzas | mmm, I'll discuss with sean-k-mooney later to know if he can | 16:07 |
bauzas | if not, I'll be the owner for this week | 16:08 |
bauzas | #topic Gate status | 16:08 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:08 |
sean-k-mooney | i can make it | 16:08 |
sean-k-mooney | i proably dont but i will | 16:08 |
bauzas | sean-k-mooney: thanks, very much | 16:08 |
bauzas | sean-k-mooney: again, this is not a priority | 16:08 |
bauzas | do it as you can | 16:08 |
bauzas | the number of open bugs is pretty low at this time, which is cool, kudos to the team | 16:09 |
sean-k-mooney | its fine ill keep an eye on them | 16:09 |
bauzas | s/open/new | 16:09 |
bauzas | #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status | 16:09 |
bauzas | a new check : | 16:09 |
bauzas | #link https://zuul.openstack.org/builds?job_name=tempest-integrated-compute-centos-9-stream&project=openstack%2Fnova&pipeline=periodic-weekly&skip=0 Centos 9 Stream periodic job status | 16:09 |
bauzas | sean-k-mooney: didn't had time to see whether you proposed a tempest patch for removing c9s from check and gate ? | 16:10 |
dansmith | isn't there some way to see previous periodic runs too? | 16:10 |
sean-k-mooney | bauzas: yes i did | 16:10 |
dansmith | so we can see that it's passed N times in a row? | 16:10 |
sean-k-mooney | dansmith: yes | 16:10 |
dansmith | or has this only run once? | 16:10 |
sean-k-mooney | dansmith: i can see if i can get that | 16:10 |
bauzas | dansmith: https://zuul.openstack.org/builds?job_name=tempest-integrated-compute-centos-9-stream&project=openstack%2Fnova&skip=0 | 16:11 |
dansmith | sean-k-mooney: not important now, just wanted to make sure we don't need to build a thing | 16:11 |
sean-k-mooney | i think you can filter the build by pipeline and proejct | 16:11 |
sean-k-mooney | ya basically ^ | 16:11 |
bauzas | dansmith: as you see, for the moment, we're still testing c9s in check and gate due to a tempest template | 16:11 |
dansmith | yeah but that only shows one periodic run, is that because it has only run once yet? | 16:11 |
bauzas | dansmith: correct | 16:11 |
dansmith | gtotcha | 16:11 |
bauzas | we could also check experiment if we want | 16:12 |
sean-k-mooney | https://review.opendev.org/c/openstack/tempest/+/850242 | 16:12 |
bauzas | ta | 16:12 |
sean-k-mooney | that is the tempest template update patch ^ | 16:12 |
dansmith | ack | 16:12 |
bauzas | #link https://review.opendev.org/c/openstack/tempest/+/850242 removing Centos 9 stream jobs from default Tempest template | 16:12 |
bauzas | #link https://zuul.opendev.org/t/openstack/builds?job_name=nova-emulation&pipeline=periodic-weekly&skip=0 Emulation periodic job runs | 16:13 |
bauzas | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:13 |
bauzas | #info STOP DOING BLIND RECHECKS aka. 'recheck' https://docs.openstack.org/project-team-guide/testing.html#how-to-handle-test-failures | 16:13 |
bauzas | that's it | 16:13 |
bauzas | moving on | 16:13 |
bauzas | #topic Release Planning | 16:13 |
bauzas | #link https://releases.openstack.org/zed/schedule.html | 16:13 |
bauzas | #info Zed-2 was last week | 16:13 |
bauzas | #info Specs are no longer accepted | 16:14 |
bauzas | #link https://blueprints.launchpad.net/nova/zed current blueprints open for the zed timeframe | 16:14 |
bauzas | 13 bps | 16:14 |
bauzas | less than in Yoga | 16:14 |
bauzas | we could have one more | 16:14 |
bauzas | but this will be discussed in the open discussion | 16:15 |
bauzas | oh, I forgot to prepare an etherpad, stupid me | 16:15 |
bauzas | give me a second, I'd like to ask our API microversion proposers to ask for a specific microversion | 16:16 |
bauzas | instead of all of them rushing into 2.92 | 16:16 |
bauzas | 2.91 | 16:16 |
bauzas | there it is | 16:17 |
bauzas | #link https://etherpad.opendev.org/p/nova-zed-microversions-plan Etherpad for a microversion use query | 16:18 |
bauzas | fwiw, I left 2.91 for https://review.opendev.org/c/openstack/nova/+/845897 | 16:18 |
bauzas | and https://review.opendev.org/c/openstack/nova/+/849133 is now ready for 2.92 | 16:18 |
bauzas | the latter will require a very small merge conflict resolution due to the api docs | 16:19 |
bauzas | but as you see, nothing was preventing me to use the 2.92 microversion even if 2.91 wasn't merged | 16:19 |
bauzas | that's why I propose our different change owners to ask for a microversion | 16:19 |
gibi | you will have a bit more than a single conflict on the api doc but yes hopefully the conflict will be small | 16:20 |
bauzas | I'll send an email tomorrow explaining about this | 16:20 |
bauzas | gibi: I tested it | 16:20 |
bauzas | gibi: 4 files were conflicting | 16:20 |
gibi | I expect a conflict on https://review.opendev.org/c/openstack/nova/+/849133/5/doc/api_samples/versions/v21-version-get-resp.json | 16:20 |
bauzas | yup, this one and the root v2 | 16:21 |
bauzas | plus the rest api microversions list doc | 16:21 |
bauzas | and I can't remember the last one | 16:21 |
gibi | and on the api_version_request.py | 16:21 |
bauzas | but easy conflicts | 16:21 |
bauzas | could be harder to resolve if two changes are touching the same API resource | 16:21 |
bauzas | but let's not overcomplicate this | 16:22 |
Uggla | fyi I have just reserved 2.93 for virtiofs | 16:22 |
bauzas | Uggla: then prepare your API change to use the 2.93 microversion, this should work like mine | 16:22 |
gibi | bauzas: sure | 16:22 |
bauzas | tbc, we reserve the right to flip the microversions | 16:23 |
Uggla | need to review because I bet on 2.92. | 16:23 |
bauzas | depending on the state of the review | 16:23 |
bauzas | we had runways before | 16:24 |
bauzas | take it as a unformal runway for asking reviews | 16:24 |
bauzas | either way, I'll send an email explaining the rules | 16:24 |
bauzas | I don't want an overcomplicated process | 16:25 |
bauzas | this is just a way to prevent people doing frequent rebases | 16:25 |
bauzas | but the smaller microversion you ask, the higher you need to be reviewed hence be present | 16:26 |
bauzas | I just don't want us to wait for new revisions that could stale other changes | 16:26 |
bauzas | so I'll just say that we're free to drop some change from the microversion number | 16:27 |
bauzas | hope you folks don't disagree with this stupid plan | 16:27 |
gibi | with the amount of folks pushing for a microversion right now I don't see trouble | 16:28 |
Uggla | sounds good | 16:28 |
bauzas | well, I see 5 different patches | 16:28 |
bauzas | at leasty | 16:28 |
bauzas | anyway, moving on | 16:29 |
gibi | personally I would not bet on current+3 or higher to be ordered | 16:29 |
bauzas | #action bauzas to clarify the game rules of the etherpad in a later email tomorrow | 16:29 |
gibi | but having a c+1 and c+2 ordered make sens | 16:29 |
gibi | e | 16:29 |
bauzas | gibi: that's a reasonable point | 16:29 |
bauzas | I could remove 2.95 and newer | 16:29 |
bauzas | moving on | 16:29 |
bauzas | #topic Review priorities | 16:29 |
gibi | and also if you are not ready for review then please don't allocate a microversion :) | 16:29 |
bauzas | gibi: that's the game rule | 16:30 |
gibi | coolio | 16:30 |
bauzas | and if you are on vacations for 4 weeks, don't ask for the next microversion | 16:30 |
bauzas | or ask the next one, provided your patch can be reviewed before you leave | 16:30 |
bauzas | :) | 16:30 |
gibi | :) | 16:30 |
bahnwaerter | :) | 16:31 |
* bauzas of course won't take 4 weeks off | 16:31 | |
bauzas | only 3.5 | 16:31 |
bauzas | #topic Review priorities | 16:31 |
bauzas | #link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement+OR+project:openstack/os-traits+OR+project:openstack/os-resource-classes+OR+project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/osc-placement)+label:Review-Priority%252B1 | 16:32 |
bauzas | huzzah | 16:32 |
bauzas | #link https://review.opendev.org/c/openstack/project-config/+/837595 is merged | 16:32 |
bauzas | #link https://docs.openstack.org/nova/latest/contributor/process.html#what-the-review-priority-label-in-gerrit-are-use-for explains now the new gerrit flag and how to use it | 16:32 |
gibi | would be interesting to see how many +1 will appeare on this list from now | 16:33 |
dansmith | I friggin hate the RP label btw | 16:33 |
bauzas | at least two from sean-k-mooney :) | 16:33 |
gibi | bauzas: btw you need to update your query to show both +1 and +2 | 16:34 |
dansmith | I keep RP+2ing patches and people ask a week later why I didn't CR+2 them :/ | 16:34 |
bauzas | dansmith: point them doc | 16:34 |
sean-k-mooney | they were proably form before the update | 16:34 |
bauzas | gibi: yeah I need to modify it | 16:34 |
dansmith | bauzas: no, I mean *I* do the wrong thing because I go looking for the +2 button to click and choose the wrong one | 16:34 |
sean-k-mooney | yes they were bot form before the cahnge merged | 16:35 |
dansmith | I wish RP could be a different scale like -A +B +C | 16:35 |
sean-k-mooney | but plese do look at them | 16:35 |
sean-k-mooney | dansmith: it can be | 16:35 |
gibi | dansmith: valid point, can we change it from +2 to +B? | 16:35 |
bauzas | dansmith: ah I get your point | 16:35 |
dansmith | if we can choose anything, | 16:35 |
bauzas | it's confusing indeed | 16:35 |
sean-k-mooney | i am pretty sure it does not have to be a number | 16:35 |
dansmith | could we make it -NotYet, +Prio, +HighPrio ? | 16:35 |
bauzas | I don't know, we need to look at gerrit acls | 16:35 |
dansmith | or -Low, +Med, +High | 16:36 |
sean-k-mooney | i can check but this si just a cutom lable | 16:36 |
dansmith | don't do it just for me, but just relating my frustration with it on the glance side | 16:36 |
bauzas | dansmith: man, we got this https://review.opendev.org/c/openstack/project-config/+/837595 open for a while, you know your very good comment would have been more than appreciated then ? :D | 16:36 |
bauzas | anyway, it took us 6 months to get it | 16:36 |
dansmith | bauzas: sorry, but it has taken actual experience to realize it's annoying | 16:36 |
bauzas | I'm pretty sure we can take one month more to find if we can change the acls and to get it merged :p | 16:37 |
sean-k-mooney | bauzas: well thats just ebcause we didnt agree on what it shoudl be | 16:37 |
sean-k-mooney | if we can set a custom value | 16:37 |
bauzas | sean-k-mooney: not exactly | 16:37 |
sean-k-mooney | and we want too we can get it updated quickly | 16:37 |
dansmith | I will help push for reviews on a change if you want to do it | 16:37 |
bauzas | sean-k-mooney: it took us nearly a cycle to agree and then nearly a midcycle to get it merged | 16:37 |
dansmith | again, not trying to mess anything up, just conveying my experience | 16:37 |
bauzas | dansmith: your comment is legit | 16:37 |
sean-k-mooney | bauzas: thats just because i had asked them to wait until i went back to them | 16:37 |
bauzas | and I don't want contributors to mess this up | 16:37 |
gibi | I would go for +P (contributor review promise) +CP (core review promise) | 16:38 |
bauzas | someone thinking he would +1 a patch and instead saying "yay, I'm committed on reviewing it soon" | 16:38 |
dansmith | is that what you meant by +1 +2? :) | 16:38 |
dansmith | if so, the labels would be much more useful | 16:38 |
bauzas | yup | 16:38 |
sean-k-mooney | https://gerrit-review.googlesource.com/Documentation/config-labels.html#label_value | 16:38 |
bauzas | we'll figure it out | 16:38 |
sean-k-mooney | so it might need to be an int | 16:38 |
sean-k-mooney | the name can be anything we want | 16:39 |
sean-k-mooney | we can proably move on and confrim outside the meeting | 16:40 |
dansmith | ++ | 16:40 |
gibi | ack | 16:40 |
bauzas | ++ | 16:40 |
bauzas | we have someone having added https://review.opendev.org/c/openstack/nova-specs/+/816542 to the agenda | 16:41 |
bauzas | Spec for modifiable user_data was accepted / merged, but implementations are still pending final review / merge | 16:41 |
bauzas | so I guess he's raising our attention to : | 16:41 |
bauzas | #link https://review.opendev.org/c/openstack/nova/+/816157 server implementation | 16:42 |
bauzas | #link https://review.opendev.org/c/openstack/python-novaclient/+/816158 novaclient | 16:42 |
bauzas | #link https://review.opendev.org/c/openstack/python-openstackclient/+/847792 openstackclient | 16:42 |
sean-k-mooney | yep i skimmed that | 16:42 |
bauzas | #link https://review.opendev.org/c/openstack/python-novaclient/+/816158 novaclient | 16:42 |
bauzas | this is one of the API changes we have | 16:42 |
bauzas | 2.94 ? | 16:42 |
sean-k-mooney | im not sure if its reday to merge but its an api change yes | 16:43 |
bauzas | I'll add it to the etherpad | 16:43 |
bauzas | moving on | 16:44 |
bauzas | #topic Stable Branches | 16:44 |
bauzas | elodilles: are you around ? | 16:44 |
elodilles | yes | 16:44 |
elodilles | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:44 |
elodilles | #info stable/train is blocked, fix exists but hasn't merged yet due to intermittent failures + now nova-grenade-multinode & nova-live-migration started to fail @ devstack 'create' phase | 16:44 |
elodilles | so train is now 'more' broken | 16:45 |
elodilles | i could not reproduce the devstack issue locally yet | 16:46 |
bauzas | looks like the French SNCF rail | 16:46 |
elodilles | :) | 16:46 |
elodilles | anyway, i'll try to look into the issue, but any hint is appreciated | 16:47 |
elodilles | (i've added some details to the nova-stable-branch-ci, but haven't created a bug yet) | 16:48 |
bauzas | elodilles: honesly, I'm under the water as I speak | 16:48 |
elodilles | bauzas: ok, no problem, it's just a heads up for everyone who is interested in train branch :) | 16:49 |
bauzas | *some* may be interested | 16:49 |
elodilles | and that's it about stable branches from me i think | 16:50 |
* gibi more and more feels we don't have the bandwidth to maintain stable/train | 16:50 | |
bauzas | unfortunately, let's move on, then | 16:52 |
bauzas | #topic Open discussion | 16:52 |
bauzas | (sean) https://review.opendev.org/c/openstack/nova-specs/+/849488 spec freeze exception for spice compression | 16:52 |
bauzas | so yeah I wrote we could discuss this now | 16:52 |
bauzas | to see whether we punt it for Zed or we accept it | 16:52 |
bauzas | anyone having opinions about it ? | 16:53 |
bauzas | tbh, I'm meh to it | 16:53 |
bauzas | trying to honestly balance the risks vs. the benefits | 16:53 |
sean-k-mooney | risk shoudl be small since this is not user faceing | 16:53 |
yoctozepto | o/ | 16:53 |
sean-k-mooney | there is no api impact to this | 16:54 |
bauzas | this is configurable, right? | 16:54 |
yoctozepto | right | 16:54 |
sean-k-mooney | via host level config options only | 16:54 |
yoctozepto | and defaults to previous default | 16:54 |
bauzas | yeah, so basically a regression wouldn't be a big deal | 16:54 |
sean-k-mooney | we might want to default to unset | 16:54 |
bauzas | changing the options and that's it | 16:54 |
sean-k-mooney | but we could defer that to the implemation | 16:54 |
bauzas | yeah | 16:54 |
sean-k-mooney | to keep it entirly off by defualt | 16:54 |
bauzas | if that's purely additive and host-config based only, doesn't sound a big deal | 16:55 |
yoctozepto | i.e., "do not touch this part of libvirt's xml" by default? | 16:55 |
bauzas | correct | 16:55 |
yoctozepto | makes sense | 16:55 |
bauzas | no upgrade impact | 16:55 |
sean-k-mooney | right we cuurrently dont generate the elements | 16:55 |
sean-k-mooney | so we coudl continue to do that by default | 16:55 |
yoctozepto | agreed | 16:55 |
gibi | I'm OK to grant the exception for this. | 16:55 |
bahnwaerter | sean-k-mooney: Yeah, I could change that. It makes more sense to only set the libvirt entries if they are specified in a nova.conf | 16:56 |
bauzas | yoctozepto: do you have open changes against it ? | 16:56 |
bauzas | oh, that's bahnwaerter's question then | 16:56 |
sean-k-mooney | there is a nova change and nova-specs change open | 16:56 |
yoctozepto | ++ | 16:56 |
sean-k-mooney | so if we grant the excption we can update the sepc before we merge it | 16:56 |
bauzas | ok, so there is already a poc | 16:56 |
sean-k-mooney | yes | 16:57 |
bauzas | all the planets are aligned then | 16:57 |
bahnwaerter | bauzas: Yeah, I was invited to this dicussion today ;) | 16:57 |
bauzas | let me take my baton then... | 16:57 |
bauzas | #agreed https://review.opendev.org/c/openstack/nova-specs/+/849488/ granted as a spec deadline exception, sounds reasonable provided there is no upgrade impact and the change being purely self-contained and additive | 16:58 |
bauzas | cores, I'd appreciate if you could review it ASAP | 16:58 |
bauzas | (the spec, tbc) | 16:58 |
bauzas | that's it I guess for today | 16:59 |
sean-k-mooney | ill drop +2 given the pending change to the config behavior | 16:59 |
sean-k-mooney | not quite | 16:59 |
bauzas | sean-k-mooney: about the spec itself | 16:59 |
bauzas | (sean) there seams to be considerable outstanding question with regards to Configurable instance domains | 16:59 |
sean-k-mooney | oh ya so that it for that topic | 16:59 |
sean-k-mooney | yep so just want to make sure we disucssed ^ | 17:00 |
bauzas | https://review.opendev.org/c/openstack/nova-specs/+/850048 revert was merged | 17:00 |
bauzas | do we want to grant an exception for it ? | 17:00 |
dansmith | are we on to the domains thing? | 17:00 |
bauzas | yup | 17:00 |
gibi | I do apologize pushing the spec through within such a sort timeframe last week | 17:00 |
bauzas | dns domain this | 17:00 |
dansmith | yeah I still (heartily) question the approach in general | 17:00 |
sean-k-mooney | https://review.opendev.org/c/openstack/nova-specs/+/850352 is the revert of the revert with some issues adressed | 17:00 |
sean-k-mooney | yes | 17:00 |
bauzas | fwiw, we're overtime | 17:01 |
bauzas | so we'll need to end the meeting | 17:01 |
dansmith | I know it will/would be more work to do this via integration with neutron, but it seems like it would be a lot better to do so | 17:01 |
bauzas | but I'd appreciate if people could continue the convo | 17:01 |
sean-k-mooney | bauzas: well we can extned | 17:01 |
sean-k-mooney | its in the nova changel now | 17:01 |
sean-k-mooney | but eitehr way | 17:01 |
bauzas | yeah | 17:01 |
bauzas | this is just we try to stick with one hour | 17:02 |
bauzas | anyway | 17:02 |
sean-k-mooney | dansmith: so do you think we shoudl take a step back and spend more time looking at this | 17:02 |
dansmith | personally I do, yeah | 17:02 |
bauzas | me too | 17:02 |
bauzas | I feel we require a proper brainstorming about it | 17:02 |
sean-k-mooney | then that fine we can defer to AA | 17:02 |
dansmith | I'd like to understand more of what we can and can't do with help from neutron | 17:02 |
sean-k-mooney | and not rush this | 17:02 |
dansmith | codifying this in our API is just a hack, IMHO | 17:02 |
dansmith | sounds good to me | 17:02 |
bauzas | yup, sounds we need a bit of a design time | 17:03 |
sean-k-mooney | im partly worried that we made dission in this space in the past that tie our hand but we may want to revaulate those | 17:03 |
sean-k-mooney | so i think we shoudl spend some time between now and ptg evaulating this again | 17:03 |
sean-k-mooney | including the previous desision | 17:03 |
bauzas | sean-k-mooney: tbh, one week ago, we were still reviewing some metadata API change IIRC | 17:03 |
sean-k-mooney | bauzas: yes which we new would not work for quite a while | 17:04 |
bauzas | which, by reading the superseding spec, I understand why this approach is no longer possible | 17:04 |
sean-k-mooney | we can do the metadta change too | 17:04 |
sean-k-mooney | but it wont help the usecasue | 17:04 |
sean-k-mooney | its just providing more info to the domain | 17:04 |
sean-k-mooney | ... vm | 17:05 |
sean-k-mooney | which it will ignore | 17:05 |
bauzas | but yeah, sounds to me that domains are some information given by the network, not the user | 17:05 |
* sean-k-mooney hates i typoed domain instead of vm | 17:05 | |
sean-k-mooney | bauzas: i would disagree with that | 17:05 |
sean-k-mooney | but it really depend on the config | 17:05 |
sean-k-mooney | in generally the domain is carried by the port/floating ip and a default domain can be added to the netowrk | 17:05 |
sean-k-mooney | anyway we are over time | 17:06 |
dansmith | I think bauzas meant the network infra, | 17:06 |
dansmith | which is what I meant when I said it | 17:06 |
bauzas | well, unless I'm wrong, DNS is L5 | 17:06 |
sean-k-mooney | right | 17:06 |
dansmith | not necessarily "the network object in neutron, distinct from the port object" | 17:06 |
sean-k-mooney | what is greate is this is not integrated with routed network properly either | 17:06 |
bauzas | yeah, not the "neutron network" | 17:06 |
dansmith | and I totally think it does come from network infra, at least in terms of plumbing | 17:06 |
bauzas | I meant 'the network infrastructure" | 17:07 |
sean-k-mooney | dansmith: its ment to be self service | 17:07 |
dansmith | if you override it in your guest OS, that's fine, but we don't need to be involved in that level, IMHO | 17:07 |
sean-k-mooney | at least with designate the model is bring your own domain | 17:07 |
bauzas | this is some data Nova doesn't have to deal with | 17:07 |
dansmith | sean-k-mooney: totally, but we should integrate with the services providing the network infra, even if you took your own domain to them | 17:07 |
sean-k-mooney | you point your domain mx rcords to desigante and then manage it as an enduser independet of the cloud admin | 17:07 |
dansmith | yep, understand | 17:07 |
dansmith | I'm not saying openstack shouldn't handle this, I'm saying I think nova is probably not the right place to set this per-instance | 17:08 |
bauzas | in theory, you could even have your DNS servers totally uncorrelated from OpenStack services | 17:08 |
sean-k-mooney | yes. | 17:08 |
bauzas | but Nova shouldn't be managing it | 17:08 |
sean-k-mooney | anyway im not going to ask for a spec freeze excption for this | 17:08 |
bauzas | this could be some "metadata" information | 17:08 |
bauzas | yeah and we need to end the meeting | 17:09 |
sean-k-mooney | im not sure i agree with "nova shoudl not be managing this" but i understand why you have that view point | 17:09 |
sean-k-mooney | so yes lets end the meeting | 17:09 |
bauzas | let's end the meeting for now | 17:09 |
bauzas | and we'll continue | 17:09 |
bauzas | thanks all | 17:10 |
bauzas | #endmeeting | 17:10 |
opendevmeet | Meeting ended Tue Jul 19 17:10:03 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 17:10 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2022/nova.2022-07-19-16.00.html | 17:10 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2022/nova.2022-07-19-16.00.txt | 17:10 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2022/nova.2022-07-19-16.00.log.html | 17:10 |
bauzas | sean-k-mooney: I meant this can be an instance metadata information | 17:10 |
bauzas | but I don't want it to be "primer" as officially defined in our instance API | 17:10 |
bauzas | technically, you can pass your domain information thru userdata too | 17:11 |
sean-k-mooney | perhaps we have alternitives to not need to encodeing in the instance api | 17:11 |
sean-k-mooney | you can | 17:11 |
bauzas | if you want to have your domains managed by OpenStack services, then this is Designate | 17:12 |
sean-k-mooney | just a note that while these neutron apis existed for a long time they are relitively new to backend like ovn | 17:12 |
bauzas | if you don't want them, then you can use the entryknobs we currently have | 17:12 |
sean-k-mooney | bauzas: well right now this is a one way path | 17:12 |
sean-k-mooney | form nova to neutron to designate | 17:12 |
dansmith | sean-k-mooney: right, but if the operator has chosen a network backend that doesn't support this, then I think it's reasonable to say that it's not supported to manage this for the user on those deployments | 17:12 |
sean-k-mooney | no infor ever flows the other way | 17:13 |
dansmith | if they do, then you can, and if a popular backend doesn't have support for this, but should, then ... work should be done :) | 17:13 |
dansmith | especially for something as important as OVN | 17:13 |
sean-k-mooney | it only got this in yoga | 17:13 |
sean-k-mooney | with some support added in backport ithink | 17:13 |
sean-k-mooney | at least the per-port dns info only got added in yoga | 17:14 |
bauzas | I have to quit by now | 17:14 |
johnsom | Just a note, the guest VM FQDN used for the hostname in the kernel is not related to the FQDN(s) configured for the host in DNS/BIND/Designate. Very different things | 17:14 |
sean-k-mooney | intenrl dns supprot is in progress for zed | 17:14 |
sean-k-mooney | johnsom: well by definiti8on the hostname used by the kernel should not be an FQDN | 17:14 |
johnsom | Not true | 17:14 |
sean-k-mooney | johnsom: that what started this mess | 17:14 |
johnsom | But, we have had that discussion. | 17:15 |
sean-k-mooney | form a nova persetive we never supprot that | 17:15 |
bauzas | yup | 17:15 |
bauzas | hostnames are host names | 17:15 |
dansmith | johnsom: we're talking about providing some way to say that the former should be set from the latter according to some rule, like "take the first one" or "this one is my primary" | 17:15 |
johnsom | Yeah, I'm not talking for Nova. Just the kernel UTC | 17:15 |
bauzas | hostnames aren't FQDNs in Nova | 17:15 |
dansmith | instead of nova taking a hostname that we set in the guest | 17:15 |
sean-k-mooney | johnsom: form a kernel UTC perstive the validat of FQDN as a hostname depend on what distro you ask | 17:16 |
* bauzas drops by now | 17:16 | |
johnsom | sean-k-mooney It's defined and managed in the kernel, so as long as it's running a Linux kernel it will be the same. | 17:17 |
sean-k-mooney | johnsom: what is allowable and what are recommend are two differnt things | 17:17 |
* johnsom notes, sorry, typo, it's the UTS namespace. I have timezones on the brain today | 17:17 | |
sean-k-mooney | https://www.freedesktop.org/software/systemd/man/hostname.html | 17:17 |
sean-k-mooney | """ The hostname should be composed of up to 64 7-bit ASCII lower-case alphanumeric characters or hyphens forming a valid DNS domain name. It is recommended that this name contains only a single label, i.e. without any dots. """ | 17:18 |
johnsom | Once again, what systemd does/thinks is not what the kernel does. | 17:18 |
sean-k-mooney | correct | 17:19 |
sean-k-mooney | but that is the same recommendation that nova has for how to constuct hostnames | 17:19 |
johnsom | As I have mentioned before, even RHEL satellite expects a FQDN. | 17:19 |
sean-k-mooney | yep and we recommend agains using fqdns in our downstream product | 17:19 |
sean-k-mooney | that recomendation was ignored and that fine | 17:19 |
sean-k-mooney | but strictly speaking nova orgially did not orgianly intend to supprot having two compute nodes with the same hostname but idffernt fqdns | 17:20 |
johnsom | Really, I think this is getting over thought a bit. It seems like we should just pass the FQDN in the metadata and let cloud-init figure out what settings need to go where. That way people don't have to change the hostname of the guest to sign up the instance with satellite, etc. | 17:21 |
sean-k-mooney | dansmith: johnsom for what its worth i think alternitive 3 in my lates revision might be the best path forward | 17:21 |
sean-k-mooney | johnsom: that is kind of my option 3 | 17:22 |
sean-k-mooney | johnsom: https://review.opendev.org/c/openstack/nova-specs/+/850352/1/specs/zed/approved/configurable-instance-domains.rst#96 | 17:22 |
johnsom | I am just now reading through that. | 17:22 |
dansmith | I don't understand that opinion *at all* :) | 17:22 |
sean-k-mooney | so we would truncate and set host name to everyting up to the first . and fqdn to the full thing | 17:22 |
sean-k-mooney | dansmith: did you look at https://github.com/canonical/cloud-init/blob/91fd72c3f5b416b7815314eebea0b82ccd7e3f73/cloudinit/config/cc_set_hostname.py#L25-L32= | 17:23 |
johnsom | https://cloudinit.readthedocs.io/en/latest/topics/modules.html#set-hostname | 17:24 |
johnsom | That too is a good reference | 17:24 |
sean-k-mooney | i think? its the same but html version | 17:24 |
dansmith | cloud-init does that from the user's own instance metadata right? | 17:24 |
sean-k-mooney | ya it is | 17:24 |
sean-k-mooney | hum | 17:25 |
sean-k-mooney | you thinking we can jsut set the fqdn on the instance metadta | 17:25 |
sean-k-mooney | and it will show up | 17:25 |
sean-k-mooney | and prefer_fqdn_over_hostname | 17:25 |
sean-k-mooney | that woudl be worth a try | 17:25 |
dansmith | I dunno, I'm asking.. if so that seems like a much better deal, | 17:25 |
dansmith | basically the contract is between the user and cloud-init, with nova uninvolved | 17:25 |
sean-k-mooney | well its reading those values form the isntance metadata | 17:26 |
sean-k-mooney | im just not sure if the urls line up | 17:26 |
sean-k-mooney | but i can try that now | 17:26 |
sean-k-mooney | im not sure if the servers generic metadata is under a subkey | 17:27 |
dansmith | right but is "instance metadata" the actual user's metadata blob, or the "metadata blob that nova generates, of which a sub-dict is the user's" ? | 17:27 |
sean-k-mooney | its not part of user data | 17:28 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/api/metadata/base.py#L161= | 17:28 |
sean-k-mooney | im just going to boot a vm and see what it looks like | 17:29 |
dansmith | not user data, user metadata | 17:29 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/api/metadata/base.py#L318= | 17:30 |
sean-k-mooney | it look like its in a subkey call meta | 17:30 |
dansmith | right, metadata['meta'] = { .. the stuff I passed to the API as "metadata" .. } | 17:32 |
dansmith | correct? | 17:32 |
sean-k-mooney | ya i think so | 17:32 |
dansmith | and that's not where cloud-init is looking, correct? | 17:33 |
sean-k-mooney | not sure about that last bit i think its looking for FQDN as a sibling of meta | 17:33 |
sean-k-mooney | but not sure | 17:33 |
dansmith | oh okay good | 17:33 |
sean-k-mooney | hostname is at the same level | 17:33 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/api/metadata/base.py#L349= | 17:34 |
sean-k-mooney | vm is booting now we will know shortly | 17:35 |
sean-k-mooney | ubuntu@meta-test:~$ sudo cat /var/lib/cloud/data/set-hostname | 17:40 |
sean-k-mooney | { | 17:40 |
sean-k-mooney | "fqdn": "meta-test", | 17:40 |
sean-k-mooney | "hostname": "meta-test" | 17:40 |
sean-k-mooney | } | 17:40 |
sean-k-mooney | what the special adress again ill see what in teh api with curl | 17:40 |
sean-k-mooney | 169.254.169.254 perhaps | 17:41 |
sean-k-mooney | odd im not seeing it where i expect too | 17:42 |
sean-k-mooney | https://termbin.com/n7bk | 17:44 |
sean-k-mooney | so its set on the instance | 17:44 |
sean-k-mooney | but i dont se fqdu anywhere | 17:44 |
sean-k-mooney | isnte server metadata ment to be discoverabel via the metadtaa api endpoing | 17:45 |
sean-k-mooney | im expecting to see it somewhere under curl 169.254.169.254/latest/meta-data | 17:45 |
*** dasm|ruck is now known as dasm|off | 17:48 | |
sean-k-mooney | so that would be a no | 17:52 |
sean-k-mooney | unless im missing somthing im not sing where we expose the userdata or metadata at that url | 17:53 |
dansmith | sean-k-mooney: https://github.com/openstack/nova/blob/c53ec4e48884235566962bc934cbf292ad5b67b8/nova/api/metadata/base.py#L317-L318 | 17:54 |
dansmith | launch_metadata comes from utils.instance_meta() which should be the user's k=v metadata they provided | 17:55 |
sean-k-mooney | yes and we regeister that fucntion as a path handeler here https://github.com/openstack/nova/blob/c53ec4e48884235566962bc934cbf292ad5b67b8/nova/api/metadata/base.py#L221= | 17:56 |
sean-k-mooney | so with meta_data.json" as the route | 17:57 |
sean-k-mooney | ill check that | 17:57 |
sean-k-mooney | so the instance metadta is not part of the ec2 info https://github.com/openstack/nova/blob/c53ec4e48884235566962bc934cbf292ad5b67b8/nova/api/metadata/base.py#L235-L303= | 18:00 |
sean-k-mooney | im not really sure what to say to be honest | 18:04 |
sean-k-mooney | ok its there | 18:08 |
sean-k-mooney | not that i can find via curl but | 18:09 |
sean-k-mooney | cloud-init query -a sees it | 18:09 |
sean-k-mooney | its in the meta section | 18:09 |
sean-k-mooney | https://paste.opendev.org/show/bfDEteTetI0rInj8NfBx/ | 18:09 |
sean-k-mooney | line 57-60 | 18:11 |
sean-k-mooney | oh its at http://169.254.169.254/openstack/2012-08-10/meta_data.json | 18:15 |
sean-k-mooney | curl http://169.254.169.254 does not list the openstack directory | 18:16 |
sean-k-mooney | so that makes sense without openstack im looking at the ec2 part | 18:16 |
sean-k-mooney | dansmith: https://termbin.com/cxty that is the metadta we generated for that instnace | 18:21 |
dansmith | that's what I would expect, yeah, | 18:21 |
dansmith | so does cloud-init look at that meta['fqdn'] properly? | 18:21 |
sean-k-mooney | no | 18:22 |
sean-k-mooney | it could but it does not appear too | 18:22 |
sean-k-mooney | so the openstack data souce could be updated | 18:22 |
dansmith | could or should? | 18:22 |
sean-k-mooney | to make keys under meta have precidence | 18:22 |
sean-k-mooney | could, im not sure it should or not | 18:22 |
sean-k-mooney | or well we could implement atht too allow keys to be "un namespaced" i.e. not embeded in the metakey | 18:23 |
sean-k-mooney | asuming the hostname filed there has precinece over the ec2 version | 18:24 |
dansmith | we really *really* should avoid nova making any contract about the keys in user metadata | 18:24 |
sean-k-mooney | which im not sure it does | 18:24 |
dansmith | if it's something cloud-init looks at, then fine, but we should not be touching that stuff | 18:24 |
sean-k-mooney | ack | 18:24 |
dansmith | well, still, it seems to me that hostname should be the short part and we should get domain from some primary network affiliation | 18:25 |
sean-k-mooney | well i belvie they put fqdns in it sometime in the examples | 18:25 |
dansmith | so remind me again why the user can't just use an fqdn in the hostname field and we just look the other way? | 18:27 |
sean-k-mooney | it broke desginate | 18:27 |
sean-k-mooney | or neutron | 18:28 |
sean-k-mooney | because we passed that directly too them | 18:28 |
sean-k-mooney | if the hostname hand a numeric tld | 18:28 |
sean-k-mooney | so ubuntu-20.04 | 18:28 |
sean-k-mooney | would break neutron | 18:28 |
sean-k-mooney | because we set the dns_name to that | 18:28 |
sean-k-mooney | dansmith: https://cloudinit.readthedocs.io/en/latest/topics/instancedata.html#example-output was the example i was refering too | 18:32 |
dansmith | they do show hostname having an FQDN, even though it's a . internal one | 18:33 |
dansmith | but it also shows the public real hostnames being more associated with interfaces, which seems more right-er to me | 18:34 |
sean-k-mooney | yep "public_hostname": "ec2-3-89-187-177.compute-1.amazonaws.com", | 18:34 |
sean-k-mooney | if you look at v1 the hsotnaem was truncated | 18:34 |
sean-k-mooney | then the top level "local_hostname": "ip-172-31-81-43", is still truncated | 18:35 |
sean-k-mooney | but hostname is now "hostname": "ip-172-31-81-43.ec2.internal", | 18:36 |
sean-k-mooney | and they have per network interface info | 18:36 |
sean-k-mooney | dansmith: the other spec which we abandoned | 18:36 |
sean-k-mooney | was adding the per networking interface info | 18:36 |
sean-k-mooney | since that would allow us to pass all the inform form neutron | 18:36 |
sean-k-mooney | and just get out of thet way | 18:37 |
sean-k-mooney | but apprenly cloud init wont use that | 18:37 |
sean-k-mooney | this is all based on aws by the way | 18:37 |
sean-k-mooney | the example | 18:37 |
sean-k-mooney | and they have obviouly changed there mind over time | 18:38 |
johnsom | dansmith I am 100% on board with allowing fqdn in the hostname field and just pass it to cloud-init to deal with. That would solve the problem a customer was having. If there is an issue with neutron, that should be easily fixable. | 18:38 |
dansmith | johnsom: agree | 18:38 |
sean-k-mooney | johnsom: that customer issue is why we are talking about this | 18:38 |
johnsom | Yeah, I guessed as much | 18:39 |
sean-k-mooney | we had the option of doing that in wallaby and agreed to add the display name sanatiation | 18:39 |
sean-k-mooney | we aslo had a mailing list thread on this topic | 18:39 |
sean-k-mooney | so we cloud just allow fwdns again and strip or pass the domain when talking to neutron | 18:40 |
sean-k-mooney | our consern with that is we have shipted the normaliasation for 3 releases now | 18:40 |
dansmith | no | 18:40 |
sean-k-mooney | so we dont knwo who we will break | 18:40 |
dansmith | we should not parse the hostname and split out domains | 18:40 |
sean-k-mooney | dansmith: so neutron should? | 18:41 |
dansmith | we should take that string, pass it to cloud-init and/or neutron, and fix whatever the problem is on the neutron side that didn't like it sometimes | 18:41 |
dansmith | sean-k-mooney: neutron is the networking service | 18:41 |
johnsom | dansmith +1 | 18:41 |
sean-k-mooney | right but we are curently setting the dns_name filed in there api | 18:41 |
sean-k-mooney | to an fqdn when its defiend to take a hostname | 18:41 |
sean-k-mooney | https://github.com/openstack/nova/blob/50fdbc752a9ca9c31488140ef2997ed59d861a41/releasenotes/notes/instance-hostname-used-to-populate-ports-dns-name-08341ec73dc076c0.yaml | 18:42 |
dansmith | I really think the right thing is for us to take a hostname and get our domain affiliation from neutron, but if we really really need to be able to take an FQDN via nova, we should be as hands-off about it as possible | 18:42 |
sean-k-mooney | well right now the hostname filed can only be a hostname if passed driectly | 18:43 |
johnsom | It is common case that the FQDN in the guest does not match the FQDN on the port in neutron. One is an internal view, the other external. | 18:43 |
sean-k-mooney | if its not passed we generate it form the dispalyname | 18:43 |
sean-k-mooney | johnsom: sure but the customer in quetion needs it to be resolvable | 18:44 |
sean-k-mooney | so what ever gets set need to actully reslove in nutrons dns | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Rename [pci]passthrough_whitelist to device_spec https://review.opendev.org/c/openstack/nova/+/843834 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Rename exception.PciConfigInvalidWhitelist to PciConfigInvalidSpec https://review.opendev.org/c/openstack/nova/+/843861 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Rename whitelist in tests https://review.opendev.org/c/openstack/nova/+/843862 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Basics for PCI Placement reporting https://review.opendev.org/c/openstack/nova/+/846187 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Extend device_spec with resource_class and traits https://review.opendev.org/c/openstack/nova/+/846218 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reject PCI dependent device config https://review.opendev.org/c/openstack/nova/+/846435 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reject mixed VF rc and trait config https://review.opendev.org/c/openstack/nova/+/846436 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Ignore PCI devs with physical_network tag https://review.opendev.org/c/openstack/nova/+/846219 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reject devname based device_spec config https://review.opendev.org/c/openstack/nova/+/846466 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Support [pci]device_spec reconfiguration https://review.opendev.org/c/openstack/nova/+/846470 | 18:44 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Stop if tracking is disable after it was enabled before https://review.opendev.org/c/openstack/nova/+/847009 | 18:44 |
dansmith | johnsom: I think that's a broken way to think about it in a managed environment, which is why I think you should choose one port to be your primary interface and we get our domain affiliation from that | 18:44 |
sean-k-mooney | dansmith: perhaps but in general ports and neworks dont have domains | 18:45 |
sean-k-mooney | they do in the custoemrs case | 18:45 |
sean-k-mooney | they define them on the network | 18:45 |
dansmith | but I understand there's practicalities about where we are at the moment, so I'd rather just pass the hostname to the other services and let them handle what to do if it looks like an fqdn | 18:45 |
sean-k-mooney | and they are expecting the domain to propacate down to all vms on that network | 18:45 |
dansmith | sean-k-mooney: again, I'm not talking about absolutes about how neutron works today, I'm saying what I think about the way it *should* work | 18:46 |
sean-k-mooney | dansmith: to make that work both neutron and designate would need to be able to handel that | 18:46 |
dansmith | sean-k-mooney: I understand | 18:47 |
johnsom | dansmith Think about the case where the domain is cloud generated based on the project ID, etc. The in-guest name doesn't necessarily need to be resolvable from outside the guest. It can be, but it's not necessary. | 18:47 |
sean-k-mooney | johnsom: in there paticaly case it needs to be but in general it may not | 18:48 |
sean-k-mooney | https://github.com/openstack/nova/blob/93a65f06df67ce39d65827692150c78013c7f6d5/nova/network/neutron.py#L1737= | 18:48 |
sean-k-mooney | this is where we set the dns_name on the port by the way | 18:48 |
sean-k-mooney | if that just truncated the hostname neutron and designate woudl work | 18:48 |
sean-k-mooney | we are sanatising the hostname to workaround that bad request form neutron today | 18:49 |
johnsom | Yep, just split on the first label | 18:49 |
sean-k-mooney | so we could revert the normalisation we do right now when setting instance.hostname and truncate there and all uscases that work before wallaby would work again however the downstide of that is if your vm is called ubuntu-20.04 in neutron it will have ubuntu-20 set as the domain name | 18:52 |
dansmith | johnsom: I know, in the most generic cloud case it would also be something other than your own domain. It just seems like we've got all this overly-complex plumbing of networks now, so we should be able to actually manage network things via the network :) | 18:52 |
sean-k-mooney | and to make that worse it breaks multi create in that casse | 18:52 |
dansmith | sean-k-mooney: I really don't think we should be normalizing or truncating or splitting the hostname in nova | 18:52 |
dansmith | perhaps checking that it's a valid hostname (*maybe*) | 18:52 |
dansmith | but let the other services deal with it | 18:52 |
sean-k-mooney | dansmith: well we have been since before essexe | 18:53 |
sean-k-mooney | there was exsitng code that removed unicode | 18:53 |
dansmith | sean-k-mooney: and look at where we're at :) | 18:53 |
sean-k-mooney | and some other sepcial charters | 18:53 |
sean-k-mooney | so just sayign if we revert this it will still exits | 18:53 |
sean-k-mooney | dansmith: im not disagreeing that its undeisreable | 18:54 |
* johnsom grumbles that unicode is valid in the kernel hostname UTS | 18:54 | |
dansmith | again, checking for sanity is not such a big deal, but I'd think we'd want to reject the instance boot, not just sanitize-and-go | 18:54 |
sean-k-mooney | dansmith: that was also something we discuseed | 18:54 |
sean-k-mooney | psi was unhappy with that proposal | 18:54 |
sean-k-mooney | but i think that was our first responce | 18:54 |
sean-k-mooney | "this is invalide sorry neturon told us so" | 18:55 |
sean-k-mooney | i dont recall all the details but with queens and without designate vms with numeric TLDs booted | 18:56 |
sean-k-mooney | and with train and designate it did not | 18:56 |
sean-k-mooney | nova has been seeting the dns_name form instance.hostname since mitaka | 18:57 |
dansmith | we have friends that work on designate right? :) | 18:57 |
sean-k-mooney | so that was either a change in neutron or caused by adding designate | 18:58 |
johnsom | grin | 18:58 |
johnsom | You do.... | 18:58 |
sean-k-mooney | pluarl? | 18:58 |
dansmith | sean-k-mooney: johnsom is worth at least two | 18:58 |
sean-k-mooney | :) | 18:58 |
johnsom | lol | 18:58 |
dansmith | johnsom: that was a comment of your worth, not your waistline, btw ;P | 18:58 |
sean-k-mooney | i just tought designate was one of the more understaffed project | 18:58 |
sean-k-mooney | i think this vlaidation change was in neutron to be honest | 18:59 |
sean-k-mooney | between train and queens | 18:59 |
johnsom | There are two full time RH folks, and a couple more cores active. | 18:59 |
sean-k-mooney | oh ok glad that has improved | 18:59 |
johnsom | But, as dansmith said, the Designate buck stops with me at the moment, so if we need to fix something on the designate side, assign the bug to me. | 19:00 |
sean-k-mooney | https://github.com/openstack/neutron-lib/blob/f01b2e9025d33aeff3bf22ea2568bda036878819/neutron_lib/api/validators/dns.py#L59-L92= | 19:02 |
sean-k-mooney | so that apprently is what does the validation in neutron | 19:02 |
sean-k-mooney | well it starts here https://github.com/openstack/neutron-lib/blob/f01b2e9025d33aeff3bf22ea2568bda036878819/neutron_lib/api/validators/dns.py#L112= | 19:03 |
dansmith | I just had to: https://imgur.com/a/spREDAg | 19:03 |
johnsom | lol | 19:04 |
johnsom | At least it's a $100 | 19:04 |
dansmith | inflation, yo | 19:04 |
sean-k-mooney | https://github.com/openstack/neutron-lib/blob/f01b2e9025d33aeff3bf22ea2568bda036878819/neutron_lib/api/validators/dns.py#L50-L52= | 19:05 |
sean-k-mooney | that is what was rejecting the numeric tlds | 19:05 |
sean-k-mooney | and that has been in place since pike | 19:06 |
sean-k-mooney | so the psi issue was caused by turning on the dns extention | 19:06 |
sean-k-mooney | it was there in mitak too shich is when we started setting that field https://github.com/openstack/neutron/blob/4d8685da8050df79d9193f91cab572cfc6d67a47/neutron/extensions/dns.py#L130-L133= | 19:09 |
sean-k-mooney | dansmith: so we could go back to not normalising or do want we don downstream | 19:10 |
dansmith | sounds like we need to collab with the neutron buck-stop | 19:10 |
sean-k-mooney | downstream if the tld is numeric we normalise to remove . | 19:11 |
sean-k-mooney | btu otherwisse we allow the fqdn | 19:11 |
sean-k-mooney | in hostname | 19:11 |
sean-k-mooney | so downstream its targeted to just making that one edgecase pass since the change was never backported upstream | 19:12 |
sean-k-mooney | upstream form wallaby on we replace all '.' with _ or - i cant recall | 19:12 |
johnsom | The TLD rules are pretty simple, I think it is perfectly acceptable to error to the user. There are two length limitations and the basic regex neutron has. I assume the neutron raise comes to late for nova to communicate that to the user? | 19:16 |
sean-k-mooney | yes it haapens on the compute node when we are bidning the ports | 19:16 |
dansmith | not too late to communicate, just too late to reject the request | 19:17 |
dansmith | we have lots of reasons why the instance goes into error state based on lies you told us earlier | 19:17 |
sean-k-mooney | without designate this would also work | 19:17 |
sean-k-mooney | because nova wont try to set the field | 19:18 |
sean-k-mooney | since the extenion is not enabled | 19:18 |
dansmith | so if dns is enabled and you gave us something invalid, then failing to wire up would be a fine reason to error the instance | 19:18 |
sean-k-mooney | ok because that is what we used to do | 19:18 |
johnsom | Yep | 19:18 |
sean-k-mooney | before it was reported as a bug and "fixed" | 19:18 |
sean-k-mooney | in wallaby | 19:18 |
dansmith | "hacked" | 19:19 |
dansmith | "swept into the future debt dustbin to screw someone else later" | 19:19 |
dansmith | but yeah :) | 19:19 |
sean-k-mooney | well we filed an rfe to add hostname as a sperate top level paramter and did a lot of other work | 19:19 |
sean-k-mooney | but ya im not happy with the situration we are in currently | 19:20 |
sean-k-mooney | going back to that old behavior will break some users but fix others | 19:20 |
sean-k-mooney | depending on if designate is aviaable of not | 19:20 |
sean-k-mooney | which yes is tecninally detechable via the neutron api | 19:21 |
sean-k-mooney | by checking the extentions as that is one of the few that is only reported when enabled i belive | 19:21 |
sean-k-mooney | neutron has a habit of reporting all extesion even if they are not enabled makeing it imposible to determin that | 19:21 |
johnsom | So, what I am hearing is a proposal: hostname field, remove the FQDN restriction, hand it off to cloud-init single label or FQDN, no hacking on the string (i.e. no . -> -). Pass the string through to neutron. If the domain doesn't match or is rejected, ERROR the instance with "invalid hostname" in the error field. | 19:22 |
johnsom | Just trying to summarize for clarity | 19:22 |
dansmith | that's what I'm saying yeah | 19:23 |
sean-k-mooney | that would regress a fixed bug | 19:23 |
dansmith | there might be some opinions about whether or not we need to hide that behind a microversion or not I guess | 19:23 |
sean-k-mooney | and break people on upgrade | 19:23 |
johnsom | That works for me and would solve the customer issue | 19:23 |
sean-k-mooney | including breaking psi | 19:23 |
sean-k-mooney | but if we have an a way to help them fix all invalid hostname we cloud | 19:23 |
dansmith | sean-k-mooney: it doesn't break them if the neutron thing is fixed right? | 19:24 |
johnsom | Why would it break on upgrade? you are going from more restrictive to less | 19:24 |
sean-k-mooney | it wont break exiting vms | 19:24 |
sean-k-mooney | that we have already normalised | 19:24 |
johnsom | Right | 19:24 |
sean-k-mooney | but it will break anyone that started depending on that | 19:24 |
sean-k-mooney | so custoemr with exsiting heat templates | 19:24 |
sean-k-mooney | woudl find it breaks | 19:25 |
dansmith | depending on what specifically? the mangled hostname? | 19:25 |
sean-k-mooney | yes | 19:25 |
johnsom | They would start getting ERROR instances if the hostname is bogus instead of having the hostname switched around on them. Which seems like the right answer to me. APIs that magically change the data input to something else are ... unpleasant | 19:26 |
sean-k-mooney | johnsom: im pretty sure you review this by the way in the past if not appolgies but the mangaleing was discussed at leant on the mainile list | 19:27 |
sean-k-mooney | and it was chosen to go that appoch since we already did it for unicode and we were following the rfc for mangeling rules | 19:28 |
sean-k-mooney | and we explictly asks operator if they were depenidng on the fqdns in the hostname at the time | 19:28 |
dansmith | johnsom: agree, and unless we let people opt into the old behavior with a microversion, we're already changing that behavior underneath them | 19:28 |
sean-k-mooney | that is an option | 19:29 |
dansmith | no, | 19:29 |
dansmith | I mean with the previous change | 19:29 |
sean-k-mooney | disable the mangeling in new microverion | 19:29 |
dansmith | going from more mangling to less mangling is less disruptive, I'm sure | 19:29 |
sean-k-mooney | it will go from 200 to 400 | 19:29 |
dansmith | no, because we won't know until too late, right? | 19:30 |
dansmith | but as I said above, there's a discussion to be had on the microversion requirement | 19:30 |
sean-k-mooney | actully right it will not change the respocne and go form active to errror | 19:30 |
dansmith | ...right | 19:30 |
sean-k-mooney | https://github.com/openstack/nova/commit/9046f0fff4be424eda25401a3f9b8752964de775 that was the change we did 2 years ago to adress https://bugs.launchpad.net/nova/+bug/1581977 | 19:34 |
sean-k-mooney | https://lists.openstack.org/pipermail/openstack-discuss/2020-November/019113.html was the mailing list thread | 19:35 |
dansmith | sean-k-mooney: that's because we set hostname from display name if hostname isn't set specifically right? | 19:37 |
sean-k-mooney | yes before xena that was teh only way to set hostname | 19:38 |
sean-k-mooney | it was an internal atribute on the instance | 19:38 |
dansmith | ah | 19:38 |
sean-k-mooney | we added an api to set that in repsonce to the issues raised with ^ | 19:38 |
dansmith | okay so, this is even less of a problem I think.. if they specify the hostname, we can just take it as-is and if not, then we keep the existing display->filter->hostname behavior right? | 19:38 |
sean-k-mooney | https://specs.openstack.org/openstack/nova-specs/specs/xena/implemented/configurable-instance-hostnames.html | 19:39 |
sean-k-mooney | well we do not allow FQDNs via that api | 19:39 |
sean-k-mooney | e.g. if you pass --hostname it must not be an fqdn | 19:39 |
dansmith | ack, so that requires a microversion then | 19:39 |
sean-k-mooney | to allow it to be an fqdn yes | 19:40 |
sean-k-mooney | if we want to allwo that | 19:40 |
dansmith | all sounds fine to me, and isn't going to break anyone | 19:40 |
sean-k-mooney | well that is what the --domain spec was trying to do | 19:40 |
johnsom | dansmith Yeah, that was what I was thinking. The display name is always going to be a mess. But the hostname field is pretty clear it can't be crazy and should be good for pass through | 19:40 |
sean-k-mooney | add a domain filed to not continute to overload hostname | 19:40 |
sean-k-mooney | since hostnanme is used for dns_name and that cant be an fqdn today | 19:40 |
dansmith | johnsom: right and since we have the displayname->hostname(ifunset) part, we can keep the mangling if you don't otherwise set it to something... but if you do, it better be right | 19:41 |
sean-k-mooney | well ok it kind of can | 19:41 |
dansmith | "it" being hostname in my statement above | 19:41 |
sean-k-mooney | dansmith: so the propsoal is to just relax instanstnce.hostname if you pass it explictly | 19:41 |
sean-k-mooney | that might need a db migration | 19:41 |
sean-k-mooney | i would hae to check the column size | 19:42 |
johnsom | Nope, I looked, the field is fine in the DB | 19:42 |
dansmith | sean-k-mooney: sounds like that's all that needs to happen? | 19:42 |
johnsom | It's 255 already | 19:42 |
sean-k-mooney | oh ok | 19:42 |
sean-k-mooney | we are limiting it to 63 | 19:42 |
sean-k-mooney | in the api | 19:42 |
sean-k-mooney | so we would just need to drop the extra vlaidation on hostname when its passed explcitly | 19:42 |
sean-k-mooney | and live with the fact it can be an fqdn | 19:43 |
dansmith | for the new microversion yeah | 19:43 |
sean-k-mooney | and document that it will be passed as is to neutron | 19:43 |
dansmith | I don't love it, but I like it a lot better than us taking multiple things and constructing FQDNs | 19:43 |
sean-k-mooney | ya we dissucsed having --fqdn when --hostname was added | 19:43 |
sean-k-mooney | but we did not want to support fqdns | 19:43 |
sean-k-mooney | but i guess that is a less invasive change | 19:44 |
dansmith | okay now hold up, | 19:44 |
dansmith | what about the multi-create case? | 19:44 |
sean-k-mooney | we will sufix the fqdn | 19:44 |
dansmith | that would require the parsing of the hostname in nova to insert the index | 19:44 |
sean-k-mooney | well or that | 19:44 |
sean-k-mooney | or make the sufic a prefix | 19:45 |
dansmith | suffixing the fqdn won't yield anything valid, so no point in doing that | 19:45 |
dansmith | yeah 1-$hostname would work I think | 19:45 |
sean-k-mooney | we could change that in the microversion | 19:45 |
sean-k-mooney | i assume we cant just decied to not support multi create | 19:46 |
dansmith | well, I was going to say, | 19:46 |
sean-k-mooney | it would make life eiser in many ways :) | 19:46 |
dansmith | I wonder how useful hostname really is in the multi case | 19:46 |
sean-k-mooney | so in artoms spec he nandedl this by saying we woudl continue to suffix the hostname part as we do today | 19:46 |
sean-k-mooney | but the doamin value would be appended and the same for all instnaces | 19:47 |
sean-k-mooney | which when you have it as two parts makes sense | 19:47 |
sean-k-mooney | but if we want to avodi parsing then ya prefix or declare not supported | 19:47 |
dansmith | or just set them all to what they gave us | 19:48 |
sean-k-mooney | or that | 19:48 |
sean-k-mooney | ok its getting late here | 19:48 |
sean-k-mooney | if we want to explore this this cycle i can update the spec | 19:48 |
dansmith | ...and I'm sick of obsessing over hostnames :) | 19:48 |
sean-k-mooney | and formally ask for a spec freeze on the mailing list | 19:49 |
dansmith | I can't imagine we're going to get to agreement on all this in time | 19:49 |
sean-k-mooney | thats fine i suspect the same | 19:49 |
sean-k-mooney | im also kind of burnt out on this but i can prepare a draft or at least link to this in the spec | 19:50 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!