*** awalende has joined #openstack-nova | 00:06 | |
*** gfhellma_ has joined #openstack-nova | 00:07 | |
*** gfhellma_ has quit IRC | 00:09 | |
*** awalende has quit IRC | 00:10 | |
*** slaweq has joined #openstack-nova | 00:11 | |
*** yonglihe has quit IRC | 00:15 | |
*** slaweq has quit IRC | 00:16 | |
*** gyee has quit IRC | 00:32 | |
*** brinzhang has joined #openstack-nova | 00:35 | |
openstackgerrit | Brin Zhang proposed openstack/python-novaclient master: Microversion 2.74: Support Specifying AZ to unshelve https://review.opendev.org/665136 | 00:43 |
---|---|---|
*** yonglihe has joined #openstack-nova | 00:54 | |
*** brinzh has joined #openstack-nova | 00:55 | |
*** brinzhang has quit IRC | 00:58 | |
*** takashin has joined #openstack-nova | 00:59 | |
*** mriedem_away has quit IRC | 01:01 | |
*** yedongcan has joined #openstack-nova | 01:06 | |
*** lbragstad has quit IRC | 01:08 | |
*** tetsuro has joined #openstack-nova | 01:10 | |
*** brinzhang has joined #openstack-nova | 01:11 | |
*** brinzh has quit IRC | 01:14 | |
*** igordc has quit IRC | 01:16 | |
openstackgerrit | Merged openstack/nova master: Log quota legacy method warning only if counting from placement https://review.opendev.org/665765 | 01:45 |
*** tetsuro has quit IRC | 01:48 | |
*** tetsuro has joined #openstack-nova | 01:52 | |
openstackgerrit | Fan Zhang proposed openstack/nova master: Log disk transfer stats in live migration monitor. https://review.opendev.org/619395 | 01:52 |
*** guozijn has joined #openstack-nova | 01:53 | |
*** hongbin has joined #openstack-nova | 01:58 | |
openstackgerrit | Fan Zhang proposed openstack/nova master: Retry after hitting libvirt error VIR_ERR_OPERATION_INVALID in live migration. https://review.opendev.org/612272 | 02:04 |
*** whoami-rajat has joined #openstack-nova | 02:08 | |
*** tinwood has quit IRC | 02:08 | |
*** tinwood has joined #openstack-nova | 02:10 | |
*** slaweq has joined #openstack-nova | 02:11 | |
*** slaweq has quit IRC | 02:16 | |
openstackgerrit | Takashi NATSUME proposed openstack/python-novaclient master: Fix duplicate object description error https://review.opendev.org/666203 | 02:35 |
*** tetsuro has quit IRC | 02:47 | |
*** _alastor_ has quit IRC | 02:50 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Add a live migration regression test https://review.opendev.org/641200 | 02:52 |
*** ricolin has joined #openstack-nova | 02:55 | |
*** markvoelker has joined #openstack-nova | 03:00 | |
*** markvoelker has quit IRC | 03:06 | |
*** slaweq has joined #openstack-nova | 03:11 | |
*** slaweq has quit IRC | 03:15 | |
*** guozijn has quit IRC | 03:18 | |
openstackgerrit | Merged openstack/nova master: Clean up NumInstancesFilter related docs https://review.opendev.org/665768 | 03:21 |
*** tetsuro has joined #openstack-nova | 03:29 | |
*** psachin has joined #openstack-nova | 03:32 | |
*** tetsuro has quit IRC | 03:34 | |
*** _alastor_ has joined #openstack-nova | 03:49 | |
*** _alastor_ has quit IRC | 03:54 | |
*** psachin has quit IRC | 04:01 | |
*** markvoelker has joined #openstack-nova | 04:01 | |
*** psachin has joined #openstack-nova | 04:03 | |
*** markvoelker has quit IRC | 04:06 | |
*** guozijn has joined #openstack-nova | 04:09 | |
*** guozijn has quit IRC | 04:22 | |
*** bhagyashris has joined #openstack-nova | 04:24 | |
*** hongbin has quit IRC | 04:25 | |
*** shilpasd_ has quit IRC | 04:27 | |
*** udesale has joined #openstack-nova | 04:31 | |
openstackgerrit | Takashi NATSUME proposed openstack/python-novaclient master: Add irrelevant files in dsvm job https://review.opendev.org/666217 | 04:47 |
*** cfriesen has quit IRC | 04:58 | |
*** tetsuro has joined #openstack-nova | 05:07 | |
*** slaweq has joined #openstack-nova | 05:11 | |
openstackgerrit | Madhuri Kumari proposed openstack/nova master: Replace deprecated with_lockmode with with_for_update https://review.opendev.org/666221 | 05:16 |
*** slaweq has quit IRC | 05:16 | |
*** dklyle_ has joined #openstack-nova | 05:24 | |
*** david-lyle has quit IRC | 05:27 | |
*** purplerbot has quit IRC | 05:27 | |
*** purplerbot has joined #openstack-nova | 05:28 | |
*** alex_xu has quit IRC | 05:30 | |
*** irclogbot_0 has quit IRC | 05:30 | |
*** awestin1 has quit IRC | 05:30 | |
*** sridharg has joined #openstack-nova | 05:31 | |
*** mgoddard has quit IRC | 05:31 | |
*** irclogbot_3 has joined #openstack-nova | 05:31 | |
*** awestin1 has joined #openstack-nova | 05:32 | |
*** mgoddard has joined #openstack-nova | 05:33 | |
*** ratailor has joined #openstack-nova | 05:49 | |
*** guozijn has joined #openstack-nova | 05:50 | |
*** Luzi has joined #openstack-nova | 05:50 | |
openstackgerrit | melanie witt proposed openstack/nova-specs master: Propose showing server status UNKNOWN when host status UNKNOWN https://review.opendev.org/666181 | 05:55 |
*** guozijn has quit IRC | 06:03 | |
*** tkajinam has quit IRC | 06:03 | |
*** yikun has joined #openstack-nova | 06:08 | |
*** tetsuro has quit IRC | 06:10 | |
*** slaweq has joined #openstack-nova | 06:11 | |
*** spsurya has joined #openstack-nova | 06:15 | |
*** ricolin has quit IRC | 06:18 | |
*** ricolin has joined #openstack-nova | 06:19 | |
*** dpawlik has joined #openstack-nova | 06:22 | |
*** rajinir has quit IRC | 06:22 | |
*** threestrands has joined #openstack-nova | 06:27 | |
amotoki | gmann: hi | 06:30 |
gmann | amotoki: hi | 06:30 |
amotoki | gmann: there is a discussion on exceptions from novaclient in a horizon review https://review.opendev.org/#/c/661526/ | 06:30 |
amotoki | gmann: we are struggling on how to catch a specific exception from nova API and/or novaclient. | 06:30 |
amotoki | gmann: some suggestion would be appreciated. | 06:31 |
*** hamdyk has joined #openstack-nova | 06:31 | |
amotoki | gmann: in case of neutron, neutron returns an exception type in a response body and neutronclient decodes it, so consumers of neutronclient python binding can catch a specific error. | 06:32 |
gmann | checking | 06:34 |
amotoki | gmann: thanks. no need to rush :) | 06:34 |
*** belmoreira has joined #openstack-nova | 06:40 | |
gmann | amotoki: BadRequest is not generic error. I mean nova API convert the various related exception to HTTP Exception and then raise | 06:49 |
gmann | like - https://opendev.org/openstack/nova/src/branch/master/nova/api/openstack/compute/volumes.py#L354 | 06:49 |
gmann | HTTPBadRequest can occur due to multiple reason and their details are in error message | 06:50 |
*** guozijn has joined #openstack-nova | 06:50 | |
*** rcernin has quit IRC | 06:51 | |
gmann | amotoki: Horizon showing the error message which include the error details for example: "invalid volume id is request" not enough ? or you want to prepare some addition helpful error msg ? | 06:52 |
gmann | novaclient also decode them into ClientException - https://github.com/openstack/python-novaclient/blob/003ac57d9af74aa4658a7bf6cc6b6b3bafa58c11/novaclient/exceptions.py#L249 | 06:54 |
amotoki | gmann: thanks for checking. The horizon patch I mentioned above is to try to show a message from nova API as-is instead of showing a generic message "Unable to attach volume". | 06:56 |
openstackgerrit | Takashi NATSUME proposed openstack/python-novaclient master: Add irrelevant files in dsvm job https://review.opendev.org/666217 | 06:57 |
amotoki | gmann: it is generally a thing avoided in horizon (as messages cannot be translated and sometimes they are not friendly to GUI users) but we don't have a good idea for this case. | 06:57 |
gmann | amotoki: i see your point. | 06:58 |
amotoki | gmann: my initial question is whether we can assume BadRequest only for that case. | 06:58 |
amotoki | gmann: for example, I wonder BadRequest can be raised for input validation or something others. | 06:58 |
gmann | amotoki: it is hard to say, there can be various other error can be raised by Nova which is API specific and many times we do add/improve the exception | 06:59 |
gmann | amotoki: how about putting else only in case of 500 error code. ex.htttp_status == 500 | 07:00 |
gmann | and for rest all, horizon can use the ex.msg as it is. which is much reliable because nova API prepare that error msg explicitly. | 07:01 |
amotoki | gmann: in a best case, we can catch a specific exception and show an appropriate message by horizon..... | 07:01 |
amotoki | gmann: for example, horizon tries to hide UUID but most API messages include UUID :-( | 07:01 |
gmann | 2nd option is: include all possible exception per APIs which you can get info from API ref | 07:02 |
*** rpittau|afk is now known as rpittau | 07:02 | |
gmann | amotoki: yeah so you can do that. but in that case you need to decode the error message. because error message include the details of "why it is BadRequest" | 07:02 |
gmann | amotoki: does neutron client hide such info ? or horizon does ? | 07:04 |
amotoki | gmann: yeah, that's the dilemma.... if English is the only language it would be much much simpler | 07:04 |
gmann | yeah. it depends on locale | 07:04 |
*** igordc has joined #openstack-nova | 07:05 | |
amotoki | gmann: this is neutronclient code https://opendev.org/openstack/python-neutronclient/src/branch/master/neutronclient/v2_0/client.py#L65-L73 | 07:05 |
*** tkajinam has joined #openstack-nova | 07:05 | |
*** luksky has joined #openstack-nova | 07:06 | |
amotoki | gmann: 'type' contains an exception name in neutron server and if a specific exception is defined in https://opendev.org/openstack/python-neutronclient/src/branch/master/neutronclient/common/exceptions.py#L136 neutronclient raises a corresponding exception to callers. | 07:06 |
gmann | amotoki: so it does include the complete 'error_message' send from neutron API correct ? | 07:06 |
*** ivve has joined #openstack-nova | 07:07 | |
gmann | amotoki: novaclient does the same - https://opendev.org/openstack/python-novaclient/src/branch/master/novaclient/exceptions.py#L249-L276 | 07:07 |
amotoki | gmann: yes, the general format of an exception message is like: {"type": "fooException", "message": "...."} | 07:08 |
amotoki | gmann: we add new entries to neutronclient/common/exceptions.py per request from consumers. | 07:08 |
amotoki | but in this case it looks better to concatenate (translatable) "Unable to attach volume" with a message from nova API. | 07:09 |
gmann | amotoki: yeah, nova has only high level exception only not so rich like neutronclient | 07:09 |
amotoki | what I think is to catch exceptions related to multi-attach and send a specific error message. | 07:10 |
*** igordc has quit IRC | 07:10 | |
amotoki | so I wonder there is a way to distinguish multi-attached related exceptions from otherr BadRequest. | 07:10 |
gmann | amotoki: hummm. that is good idea. I am just wondering how many such Client exception we need to add in nova case | 07:10 |
amotoki | but it looks not easy now. | 07:10 |
gmann | attach volume has only 5-6 exception which is not so hard | 07:11 |
*** luksky has quit IRC | 07:11 | |
gmann | but see the server boot exceptions - https://opendev.org/openstack/nova/src/branch/master/nova/api/openstack/compute/servers.py#L695-L763 | 07:12 |
amotoki | yeah I know it | 07:12 |
gmann | is it too bad to show UUID in horizon? i mean horizon will be showing it to owner or admin only | 07:13 |
gmann | or you find nova exception error message include more details which can cause security issue | 07:13 |
amotoki | in the current impl of horizon, if UUID is shown to users, they sometimes need to fallback into CLI. | 07:15 |
*** ralonsoh has joined #openstack-nova | 07:16 | |
amotoki | I don't think it can be a security issue. it is just an usability topic. | 07:16 |
gmann | yeah | 07:16 |
amotoki | UUID is shown only in the detail page in most cases. | 07:16 |
*** tesseract has joined #openstack-nova | 07:20 | |
*** hamdykhader has joined #openstack-nova | 07:21 | |
*** hamdyk has quit IRC | 07:21 | |
gmann | amotoki: i replied to either include all possible exception or check case of 500. | 07:22 |
amotoki | gmann: thanks. | 07:22 |
amotoki | gmann: http://specs.openstack.org/openstack/api-sig/guidelines/errors.html might be a candidate (though I don't think it is implemented) | 07:22 |
*** luksky has joined #openstack-nova | 07:23 | |
amotoki | gmann: neutron API does similar thing in a different way https://opendev.org/openstack/neutron/src/branch/master/neutron/api/api_common.py#L512-L524 | 07:23 |
amotoki | gmann: anyway thanks for the discussion. really appreciated. | 07:23 |
*** ccamacho has quit IRC | 07:27 | |
*** ccamacho has joined #openstack-nova | 07:27 | |
gmann | amotoki: yeah standard error format is the missing part. NovaException does not provide scuh better way like neutron does. we have only 'code' and 'message'. | 07:28 |
gmann | amotoki: let me see sometime ( when i will be free) if we can have such wrapper method in NovaException to fetch the data in more standard way. | 07:31 |
gmann | thanks for reporting it. | 07:31 |
*** helenafm has joined #openstack-nova | 07:33 | |
*** tetsuro has joined #openstack-nova | 07:33 | |
*** tssurya has joined #openstack-nova | 07:38 | |
*** belmoreira has quit IRC | 07:38 | |
*** tetsuro has quit IRC | 07:38 | |
*** ttsiouts has joined #openstack-nova | 07:42 | |
*** belmoreira has joined #openstack-nova | 07:44 | |
*** dtantsur|afk is now known as dtantsur | 07:56 | |
*** trident has quit IRC | 07:57 | |
*** threestrands has quit IRC | 07:59 | |
*** bhagyashris has quit IRC | 07:59 | |
*** takashin has left #openstack-nova | 08:00 | |
*** trident has joined #openstack-nova | 08:01 | |
openstackgerrit | Josephine Seifert proposed openstack/nova-specs master: Spec for the Nova part of Image Encryption https://review.opendev.org/608696 | 08:04 |
*** ttsiouts has quit IRC | 08:06 | |
*** tkajinam has quit IRC | 08:06 | |
*** ttsiouts has joined #openstack-nova | 08:07 | |
*** brinzh has joined #openstack-nova | 08:10 | |
*** ttsiouts has quit IRC | 08:11 | |
*** brinzhang has quit IRC | 08:12 | |
openstackgerrit | Josephine Seifert proposed openstack/nova-specs master: Spec for the Nova part of Image Encryption https://review.opendev.org/608696 | 08:15 |
*** ociuhandu has joined #openstack-nova | 08:15 | |
*** ttsiouts has joined #openstack-nova | 08:16 | |
openstackgerrit | jiasirui proposed openstack/nova-specs master: fix the spelling mistakes https://review.opendev.org/666244 | 08:24 |
*** pcaruana has quit IRC | 08:27 | |
*** xek has joined #openstack-nova | 08:27 | |
shilpasd | <efried> i am facing issue in n-cpu start, due to nova/db/sqlalchemy/migrate_repo/versions/397_migrations_cross_cell_move.py | 08:28 |
shilpasd | efried: this got commited at https://review.opendev.org/#/c/614012/ | 08:28 |
shilpasd | error is:Error starting thread.: RemoteError: Remote error: DBError (pymysql.err.InternalError) (1054, u"Unknown column 'migrations.cross_cell_move' in 'field list'") | 08:28 |
shilpasd | i have run db manage command, but no success | 08:28 |
shilpasd | any solution? | 08:28 |
shilpasd | efried: mriedem: here is the detailed error log, http://paste.openstack.org/show/753170/ | 08:30 |
*** dikonoor has joined #openstack-nova | 08:34 | |
*** imacdonn has quit IRC | 08:41 | |
*** imacdonn has joined #openstack-nova | 08:41 | |
openstackgerrit | Martin Midolesov proposed openstack/nova master: Implementing graceful shutdown. https://review.opendev.org/666245 | 08:44 |
*** pcaruana has joined #openstack-nova | 08:45 | |
*** arxcruz is now known as arxcruz|brb | 08:47 | |
*** markvoelker has joined #openstack-nova | 08:53 | |
*** bhagyashris__ has joined #openstack-nova | 08:55 | |
*** tetsuro has joined #openstack-nova | 08:55 | |
*** markvoelker has quit IRC | 08:58 | |
*** tetsuro has quit IRC | 09:11 | |
*** kaisers1 has quit IRC | 09:13 | |
*** cdent has joined #openstack-nova | 09:15 | |
*** kaisers has joined #openstack-nova | 09:16 | |
*** damien_r has joined #openstack-nova | 09:17 | |
*** brinzh has quit IRC | 09:18 | |
*** brinzhang has joined #openstack-nova | 09:19 | |
*** awalende has joined #openstack-nova | 09:19 | |
*** brinzhang has quit IRC | 09:19 | |
*** brinzhang has joined #openstack-nova | 09:19 | |
*** brinzhang has quit IRC | 09:20 | |
*** brinzhang has joined #openstack-nova | 09:20 | |
*** damien_r has quit IRC | 09:22 | |
*** luksky has quit IRC | 09:24 | |
*** davidsha has joined #openstack-nova | 09:24 | |
*** belmoreira has quit IRC | 09:25 | |
*** guozijn has quit IRC | 09:32 | |
*** igordc has joined #openstack-nova | 09:37 | |
openstackgerrit | Merged openstack/nova master: Fix wrong assert methods https://review.opendev.org/665897 | 09:38 |
*** derekh has joined #openstack-nova | 09:39 | |
*** janki has joined #openstack-nova | 09:43 | |
*** dikonoor has quit IRC | 09:47 | |
*** lpetrut has joined #openstack-nova | 09:49 | |
*** mdbooth has joined #openstack-nova | 09:49 | |
openstackgerrit | zhaixiaojun proposed openstack/python-novaclient master: Modify the url of upper_constraints_file https://review.opendev.org/665934 | 09:50 |
*** markvoelker has joined #openstack-nova | 09:53 | |
*** brinzhang has quit IRC | 09:55 | |
*** punith has joined #openstack-nova | 09:57 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: xvp: Start using consoleauth tokens https://review.opendev.org/652967 | 09:57 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: xvp: Remove use of '_LI' marker https://review.opendev.org/665425 | 09:57 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-status: Remove consoleauth workaround check https://review.opendev.org/652968 | 09:57 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove nova-consoleauth https://review.opendev.org/652969 | 09:57 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Remove ConsoleAuthToken.to_dict https://review.opendev.org/652970 | 09:57 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: docs: Rework nova console diagram https://review.opendev.org/660147 | 09:57 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: docs: Integrate 'sphinx.ext.imgconverter' https://review.opendev.org/665693 | 09:57 |
openstackgerrit | Surya Seetharaman proposed openstack/nova master: Grab fresh info from the driver during nova start/stop actions https://review.opendev.org/665975 | 09:58 |
*** markvoelker has quit IRC | 09:58 | |
*** bhagyashris__ has quit IRC | 09:59 | |
*** martinkennelly has joined #openstack-nova | 10:04 | |
*** luksky has joined #openstack-nova | 10:07 | |
*** lpetrut has quit IRC | 10:13 | |
*** rcernin has joined #openstack-nova | 10:18 | |
*** yedongcan has left #openstack-nova | 10:20 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: tests: Use consistent URL regex substitution https://review.opendev.org/665949 | 10:21 |
*** dave-mccowan has joined #openstack-nova | 10:31 | |
*** _alastor_ has joined #openstack-nova | 10:35 | |
*** mdbooth has quit IRC | 10:36 | |
*** mdbooth has joined #openstack-nova | 10:36 | |
*** _alastor_ has quit IRC | 10:40 | |
*** priteau has joined #openstack-nova | 10:46 | |
*** ttsiouts has quit IRC | 10:47 | |
*** ttsiouts has joined #openstack-nova | 10:48 | |
*** lpetrut has joined #openstack-nova | 10:51 | |
*** ttsiouts has quit IRC | 10:52 | |
*** markvoelker has joined #openstack-nova | 10:54 | |
*** awalende has quit IRC | 10:58 | |
*** markvoelker has quit IRC | 10:59 | |
ohwhyosa | hey there! I uploaded an iso ubuntu-server-18.04 (not live) and when connecting via console to the instance, it does detect it (it lets me get to the keymap selection part) but then it complains no cd-rom image is mounted, any idea why might that be? | 11:18 |
sean-k-mooney | not really while you can use iso its a little complicated in that you generally have to also a data volume to install to | 11:21 |
sean-k-mooney | when you select the iso as a disk image it gets attached as the root disk. im not sure if it is attahced as a cdrom or not | 11:22 |
ohwhyosa | thanks sean-k-mooney! the recommendation would be using other image formats if available then, right? | 11:23 |
sean-k-mooney | well normally you would locally install it in a vm and then upload that to the cloud or use a prebuild image | 11:23 |
sean-k-mooney | you can enable a boot menu too. you can convert the iso into a data volume in cinder and then boot an instance with a blank volume and install too but in generall you are better off preparing the image first rather then trying to do it in openstack | 11:25 |
ohwhyosa | nova.console.websocketproxy keeps on saying broken pipe, by the way, tried the openstack-ansible multinode and the aio (which should set up all the networking correctly by itself) | 11:25 |
ohwhyosa | It is only the console service that broken pipes, just saying in case there might be an issue behind it | 11:26 |
ohwhyosa | I used both latest and stein | 11:26 |
ohwhyosa | stable/stein | 11:26 |
sean-k-mooney | im not sure that sound more like a spice server or websockify issue rather then nova | 11:26 |
sean-k-mooney | ohwhyosa: if you have not seen https://github.com/openstack/diskimage-builder you should check it out | 11:27 |
sean-k-mooney | there is also virt-builder but tool like those are how people usually prepare image for openstack | 11:28 |
*** punith has quit IRC | 11:29 | |
ohwhyosa | sean-k-mooney, thanks a ton! (and it happened both with noVNC and spice, but it is indeed happening with websockify) | 11:29 |
*** ttsiouts has joined #openstack-nova | 11:32 | |
*** udesale has quit IRC | 11:53 | |
*** arxcruz|brb is now known as arxcruz | 11:53 | |
*** udesale has joined #openstack-nova | 11:53 | |
*** markvoelker has joined #openstack-nova | 11:55 | |
*** markvoelker has quit IRC | 12:00 | |
*** awalende has joined #openstack-nova | 12:07 | |
*** awalende has quit IRC | 12:08 | |
*** awalende has joined #openstack-nova | 12:08 | |
*** dikonoor has joined #openstack-nova | 12:08 | |
*** mgariepy has joined #openstack-nova | 12:15 | |
*** rcernin has quit IRC | 12:15 | |
*** udesale has quit IRC | 12:20 | |
*** udesale has joined #openstack-nova | 12:21 | |
*** dikonoor has quit IRC | 12:21 | |
*** udesale has quit IRC | 12:25 | |
*** udesale has joined #openstack-nova | 12:25 | |
*** priteau has quit IRC | 12:27 | |
*** liuyulong has joined #openstack-nova | 12:28 | |
*** awalende has quit IRC | 12:29 | |
*** awalende has joined #openstack-nova | 12:30 | |
*** awalende has quit IRC | 12:34 | |
*** eharney has quit IRC | 12:40 | |
*** alex_xu has joined #openstack-nova | 12:45 | |
*** markvoelker has joined #openstack-nova | 12:56 | |
*** ttsiouts has quit IRC | 13:00 | |
*** markvoelker has quit IRC | 13:00 | |
*** ttsiouts has joined #openstack-nova | 13:01 | |
efried | shilpasd: You've hit this: https://review.opendev.org/#/c/614012/ | 13:01 |
efried | shilpasd: sorry, you already said that. | 13:01 |
efried | are your conductor & computes running at the same level? | 13:02 |
*** ratailor has quit IRC | 13:02 | |
*** awalende has joined #openstack-nova | 13:03 | |
*** ttsiouts has quit IRC | 13:05 | |
shilpasd | efried: yes | 13:10 |
efried | shilpasd: In any case this sounds like a bug (it should behave better than that even if versions are mismatched) but something mriedem will have to look at once he gets here. | 13:10 |
*** mriedem has joined #openstack-nova | 13:11 | |
efried | there he is :) | 13:12 |
shilpasd | efried: ok | 13:12 |
shilpasd | yes | 13:12 |
shilpasd | mriedem: i am getting http://paste.openstack.org/show/753170/ while service restart on master, can you help here to solve | 13:14 |
alex_xu | mriedem: we replied your comment on vpmem. also explored the bdm, let us know your thought | 13:14 |
alex_xu | mriedem: sean-k-mooney also in case you don't know, we already submit the vpmem code to gerrit https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/virtual-persistent-memory, hope that is helpful on review spec | 13:15 |
mriedem | shilpasd: looks like you upgraded code without syncing the database | 13:15 |
shilpasd | mriedem: nova-manage db sync, is run still the issue | 13:16 |
shilpasd | any other command to be run? | 13:16 |
*** ttsiouts has joined #openstack-nova | 13:18 | |
sean-k-mooney | alex_xu: sorry i have kept defering re reviewing https://review.opendev.org/#/c/601596/ ill grab a cup of coffee and go review before my next meeting | 13:18 |
alex_xu | sean-k-mooney: no problem, thanks | 13:19 |
sean-k-mooney | and cool im not sure i will have time to reivew the code today but ill skim over it | 13:19 |
alex_xu | \o/ | 13:19 |
*** lbragstad has joined #openstack-nova | 13:20 | |
*** cdent has quit IRC | 13:21 | |
*** bbowen has joined #openstack-nova | 13:24 | |
mriedem | shilpasd: if you check the table schema in that database does it have the cross_cell_move column? | 13:24 |
*** bbowen_ has quit IRC | 13:24 | |
shilpasd | mriedem: there is no cross_cell_move column | 13:28 |
mriedem | then you didn't run the db sync properly | 13:28 |
mriedem | are you sure you're syncing your nova (cell1) database? | 13:28 |
shilpasd | mriedem: i am having multi node setup, and i have migrations record in it, is it cauing an issue? | 13:29 |
shilpasd | should i clen records and run db sync? | 13:29 |
mriedem | existing migration records should not be a problem | 13:29 |
shilpasd | ok | 13:30 |
mriedem | shilpasd: you know that https://review.opendev.org/#/c/614012/ just merged yesterday, so you must be doing CD somewhere? | 13:30 |
shilpasd | mriedem: yes i know that, after pulling master today started getting this issue | 13:31 |
tssurya | shilpasd: did you run it on both cell0 and cell1 ? | 13:36 |
shilpasd | tssurya: after syncing DB, checked the table schema in that database for both cell0 and cell1, but it does not have the cross_cell_move column | 13:39 |
*** rajinir has joined #openstack-nova | 13:40 | |
sean-k-mooney | alex_xu: erric woudl we be able to take a look at the CAT spec https://review.opendev.org/#/c/662264/3 johnthetubaguy i dont know if cache allocation and memory bandwith limits are somethign your interested in too for hpc but input is welcome if you are interested. | 13:40 |
sean-k-mooney | efried: ^ | 13:40 |
efried | sean-k-mooney: ack | 13:41 |
sean-k-mooney | ill be working with lyarwood on the implementation he will likely take lead but im hopeing to get a yes or no on the general direction of the spec in the next week or two. | 13:43 |
tssurya | shilpasd: well the nova-manage db sync comand by default takes the db_connection from the nova.conf config file. | 13:43 |
tssurya | hopefully you are running it on level1 and not level0 | 13:44 |
tssurya | for cell1 syncing | 13:44 |
stephenfin | gibi: Could you take a look at this long standing bugfix? https://review.opendev.org/#/c/609460/ | 13:44 |
mriedem | shilpasd: confirm that the config file you're using when running nova-manage db sync has the [database]/connection pointed at the correct cell db | 13:44 |
*** BjoernT has joined #openstack-nova | 13:44 | |
stephenfin | (I don't know anyone else that knows that piece of the code in detail) | 13:44 |
gibi | stephenfin: I've put it in my queue | 13:45 |
stephenfin | Thanks :) | 13:45 |
tssurya | shilpasd: yea check the database connection in the config you are running it against, I am guessing its not pointing at the right db | 13:45 |
shilpasd | tssurya: i don't think so, since nova operations were working before pulling master | 13:47 |
tssurya | shilpasd: what I mean is that nova-manage db sync is not being run against the cell config file | 13:48 |
tssurya | perhaps | 13:48 |
tssurya | which is why its not adding the column to the cell level dbs | 13:48 |
shilpasd | i have checked all that and in conf cell1 is configured as DB con | 13:49 |
*** eharney has joined #openstack-nova | 13:50 | |
shilpasd | and checked cell1 for cross_cell_move column but its not there | 13:50 |
shilpasd | finally i am doing stack with reclone YES, lets see further, will keep you posted | 13:50 |
*** mlavalle has joined #openstack-nova | 13:53 | |
mloza | hello, I have 13 computes and 12 of them are working fine. One of the problematic compute node's nova-compute service goes up and down in `openstack compute service list`. From the logs, I got tons of "Unexpected error during heartbeart thread processing, retrying...: ConnectionForced: Too many heartbeats missed" and RMQ broken pipes. Redeploying the compute didn't fix the issue neither rebooting | 13:53 |
mloza | completely the controller nodes. | 13:53 |
*** mchlumsky has joined #openstack-nova | 13:54 | |
ohwhyosa | mloza, have you checked placement logs? | 13:55 |
tssurya | mloza: looks like a message queue issue due to which the compute is not ablt to communicate to the controller, its not a nova-compute service issue I guess | 13:55 |
tssurya | able* | 13:55 |
*** mdbooth has quit IRC | 13:56 | |
ohwhyosa | I had a similar problem and it turns out the host-name was duplicated (because of a previous installation) | 13:56 |
*** mdbooth has joined #openstack-nova | 13:56 | |
*** janki has quit IRC | 13:57 | |
*** markvoelker has joined #openstack-nova | 13:57 | |
ohwhyosa | try OS_TOKEN=$(openstack token issue -c id -f value) | 13:57 |
ohwhyosa | and then curl -s -H "X-Auth-Token: ${OS_TOKEN}" http://10.0.0.11:8780/resource_providers?name=${hostname} | 13:58 |
ohwhyosa | assumming that 10.0.0.11:8780 is your placement endpoint | 13:58 |
ohwhyosa | which you could check with openstack endpoint list I believe | 13:59 |
mloza | tssurya: When I redeployed the problematic compute, I change the hostname and IP address but still the issue persist. nova-compute keeps flapping | 13:59 |
*** rouk has joined #openstack-nova | 13:59 | |
*** liuyulong has quit IRC | 14:01 | |
*** markvoelker has quit IRC | 14:01 | |
*** BjoernT_ has joined #openstack-nova | 14:01 | |
*** BjoernT has quit IRC | 14:03 | |
*** tssurya has quit IRC | 14:06 | |
*** brinzhang has joined #openstack-nova | 14:09 | |
shilpasd | tssurya: mriedem: stack solve the issue | 14:09 |
mloza | ohwhyosa: Just full of RMQ broken pipes and heartbeat missed. All logs are thrown in elasticsearch | 14:09 |
*** cdent has joined #openstack-nova | 14:11 | |
*** dpawlik has quit IRC | 14:11 | |
ohwhyosa | Did you check the api, mloza ? | 14:13 |
*** lpetrut has quit IRC | 14:14 | |
mloza | ohwhyosa: No such error in nova-api | 14:17 |
ohwhyosa | nope, the placement api, did you check the curl command result? | 14:19 |
mloza | ohwhyosa: let me check | 14:20 |
jroll | random question: has anyone discussed doing passthrough of a host TPM device to the guest? I see this BP from 2014 https://blueprints.launchpad.net/nova/+spec/add-libvirt-tpm . I guess what I'm wondering is if that's something that would be accepted, and if so, if anyone has ideas on the best route to implement (as it isn't just standard pci passthrough). | 14:20 |
jroll | context: we have users that have a requirement for a hardware tpm (still don't understand why vtpm isn't okay, but it is what it is) | 14:21 |
* jroll is happy to just mail the list, but thought I'd ask here first | 14:21 | |
*** ivve has quit IRC | 14:23 | |
*** JamesBenson has joined #openstack-nova | 14:26 | |
gibi | stephenfin: I'm +2 on https://review.opendev.org/#/c/609460 | 14:27 |
stephenfin | gibi++ | 14:28 |
stephenfin | Now to find another unwilling victim^H^H^H helpful soul :) | 14:28 |
mloza | ohwhyosa: https://pastebin.com/raw/bA3LScMr | 14:28 |
mriedem | jroll: there is an approved spec for tpm | 14:29 |
mriedem | https://specs.openstack.org/openstack/nova-specs/specs/train/approved/add-emulated-virtual-tpm.html | 14:29 |
mriedem | the code is lagging and was deferred from stein | 14:29 |
mriedem | windriver owns it | 14:29 |
mriedem | jroll: oh this is different, | 14:30 |
jroll | mriedem: passthrough, not virt... yeah | 14:30 |
mriedem | you want passthrough | 14:30 |
mriedem | i'd ask cfriesen about hw passthrough tpm if/when he's around | 14:31 |
mloza | ohwhyosa: I did redeploy twice. The first one, we deleted the old compute in `openstack resource provider ..` and the second time, we change the hostname and IP address but still RMQ issues are there. | 14:31 |
*** JamesBen_ has joined #openstack-nova | 14:33 | |
mloza | ohwhyosa: The RMQ issue is only a specific node and other computes are fine. | 14:33 |
ohwhyosa | hmmm so the old provider doesn't appear on the query, right? | 14:34 |
ohwhyosa | did you search by the current hostname or the old one? | 14:34 |
*** Luzi has quit IRC | 14:36 | |
*** JamesBenson has quit IRC | 14:36 | |
*** hamdykhader has quit IRC | 14:37 | |
kashyap | jroll: Heya, just noticed your random question, yeah, was curious why vTPM (which is implemented in libvirt/QEMU) is not okay. Curious to hear when you know | 14:38 |
*** awalende has quit IRC | 14:38 | |
jroll | kashyap: probably FUD. idk. I'll let you know if I figure it out | 14:38 |
*** awalende has joined #openstack-nova | 14:38 | |
mriedem | jroll: let me guess, they are baremetal people that don't trust anything virtual, so even using VMs is a problem for them and they want hw passthrough wherever and whenever possible b/c they don't trust virtualization. | 14:38 |
jroll | mriedem: it's almost like you know where I work | 14:39 |
mriedem | heh | 14:39 |
kashyap | mriedem: Heh, that guess is perfectly reasonable | 14:40 |
stephenfin | dansmith: Could you take a look at https://review.opendev.org/#/c/609460 too? Someone has finally stumbled upon it | 14:41 |
*** dklyle_ has quit IRC | 14:42 | |
*** awalende has quit IRC | 14:42 | |
mloza | ohwhyosa: yes, the old provider isn't in the list | 14:42 |
mloza | current one | 14:42 |
*** dklyle has joined #openstack-nova | 14:42 | |
mloza | I checked the old and it wasn't there anymore | 14:42 |
dansmith | stephenfin: I'll look when there's a CI run from this year | 14:43 |
stephenfin | that'll happen automatically when it goes through the gate | 14:43 |
dansmith | yeah, but I'm not going to spend time reviewing it if it's like completely broken, you know? | 14:44 |
*** gfhellma has joined #openstack-nova | 14:45 | |
sean-k-mooney | dansmith: ha that could take a while to happen | 14:45 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Ignore hw_vif_type for direct, direct-physical vNIC types https://review.opendev.org/609460 | 14:45 |
stephenfin | Fair, but it's not. Just saying :) | 14:46 |
sean-k-mooney | stephenfin: wait you tested something locally :P | 14:46 |
mloza | ohwhyosa: btw, this is a kolla-ansible deployment of stable/rocky branch | 14:47 |
*** brinzhang0 has joined #openstack-nova | 14:47 | |
*** brinzhang has quit IRC | 14:47 | |
*** luksky has quit IRC | 14:50 | |
*** awalende has joined #openstack-nova | 14:51 | |
*** belmoreira has joined #openstack-nova | 14:53 | |
*** brinzhang0 has quit IRC | 14:53 | |
*** awalende has quit IRC | 14:54 | |
*** liuyulong has joined #openstack-nova | 14:57 | |
*** markvoelker has joined #openstack-nova | 14:58 | |
*** lpetrut has joined #openstack-nova | 14:58 | |
*** lpetrut has quit IRC | 14:59 | |
*** lpetrut has joined #openstack-nova | 14:59 | |
belmoreira | mriedem are you available? Hope you're good. | 15:01 |
belmoreira | what's the best way to fix wrong allocations? (requires_allocation_refresh = False) | 15:01 |
*** awalende has joined #openstack-nova | 15:02 | |
mriedem | belmoreira: coincidentally bauzas and i are talking about busted allocations in -placement | 15:02 |
*** markvoelker has quit IRC | 15:02 | |
*** cfriesen has joined #openstack-nova | 15:02 | |
mriedem | but should probably be talking about that here since it's a nova problem | 15:02 |
bauzas | actually yeah | 15:03 |
mriedem | i'm not sure what requires_allocation_refresh is | 15:03 |
mriedem | resource_provider_association_refresh ? | 15:03 |
mriedem | resource_provider_association_refresh doesn't touch allocations anyway | 15:04 |
belmoreira | this is only enable for ironic (the default is false). I can confirm but I'm assuming it always updates the allocations | 15:05 |
mriedem | belmoreira: well one part is fixing the many bugs that cause incorrect allocatoins, e.g. https://review.opendev.org/#/c/654067/ | 15:05 |
mriedem | belmoreira: oh, requires_allocation_refresh in the ironic driver, that was removed awhile ago | 15:05 |
belmoreira | mriedem I agree. But after hitting the bugs operators need something to fix the busted allocations | 15:05 |
mriedem | yeah i know | 15:06 |
*** sridharg has quit IRC | 15:06 | |
mriedem | i think https://bugs.launchpad.net/nova/+bug/1793569 is where we've been collecting the various notes on what some operators are doing for scripts | 15:06 |
openstack | Launchpad bug 1793569 in OpenStack Compute (nova) "Add placement audit commands" [Wishlist,Confirmed] | 15:06 |
mriedem | belmoreira: one option is if you know that some instance has the wrong allocations, you can delete the allocations in placement and then run the nova-manage placement heal_allocations command and it should fix the allocations | 15:07 |
mriedem | but that won't do anything about allocations held by migration records from some failed migration | 15:07 |
ohwhyosa | mloza, I recommend you checkout also with the people at #openstack-kolla | 15:08 |
ohwhyosa | I can't help you more, I'm kinda new myself (without the kinda) but since I had a similar problem recently and was helped by sean-k-mooney , I thought I'd pay the favor forward in case it would help | 15:08 |
*** _alastor_ has joined #openstack-nova | 15:13 | |
bauzas | mriedem: you need to stop your APIs before cleaning up allocations, right? | 15:14 |
mriedem | umm | 15:15 |
belmoreira | mriedem in that case the operator needs to identify somehow the wrong allocations. I was thinking in having a "nova-manage" for that and fix or a periodic task | 15:15 |
mriedem | i guess it depends on what you mean by cleaning up | 15:15 |
mriedem | belmoreira: see comment 5 that i just posted https://bugs.launchpad.net/nova/+bug/1793569 | 15:15 |
openstack | Launchpad bug 1793569 in OpenStack Compute (nova) "Add placement audit commands" [Wishlist,Confirmed] | 15:15 |
bauzas | mriedem: what you just said 'delete allocations and run nova-manage' | 15:15 |
mriedem | bauzas: heal_allocations won't do anything to an instance whose task_state is not None, | 15:16 |
belmoreira | mriedem we had that periodic task in the past, if I remember correctly | 15:16 |
bauzas | if you delete allocations, you could race if you leave users to be able to create new instances ;) | 15:16 |
mriedem | bauzas: creating new servers and their allocations isn't the problem you're trying to solve | 15:16 |
bauzas | I'm misunderstanding what you propose then | 15:16 |
mriedem | belmoreira: once everything is upgraded to pike, the resource tracker would stop updating allocations for non-ironic nodes | 15:16 |
mriedem | bauzas: if you're trying to heal allocations for a single instance, you could lock it, delete its allocatoins in placement, and then run heal_allocations and then unlock it | 15:17 |
bauzas | hah ok, that I understand | 15:18 |
mriedem | but as i've said in that bug above, and in irc many a time, heal_allocations nor the scripts in that bug from larsks and mnaser deal with stale allocations held by migration consumers | 15:18 |
bauzas | I thought you were saying about scrubbing the whole allocations table and just do the nova-manage placement heal stuff to recreate all the records | 15:18 |
mriedem | bauzas: i probably wouldn't do that on a production cloud with a lot of instances | 15:19 |
bauzas | of course | 15:19 |
mriedem | unless you use the --dry-run option which was also recently added and not backported | 15:19 |
mriedem | heal_allocations also doesn't deal with nested allocations (yet) | 15:19 |
belmoreira | mriedem right. Why it shouldn't continue to happen for libvirt if we continue to require it for ironic? (at least optional) We will have always bugs and this "sync" will be always required | 15:19 |
mriedem | so our tooling for operators is definitely lagging the complicated features we're shoving in | 15:20 |
mriedem | belmoreira: first, we no longer have that flag in the ironic driver as i said - it was removed some time ago, either rocky or stein | 15:20 |
bauzas | belmoreira: for the context, I have some customers that complain about the nova_api (because Queens) DB be growing because of orphaned allocations | 15:20 |
mriedem | belmoreira: the reason the RT doesn't manage allocations since pike is because starting in pike the scheduler creates the allocations and the RT would trample the allocations during a migration afterward, which is its own problem | 15:21 |
mriedem | trust me, we had a bunch of bugs going into Pike RC1 and post-GA related to that | 15:21 |
mriedem | that was addressed in queens with https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/migration-allocations.html | 15:22 |
mriedem | but because of ^ and the lack of the RT auto healing things, if a migraiton fails and we don't cleanup properly, you've got stale allocations in placement held by migration records | 15:22 |
mriedem | and heal_allocations doesn't yet deal with those | 15:22 |
belmoreira | mridem fair enough | 15:22 |
mriedem | which leads to the scheduler thinking you have far less capacity than you probably do | 15:22 |
belmoreira | yeah, heal_allocations only create new allocations | 15:23 |
*** mrch_ has quit IRC | 15:23 | |
bauzas | I guess a "fix my cloud" button would be awesome | 15:23 |
belmoreira | we need a nova-manage for placement consistency | 15:23 |
mriedem | or just more people (developers/cores) working on nova that care about fixing these latent issues for operators | 15:24 |
bauzas | but then https://media.giphy.com/media/xThuW45pxrB820tD0c/giphy.gif | 15:24 |
mriedem | b/c frankly i'm getting burned out on caring about finding and fixing this stuff | 15:24 |
*** Sundar has joined #openstack-nova | 15:24 | |
bauzas | mriedem: oh yeah I understand you, don't make me wrong | 15:25 |
belmoreira | mriedem don't get me wrong. I really appreciate all your work on this | 15:25 |
bauzas | I just think about the possible DB locks a "fix my cloud" button would make | 15:25 |
mriedem | bauzas: invariably we'd fuck up a "fix my cloud" thing anyway | 15:25 |
bauzas | ideally, this needs to work online | 15:26 |
bauzas | but this is the ideal | 15:26 |
mriedem | as i said in https://bugs.launchpad.net/nova/+bug/1793569/comments/5 i think one low hanging fruit is probably adding something to heal_allocations that scans for allocations held by migratoins which are not in progress and reports on those | 15:27 |
openstack | Launchpad bug 1793569 in OpenStack Compute (nova) "Add placement audit commands" [Wishlist,Confirmed] | 15:27 |
bauzas | belmoreira: you'd be okay with some tool that 'd fix allocations by a maintenance window ? | 15:27 |
bauzas | I mean, I guess the answer, it'd depend on the required time window :-) | 15:27 |
bauzas | mriedem: couldn't we just report allocations that are either for not-in-progress migrations or just not related to any instance or migration UUID ? | 15:28 |
mriedem | i'm torn about how much functionality to shove into heal_allocations b/c it's going to get more complicated https://review.opendev.org/#/q/topic:bug/1819923+(status:open+OR+status:merged) | 15:28 |
mriedem | bauzas: sure we *could* but that's likely a separate command i'd think | 15:29 |
mriedem | nova-manage placement audit_allocations or something | 15:29 |
*** lpetrut has quit IRC | 15:30 | |
cdent | which is part of how we ended up wanting consumer types | 15:30 |
cdent | for most clouds (so far) it won't matter because all allocations will come from nova | 15:30 |
mriedem | i thought about that, and we could fudge around consumer types for now by checking the resource classes involved to know it's coming from nova | 15:30 |
* cdent nods | 15:30 | |
bauzas | cdent: heh touché | 15:31 |
bauzas | you know what ? | 15:31 |
mriedem | we could also use some troubleshooting docs about "my allocations seem all f'ed up, how can i tell for sure?" but i'm not sure how easy that is to write, but it's something that comes up in here almost every week i think | 15:31 |
bauzas | I've been given that escalation so I have free time to work on it | 15:32 |
*** gyee has joined #openstack-nova | 15:32 | |
cdent | having those docs and tools coming from the people who are experiencing and/or fixing the bugs would be ideal as they have the most idea of what's going on. in a perfect universe we'd have lots of people running around to do it, but we're long past that | 15:33 |
bauzas | mriedem: specless BP, there is ? | 15:33 |
mriedem | bauzas: i don't think providing tooling for fixing our messes is a feature | 15:33 |
sean-k-mooney | its not really a bug either | 15:34 |
sean-k-mooney | the thing that is cause the mess is a bug | 15:34 |
sean-k-mooney | i guess a tool could be a partial fix for that but | 15:34 |
sean-k-mooney | are you ok with tieing the tool to the bug rather then a seperate thing | 15:34 |
mriedem | https://bugs.launchpad.net/nova/+bug/1793569 is a bug | 15:34 |
openstack | Launchpad bug 1793569 in OpenStack Compute (nova) "Add placement audit commands" [Wishlist,Confirmed] | 15:34 |
bauzas | cool, I'll use this bug | 15:35 |
* bauzas rolls his sleeves | 15:35 | |
sean-k-mooney | am ok that feels more like an RFE then a bug but ok | 15:35 |
sean-k-mooney | no need to invent more paperwork when we dont have too | 15:36 |
mriedem | bauzas: i wouldn't mind you posting something to the ML with what you plan on adding before starting a bunch of work on it | 15:36 |
bauzas | ack, tagging ops too | 15:36 |
belmoreira | bauzas yes please | 15:36 |
*** damien_r has joined #openstack-nova | 15:36 | |
*** awalende has quit IRC | 15:38 | |
mriedem | bauzas: you can reply to this thread i started last september http://lists.openstack.org/pipermail/openstack-discuss/2019-March/004223.html | 15:39 |
bauzas | mriedem: cool | 15:39 |
*** awalende has joined #openstack-nova | 15:39 | |
*** gfhellma has quit IRC | 15:40 | |
mriedem | i'll also say i expect functional tests for anything added b/c unit tests don't cut it with this kind of stuff that involves placement | 15:40 |
*** awalende_ has joined #openstack-nova | 15:41 | |
mloza | ohwhyosa: how did fix your issue? | 15:42 |
*** awalende has quit IRC | 15:43 | |
bauzas | belmoreira: do you know if [ops] tag in the email subject is enough for getting ops' eye on what I write ? | 15:43 |
mriedem | yes, it should be | 15:44 |
*** awalende_ has quit IRC | 15:44 | |
belmoreira | is the tag that we are using | 15:44 |
mriedem | it's what i used on http://lists.openstack.org/pipermail/openstack-discuss/2019-June/thread.html#7097 | 15:44 |
mriedem | which is related to all of this also - nova mis-managing placement resources | 15:45 |
artom | sean-k-mooney, remind me again, what's the situation where we can have a single instance with bind-time and plug-time events? | 15:46 |
mriedem | bauzas: mayhap while you have been escalated to spend time upstream, you'd like to review https://review.opendev.org/#/q/topic:bug/1825537+(status:open+OR+status:merged) as well | 15:46 |
artom | OVS hybrid-plugged ports for the former, what's the latter? | 15:47 |
artom | 'cuz hybrid-plug is a neutron-wide setting, no? | 15:47 |
mriedem | artom: sriov direct-physical i thought | 15:47 |
sean-k-mooney | ovs + sriov or just tow different sriov ports | 15:47 |
mriedem | artom: different types of ports attached to the server | 15:47 |
artom | mriedem, that's what I have in my commit message currently (not pushed yet), but it means I got https://review.opendev.org/#/c/664431/ completely wrong | 15:48 |
sean-k-mooney | e.g. one port that is vnic_type=direct-physical + another port that is vnic_type=direct | 15:48 |
artom | I think I'l push, sean-k-mooney can shoot it down ;) | 15:48 |
mriedem | belmoreira: while you're around, here is another tool for ops for you https://review.opendev.org/#/c/655908/ | 15:48 |
sean-k-mooney | artom: if i think its incorrect ills leave a comment or updated it for you when i review | 15:49 |
artom | sean-k-mooney, appreciated :) | 15:49 |
sean-k-mooney | but ya it basicaly only happens if you have two different network backend attached to the same vm which only happens if you are using sriov really | 15:50 |
artom | mriedem, I'm off until Tuesday, so I won't be the one harassing you for reviews. That honour goes to sean-k-mooney while I'm gone (because it's still on fire internally) | 15:50 |
mriedem | artom: as in you're off today? | 15:50 |
artom | mriedem, as of tonight, flying to Russia to see the folks | 15:50 |
artom | Well, this afternoon, really | 15:50 |
mriedem | ok so is someone going to be updating https://review.opendev.org/#/c/644881/ sometime soon because i'm getting really tired of having to re-load the context from that change into my head every other week | 15:51 |
mriedem | when dansmith asks me to | 15:51 |
belmoreira | mriedem thanks. | 15:51 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Revert resize: wait for events according to hybrid plug https://review.opendev.org/644881 | 15:51 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: [DNM] testing bug/1813789 revert resize events https://review.opendev.org/664442 | 15:51 |
artom | mriedem, ^^ :) | 15:51 |
artom | And as I said, sean-k-mooney's carrying the torch | 15:51 |
* mriedem uncocks | 15:51 | |
artom | Unless it somehow merges today | 15:51 |
*** helenafm has quit IRC | 15:52 | |
sean-k-mooney | mriedem: ill priorites any review feedback that you or dansmith have. artom ill also review this again shortly | 15:54 |
jroll | kashyap: so the concern on vTPM is around where keys "in the TPM" are stored. can you help me find docs on that? in other words, we don't want these keys sitting on a disk. | 15:57 |
*** belmoreira has quit IRC | 15:57 | |
*** markvoelker has joined #openstack-nova | 15:58 | |
bauzas | mriedem: ack, will review those | 15:58 |
bauzas | if working on customer escalations means more upstream time for me, then I sign off for moar :) | 15:59 |
*** ttsiouts has quit IRC | 16:02 | |
openstackgerrit | Stephen Finucane proposed openstack/nova-specs master: Additional upgrade clarifications for cpu-resources https://review.opendev.org/666032 | 16:02 |
mriedem | sean-k-mooney: i don't know what's up with https://review.opendev.org/#/c/647733/ but it's not queued in zuul and zuul hasn't reported on it | 16:02 |
*** ttsiouts has joined #openstack-nova | 16:03 | |
*** markvoelker has quit IRC | 16:03 | |
sean-k-mooney | mriedem: patch set 7 didnt get queued either | 16:04 |
sean-k-mooney | mriedem: i realised like an hour ago that its still missing a test | 16:04 |
sean-k-mooney | so ill respin it later today | 16:04 |
sean-k-mooney | i still need to add the test for backleveling the object | 16:05 |
sean-k-mooney | :) you got to the - workflow before me | 16:06 |
sean-k-mooney | i think ther ewas a zuul restart yeasterday aroudn wehn i pushed it an i think that messed with it | 16:06 |
mriedem | you rechecked it about 3 hours ago though | 16:06 |
mriedem | *4 | 16:06 |
*** ttsiouts has quit IRC | 16:07 | |
sean-k-mooney | ya i dont know why that didnt go through. | 16:07 |
sean-k-mooney | anyway ill update it later today. sorry for the noise | 16:07 |
*** rpittau is now known as rpittau|afk | 16:10 | |
*** lpetrut has joined #openstack-nova | 16:10 | |
*** _erlon_ has joined #openstack-nova | 16:12 | |
*** udesale has quit IRC | 16:12 | |
*** mdbooth_ has joined #openstack-nova | 16:14 | |
*** awalende has joined #openstack-nova | 16:14 | |
*** mdbooth has quit IRC | 16:17 | |
*** spsurya has quit IRC | 16:18 | |
*** mdbooth_ has quit IRC | 16:19 | |
*** awalende has quit IRC | 16:19 | |
*** mrch_ has joined #openstack-nova | 16:20 | |
*** liuyulong has quit IRC | 16:22 | |
sean-k-mooney | alex_xu: just finished https://review.opendev.org/#/c/601596/14 im not sure we should default to copying pmem namespace for resize or cold migration | 16:27 |
openstackgerrit | jacky06 proposed openstack/os-traits master: Sync Sphinx requirement https://review.opendev.org/666386 | 16:32 |
openstackgerrit | jacky06 proposed openstack/os-vif master: Sync Sphinx requirement https://review.opendev.org/666387 | 16:34 |
*** gfhellma has joined #openstack-nova | 16:44 | |
*** ricolin has quit IRC | 16:48 | |
*** lpetrut has quit IRC | 16:49 | |
*** awalende has joined #openstack-nova | 16:50 | |
*** gfhellma_ has joined #openstack-nova | 16:50 | |
*** gfhellma has quit IRC | 16:53 | |
*** dtantsur is now known as dtantsur|afk | 16:56 | |
*** davidsha has quit IRC | 16:57 | |
*** martinkennelly has quit IRC | 16:57 | |
*** markvoelker has joined #openstack-nova | 16:59 | |
*** derekh has quit IRC | 17:00 | |
*** trident has quit IRC | 17:02 | |
*** markvoelker has quit IRC | 17:04 | |
*** trident has joined #openstack-nova | 17:04 | |
*** tesseract has quit IRC | 17:21 | |
*** cdent has quit IRC | 17:22 | |
*** awalende has quit IRC | 17:22 | |
*** eharney has quit IRC | 17:25 | |
*** mgoddard has quit IRC | 17:25 | |
rouk | so, update_available_resource in resource_tracker.py and _refresh_associations in report.py both take so long to complete that the main thread hangs long enough to miss rmq heartbeats for 2-3 minutes at a time on a specific node, causing nova service to be unusable and status to flap. does this sound familar to anyone? | 17:25 |
rouk | been debugging a specific compute node flapping/being broken for a while, narrowed it down this far, making these functions just return immediately has fixed the symptoms, but those functions are kinda needed for functioning. | 17:27 |
sean-k-mooney | rouk: no not really | 17:27 |
sean-k-mooney | what driver are you using | 17:27 |
sean-k-mooney | and how big is the host its happening on | 17:27 |
*** mgoddard has joined #openstack-nova | 17:27 | |
rouk | sean-k-mooney: libvirt, 256gb ram, 96 threads of cpu, epyc | 17:28 |
sean-k-mooney | and do you have a lot of running instnaces? | 17:28 |
rouk | other nodes at same spec and exact hw match are... fine, as far as i can tell, suddenly this unit became useless, fresh deploy of it still broken. | 17:29 |
sean-k-mooney | or pci deivces | 17:29 |
rouk | nope, totally evicted currently, its a fresh build, had to fail everyone out of it and manually recover, as nova wasnt working enough to complete any migrations. | 17:29 |
rouk | no crazy pcie devices, a 40gig card. | 17:30 |
sean-k-mooney | that seams strange. i have run nova on simlarly sized system in the past 88core intel system with 192GB of ram and seen no issues | 17:30 |
sean-k-mooney | if you do virsh capablities on the host system does it complete quickly or is it slow | 17:31 |
rouk | i have something like 60 other nodes at same spec without issue, this one just cant get through the libvirt calls fast enough even on a fresh deploy, fresh ip, fresh provider config, fresh hostname | 17:31 |
sean-k-mooney | im wondering if thise could be a libvirt issue | 17:31 |
rouk | it does seem to hang around in nova's libvirt.py for a while | 17:32 |
sean-k-mooney | what version of python are you running? | 17:32 |
rouk | the very first line in rocky's update_available_resource takes sometimes upwards of 60 seconds, the 2 functions i mention trade in terms of time spent, one cycle it will timeout on one, then the other | 17:33 |
rouk | 2.7.15 | 17:33 |
sean-k-mooney | nova currently dose not supprot python 3.7 and i have seen the compute agent hang with 3.7 | 17:33 |
sean-k-mooney | 2.7.15 on ubuntu bionic right? | 17:33 |
sean-k-mooney | mriedem: if you are around there was a gate issue with 2.7.15 recently right? | 17:34 |
rouk | yep | 17:34 |
rouk | what command should i check for cap list? | 17:35 |
sean-k-mooney | well i was wondering if libvirt was taking a long time to repond to that command | 17:35 |
sean-k-mooney | e.g. was it haveing trouble iterspecing the plathform | 17:36 |
*** tesseract has joined #openstack-nova | 17:36 | |
rouk | capabilities returns in under 100ms it seems | 17:36 |
sean-k-mooney | if you had a dodgey disk or a semi broken pci device it could cause that to be slow or toher command | 17:36 |
sean-k-mooney | *other | 17:36 |
sean-k-mooney | basicaly i was trying to figure ot is it slow because the compute agent is running slow or its waithing around for io form sysfs/libvirt | 17:37 |
rouk | network running at a stable 20gig with 0% loss to the rabbit nodes, disks are happy, 2GB/s to the root device, no other drives in the system... | 17:37 |
sean-k-mooney | 2GB/s to the root disk? | 17:38 |
rouk | yep | 17:38 |
sean-k-mooney | as in you currently can write at that speed or that is the activtiy you are seeing | 17:38 |
*** eharney has joined #openstack-nova | 17:38 | |
rouk | capability, stable, load is nothing right now | 17:38 |
sean-k-mooney | if you are seeing 2GB/s of writes and the system is idel somethign is wrong but if you ran dd and it worked a 2GB/s then your disk is fine | 17:39 |
rouk | 3GB/s to the docker mountpoint right now via DD from /dev/zero, no load otherwise | 17:40 |
rouk | 0.03 load across 96 threads | 17:40 |
sean-k-mooney | ya ok so this is likely not related to the system hardware then or to libvirt/sysfs | 17:41 |
sean-k-mooney | latency | 17:41 |
rouk | resources = self.driver.get_available_resource(nodename) is the line that takes the longest in the first freeze spot | 17:41 |
rouk | that one takes upwards of 60 seconds every 2nd cycle | 17:42 |
rouk | _refresh_associations is hanging within the if refresh_sharing: | 17:43 |
rouk | also every 2nd cycle, on the cycle which the other does not freeze, they trade, like some kind of shared lock | 17:43 |
rouk | again, nothing special about this compute node that i can see, same config, same deploy, completely fresh, fresh rmq queues, its never even had a vm on it. | 17:44 |
rouk | same containers as the others, central repo so versions always match. | 17:45 |
*** psachin has quit IRC | 17:48 | |
rouk | sean-k-mooney: any idea for next steps? or should i just trace all the code down till i find a specific thing blocking it? | 17:48 |
sean-k-mooney | am if you have time that would help. these fucntion are used to update the invetory of avialble resouce in placmente so we chan scheule properly | 17:49 |
*** factor__ has joined #openstack-nova | 17:49 | |
*** icarusfactor has quit IRC | 17:49 | |
sean-k-mooney | so without them you cant use that node but there is obviouly somethign wrong | 17:50 |
sean-k-mooney | do you use pci passthouhg on that node | 17:50 |
sean-k-mooney | if so you could try commenting out the whitelist to rule out the pci module | 17:50 |
sean-k-mooney | similarly if you are using vgpus you could disabel that for debuging in teh config | 17:51 |
sean-k-mooney | if the issue still exists with pci passthough and vgpus disabled in the conf(assuming you were using either) that narrows down the code that could be at falut | 17:52 |
rouk | nope, same config as all other nodes, just rbd storage, oslo memcached, lock_path = /var/lib/nova/tmp, | 17:52 |
sean-k-mooney | ok so we can ignore those codepaths then | 17:52 |
rouk | no major customization other than standard kolla config for a compute node. | 17:52 |
sean-k-mooney | what relese did you say you were using | 17:52 |
rouk | can dump you configs if it would help | 17:52 |
*** maciejjozefczyk has quit IRC | 17:52 | |
rouk | rocky, nova==18.1.0, it was working on this node up until monday, no changes in version, got some stein deployments that are happy too. | 17:53 |
rouk | could move up to 18.2.1 i guess, if a fresh one would help maybe? | 17:54 |
*** melwitt is now known as jgwentworth | 17:55 | |
*** awalende has joined #openstack-nova | 17:56 | |
openstackgerrit | Adam Spiers proposed openstack/nova master: Add extra spec parameter and image property for memory encryption https://review.opendev.org/664420 | 17:58 |
sean-k-mooney | am can you file a bug for this and inclode as much info as you can | 17:59 |
rouk | well if nobody has heard of it, and nobody has had similar, i would need to find the exact cause before a bugreport would be any use... | 17:59 |
sean-k-mooney | it sound like there is a deadlock or somethin that is preventing thinks form correctly yeilding | 17:59 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Use fake flavor instead of empty dict in test https://review.opendev.org/662555 | 18:00 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Pass extra_specs to flavor in vif tests https://review.opendev.org/662556 | 18:00 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Extract SEV-specific bits on host detection https://review.opendev.org/636334 | 18:00 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Add <launchSecurity> element to guest config for AMD SEV https://review.opendev.org/636318 | 18:00 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Allow guest devices to include <driver iommu='on' /> https://review.opendev.org/644564 | 18:00 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Detect that SEV is required and enable iommu for devices https://review.opendev.org/644565 | 18:00 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Use <launchSecurity> element when SEV is required https://review.opendev.org/662557 | 18:00 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Enable memory locking if SEV is requested https://review.opendev.org/662558 | 18:00 |
*** markvoelker has joined #openstack-nova | 18:00 | |
sean-k-mooney | rouk: well you dont need to fuly root cause it to open the bug you could update it as you go with your findings | 18:00 |
aspiers | sean-k-mooney, mriedem: this is the best I could come up with for the fake dict issue: https://review.opendev.org/#/c/664420/10/nova/virt/hardware.py@1142 | 18:00 |
*** awalende has quit IRC | 18:00 | |
*** ociuhandu has quit IRC | 18:02 | |
*** gfhellma__ has joined #openstack-nova | 18:03 | |
*** markvoelker has quit IRC | 18:05 | |
*** gfhellma_ has quit IRC | 18:06 | |
*** pcaruana has quit IRC | 18:11 | |
mriedem | sean-k-mooney: the 2.7.15 python thing recently was that hacking check unit test failure | 18:13 |
mriedem | https://bugs.launchpad.net/nova/+bug/1832392 | 18:14 |
openstack | Launchpad bug 1804062 in nova (Ubuntu Eoan) "duplicate for #1832392 test_hacking fails for python 3.6.7 and newer" [High,Triaged] | 18:14 |
mriedem | aspiers: comment inline | 18:16 |
*** damien_r has quit IRC | 18:22 | |
sean-k-mooney | mriedem: kolla-ansible also had some other issue but i think rouk issue is related to ceph network connectinvity | 18:25 |
aspiers | mriedem: thanks! | 18:26 |
sean-k-mooney | not 100% sure but when they tried to conenct to ceph form the node the ceph status commnd was filling speradically so that might be cause the agent to hang when it tries to get the capasity fo the rbd pool | 18:26 |
*** luksky has joined #openstack-nova | 18:30 | |
rouk | yeah, checking for ip conflicts that somehow got into the network managed only by kolla... | 18:31 |
rouk | see what else we can find, its flapping like an ip conflict. | 18:31 |
*** BjoernT has joined #openstack-nova | 18:32 | |
sean-k-mooney | you mentioned you changed the host name between redeploying right | 18:32 |
sean-k-mooney | did you use the kolla ansible bootstrap playbook | 18:32 |
sean-k-mooney | that templates out a /etc/hosts file with static assignments for all nodes | 18:32 |
sean-k-mooney | but if you are also using dhcp that could maybe cause issues | 18:33 |
sean-k-mooney | its been a while since i woked on kolla so they may have changed that | 18:33 |
*** BjoernT_ has quit IRC | 18:34 | |
*** damien_r has joined #openstack-nova | 18:36 | |
mriedem | sean-k-mooney: mlavalle: comments on https://review.opendev.org/#/c/644881/ and question about the assertion that ovs hybrid plug vif types are "neutron wide" configuration - that seems surprising to me | 18:41 |
mriedem | i thought that conifguration was per neutron agent | 18:41 |
*** damien_r has quit IRC | 18:43 | |
artom | mriedem, thanks for the review - we're in the final throws of packing, so sean-k-mooney will have to follow up. I've left a reply for 1 thing though. | 18:48 |
*** tesseract has quit IRC | 18:49 | |
*** ivve has joined #openstack-nova | 18:50 | |
*** BjoernT_ has joined #openstack-nova | 18:51 | |
*** BjoernT has quit IRC | 18:52 | |
*** markvoelker has joined #openstack-nova | 19:01 | |
*** ivve has quit IRC | 19:16 | |
*** markvoelker has quit IRC | 19:20 | |
*** phughk has joined #openstack-nova | 19:25 | |
mriedem | https://review.opendev.org/#/c/571265/ adds a functional test for a scheduler filter that was otherwise not tested (i don't think when i wrote it anyway), open for over a year now, has a +2 if someone can look | 19:40 |
*** maciejjozefczyk has joined #openstack-nova | 19:44 | |
*** eharney has quit IRC | 19:47 | |
mriedem | is melwitt out this week? | 19:50 |
*** maciejjozefczyk has quit IRC | 19:50 | |
mriedem | dansmith: looking back on https://bugs.launchpad.net/nova/+bug/1773945 and https://bugs.launchpad.net/nova/+bug/1784074 - if we get here in conductor where the instance is deleted after the build request is created and we've done scheduling but before the instance is created in a cell, we delete the build request here and continue to the next instance we're building: https://github.com/openstack/nova/blob/74aebe0d4e5a978a4001 | 19:51 |
openstack | Launchpad bug 1773945 in OpenStack Compute (nova) "nova client servers.list crashes with bad marker" [Medium,In progress] - Assigned to Matt Riedemann (mriedem) | 19:51 |
mriedem | 0aee9e70e98246c4/nova/conductor/manager.py#L1350 | 19:51 |
openstack | Launchpad bug 1784074 in OpenStack Compute (nova) "Instances end up with no cell assigned in instance_mappings" [Medium,In progress] - Assigned to Matt Riedemann (mriedem) | 19:51 |
mriedem | we don't bury that instance in cell0 presumably because it's already been deleted via build request | 19:51 |
mriedem | however, since we don't bury it, and the build request is gone, and we didn't create the instance, we could have an orphaned instance mapping, yeah? | 19:52 |
jgwentworth | no, just didn't want to lose my other nick from not using it | 19:52 |
*** jgwentworth is now known as melwitt | 19:52 | |
*** ralonsoh has quit IRC | 19:53 | |
* melwitt feebly tries to open that link | 19:54 | |
melwitt | concatenating not working | 19:54 |
mriedem | ok, was going to ping you on ^ as well | 19:54 |
melwitt | what you're saying makes sense though | 19:55 |
melwitt | interestingly recently I was trying to figure out how an instance could possibly be in both cell0 and cell1 in a single cell env, since that's happened to rdo cloud several times. but I found no way | 19:55 |
melwitt | so if you saw how that could be, lmk | 19:56 |
*** awalende has joined #openstack-nova | 19:58 | |
dansmith | mriedem: maybe? | 19:59 |
melwitt | so, we create instance mapping with cell_id=NULL in compute/api. so if we get to conductor and no build request found, we add a placeholder to the instance list | 19:59 |
melwitt | we don't create an instance record | 20:00 |
mriedem | right we'd skip those here https://github.com/openstack/nova/blob/74aebe0d4e5a978a40011e890aee9e70e98246c4/nova/conductor/manager.py#L1391 | 20:01 |
dansmith | yeah was just reading that loop | 20:01 |
dansmith | we assume that if instance is none that it's because we buried it already I think | 20:01 |
mriedem | while i'm looking at this code, we should probably make this fatal now :) https://github.com/openstack/nova/blob/74aebe0d4e5a978a40011e890aee9e70e98246c4/nova/conductor/manager.py#L1238 | 20:01 |
melwitt | we delete all the stuff if quota recheck fails. but if someone has requested a delete of the thing, wouldn't the compute/api delete the instance mapping? I can't remember where we do that for delete | 20:01 |
mriedem | dansmith: or it's already been deleted via the build request before we got that far | 20:02 |
mriedem | melwitt: no we don't delete the instance mapping when deleting the instance (or build request) in the api | 20:02 |
mriedem | that's why surya had to add it to the archive_deleted_rows cmd | 20:02 |
*** artom has quit IRC | 20:02 | |
dansmith | mriedem: yeah, we get instance=None if the build request failed, do we get it for other reasons? | 20:02 |
melwitt | oh, ever? if we never do, then is it an issue? | 20:03 |
dansmith | ohh, | 20:03 |
dansmith | I see, | 20:03 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Add extra spec parameter and image property for memory encryption https://review.opendev.org/664420 | 20:03 |
dansmith | we bury it right away if we get hostmappingnotfound | 20:03 |
dansmith | but if we get instance mapping not found, we don't bury, but instances.append(None) | 20:03 |
dansmith | the loop at the bottom assumes the former behavior | 20:03 |
dansmith | the latter looks like it was added for quota counting reasons | 20:04 |
melwitt | yeah, it was | 20:04 |
mriedem | dansmith: you mean build request not found | 20:04 |
melwitt | the append None thing | 20:04 |
mriedem | https://github.com/openstack/nova/blob/74aebe0d4e5a978a40011e890aee9e70e98246c4/nova/conductor/manager.py#L1350 | 20:04 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Use fake flavor instead of empty dict in test https://review.opendev.org/662555 | 20:04 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Pass extra_specs to flavor in vif tests https://review.opendev.org/662556 | 20:04 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Extract SEV-specific bits on host detection https://review.opendev.org/636334 | 20:04 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Add <launchSecurity> element to guest config for AMD SEV https://review.opendev.org/636318 | 20:04 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Allow guest devices to include <driver iommu='on' /> https://review.opendev.org/644564 | 20:04 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Detect that SEV is required and enable iommu for devices https://review.opendev.org/644565 | 20:04 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Use <launchSecurity> element when SEV is required https://review.opendev.org/662557 | 20:04 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Enable memory locking if SEV is requested https://review.opendev.org/662558 | 20:04 |
dansmith | mriedem: yeah sorry | 20:04 |
dansmith | mriedem: yeah that was melwitt's main counting quotas patch | 20:04 |
dansmith | which added the instances.append(None) for the host not found bury case as well | 20:05 |
dansmith | https://github.com/openstack/nova/commit/5c90b25e49d47deb7dc6695333d9d5e46efe8665#diff-378a96ec6159d0a2f8ec7ab71bc3843bR946 | 20:06 |
dansmith | although it looks like we were skipping those both anyway | 20:06 |
mriedem | yeah, and at some point in talking about https://bugs.launchpad.net/nova/+bug/1784074 we had talked about squashing those two for loops into one again, but i've lost track of the discussion about that (though i think i know how to find it) | 20:07 |
openstack | Launchpad bug 1784074 in OpenStack Compute (nova) "Instances end up with no cell assigned in instance_mappings" [Medium,In progress] - Assigned to Matt Riedemann (mriedem) | 20:07 |
openstackgerrit | Eric Fried proposed openstack/nova master: Clarify --before help text in nova manage https://review.opendev.org/661289 | 20:07 |
mriedem | i'm not really signing up to do that refactor now though, | 20:07 |
*** gfhellma__ has quit IRC | 20:07 | |
dansmith | yeah | 20:07 |
mriedem | just thinking it makes sense for us to delete the instance mapping in this block https://github.com/openstack/nova/blob/74aebe0d4e5a978a40011e890aee9e70e98246c4/nova/conductor/manager.py#L1350 | 20:07 |
dansmith | I think we should just bury in that case, no? | 20:07 |
mriedem | b/c if the BR is gone, and the instance isn't created in a cell yet, the instance mapping shouldn't exist either - it's orphaned | 20:07 |
melwitt | I've been wanting ("wanting") to do that refactor but, you know how it goes | 20:07 |
mriedem | burying it would bring it back | 20:07 |
mriedem | from the br zombieland | 20:08 |
dansmith | mriedem: right, but isn't that the expectation of the api user? | 20:08 |
melwitt | but how does that hurt anything? if we never delete instance mappings? | 20:08 |
dansmith | or is this only in the multi-create case? | 20:08 |
melwitt | *if we never delete instance mappings anyway? | 20:08 |
dansmith | for single create, once you get back a uuid you expect to be able to list that thing (with deleted=yes) until archive | 20:09 |
dansmith | but I guess maybe if you've deleted it it could be up for immediate archive and thus deleting the mapping is equivalent | 20:09 |
dansmith | so yeah, either I guess | 20:09 |
mriedem | melwitt: it could be a problem in a tight window if you got that mapping via build request as a marker when listing/paging | 20:09 |
melwitt | oh, I see | 20:09 |
mriedem | dansmith: if you delete the build request before it's an instance, i don't think we honor reading it as deleted=yes anyway, | 20:10 |
mriedem | because BRs are hard deleted | 20:10 |
mriedem | melwitt: at least that was part of the reasoning behind https://review.opendev.org/#/c/575556/ | 20:10 |
dansmith | mriedem: ack yeah I guess I remember that bit of cheating | 20:11 |
mriedem | it would have been like newton-era talk about that cheating being ok, likely in a small hotel conf room in hillsboro | 20:11 |
mriedem | if i remember correctly | 20:11 |
mriedem | with jpenick squashed on the floor in a corner | 20:12 |
melwitt | lol | 20:12 |
dansmith | yar | 20:12 |
mriedem | the good old days... | 20:12 |
melwitt | it was wharm in there | 20:12 |
mriedem | anyway, i'm about to abandon https://review.opendev.org/#/c/575556/ but was just looking at my last comment there | 20:12 |
mriedem | melwitt: unless you were directly underneath where the air shot out and down on you, which was my case | 20:13 |
mriedem | then it was cold and contact-drying | 20:13 |
melwitt | lol! | 20:13 |
melwitt | well I'll be | 20:13 |
*** markvoelker has joined #openstack-nova | 20:17 | |
*** belmoreira has joined #openstack-nova | 20:20 | |
*** gfhellma has joined #openstack-nova | 20:22 | |
*** gfhellma_ has joined #openstack-nova | 20:24 | |
*** gfhellma has quit IRC | 20:28 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Delete InstanceMapping in conductor if BuildRequest is already deleted https://review.opendev.org/666438 | 20:32 |
*** awalende has quit IRC | 20:32 | |
openstackgerrit | melanie witt proposed openstack/nova master: rbd: use MAX_AVAIL stat for reporting bytes available https://review.opendev.org/556692 | 20:35 |
*** eharney has joined #openstack-nova | 20:36 | |
*** markvoelker has quit IRC | 20:36 | |
*** gfhellma_ has quit IRC | 20:39 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add functional test for AggregateMultiTenancyIsolation + migrate https://review.opendev.org/571265 | 20:40 |
*** awalende has joined #openstack-nova | 20:41 | |
*** whoami-rajat has quit IRC | 20:47 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: libvirt: don't log error if guest gone during interface detach https://review.opendev.org/610727 | 21:00 |
mriedem | relatively simple older patch here that has a +2 on it - it's mostly docstring https://review.opendev.org/#/c/633212/ | 21:04 |
sean-k-mooney | mriedem: just back after dinner ill take a look at https://review.opendev.org/#/c/644881/ now | 21:28 |
mriedem | i'll be gone in 30 minutes | 21:31 |
sean-k-mooney | mriedem: the use of hybrid plug is based on the local config on the compute host which is read by the agent and hybrid plug amoung other things is stored in the neutron db | 21:31 |
*** _erlon_ has quit IRC | 21:31 | |
sean-k-mooney | so when the ml2 driver bind the port to the hsot is pass an agent context like our host_state object | 21:32 |
mriedem | mnaser: figured out why my recreate test in https://review.opendev.org/#/c/647603/ wasn't failing | 21:32 |
sean-k-mooney | which it check to determin if hybid plug should be used | 21:32 |
mriedem | mnaser: only took a couple of months | 21:32 |
mriedem | sean-k-mooney: but that's per neutron agent right? | 21:32 |
sean-k-mooney | yes | 21:32 |
mriedem | the commit message makes it sound like if you're ovs hybrid plug, it's deployment-wide | 21:33 |
*** markvoelker has joined #openstack-nova | 21:33 | |
sean-k-mooney | it does. ill comment on the comment and correct it | 21:33 |
sean-k-mooney | for tools like triplo it typeiclay configred globally | 21:33 |
sean-k-mooney | but it actully set per host | 21:33 |
sean-k-mooney | at bind time the ml2 driver is passed the agent context for the agent on the host that nova selected and it determins what to do based on the configurtion of that host | 21:34 |
mnaser | mriedem: what object should it be? | 21:34 |
mriedem | mnaser: nvm, i guess it doesn't recreate it | 21:35 |
mriedem | for one thing the test wasn't going through get_stashed_volume_connector when deleting the server, | 21:35 |
mriedem | and _local_cleanup_bdm_volumes swallows the error anyway | 21:35 |
mriedem | but if the bdm.connection_info were 'null' the test would fail before that anyway | 21:36 |
mriedem | so i guess i didn't recreate it | 21:36 |
efried | I think that newton midcycle was my first in-person encounter with you people. | 21:39 |
mriedem | "you people" | 21:39 |
* mriedem calls the PC police | 21:39 | |
sean-k-mooney | efried: does it feel longer or shorter then that :) | 21:40 |
efried | With my dying breath I shall continue to deny that "you people" is an acceptable PC violation, under any circumstances. | 21:41 |
Nick_A | config drive - is there a way to only use it at instance creation and not let instance mount it again after? | 21:41 |
efried | sean-k-mooney: I thought I started somewhere around liberty/mitaka, but that may have been way in the background, no community involvement at that point. | 21:42 |
efried | sean-k-mooney: I do remember understanding ZERO of what went on in that room though. | 21:42 |
efried | (Some things never change) | 21:42 |
*** gfhellma has joined #openstack-nova | 21:42 | |
sean-k-mooney | i think that is common. both initally not following the supper detailed conversation and not getting involved in the comunity right away | 21:43 |
sean-k-mooney | i stared playing with openstack at teh end fo havana playing with quantum but it was after icehouse had shipped in early juno that i submitted my first patch i think | 21:46 |
sean-k-mooney | huh my first patch was a neutron spec apparently | 21:48 |
sean-k-mooney | https://review.opendev.org/#/c/95121/ | 21:48 |
*** BjoernT_ has quit IRC | 21:49 | |
*** markvoelker has quit IRC | 21:53 | |
mriedem | seems pypi mirrors have exploded, | 21:53 |
mriedem | so for anyone waiting all day for a CI result, they likely have to recheck | 21:53 |
*** JamesBen_ has quit IRC | 21:54 | |
*** awalende has quit IRC | 21:55 | |
*** awalende has joined #openstack-nova | 21:56 | |
*** awalende has quit IRC | 21:59 | |
*** mriedem has quit IRC | 21:59 | |
*** xek has quit IRC | 22:02 | |
*** mlavalle has quit IRC | 22:08 | |
sean-k-mooney | that proably explains why my recheck early never started | 22:10 |
*** belmoreira has quit IRC | 22:13 | |
*** belmoreira has joined #openstack-nova | 22:16 | |
*** zbr|ruck has quit IRC | 22:17 | |
*** luksky has quit IRC | 22:39 | |
*** Sundar has quit IRC | 22:44 | |
*** belmoreira has quit IRC | 22:45 | |
*** markvoelker has joined #openstack-nova | 22:50 | |
*** gfhellma_ has joined #openstack-nova | 22:52 | |
*** gfhellma has quit IRC | 22:55 | |
*** markvoelker has quit IRC | 23:05 | |
*** tkajinam has joined #openstack-nova | 23:06 | |
*** JamesBenson has joined #openstack-nova | 23:12 | |
*** rcernin has joined #openstack-nova | 23:16 | |
*** JamesBenson has quit IRC | 23:17 | |
*** gfhellma_ has quit IRC | 23:25 | |
*** gfhellma_ has joined #openstack-nova | 23:25 | |
*** gfhellma__ has joined #openstack-nova | 23:30 | |
*** gfhellma_ has quit IRC | 23:31 | |
*** gfhellma_ has joined #openstack-nova | 23:43 | |
*** slaweq has quit IRC | 23:47 | |
*** gfhellma__ has quit IRC | 23:47 | |
alex_xu | sean-k-mooney: thanks, I'm ok with default to not copy the data | 23:56 |
sean-k-mooney | i think in the context of cross cell resize it nice that the default beahvior would be the same as intra cell resize | 23:57 |
sean-k-mooney | and its also consistent with snapshotting and shelve | 23:57 |
*** gfhellma_ has quit IRC | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!