Monday, 2025-01-20

*** mhen_ is now known as mhen02:16
opendevreviewBiser Milanov proposed openstack/tempest master: test_instances_with_cinder_volumes: Make sure attachments in guest are correct  https://review.opendev.org/c/openstack/tempest/+/93963312:53
sp-bmilanovHi openstack-qa, can someone have a look at this bug and the proposed change? It has been causing some (a non-trivial amount) of our CI runs to fail https://bugs.launchpad.net/tempest/+bug/2095336 https://review.opendev.org/c/openstack/tempest/+/939633 13:03
opendevreviewBiser Milanov proposed openstack/tempest master: test_instances_with_cinder_volumes: Make sure attachments in guest are correct  https://review.opendev.org/c/openstack/tempest/+/93963313:46
fricklerthe failure I'm seeing most often is cleanup failing during volume tests, but maybe making sure the attachments worked properly in the first case would help, too ^^ cc gmann sean-k-mooney 13:57
sean-k-mooneyone or two of the volume test still wait for active rather then sshable which may also be a factor14:41
sean-k-mooneyfrickler: for exampel the test ehy are modifying https://review.opendev.org/c/openstack/tempest/+/939633/2/tempest/scenario/test_instances_with_cinder_volumes.py#15114:43
sean-k-mooneyany tests that attaches a server at run time should wait for sshabel before trying to do the attach14:44
fricklersean-k-mooney: yes, seems that this wasn't don't done consistently everywhere, which surprises me a bit15:34
sean-k-mooneywe got most of the placese a few cycles ago but we defintly missed a few15:35
fricklerok, so next time I stumble across some volume error, I'll try to remember to check the test that is failing for this15:38
gmannyeah we have some places where sshable server might be needed. We have done a lot of them especially volume attach/detach one in past and then did case by case later18:01
*** haleyb_ is now known as haleyb18:03
gmannsean-k-mooney: frickler but in this case, test does wait for the sshable server https://review.opendev.org/c/openstack/tempest/+/939633/2/tempest/scenario/test_instances_with_cinder_volumes.py#16918:04
gmannthere is something else causing issue18:04
sean-k-mooney its waiting after the attachment is done18:08
sean-k-mooneywe need to wait for it to be sshable before call " self.nova_volume_attach"18:08
sean-k-mooneymoving the volume attachment to line 174 as is done in the patch 18:09
sean-k-mooneymay be sufficent but ti woudl be better to explictly have a wait for sshable to make it clear that is what happening18:09
sean-k-mooneydoes constucting the linux clinet on line 169 actully ensure it sshable before it returns?18:10
gmannsean-k-mooney: in this change volume attachment is moved after ssh access right18:10
sean-k-mooneyits moved after the clinet is constucted18:11
sean-k-mooneyis that enouch to try and ssh in want wait?18:11
sean-k-mooneyi woudl not expect get_remote_client to do that but if it does then with the chage applied that enough18:12
gmannget_remote_client is what ssh confirmation 18:12
gmannwhich is this guy https://github.com/openstack/tempest/blob/master/tempest/lib/common/utils/linux/remote_client.py#L11818:12
sean-k-mooneyvalidate_authentication18:13
gmannthis is what we so for wait=SSHABLE case also18:13
gmannwe do18:13
sean-k-mooneyack18:13
sean-k-mooneyok so its validated as a sideffect fo concustiing the client18:13
sean-k-mooneythen y athe current patch is suffeicnt18:13
sean-k-mooneyits less obvious to me then an explict wait but it will do what we want18:14
gmannyeah, its a lengthy test18:16
sean-k-mooneythe part that im still not conviced by it using dev_name18:18
sean-k-mooneywe shoudl not rely on the dev_name hint as that all it is18:18
gmannsean-k-mooney: sure, maybe we can change that in separate change, but good to leave comment there 18:21
sean-k-mooneyim leaving one now but that depenency is being added by this patch18:21
sean-k-mooneyit did not exist before so to me it intorducitng a bug18:21
gmannI see, make sense, will remove my +2, we should fix that 18:23
sean-k-mooneyi left a comment, you know tempet way better then i so ill defer to your judgement on that but i would be inclind to do the ordering change in the intial patch and then have a followup for the extra assertion18:24
sean-k-mooneyim not sure what the best way to write that assetion is given we do not have the udev rule for disk/by-id in cirros18:24
sean-k-mooneybut i woudl assuem blkid or other /sys locations provide the same info?18:25
sean-k-mooneyalthoguh cirros does use the busybox blkid so it may not provide that info18:25
sean-k-mooneyits been a while since i looked at athat in any detail18:25
gmannI am ok to do these extra verification as we have remote ssh client in tests (especially scenario tests) but agree with you that it should handle all cases. Also, I am ok to separate these two things or fix both together, let's wait for author.18:32
sean-k-mooneyack18:33
sean-k-mooneypart of my concern is the name of the volume depend both on how the device is attached, the order and which bus is used. i.e. vd* if its virtio, sd* if its sata or scsi, im not sure if the hint reflect that or not off the top of my head.18:35
sp-bmilanovsean-k-mooney: gmann: we have a truncated serial in /sys/block/vdX/serial, and in theory, the Tempest code can verify those. That was one of my initial implementation options, but it is more complex than attaching devices one by one and making sure we have the expected dev inside the instance18:54
sp-bmilanovI still do not know the root cause of the issue, but it seems to be a race that gets taken care of when doing the attachments serially18:55
sp-bmilanovalso, for some reason, blkid, and lsblk do not show serial IDs for virtio devices18:57
sean-k-mooneywe cant rely on the serial either in all cases18:57
sean-k-mooneyfor type=lun devices it may or may not match the cinder volume uuid18:57
sean-k-mooneyi guess we decied to assuem sda in https://review.opendev.org/c/openstack/tempest/+/918457/2/tempest/api/compute/volumes/test_attach_volume.py18:58
sp-bmilanovright, so, ultimately if we don't want to use /sys/block/vdX/serial, there's no way to confirm the mapping? also, I am going to split the commits just now so its one for the reorder and one for the extra check19:01
sean-k-mooneyi think, if we are to assert the mappign we woudl ideally try using the serial and fall back to the name but im not sure if that is really rquired in the tempest tests19:03
sean-k-mooneyif i was to do this genericlly, i would be included to ssh into the vm and list all the blk devices, then attach a voluem and assert an addtional entry exists (extracting it name) and then on detach assert that the orginal set was restored19:04
sean-k-mooneybut again that really depend on what your are trying to test19:05
opendevreviewLiron Kuchlani proposed openstack/tempest master: Add test to verify instance bootability after volume snapshot  https://review.opendev.org/c/openstack/tempest/+/93932919:06
opendevreviewBiser Milanov proposed openstack/tempest master: test_instances_with_cinder_volumes: Make sure attachments in guest are correct  https://review.opendev.org/c/openstack/tempest/+/93963319:08
opendevreviewBiser Milanov proposed openstack/tempest master: test_instances_with_cinder_volumes: Delay attaching volumes  https://review.opendev.org/c/openstack/tempest/+/93966319:08
opendevreviewBiser Milanov proposed openstack/tempest master: test_instances_with_cinder_volumes: Delay attaching volumes  https://review.opendev.org/c/openstack/tempest/+/93966319:09
opendevreviewBiser Milanov proposed openstack/tempest master: test_instances_with_cinder_volumes: Make sure attachments in guest are correct  https://review.opendev.org/c/openstack/tempest/+/93963319:09
sp-bmilanov> but again that really depend on what your are trying to test19:18
sp-bmilanovI want to make sure that the actual device inside the guest matches what libvirt and OpenStack think the device should be, your approach with the pre- and post- attach checks seem best, I can refactor the change if nobody else minds19:18
sp-bmilanovs/libvirt and OpenStack think/libvirt and OpenStack (and hence the Tempest test) think/19:19
opendevreviewMerged openstack/devstack master: Removing start_ovn_services call  https://review.opendev.org/c/openstack/devstack/+/93760620:05
sp-bmilanovsean-k-mooney: thanks for the suggestions in the comments to the change, I will upload something tomorrow20:16
sp-bmilanovbtw, for 939663, at this point we don't care whether attach has completed or not, that is the point of the next commit that will verify the attachment inside the guest20:21
sp-bmilanovhmm, or maybe waiting to verify the attach at the API level can give us more information if the test fails in some other way20:22
sp-bmilanovyeah, that's fair20:22
opendevreviewMerged openstack/devstack master: Refactor readiness and custom config for ovn-nortd  https://review.opendev.org/c/openstack/devstack/+/93703923:23

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!