kata-dev-irc-bot | <asrivastava> Can someone explain to me how container images are mounted in kata? Is that over the 9P protocol or is there a block device that is mounted into the VM? | 00:07 |
---|---|---|
*** justJanne has quit IRC | 00:07 | |
*** annabelleB has quit IRC | 00:08 | |
*** justJanne has joined #kata-dev | 00:08 | |
*** justJanne has quit IRC | 00:10 | |
*** justJanne has joined #kata-dev | 00:10 | |
*** annabelleB has joined #kata-dev | 00:20 | |
kata-dev-irc-bot | <eric.ernst> @asrivastava it depends on the graph driver being used. | 00:38 |
kata-dev-irc-bot | <eric.ernst> If it is blockbased, then we mount it via virtio-scsi (by default) | 00:38 |
kata-dev-irc-bot | <eric.ernst> if it isn't blockbased, then 9p is used. | 00:38 |
kata-dev-irc-bot | <eric.ernst> the tip is to use a block based graph driver for best performance (ie, devicemapper) | 00:38 |
kata-dev-irc-bot | <asrivastava> @eric.ernst Thanks! Is there any documentation about graph drivers that you can point me to? | 00:48 |
kata-dev-irc-bot | <eric.ernst> Yeah, docker has good documnetation on this, 1 sec. | 00:48 |
kata-dev-irc-bot | <eric.ernst> https://docs.docker.com/storage/storagedriver/device-mapper-driver/ | 00:49 |
kata-dev-irc-bot | <eric.ernst> TL;DR is blow away /var/lib/docker/ and update /etc/docker/daemon.json and edit/create /etc/docker/daemon.json | 00:50 |
kata-dev-irc-bot | <eric.ernst> with contents: { "storage-driver": "devicemapper" } | 00:50 |
kata-dev-irc-bot | <asrivastava> So the kata runtime mounts the local image into the pod vm as part of creating the sandbox I suppose? | 00:54 |
*** fuentess has quit IRC | 01:10 | |
*** annabelleB has quit IRC | 01:57 | |
*** zerocoolback has joined #kata-dev | 02:10 | |
*** mcastelinoo has joined #kata-dev | 02:11 | |
*** mcastelino has quit IRC | 02:15 | |
*** liujiong has joined #kata-dev | 02:18 | |
*** mcastelinoo has quit IRC | 02:22 | |
*** annabelleB has joined #kata-dev | 02:32 | |
kata-dev-irc-bot | <eric.ernst> Yes. | 02:35 |
*** eernst has quit IRC | 02:40 | |
*** zerocoolback has quit IRC | 02:42 | |
*** zerocoolback has joined #kata-dev | 02:42 | |
*** zerocoolback has quit IRC | 02:47 | |
*** adrianreza has left #kata-dev | 02:51 | |
*** annabelleB has quit IRC | 02:52 | |
*** eernst has joined #kata-dev | 04:05 | |
*** eernst has quit IRC | 04:20 | |
*** sjas_ has joined #kata-dev | 04:34 | |
*** sjas has quit IRC | 04:37 | |
*** zerocoolback has joined #kata-dev | 05:34 | |
*** zerocoolback has quit IRC | 05:45 | |
*** zerocoolback has joined #kata-dev | 05:47 | |
*** zerocoolback has quit IRC | 06:19 | |
*** sameo has joined #kata-dev | 07:33 | |
*** gwhaley has joined #kata-dev | 07:48 | |
*** cdent has joined #kata-dev | 07:59 | |
*** cdent has quit IRC | 08:32 | |
*** cdent has joined #kata-dev | 09:00 | |
*** liujiong has quit IRC | 09:17 | |
*** cdent has quit IRC | 10:27 | |
*** sjas_ is now known as sjas | 10:36 | |
*** gwhaley has quit IRC | 10:59 | |
*** cdent has joined #kata-dev | 11:04 | |
*** gwhaley has joined #kata-dev | 12:35 | |
*** devimc has joined #kata-dev | 13:13 | |
*** sameo has quit IRC | 13:18 | |
*** eernst has joined #kata-dev | 13:34 | |
kata-dev-irc-bot | <eric.ernst> @raravena80 - let's move the I/O discussion over to this channel. | 13:39 |
kata-dev-irc-bot | <eric.ernst> the easiest way to see what's going on may be to paste some more details in an issue (github.com/kata-containers/runtime/issues), and specifically pasting in output of kata-collect-data.sh | 13:41 |
kata-dev-irc-bot | <eric.ernst> @graham.whaley , @archana.m.shinde - can you take a look at how these numbers compare to what we've seen with virtio-scsi? | 13:41 |
*** gwhaley has quit IRC | 13:41 | |
*** eernst has quit IRC | 13:43 | |
*** annabelleB has joined #kata-dev | 13:46 | |
*** cdent has left #kata-dev | 13:46 | |
*** gwhaley has joined #kata-dev | 13:53 | |
*** annabelleB has quit IRC | 13:57 | |
*** annabelleB has joined #kata-dev | 13:58 | |
kata-dev-irc-bot | <eric.ernst> @xu @zhangwei555, do you have more background on volume hotplug use case in our api additions? | 14:02 |
kata-dev-irc-bot | <eric.ernst> I’m wondering if we absolutely need this for 1.0 versus a follow up a few weeks after release. Basically, I’m trying to prioritize features for release. IIRC we didn’t have a use case just yet? | 14:03 |
kata-dev-irc-bot | <eric.ernst> @bergwolf you recall? | 14:03 |
kata-dev-irc-bot | <bergwolf> it's required for frakti use case | 14:04 |
kata-dev-irc-bot | <bergwolf> same as the network hotplug | 14:05 |
kata-dev-irc-bot | <raravena80> @eric.ernst sounds good. Created https://github.com/kata-containers/runtime/issues/195 | 14:06 |
kata-dev-irc-bot | <bergwolf> @eric.ernst AFAICT, all the APIs listed in https://github.com/kata-containers/documentation/blob/master/design/kata-api-design.md are required for v1.0 release except for the plugin APIs, which can be internal for v1.0 and we can export them after v1.0 for private implementations. | 14:08 |
kata-dev-irc-bot | <eric.ernst> Thanks for clarifying. I’m just trying to prioritize, as I didn’t see an “owner” for that feature yet. Is this something that an engineer on hyper side is/can look at enabling? I know you have a lot on your plate today. | 14:09 |
kata-dev-irc-bot | <bergwolf> Can someone from Intel side help on it? We are kind of short handed right now. | 14:11 |
kata-dev-irc-bot | <eric.ernst> Or we have any hands/heads for this in Huawei I wonder. | 14:11 |
kata-dev-irc-bot | <eric.ernst> A separate line item: what about vm templating? I think this is risky for 1.0 | 14:12 |
kata-dev-irc-bot | <anne> IMHO, vm templating sounds ambitious given our timeline | 14:13 |
kata-dev-irc-bot | <bergwolf> vm templating depends on the hypervisor interface and vm factory | 14:13 |
*** eernst has joined #kata-dev | 14:14 | |
kata-dev-irc-bot | <bergwolf> Once that are settled, vm templating is easy to implement. Another effecting fact is whether we want to carry the vm template qemu patches for some time before it gets merged in qemu upstream | 14:14 |
kata-dev-irc-bot | <eric.ernst> Yeah. | 14:15 |
kata-dev-irc-bot | <eric.ernst> I think what we should do is decide on a release cadence. | 14:16 |
kata-dev-irc-bot | <eric.ernst> While VM-templating, for example, is risky at 1.0, perhaps we can decide that we want to either follow k8s like schedule, or do a release on a 6 week schedule. | 14:16 |
kata-dev-irc-bot | <eric.ernst> I think 6 week makes sense, though following another major project would work too. | 14:17 |
kata-dev-irc-bot | <bergwolf> I think we should not release v1.0 until these APIs are in place, which was agreed upon by the arch committee when merging virtcontainers in kata runtime repo. | 14:18 |
kata-dev-irc-bot | <eric.ernst> Yep. Understood. | 14:18 |
kata-dev-irc-bot | <anne> is @zhangwei555 on? it sounds like we need more hands for the frakti work and maybe huawei can help | 14:19 |
kata-dev-irc-bot | <eric.ernst> Yeah, that's my main concern. I hadn't seen any traction on volume side, and thought that wasn't necessarily tied to frakti today. Since it is, I understand we want it included in 1.0. Just want to find more hands here. | 14:19 |
kata-dev-irc-bot | <bergwolf> June 1 is the targeting release date we are working toward right now. But we should not release whatever is there when the time comes. | 14:19 |
kata-dev-irc-bot | <eric.ernst> Perhaps 1.0 is a special case, but I think going forward we should try to do time based releases rather than feature based, imo. | 14:21 |
kata-dev-irc-bot | <eric.ernst> Consistent release schedule I think is important, and is what I see from most major projects (again, I think perhaps 1.0 is special here, and understand, @bergwolf) | 14:21 |
kata-dev-irc-bot | <anne> +1 to time base | 14:22 |
kata-dev-irc-bot | <anne> @bergwolf are there others who are familiar with the frakti use case that can help with that work? | 14:22 |
kata-dev-irc-bot | <bergwolf> For release candidates, yes. For official releases, we should have proper targeting feature sets and let them block the official releases. | 14:22 |
kata-dev-irc-bot | <bergwolf> @anne one does not need frakti knowledge to implement the volume hotplug APIs | 14:23 |
kata-dev-irc-bot | <bergwolf> It's just about exporting the internal virtcontainers volume hotplug APIs | 14:25 |
kata-dev-irc-bot | <eric.ernst> Yeah, testing would be interesting here though/ | 14:25 |
kata-dev-irc-bot | <julio.montes> @bergwolf Hi | 14:25 |
kata-dev-irc-bot | <eric.ernst> ie, making sure it works e2e with what Frakti expects. | 14:25 |
kata-dev-irc-bot | <bergwolf> the current API will be wrapper of the upcoming new APIs. So testing can be done with kata cli. | 14:26 |
kata-dev-irc-bot | <julio.montes> @bergwolf agent API changed https://github.com/kata-containers/agent/commit/e37feac2d41f29cf68f519471df900d34af2de6f#diff-b99777a97a9c03f0afb8d05c662740edR48 | 14:27 |
kata-dev-irc-bot | <julio.montes> I'm trying to update the agent in the runtime | 14:27 |
kata-dev-irc-bot | <julio.montes> but I have problems in this line | 14:27 |
kata-dev-irc-bot | <julio.montes> https://github.com/kata-containers/runtime/blob/master/virtcontainers/kata_agent.go#L808 | 14:27 |
kata-dev-irc-bot | <bergwolf> @julio.montes that is included in https://github.com/kata-containers/runtime/pull/173 | 14:27 |
kata-dev-irc-bot | <bergwolf> If you guys are OK with the PR, please just merge it. | 14:28 |
kata-dev-irc-bot | <bergwolf> CI are green now | 14:28 |
kata-dev-irc-bot | <julio.montes> yes | 14:28 |
kata-dev-irc-bot | <julio.montes> I need it | 14:28 |
kata-dev-irc-bot | <eric.ernst> @bergwolf yeah, this makes sense. I'd be curious to see if we can get help from Huawei on it. | 14:28 |
kata-dev-irc-bot | <julio.montes> @bergwolf thanks | 14:28 |
kata-dev-irc-bot | <eric.ernst> Don't know if you have insight there. | 14:28 |
kata-dev-irc-bot | <eric.ernst> Other question: do you know if/when @laijs can speak more about vm-templating work? I know there are plenty of people who are interested in this. | 14:29 |
kata-dev-irc-bot | <eric.ernst> I was going to invite a couple other companies to that discussion | 14:29 |
*** mylinux has joined #kata-dev | 14:32 | |
kata-dev-irc-bot | <bergwolf> @eric.ernst can Intel help the new API implementation? I don't know if Huawei can help on it or not. | 14:32 |
kata-dev-irc-bot | <bergwolf> For vm templating presentation, we need @laijs to confirm his time slots | 14:33 |
kata-dev-irc-bot | <anne> @bergwolf this sounds like something we could put on the ML to call for contributors | 14:34 |
*** gabyc_ has joined #kata-dev | 14:34 | |
kata-dev-irc-bot | <bergwolf> @anne I believe they require some in-depth understanding of kata runtime. Not an easy task for new contributors. | 14:35 |
kata-dev-irc-bot | <anne> doesn't hurt to try and see if anyone steps up (can even outline what they need to know) | 14:37 |
kata-dev-irc-bot | <eric.ernst> @bergwolf yeah, wouldn't mark it "good first issue" :slightly_smiling_face: | 14:37 |
kata-dev-irc-bot | <eric.ernst> @bergwolf - to be transparent, I'm not sure we have anyone who can start this feature today. | 14:37 |
kata-dev-irc-bot | <eric.ernst> Once our backlog clears up a little bit we can probably jump on it. | 14:38 |
kata-dev-irc-bot | <eric.ernst> I just want to see if we can get others involved in the meantime (as a bonus, i think the project looks better as well if we get more companies in the git blame :D) | 14:38 |
kata-dev-irc-bot | <bergwolf> ok, how about tag it `help-wanted` on github, and we can come back to it two or three weeks later | 14:38 |
kata-dev-irc-bot | <eric.ernst> Okay, I'll tag it, and I may reach out to a few folks in the meantime too. I just don't want to wait too long. | 14:39 |
kata-dev-irc-bot | <anne> I'd send a note now to let people know it needs help. 3 weeks from now is 30 days from the release goal | 14:39 |
kata-dev-irc-bot | <eric.ernst> Speaking of being busy; you're workign on the sandbox API changes, right? | 14:39 |
kata-dev-irc-bot | <bergwolf> I think it should take one week or so to implement | 14:40 |
kata-dev-irc-bot | <bergwolf> Yup | 14:40 |
kata-dev-irc-bot | <bergwolf> right now on `Release()` today | 14:40 |
kata-dev-irc-bot | <bergwolf> I should be able to post something once the issues with `Release()` is settled. I've got some problem with yamux on serial. | 14:42 |
kata-dev-irc-bot | <raravena80> question, does vm templating help with vm boot up time? and/or what is the other use for it? | 14:42 |
kata-dev-irc-bot | <bergwolf> @raravena80 yes. And it saves a lot of guest memory by sharing pages. | 14:43 |
kata-dev-irc-bot | <raravena80> gotcha | 14:43 |
kata-dev-irc-bot | <eric.ernst> cool. @bergwolf -- just to be clear, when I said volume, I meant storage hotplug, ie : https://github.com/kata-containers/runtime/issues/50 | 14:43 |
kata-dev-irc-bot | <eric.ernst> I'll mark it as help-wanted. | 14:44 |
kata-dev-irc-bot | <bergwolf> @eric.ernst yes, that's what I mean as well. | 14:44 |
kata-dev-irc-bot | <eric.ernst> Ack? | 14:44 |
kata-dev-irc-bot | <bergwolf> y | 14:44 |
kata-dev-irc-bot | <eric.ernst> ok. from your perspective, are there other features missing for release that aren't identified here: https://github.com/kata-containers/runtime/issues?q=is%3Aissue+is%3Aopen+label%3Arelease-gating | 14:44 |
kata-dev-irc-bot | <bergwolf> vm factory and metadata storage | 14:46 |
kata-dev-irc-bot | <bergwolf> https://github.com/kata-containers/documentation/blob/master/design/kata-api-design.md#metadata-storage-plugin https://github.com/kata-containers/documentation/blob/master/design/kata-api-design.md#vm-factory-plugin | 14:46 |
kata-dev-irc-bot | <eric.ernst> Pardon my ignorance, but can you clarify, is vm-factory mandatory for functional frakti? Or is this optimization? | 14:47 |
kata-dev-irc-bot | <bergwolf> We do not need to export the plugin APIs, but we need to have them internally. | 14:47 |
kata-dev-irc-bot | <bergwolf> That's existing functionality and a very good feature of runv. We do not want to lose it in v1.0 release. | 14:48 |
kata-dev-irc-bot | <eric.ernst> understood. I'm just trying to look at calendars and features for 1.0 versus 1.1 :slightly_smiling_face: | 14:49 |
kata-dev-irc-bot | <bergwolf> To be frank, there is less point of using kata if it does not support vm factory, from frakti point of view. | 14:49 |
kata-dev-irc-bot | <eric.ernst> Yep, I get it. | 14:50 |
kata-dev-irc-bot | <eric.ernst> is there someone starting to/already looking into this, do you know/ | 14:50 |
kata-dev-irc-bot | <eric.ernst> ? | 14:50 |
kata-dev-irc-bot | <eric.ernst> I'll open issues for it in the meantime. | 14:50 |
kata-dev-irc-bot | <eric.ernst> So at least we can start to have an accurate backlog. | 14:50 |
kata-dev-irc-bot | <bergwolf> I think @laijs is looking at it. but need his confirmation. | 14:51 |
kata-dev-irc-bot | <eric.ernst> Ok, thanks @bergwolf - I just want to make sure I understand, I"m not necessarily questioning, and I definitely understand that ya'll have needs, feature wise. | 14:51 |
kata-dev-irc-bot | <eric.ernst> What're your thoughts about vm-templating, versus factory? | 14:52 |
kata-dev-irc-bot | <eric.ernst> There's a dependency, obviously, but we don't need templating for 1.0? | 14:52 |
kata-dev-irc-bot | <eric.ernst> Do you guys do vm templating in production now? | 14:52 |
kata-dev-irc-bot | <bergwolf> vm factory is the framework, and vm templating and vm caching are implementation of vm factory. | 14:53 |
kata-dev-irc-bot | <bergwolf> When say `vm factory`, I mean all of them. | 14:53 |
kata-dev-irc-bot | <eric.ernst> I see. I am worried these are going to be controversial. | 14:53 |
kata-dev-irc-bot | <raravena80> might be a good idea to defer, imo | 14:53 |
kata-dev-irc-bot | <bergwolf> There are existing implementation in runv | 14:54 |
kata-dev-irc-bot | <eric.ernst> Sure; I just mean from a security standpoint there are a few parties which ahd questions and concerns on this. | 14:54 |
kata-dev-irc-bot | <bergwolf> we do not need to write every line of code from scratch | 14:54 |
*** mylinux has quit IRC | 14:54 | |
kata-dev-irc-bot | <eric.ernst> I think we should have a design proposal/review at next arch meeting. | 14:55 |
kata-dev-irc-bot | <bergwolf> Yup, we can discuss possible security issues. And they are optional features from kata point of view. | 14:55 |
kata-dev-irc-bot | <eric.ernst> Or setup a dedicated meeting for it. Let's plan ahead on it, just because I know at least one person from Microsoft and a couple on our side who are very interested in this discussion. | 14:56 |
kata-dev-irc-bot | <eric.ernst> can you work with @laijs on prepping for Monday's call? Or suggest a time otherwise? | 14:56 |
kata-dev-irc-bot | <eric.ernst> (I hate that Monday is 6 days away :)) | 14:57 |
kata-dev-irc-bot | <bergwolf> We need to involve @laijs for the discussion (and preparation) | 14:57 |
kata-dev-irc-bot | <raravena80> @eric.ernst @bergwolf another question, is there a backdoor to see the qemu-lite console output when a container starts, or thats just locked for security reasons? | 14:58 |
kata-dev-irc-bot | <eric.ernst> You can update the guest image to include this. | 14:58 |
kata-dev-irc-bot | <eric.ernst> it isn't enabled by default. | 14:58 |
kata-dev-irc-bot | <eric.ernst> I had a gist which is likely out of date here: https://egernst.github.io/posts/debug-cc-image | 14:59 |
kata-dev-irc-bot | <eric.ernst> s/clear/kata :slightly_smiling_face: | 14:59 |
kata-dev-irc-bot | <raravena80> ah cool :slightly_smiling_face: will give it a try | 14:59 |
kata-dev-irc-bot | <eric.ernst> I think the only change from those directions may be that you'll need to check/verify that your rootfs includes /bin/bash | 15:00 |
kata-dev-irc-bot | <eric.ernst> Ah, I'm thinking this may not be exactly what you sasked. | 15:00 |
kata-dev-irc-bot | <raravena80> yeah and probably enable the serial port | 15:00 |
kata-dev-irc-bot | <eric.ernst> This is if you want to get access to the guest in the VM | 15:00 |
kata-dev-irc-bot | <eric.ernst> Is that what you were hoping for? | 15:00 |
kata-dev-irc-bot | <bergwolf> we have `exec in vm` feature in runv. Might be helpful to support it in kata as well. Then console access is through normal virtcontainers/agent APIs. | 15:01 |
kata-dev-irc-bot | <raravena80> I think this it, so basically, look at the boot up log, and possibly login to vm for debugging, etc. | 15:02 |
kata-dev-irc-bot | <raravena80> but I also like the fact that this is disable for security reasons. As in nobody should be able to access the vm so that they can try to hack into your container. | 15:04 |
kata-dev-irc-bot | <eric.ernst> @bergwolf merged the internal shim/proxy pr | 15:08 |
kata-dev-irc-bot | <eric.ernst> @raravena80 FYI, I want to learn more about the test you have, and see if we can reproduce. | 15:11 |
kata-dev-irc-bot | <eric.ernst> ie, I want to run it locally on baremetal, if that's straight forward, in order to better understand what/where the bottleneck is. | 15:12 |
kata-dev-irc-bot | <eric.ernst> it shouldn't have that much of a gap, but if it does, we should open up issues and start to address. | 15:13 |
kata-dev-irc-bot | <raravena80> Sure, let me know. I think most what you need is in the blog. The scylla image is in docker hub and the cassandra test in the cassandra package. | 15:13 |
kata-dev-irc-bot | <eric.ernst> k, thanks @raravena80 | 15:16 |
kata-dev-irc-bot | <eric.ernst> from looking at your params, it looks like you aren't providing a volume necessarily. | 15:24 |
kata-dev-irc-bot | <eric.ernst> So, having looked for 5 seconds, it appears that the db really sits in the container itself? Is that right? | 15:25 |
kata-dev-irc-bot | <eric.ernst> So, if that is indeed the case, the volume / container.img mounting really shouldn't play in here, it looks like this would be dealing with networking only? | 15:25 |
kata-dev-irc-bot | <eric.ernst> I'll see what I can do locally... | 15:26 |
kata-dev-irc-bot | <eric.ernst> speakign of local v. remote I better start my commute to the office. | 15:27 |
*** mylinux has joined #kata-dev | 15:37 | |
*** eernst has quit IRC | 15:40 | |
*** annabelleB has quit IRC | 15:42 | |
*** eernst has joined #kata-dev | 15:42 | |
*** annabelleB has joined #kata-dev | 15:54 | |
*** mcastelino has joined #kata-dev | 15:56 | |
*** mcastelino has quit IRC | 15:56 | |
*** mcastelino has joined #kata-dev | 15:59 | |
*** mylinux has quit IRC | 16:04 | |
kata-dev-irc-bot | <raravena80> yeah, the db is using the volume in the container. To enable devicemapper I actually mounted a separate SSD on the server and configured it to use it with Docker. In the end it's all in the same SSD for both bare containers and Kata containers. | 16:13 |
gwhaley | @raravena80, @eric.ernst - the question then is how did you pass that volume on the docker runtime, and what does it look like inside the container if you do a 'mount' ? I think eric.ernst will be looking at that in a bit | 16:19 |
gwhaley | @raravena80 - if you do check that, could you post the info on the github issue - that'll ensure I see it go past my inbox :-) | 16:20 |
*** fuentess has joined #kata-dev | 16:22 | |
kata-dev-irc-bot | <raravena80> @graham.whaley updated the issue :slightly_smiling_face: added the `docker inspect` output and `mount` inside the container. | 16:27 |
*** mylinux has joined #kata-dev | 16:33 | |
*** eernst has quit IRC | 16:35 | |
*** eernst has joined #kata-dev | 16:37 | |
*** mylinux has quit IRC | 16:37 | |
*** gwhaley has quit IRC | 16:39 | |
*** sai_ has joined #kata-dev | 16:43 | |
*** eernst has quit IRC | 16:44 | |
*** mylinux has joined #kata-dev | 16:57 | |
*** eernst has joined #kata-dev | 17:30 | |
*** gwhaley has joined #kata-dev | 17:33 | |
*** oikiki has joined #kata-dev | 18:28 | |
*** sai_ has quit IRC | 18:41 | |
*** justJanne has quit IRC | 18:53 | |
*** justJanne has joined #kata-dev | 18:54 | |
*** annabelleB has quit IRC | 19:21 | |
*** gwhaley has quit IRC | 19:31 | |
*** annabelleB has joined #kata-dev | 19:35 | |
*** mcastelino has quit IRC | 20:23 | |
*** annabelleB has quit IRC | 20:39 | |
*** mylinux has quit IRC | 20:45 | |
*** mylinux has joined #kata-dev | 20:45 | |
*** annabelleB has joined #kata-dev | 20:47 | |
*** annabelleB has quit IRC | 20:57 | |
*** annabelleB has joined #kata-dev | 20:58 | |
*** devimc has quit IRC | 21:18 | |
*** mylinux has quit IRC | 21:50 | |
*** mylinux_ has joined #kata-dev | 21:53 | |
kata-dev-irc-bot | <eric.ernst> @manohar.r.castelino @raravena80 @archana.m.shinde: regarding 9p versus block based. The problem in this scenario is that a volume is being used, specifically for a path. | 21:54 |
*** gabyc_ has left #kata-dev | 21:55 | |
kata-dev-irc-bot | <eric.ernst> We do have a patch in flight, right @archana.m.shinde, for checking if a volume is block based, and if it is to pass it in via virtio. | 21:55 |
kata-dev-irc-bot | <eric.ernst> (scsi in this case) | 21:55 |
kata-dev-irc-bot | <archana.m.shinde> @eric.ernst That is for a different scenario | 21:55 |
kata-dev-irc-bot | <eric.ernst> But, even still, this wouldn't solve the specific issue - you'd still want to look into mounting the DB as a blockbase. | 21:55 |
kata-dev-irc-bot | <archana.m.shinde> I have opened this issue : https://github.com/kata-containers/runtime/issues/198 | 21:56 |
kata-dev-irc-bot | <eric.ernst> That'll help, but we'd still need to change how scylia is currently setup, right? | 21:56 |
kata-dev-irc-bot | <archana.m.shinde> to address docker -v <volume-name-or-path>:/path scanrio | 21:56 |
kata-dev-irc-bot | <archana.m.shinde> with that one would have to setup a loopback device, mount it and pass that as the volume(-v vol:/var/lib/scylla) | 21:57 |
kata-dev-irc-bot | <archana.m.shinde> the user would need to setup the block device | 21:58 |
kata-dev-irc-bot | <archana.m.shinde> We can have a documented BKM for that | 21:58 |
kata-dev-irc-bot | <eric.ernst> OK - and we'll want to test/get input on potential issues here, right | 21:59 |
kata-dev-irc-bot | <eric.ernst> ? | 21:59 |
kata-dev-irc-bot | <eric.ernst> as in, how many times can we mount the same path? | 21:59 |
kata-dev-irc-bot | <archana.m.shinde> yes, I think we would face sync issues if we try to access the volume from the host as well | 22:00 |
kata-dev-irc-bot | <archana.m.shinde> but I would like to get more input on this, if it would have further side effects | 22:00 |
kata-dev-irc-bot | <raravena80> I can try mounting another block device and pass it in as `-v vol:/var/lib/scylla` to get some numbers. | 22:00 |
kata-dev-irc-bot | <raravena80> does it matter the filesystem? can do ext4 or xfs | 22:00 |
kata-dev-irc-bot | <archana.m.shinde> @raravena80: this is something that we need to add support for | 22:01 |
kata-dev-irc-bot | Action: eric.ernst thinks about multiple containers sharing the same volume. | 22:02 |
kata-dev-irc-bot | <archana.m.shinde> before we go ahead and implement it, I would want some more input on this | 22:02 |
kata-dev-irc-bot | <eric.ernst> I guess in context of a pod, it'll just mount once to the VM. | 22:02 |
kata-dev-irc-bot | <archana.m.shinde> egernst: I think we can keep track if the block device has been passed before | 22:03 |
kata-dev-irc-bot | <archana.m.shinde> and avoid passing it to the VM through virtio(blk/scsi) more than once | 22:03 |
kata-dev-irc-bot | <eric.ernst> But, I think you can probably expect that multiple pods could share the same volume, right? | 22:04 |
kata-dev-irc-bot | <raravena80> would be good to have an option to share and not share I suppose | 22:04 |
kata-dev-irc-bot | <archana.m.shinde> that is true @eric.ernst | 22:04 |
kata-dev-irc-bot | <eric.ernst> So, we can start testing this, I suppose -- manually setup as a block device, and start to see what falls over | 22:05 |
kata-dev-irc-bot | <eric.ernst> any I/O measurement that makes use of 9p will be pretty dissapointing. | 22:06 |
kata-dev-irc-bot | <archana.m.shinde> @eric.ernst yeah, any way we could not use 9p would be good | 22:07 |
kata-dev-irc-bot | <eric.ernst> @jonolson - any chance you have input / experience here? | 22:07 |
kata-dev-irc-bot | <eric.ernst> The short summary is: when passing in a volume, by default today this'll be passed in via 9p (read: I/O numbers will degrade significantly). In the event you are passing in a block device as a volume parameter, we could handle it by hotplugging into the VM using virtio-scsi. The problem I see is trying to retrofit existing workloads by mounting volume path to block-device and then passing this in to the pod (let's say the | 22:11 |
kata-dev-irc-bot | smart-performance-hungry-user would do this mounting). Specifically, we're starting to test but wondering what kind of file synchronization issues we'll start hitting. | 22:11 |
kata-dev-irc-bot | <eric.ernst> @archana.m.shinde -^ more or less, right? | 22:11 |
kata-dev-irc-bot | <archana.m.shinde> @eric.ernst: yup, pretty much captures it | 22:12 |
kata-dev-irc-bot | <jonolson> closest we come to this for anything in prod that I know of today is our persistent-disk offering — we allow many read-only attachments of a pd, but only one read-write attachment, and a read-write attachment blocks read-only attachments — for pretty much the reasons you’re expecting — Linux block cache is not forgiving of block data changing out from under it | 22:14 |
kata-dev-irc-bot | <eric.ernst> Yeah, in our case, I don't think we can expect just a single read-write. | 22:14 |
eocardon_ | Hey folks, I've been working on Kata Containers OBS packages and I've launched some Kata Containers already. Packages can be found here (https://build.opensuse.org/project/show/home:katacontainers:release). The packages are under testing and I'll be writing the documentation this week. | 22:15 |
kata-dev-irc-bot | <jonolson> yeah, Linux is not by default well equipped to handle this for block devices — it’s better equipped (although the API is painful) for filesystems, but as you observed 9p leaves much to be desired in terms of performance — i suspect there’s quite a bit of low-hanging fruit for improving 9p performance — if nothing else adding multi-queue support | 22:17 |
kata-dev-irc-bot | <manohar.r.castelino> we had some vhost-9p patches... did we ever benchmark them | 22:21 |
kata-dev-irc-bot | <manohar.r.castelino> https://github.com/clearcontainers/vhost-9pfs | 22:22 |
kata-dev-irc-bot | <archana.m.shinde> @manohar.r.castelino I dont think we have | 22:23 |
kata-dev-irc-bot | <archana.m.shinde> @julio.montes ^ | 22:23 |
kata-dev-irc-bot | <eric.ernst> In the various block-based solutions, I think they all follow the same notion of multiple readers single writer; so I guess that's standard. I'm just not sure how we'd handle that at the runtime level (again, freely admitting ignorance here, and we need to make sure some of us start looking at these "real" use cases dealing with volumes) | 22:26 |
*** annabelleB has quit IRC | 22:28 | |
kata-dev-irc-bot | <archana.m.shinde> @eric.ernst Its possible to pass a volume as readonly on the docker command line | 22:29 |
*** mylinux_ has quit IRC | 22:30 | |
kata-dev-irc-bot | <archana.m.shinde> but again we need to see how to handle a multiple pod scenario | 22:31 |
*** oikiki has quit IRC | 22:40 | |
kata-dev-irc-bot | <raravena80> in the meantime. I'll try to see if I can create scylla docker image with `RUN mkdir /var/lib/scylla` instead of `VOLUME /var/lib/scylla` | 22:46 |
kata-dev-irc-bot | <jonolson> even single writer can be challenging depending on cache behavior | 22:52 |
kata-dev-irc-bot | <eric.ernst> On our side, I feel like we've said "oh well we'll just use NFS over VSOCK." I'm still waiting to see perf numbers here. | 22:54 |
kata-dev-irc-bot | <eric.ernst> I think this is an area we'll need to dedicate a bit more time. That and getting 1.0 defined, measured and released. | 22:55 |
kata-dev-irc-bot | <jonolson> so, practically speaking 9p _should_ offer more potential speed opportunities than NFS | 22:58 |
kata-dev-irc-bot | <jonolson> or rather a paravirt fs should | 22:58 |
kata-dev-irc-bot | <jonolson> in that it has the potential for things like properly functional mmap(), even if it requires running what amounts to a cache coherency protocol to get it | 22:59 |
*** fuentess has quit IRC | 23:01 | |
*** eernst has quit IRC | 23:01 | |
kata-dev-irc-bot | <jonolson> oh, huh… that gives me a really twisted idea… fortunately I’m at NSDI this week and can’t act on this impulse… | 23:01 |
*** eernst has joined #kata-dev | 23:02 | |
*** eernst_ has joined #kata-dev | 23:03 | |
*** eernst has quit IRC | 23:04 | |
*** mylinux has joined #kata-dev | 23:06 | |
*** eernst_ has quit IRC | 23:08 | |
*** mylinux has quit IRC | 23:11 | |
*** annabelleB has joined #kata-dev | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!