Episode 9: You Say Key, I Say Quay!

Today, Brian and James discuss Docker registries. I mean, you have to put all those Docker images somewhere, right? Why not DockerHub? Or Quay.io? Turns out there's a whole host of other options, some paid, some private, and we're going to hit them all.

Today, Brian and James discuss Docker registries. I mean, you have to put all those Docker images somewhere, right? Why not DockerHub? Or Quay.io? Turns out there’s a whole host of other options, some paid, some private, and we’re going to hit them all.

Brian Seguin
Hello, you’re listening to Rent Buy Build, where we talk about the pieces and parts of the cloud native infrastructure and whether you should rent them, buy them or build them yourself. I’m Brian Seguin.
James Hunt
And I’m James Hunt.
Brian Seguin
Today we’re talking about something that I found to be very interesting, surprisingly, which is image registry stuff.
James Hunt
Ah, yes, registries. It is very interesting. It’s a deep topic surprisingly deep. Given that the, the premise is get the image somewhere, so someone else can come get the image. In a nutshell,
Brian Seguin
it’s interesting, because there’s a lot of polarized opinions out there about how to do registry stuff and, and which one, which registry system is better. And, you know, there’s a lot of best practices surrounding registry things. But before we get into that, what is a image registry?
James Hunt
So the thing that makes Kubernetes work; that makes Docker work, and really all containerization strategies and platforms in the modern world work — is a thing called the OCI spec, the open containers initiative. So when Docker hit the scene, many many many years ago, they kind of ad hoc built a means of compiling packages and and putting all the filesystem stuff together and layering things so that you could distribute a Docker image to another host so that you would have the exact same environment when you spun that up. And that little innovation very, very straightforward. Like we had been doing packaging on Unix for years. But the innovation of bringing the filesystem with you along with environments and a command to start the whole thing up is what really jumpstarts the containerization revolution. And what made it more than just an interesting technical footnote in the history of cloud computing, is that the open containers initiative was formed by a bunch of different companies, and they kind of hashed out, here’s what an image is. And here’s the format and the file system structure and how we tar all these things together and, and how we name and refer to images. And once we had that, we had to figure out how do I get an image from point A to point B. And the way they decided to do that was through a centralized unit of storage that we call the Docker image registry, or the it’s, it’s handled by the OCI distribution spec. So there’s two specs at play. The OCI image spec is how the images get put together. And the distribution spec is what we’re going to be talking about today, the software that actually moves images around, names them and makes them available.
Brian Seguin
Interesting. I have a couple of questions about what what this is. Would homebrew for example, be a example of the image registry?
James Hunt
No, because homebrew is — if you’re talking the Mac OS,
Brian Seguin
yes,
James Hunt
package manager, that’s just a package manager. An example that would be familiar to pretty much anybody who’s ever done anything with Docker is Docker Hub. Docker Hub is a public image registry that is free for the using, with some API rate limiting that they just recently popped on top to control their own costs. But when you do a Docker pull of any image for the most part, if you’re following along with a blog, or you’re watching a video or running a training course, and you do something like Docker pull Alpine, that Alpine image that has all the the current that not the kernel, but that has the user space for Alpine, the muscle lipsy and all the utilities, the thing you’re actually going to Docker run, those bits had to come from somewhere. And in the case of the Alpine image, they come from Docker Hub. So when we first started with Docker, when most people first start with Docker, you don’t even see the registry, right? It’s behind the scenes. When you build an image. You can build it locally and never give it to anybody and you never interact with a registry. When you Docker push something that’s telling the Docker daemon I need to take this image and put it on another server. And if you Docker push without a host in the name of the of the tag, as most people will do, right if you Docker tag, like I have a Docker Hub account, I am James Hunt, right so if i Docker tag something, I am James Hunt slash read by build and then I Docker push that the push is what causes my Docker daemon to talk to the Docker Hub servers and start trading bits back and forth. Here’s this layer. Here’s that layer. Here’s the third layer. Here’s how they all go together. Here’s the manifest. Here’s the tag name. Here’s the SHA here’s all the stuff packaged Gather. After a while people realize that Docker images could contain sensitive stuff, especially if you’re dealing with non compiled languages or interpreted languages, Ruby, Python, Perl, Lisp, any of the things where the software, the code that the human reads and writes, is also interpreted by the computer goes into a Docker image, and is trivial to pull back out of the Docker image.
Brian Seguin
Are we talking about? Like, your application code? Are we talking about your customer information? Okay, usually
James Hunt
application code. Very rarely do people bake data into Docker images, it’s pretty much an anti pattern to stuff data into a container because the container image is a static thing. Every time you restart the container image, the file system gets reverted to the file system spec in the image. So when you bounce an app server, any changes that you may have, you know, Docker, exec Dan, or Kubernetes, kexec din, to maybe edit code. And this is a terrible idea. Don’t do this at home. Ladies and gentlemen, if you’ve modified anything inside of a container while it’s spinning, if you then bounce that container, the file system reverts. So we don’t generally see data in images. But the code is as important sometimes, especially when you’re dealing with proprietary software. If you’re running a software as a service, you do not want your images on Docker hub for how your sass operates. Because that would make it trivial for someone to pull the image down and either run an exact clone of your service, or a competitor could pull your image down and learn how you did that whiz bang feature that pulled 20% of their users over to your platform, and then they can easily come up to speed on and you know, build out a copycat feature into their platform. So you get this, the idea of Docker Hub is cool and all and it’s great for the shared public things we all agree on, like what does Postgres look like? How should Redis be packaged? What is Maria dB, or I just need Ubuntu in in a container to use as a base image for something else. Those things work beautifully on public registries, but my code, my application, my special sauce, I really need to put somewhere else. And that’s where we start talking. Where the OCI registry spec really becomes helpful is it allows people to implement their own copy of Docker Hub. Right, you can have your own registry, that is that either access control because OCI distribution is all done over HTTP and HTTPS. So you can do all the things you’re used to with, with HTTP authentication, auth, basic digest auth, all those types of pluggable systems and you can proxy it, you can do all the things you can do with a web service. So you can build a private registry that you control who gets into it.
Brian Seguin
And I think that kind of segues a little bit into some of the controversies surrounding registry, which is people, a lot of the best practices recommend that you push your own, you create and push your own private registries because of security concerns, right? There’s also no route, right? This is there’s also a bunch of best practices surrounding continuous integration of CVE, patches, and also security scanning. When most of these, a lot of consumers of Docker hub or just don’t maybe not realize that they’re actually just pushing their containers out and pulling public information that may or may not be scanned or, or patched, depending on what service they’re using, to provide as a registry.
James Hunt
Right? Especially if you if you’re not even aware that there is a registry, and Docker makes that incredibly easy to not be aware of the fact that your images live somewhere. If you’re not aware of that it becomes trivial to to either accidentally publish something that you didn’t mean to or to consume something without really understanding that the Postgres image is really just managed by a group of volunteers on the internet. Getting into the root library, as they call it of Docker Hub images, is really just Are you willing to step up and do the maintenance? Can you handle the support? And if you can, they don’t, they don’t do a ton of background vetting. And once you’re a maintainer, they also don’t do a whole bunch of like, there’s no they, right, there’s no vai, who’s guarding. There’s not an app store. With a review team and other stuff. You just push images, and a lot of the pushback you’re seeing in the communities and the ecosystems in cloud and cloud native. A lot of the pushback against using Docker Hub is I think a symptom of a much wider supply chain. existential crisis that’s going on right now in cloud.
Brian Seguin
Dun dun da dun. Is this the is this the open source crisis?
James Hunt
Yes, this is the Why am I so everyone this has been going on for a while. Everywhere there is open source, somebody is profiting from the open source without compensating the maintainers. There’s a lot of that going on right now there’s a lot of the GPL loopholes allow SAS companies to incorporate free and open source software into their product stacks without any remuneration for the authors without any contribution back of patches, because the software is a service vendors, Amazon, Google, Salesforce, GitHub, they’re not distributing the code, they’re running the code on their own servers. So the terms of the GPL, which was born in the late 70s, early 80s, was all about distribution, while with always on broadband, always on LTE or LTE, always on wireless. And with coming 5g, we don’t have to distribute the code anymore, because we’re not running the software as consumers, we’re just using it through a web browser. So there’s a lot of open source maintainers that have kind of woken up and said, Look, I’m spending gobs and gobs of my time, just tons and tons of hours being poured into maintaining these open source projects. And it’s affecting my health, it’s affecting my relationships, it’s affecting my sense of self worth, and I’m not getting anything valuable out of this used to be fun, right. And now I just have a bunch of a lot of anger and a lot of burnout. So and that’s, that’s kind of normal, it’s unfortunate. And it’s one, you know, it’s a problem we’ll have to solve as an industry.
Brian Seguin
But open source maintainers are not like Instagram influencers, that I that are getting sponsorships, and things of that nature.
James Hunt
Right. And, you know, there have been attempts to fix that, you know, Patreon is one such system, GitHub sponsors is another but it’s one of those too little too late and it doesn’t always work. So a lot of maintainers, over the past few years have been stepping down from very popular and demanding maintainership positions. And what has happened in the power vacuum that these people leave as they exit these responsibilities. Sometimes somebody steps up to take over with the goodwill of the community in mind and the technical chops to pull it off. But sometimes somebody looks at the number of users and the number of devices that the software is now not maintained. Look, they look at the number of devices that’s run on. And they step up to maintain with nefarious purposes. So you get NPM modules that get sold to malicious actors who then embed coin mining software and your wisdom or and
Brian Seguin
this is why you you implement security scanning solutions to ensure that none of those things are in your images as you’re pulling them from public or private registries.
James Hunt
Right. And this is I mean, we’re you know, as a supply chain attack, the easiest way to mitigate a supply chain compromise is to own the supply chain and be able to audit it yourself, right. And that’s why you see this backlash against public image registry like Docker Hub, and people advocating for the use of your own private registries, you’re still going to use the open source Postgres image, you’re just going to validate that it has in it what you think it has in it before you roll it into prod.
Brian Seguin
So there’s a lot of depth we can go in on the security scanning aspect. Would
James Hunt
you say there’s a whole episode a whole episode on that? Yes.
Brian Seguin
And we’re actually going to be talking about that security scanning next episode. And so let’s just kind of get right into the rental solutions. We already mentioned Docker Hub, but there’s also Quick Quick, quick,
James Hunt
quick, quick, quick. k nafi.
Brian Seguin
Quick, quick. It’s it’s a Moby Dick reference.
James Hunt
Docker push.
Brian Seguin
No. key.io I believe I think it’s a thing that’s run by Red Hat.
James Hunt
It is now a key key used to be I believe, part of core OS. But I could be wrong on that. I’ll I’ll double check that put it in the show notes key if you’re if you’re if like me, you have a readers vocabulary key is actually spelled qu a y.io. And for years, I called it Cray. Until I said that and looked like an absolute fool in front of someone who knew how to pronounce it. Key that IO is. Yeah, it’s another Docker Hub competitor. They offer free registries for public stuff and paid for private
Brian Seguin
and they have some scanning just comes out of the box. I think Docker Hub has some limited scanning out of the box. Those are the two main SAS solutions if you’re going public,
James Hunt
independent of the cloud providers,
Brian Seguin
right, so then then you have the cloud providers. And I love some of the stances that the cloud providers take on this because
James Hunt
it makes sense, right? It makes sense that the cloud providers would also be your package. providers.
Brian Seguin
So so there’s ECR GCR and ACR, right? Amazon, Google and Azure.
James Hunt
Which one is easier? Which one is ACR?
Brian Seguin
ECR is Amazon ACR is Azure,
James Hunt
Amazon should just renamed themselves to elastic Amazon.
Brian Seguin
So those are the main ones. And they all have a very similar, yet one, some are more convoluted than others charging model when it comes to these registry systems, and it’s basically you pay for the consumption of the cloud, the registry of system sits on. ECR, the Amazon, one has like 500 different metrics and like 50 more that they’re planning doing that next year. No, I’m joking, because Amazon’s billing is the most complicated. They have a
James Hunt
reputation to keep up, right? I mean, yeah. Why else would you pay for the AWS cost Explorer, if the cost for just transferring and storage?
Brian Seguin
Google is fairly simple. It’s you pay for storage, and you pay for now network egress,
James Hunt
right? Bigger my images are, the more often I pull them in or push them up, the more I’m going to pay.
Brian Seguin
Right, exactly. And then ACR is kind of a hybrid middle ground of the two it has a few more complicated things, metrics in it, but it’s it’s basically the same, you know, consumption type model. And all of these have APR, you
James Hunt
get you get a discount on a car, if you’ve recently decommissioned Windows XP in your enterprise, right? Again, your credits. They should Microsoft if you’re listening, you can have that idea for free.
Brian Seguin
So from a rental from a rental side of things, this is actually the the probably the interesting use case where I don’t think we’re going to be recommending rental too much. I think it’s most you can do rental if you’re starting out. But I think the goal for most organizations is going to be a buy scenario. Well, if we’re not going to recommend
James Hunt
rental, what’s the buy in the build look like? Brian,
Brian Seguin
right. So the by scenario, you know, that’s quick, quick that I owe or wait sorry, cray.io e, ke at i o key.io. And Docker. Both have private registry systems that you can kind of purchase, purchase or license somehow and deploy on your on prem or in your cloud environment,
James Hunt
right. And then it’s important to differentiate they have SAS offerings where you can buy private image repos, Docker Hub charges, you per user key charges you per repo. And then also have on prem, where everything is going to be private, because it’s on prem. So two different they’re both are rent and buy.
Brian Seguin
And the interesting thing to just kind of tie in one of the other cloud providers is linode. For example, they actually recommend the by scenario here as well, where you’re going to go through and you’re going to get your your Docker trusted registry, and you’re going to deploy it to you know, linode on their LK and their linode Kubernetes engine, and then you’re just going to consume their object storage for storage. You know, in that way,
James Hunt
they’re actually lending DTR are they just having to use Docker distribution? DTR is the the private on prem Docker Hub.
Brian Seguin
Oh, gotcha, gotcha. So that so so DTR is the private on prem one and then what Docker distribution is the one that goes in the route of using a cloud provider.
James Hunt
Docker distribution is a, a, an open source project by the Docker project by mobi and Docker Incorporated, to provide that they were the original Docker registry, right? So they, they package it up, unsurprisingly, as a Docker image, put it on Docker hub for free, and you can pull that down and run that anywhere, you can run another Docker image, and what the nodes, their stance was, since you already have lk II, and it’s amazing and you already have OBJ and it’s amazing. Full disclosure. linode is a sponsor, but we do like them. Besides despite that, once you have Okay, and once you have OBJ, you can back your registry storage up to OBJ, which is an s3 worker like and you can spin the registry containers themselves on lk, ie, you have this nice little wrapped up package of Kubernetes cluster images for the cluster all kind of self contained.
Brian Seguin
It’s a nice best practice that they they can help guide you through in the process with their Doc’s. So the biggest thing from a buy scenario is to basically have that extra layer of security where you’re restricting network access into your image build process, your kind of a best practice here would be your scanning all of the things that you’re pulling from you know Docker Hub. or what have you, and then you’re, you’re, you’re pushing that to your own private Docker repo, or your, your own, you know, key repo or what have you, whatever you’ve said stood up. And then that’s also behind some type of networking firewall that restricts access to it. When you’re when you’re, when you’re pushing a container, do your own infrastructure, whether that’s a rented eyes or your private infrastructure, it’s then pulling from the registry that you have already vetted and cleared and built yourself that’s behind the firewall, and you know that, that it’s that it’s secure,
James Hunt
right, you’re managing that supply chain problem by introducing a choke point, or your own warehouse, as it were of images, images that you know, are good that have passed your auditing, so that you know, everything past this point is good, and we don’t have to scan everywhere, we just have to scan at the egress at the gateways, right the registries themselves. And then you have a whole inflow process for how do you get an image into the registry, that you can put as much red tape on as you need, or want that whatever your regulatory requirements are, if you need to scan for CVE ease, and you have a high watermark of if it has more than one, moderate to high CVE, it can’t go into the registry. And then on the other side of the registry, all of your Kubernetes clusters or your Docker compose flights, or any of your containerized infrastructure can only pull from that registry, you kind of have that proof by induction that the images are at least vetted, if not, you know, more secure than their public alternatives, or their public counterparts. And that’s really I mean, that’s the value of, of by is, it’s yours, right, so you get the flexibility to outfit it with whatever additional plugins and processes you need. And normally our process here on rent by build is that you should rent things when you’re just starting out, which is good. If you’re doing Docker Hub, you know, pay for the five bucks a month or seven bucks a month, whatever it is for the private registries. And then as you scale up, buy or build, I don’t think I think you’re going to get to the build or the by faster, because it’s so cheap. Managing one of these things is super simple. The Docker image is pretty self contained, I’ve spun up dozens of image registries for various projects, I don’t even really think about it, it’s it’s like cert manager, an Ingress is just the thing you add to the cluster.
Brian Seguin
And once you know how to do it, and once you’ve established your own process for doing it, it’s it’s easy to maintain over time, and it doesn’t take a lot of infrastructure footprint,
James Hunt
right? Your storage is elastic, because for the most part, you’re going to back it by s3 or an s3 worker like like ECS, or OBJ. And that’s really the only cost to one of the things you can get you can you can scale it up, right, you can run multiple instances in the registry and, and do a Redis Cache. And you can make these things really complicated. But for single use, as you scoped things to clusters, you can get, you know, a lot of mileage out of a single container.
Brian Seguin
And then that kind of leads us to the build scenario here, which
James Hunt
is taking some of these open source Docker image distributions and kind of customizing it to be what you need it to be kind of, you’ve actually seen, I think the examples I would point to, of build, were solved inside of large companies with scaling concerns. And I mean, like massive scaling concerns.
Brian Seguin
So I mean, are we talking about Ubers?
James Hunt
I’m crackin. Brian, release the images. So don’t
Brian Seguin
worry, I got a quick quick here to help me with
James Hunt
sourcing and sea shanties. Although I think we missed the boat on that. Miss. Come for the Tech Talk stay for the dad jokes. Ah, no, cracking is interesting to me cracking dragon fly systems like that, where they said, Look, the OCI distribution spec is neat and all and it’s great that it’s interoperable. And everyone can push to OCI, dist, compatible registries, but we need to be able to pull and we need to be able to pull much faster. When Yeah, the
Brian Seguin
whole latency concern, you know, in the hole that, that that actually takes a lot of time when you’re pushing images
James Hunt
it is or it can write and as your if your images are big, right. And sometimes images are going to be big, because the code that they’re running, the software you’re building is big, lots of dependencies. You pay for that speed of getting to market by having lots of extra stuff that you’re relying on right libraries and frameworks and other things. So you know, a large image, it’s got a couple of downsides, one of which is it takes a while to build it, but that’s a one time thing per iteration of the image. The real downside is that in Kubernetes, it has to pull the image onto the cubelet under the node before it can run the container and Docker has Docker compose has this problem as well, you, the more time it takes to download the image, the longer the response time in a scaling window that your containerization platform is going to experience. If you’re trying, you know, you’re under a massive traffic spike, right posted to top top of Hacker News, you’re brand new sasses is hot. And there’s a ton of people coming in, you’re trying to quickly scale up the application server back end. Well, if you haven’t already primed every node with the images by having them pulled down, each node that doesn’t already have the app image is gonna have to pull it. And the way naively, the way containerization does that the way Docker does it, at least is it pulls down and tries to download all of the layers and then extract them in order, right, which means if you’ve got a really big layer in the middle, and most images I’ve ever seen, they have a very large, big chunky layer with all the code in it, and then a bunch of little command layers after an environment layers before. So they’ll pull down, they’ll get stuck on this really big layer downloading it. And the box isn’t doing anything else network wise, while it’s just you know, chunking through this. So what crackin and dragonflight did is they said OCI dist is neat. But let’s stuff Bit Torrent or something very similar to it on top, where we can parallelize pulling down those, those blob layers from not just the distribution point, but also other hosts nearby in the neighborhood who might also have those those blobs in their own disk caches.
Brian Seguin
So it’s implementing sort of like a peer to peer disk image registry building thing on the retrieval side. Yeah.
James Hunt
But on the retrieval,
Brian Seguin
okay, so it just, it just goes to the peers to retrieve the code that it needs to compile.
James Hunt
So it’s actually retrieving the pre compiled image layers. Okay, this is all just on the distribution and download side. So the numbers on the their project, they did some some interesting benchmarks that I encourage you to go look at on GitHub, in their README, they said that the 50th percentile was 10 seconds to download, Oh, wow. Which is insane. Because even like the Alpine image takes a while, and it’s only five Meg. images that are in the two gig mark, which is what their benchmark was targeting generally take, you know, sec, hundreds of seconds to download, in my experience. And that can be you know, the difference, that’s a that’s an order of magnitude, between getting your scaling done in your Kubernetes cluster to handle your new load versus not,
Brian Seguin
I guess it makes sense to do something like this, especially for Uber, because, you know, when they have events, and they have different load capacities, they need to scale up and scale down extremely quickly in order to meet the demands of something in a specific area. And latency could be the difference between picking up a ride, or, you know, leaving 20 people, you know, abandoned somewhere, right.
James Hunt
And Uber has large clusters, you know, 10,000 nodes in a Kubernetes cluster. And they have that once you get above a certain number of nodes, the chances that any given image is already resident on disk and one of those cubelets is, is small, unless you’re also running 10,000 app instance, deployments, which they’re probably not because at that stage, at that scale, you’re almost definitely a micro service. So you’ve got hundreds, if not 1000s of different images running around with different tags and versions. And the chances of you always, the chances of you having it not having to pull from the central registry are vanishingly small. So having a p2p thing, they can say, hey, not only, you know, I’m scaling from 20, app instances to 30, I now have 21 places to go get the image bits from the 20 nodes that are already running the container and the registry, so I don’t have to scale the registry, I just have to make available the disk on the other nodes. So that’s, again, Uber is a special case, there’s not that many Ubers out there, there’s not that many Facebook’s you are not Google, I’m guessing listener.
Brian Seguin
I am. Oh yeah, I’m not gonna Brian’s nice.
James Hunt
Most people don’t have those scale concerns. And when you do, it becomes a core competency to be able to deliver your software quickly. So that’s when we enter into build.
Brian Seguin
Image registry is an interesting one because we’re not really recommending rent here we’re recommending, buy if you can, and, and build if you have extremely unique use cases like Uber,
James Hunt
right. And to to clarify a little bit on the by adding additional things into the mix I’m classifying as a buy so when you’re talking security scanning or compression, or block deduplication those features can be added on to an image registry that you own.
Brian Seguin
Or like automations to roll out CVE patching system,
James Hunt
auto rebuild Docker image rebuilding, you know, new base image, go find all the, once you have your own registry, you can start hooking your own automation up to it to do things like find all of the images that were built off of this Sha one, and go rebuild them track down their Docker files and rebuild them, because we just patched the base image at this. This ultimately is the problem that cloud native build packs are trying to solve, but they’re doing it by splicing two image sets together. But most people are seeming to go the route of automation and rebuild as necessary, because they’re already equipped to build images at scale. Because that’s what your ci CD pipeline in a container world is doing. It’s building an image, testing the image and then pushing that image if it succeeds in all those tests, pushing that image up to a registry.
Brian Seguin
And a lot of that automation can be crucial to managing, you know your federated application deployments. Mm hmm. Interesting so that there are more use cases in this for buy and build, as James said, so the image registry is an interesting one next week we will be or next next step is next next episode, we will be talking about security scanning. Thank you for listening and
James Hunt
sileye bill, right. And so it’s a walk. That’s the Cylons, right. Where are we going Star Trek, we’re gonna get tricorders out and search for signs of vulnerabilities.
Brian Seguin
Search for signs of vulnerabilities.
James Hunt
Exactly whatever weird nerd thing we end up doing. Join us next episode as we talk about image scanning and the security implications of building and running your own registry and what you can do to better secure the rest of your infrastructure. That’s containerized