精简docker的导出镜像
六六六一啊 2023-01-06 09:37:22 2023-01-06 3 0
Docker 镜像是由多个文件系统(只读层)叠加而成,每个层仅包含了前一层的差异部分。当我们启动一个容器的时候,Docker 会加载镜像层并在其上添加一个可写层。容器上所做的任何更改,譬如新建文件、更改文件、删除文件,都将记录与可写层上。当我们使docker save形式尝试导出镜像时会导出该镜像的所有文件层,当然这个行为是必要的,因为你不知道这个镜像的被导入环境是否已包含基础镜像的文件层。但是如果我们有一批镜像且都依赖某一个或两个基础镜像构建,且不具备批量 save的调价(必须一个一个分开了打包),这种情况下如果一个一个save的话对硬盘资源是极大的浪费,那么有没有办法去掉哪些重复的镜像层呢,答案是OK的。(也许有人会吐槽为啥不用docekrfile呢,是的一般情况下是ok的,但是用dockerfile build出来的镜像id时不一样的)
看下docker save 导出了写啥
以如下镜像为例子
FROM centos:7COPY main /home/mainRUN chmod +x /home/mainCMD /home/mai
构建
docker build -t ip-server:1.0.0 .
导出
[root@localhost demo]# docker save -o ip-server.tar ip-server:1.0.0[root@localhost demo]# du -sh ip-server.tar 212M ip-server.tar[root@localhost demo]# docker save -o centos7.tar centos:7[root@localhost demo]# du -sh centos7.tar 202M centos7.tar
可以看到基于centos7构建的ip-server镜像只是大了一丁,也就是我们COPY进来的一个可执行文件大小,那么在已知被导入环境存在centos7镜像的文件层是该如何减小ip-server.tar的体积呢?
拆解导出的镜像
[root@localhost demo]# mkdir ip-server && tar xf ip-server.tar -C ip-server[root@localhost demo]# tree ip-serverip-server├── 3efdc87cec68e28bccf6c0d96894c903e12157ed0797651a2eaa565108de5bd8│ ├── json│ ├── layer.tar -> ../fad57ccc4dd192e49d3979f477525a4b4c8fb8532ae31c2c74b5403474a26e4d/layer.tar│ └── VERSION├── d8f46057879e2e8614caa5511934d403d7ebea0af9b196bf29b68161fda76766│ ├── json│ ├── layer.tar│ └── VERSION├── f898e9c3d94a1617bac63c962155327671957f2bcbd35e6411153b7730d6558e.json├── fad57ccc4dd192e49d3979f477525a4b4c8fb8532ae31c2c74b5403474a26e4d│ ├── json│ ├── layer.tar│ └── VERSION├── manifest.json└── repositories3 directories, 12 files
# manifest.json[ { "Config": "f898e9c3d94a1617bac63c962155327671957f2bcbd35e6411153b7730d6558e.json", "RepoTags": [ "ip-server:1.0.0" ], "Layers": [ "d8f46057879e2e8614caa5511934d403d7ebea0af9b196bf29b68161fda76766/layer.tar", "fad57ccc4dd192e49d3979f477525a4b4c8fb8532ae31c2c74b5403474a26e4d/layer.tar", "3efdc87cec68e28bccf6c0d96894c903e12157ed0797651a2eaa565108de5bd8/layer.tar" ] }]
# f898e9c3d94a1617bac63c962155327671957f2bcbd35e6411153b7730d6558e.json{ "architecture": "amd64", "config": { "Hostname": "", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" ], "Cmd": [ "/bin/sh", "-c", "/home/main" ], "Image": "sha256:04da435eb95c963effbf9aba060e6f2ddd962b4861ce1bf7bdc5f3cbffceb4b5", "Volumes": null, "WorkingDir": "", "Entrypoint": null, "OnBuild": null, "Labels": { "org.label-schema.build-date": "20201113", "org.label-schema.license": "GPLv2", "org.label-schema.name": "CentOS Base Image", "org.label-schema.schema-version": "1.0", "org.label-schema.vendor": "CentOS", "org.opencontainers.image.created": "2020-11-13 00:00:00+00:00", "org.opencontainers.image.licenses": "GPL-2.0-only", "org.opencontainers.image.title": "CentOS Base Image", "org.opencontainers.image.vendor": "CentOS" } }, "container": "5d3e8d3176babbe767294d6834bf1dc6e64ce9838bc08993751724a92231d6ac", "container_config": { "Hostname": "5d3e8d3176ba", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" ], "Cmd": [ "/bin/sh", "-c", "#(nop) ", "CMD [\"/bin/sh\" \"-c\" \"/home/main\"]" ], "Image": "sha256:04da435eb95c963effbf9aba060e6f2ddd962b4861ce1bf7bdc5f3cbffceb4b5", "Volumes": null, "WorkingDir": "", "Entrypoint": null, "OnBuild": null, "Labels": { "org.label-schema.build-date": "20201113", "org.label-schema.license": "GPLv2", "org.label-schema.name": "CentOS Base Image", "org.label-schema.schema-version": "1.0", "org.label-schema.vendor": "CentOS", "org.opencontainers.image.created": "2020-11-13 00:00:00+00:00", "org.opencontainers.image.licenses": "GPL-2.0-only", "org.opencontainers.image.title": "CentOS Base Image", "org.opencontainers.image.vendor": "CentOS" } }, "created": "2022-10-26T15:53:47.898864929Z", "docker_version": "20.10.17", "history": [ { "created": "2021-09-15T18:20:23.417639551Z", "created_by": "/bin/sh -c #(nop) ADD file:b3ebbe8bd304723d43b7b44a6d990cd657b63d93d6a2a9293983a30bfc1dfa53 in / " }, { "created": "2021-09-15T18:20:23.819893035Z", "created_by": "/bin/sh -c #(nop) LABEL org.label-schema.schema-version=1.0 org.label-schema.name=CentOS Base Image org.label-schema.vendor=CentOS org.label-schema.license=GPLv2 org.label-schema.build-date=20201113 org.opencontainers.image.title=CentOS Base Image org.opencontainers.image.vendor=CentOS org.opencontainers.image.licenses=GPL-2.0-only org.opencontainers.image.created=2020-11-13 00:00:00+00:00", "empty_layer": true }, { "created": "2021-09-15T18:20:23.99863383Z", "created_by": "/bin/sh -c #(nop) CMD [\"/bin/bash\"]", "empty_layer": true }, { "created": "2022-10-26T15:53:46.693528022Z", "created_by": "/bin/sh -c #(nop) COPY file:d0c12b416e2bad24636a0f240cc09c4a6b6a0def701b5aaeeca4963507e468c4 in /home/main " }, { "created": "2022-10-26T15:53:47.752968676Z", "created_by": "/bin/sh -c chmod +x /home/main" }, { "created": "2022-10-26T15:53:47.898864929Z", "created_by": "/bin/sh -c #(nop) CMD [\"/bin/sh\" \"-c\" \"/home/main\"]", "empty_layer": true } ], "os": "linux", "rootfs": { "type": "layers", "diff_ids": [ "sha256:174f5685490326fc0a1c0f5570b8663732189b327007e47ff13d2ca59673db02", "sha256:185d24f4ae72bebf9b31c3d26486d52163e38e6c09167507a6a4f28d491aeb28", "sha256:185d24f4ae72bebf9b31c3d26486d52163e38e6c09167507a6a4f28d491aeb28" ] }}
参考docker-ce的相关导出模块的源代码(源代码解读就不做了,不算复杂)
docekr-ce/components/engin/savedocekr-ce/components/engin/load
可以找出打包出的tar包中放layer文件夹与diff_ids标记的文件层的对应关系。
1. 解压压缩包2. 读取 centos7 和 ip-server 的inspect3. 找出 centos7 和 ip-server 的diff_ids同的层数4. 按照相同的层数依次找到manifest.json中记录的layers文件目录,并把layer.tar的压缩包置空5. 重新打包
测试改处理的导出包在centos:7镜像已存在的环境中可以被正常导入操作。
代码实现
参考 github.com/zn-chen/dockerdiff
懒得琢磨也可以直接使用,在安装好go环境下git clone 下来 make && make install 后即可食用。