Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added improved logging #101

Merged
merged 16 commits into from
Jan 17, 2024
Merged

Conversation

gesarki
Copy link
Contributor

@gesarki gesarki commented Dec 28, 2023

What

Adds the following information in the logs for each image pull and mount

  • pull time for a particular image
  • mount time for a particular image
  • size of the image as a log
  • randomized request ID

Why

So we have better insight into the performance of the warm metal node plugin and to facilitate debugging.

Addition information

Here's what it looks like in action:
image

klog.Infof("mount request: %s", req.String())
u, _ := uuid.NewRandom()
valuesLogger := klog.LoggerWithValues(klog.NewKlogr(), "pod-name", req.VolumeContext["pod-name"], "namespace", req.VolumeContext["namespace"], "uid", req.VolumeContext["uid"], "request-id", u.String())
valuesLogger.Info(fmt.Sprintf("mount request: %s", req.String()))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valuesLogger.Info(fmt.Sprintf(...)) is messy but klogr doesn't support infof().

Copy link
Contributor

@vadasambar vadasambar Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to check klog.SetLogFilter() (you basically pass in a function which is called every time something is logged and add the sticky values there) or klog.SetLogger()

Another option is to use zap's SugarLogger instead of klog's logger but I think that might need more discussion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the advice here for converting format strings to structured logs. Open to discussion though

@gesarki gesarki changed the title [DRAFT] Added improved logging Added improved logging Jan 4, 2024
@gesarki gesarki marked this pull request as ready for review January 4, 2024 14:44
@gesarki gesarki requested a review from a team as a code owner January 4, 2024 14:44
gesarki and others added 4 commits January 11, 2024 00:08
Co-authored-by: Mugdha Adhav <mugdha.adhav@gmail.com>
Co-authored-by: Mugdha Adhav <mugdha.adhav@gmail.com>
Co-authored-by: Mugdha Adhav <mugdha.adhav@gmail.com>
@gesarki
Copy link
Contributor Author

gesarki commented Jan 12, 2024

I'm still running into problems getting the correct image size:
I ran the ephemeral-volume.yaml job test and here is the log after pulling:

pullexecutor.go:92] "Finished pulling image" pod-name="" namespace="" uid="" request-id="1e6b20f0-52e6-431c-905b-f5836c1d5a2a" image="docker.io/warmmetal/csi-image-test:simple-fs" pull-duration="1.252591851s" image-size=46

Checking the image via the containerd cli shows it's 3.1 KiB:

$ sudo ctr images ls | grep -i "simple"
docker.io/warmmetal/csi-image-test:simple-fs                                                                                   application/vnd.docker.distribution.manifest.v2+json      sha256:6242a1b1a6de62107af0d2aa3cc80808e15fe1720e80e813ed25e99dc9631a27 3.1 KiB   linux/amd64                                                                                                                        io.cri-containerd.image=managed

And the dockerhub page for the simple-fs tag says 360 B:
image

I tried with another image docker/transfused and it's the same story: csi image says image-size=36

1 pullexecutor.go:92] "Finished pulling image" pod-name="" namespace="" uid="" request-id="7b2bd2ad-5148-4b34-83d0-5782ba6f4d71" image="docker.io/docker/transfused:latest" pull-duration="4.615070364s" image-size=36

ctr says 169.0 KiB

$ sudo ctr images ls | grep -i "transfused"
docker.io/docker/transfused:latest                                                                                             application/vnd.docker.distribution.manifest.v2+json      sha256:f5c10e2f6eec0207edc9f837efd64accb69c631ad99ac1310c4c388e9adcb60d 169.0 KiB linux/amd64                                                                                                                        io.cri-containerd.image=managed
docker.io/docker/transfused@sha256:f5c10e2f6eec0207edc9f837efd64accb69c631ad99ac1310c4c388e9adcb60d                            application/vnd.docker.distribution.manifest.v2+json      sha256:f5c10e2f6eec0207edc9f837efd64accb69c631ad99ac1310c4c388e9adcb60d 169.0 KiB linux/amd64                                                                                                                        io.cri-containerd.image=managed

and dockerhub says 166.96 KB compressed size
image

// Returns the size of the image that was pulled in MB(I think?) **TODO: check**
func (p puller) ImageSize(ctx context.Context) int {
imageSpec := &cri.ImageSpec{Image: p.image.String()}
return imageSpec.Size()
Copy link
Contributor

@kitt1987 kitt1987 Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct to get the size of an image. After checking the implementation of the ImageSpec.Size, I found that the Puller.ImageSize will always return the length of the image reference, that is, p.image here.

The image size can be fetched through 'ImageServiceClient.ImageStatus'. The committed code can be used to get the extracted image size instead.

@gesarki
Copy link
Contributor Author

gesarki commented Jan 15, 2024

Updated the image size code and things good now:
Testing with redis:latest, here's the log message showing 49.03 MiB:

I0115 05:38:15.784715    9938 pullexecutor.go:131] "Finished pulling image" pod-name="test-pod" namespace="test-namespace" uid="test-uid" request-id="3ba02a14-95ad-4b28-bc9a-2b257c931015" image="docker.io/library/redis:latest" pull-duration="570.659305ms" uncompressed-image-size="49.03 MiB"

And here's what ctr shows (we checked the source code and found that despite what the comment says it actually retrieves the uncompressed size):

root@75b012314615:/code/cmd/plugin# ctr -n k8s.io image ls | grep -i redis
docker.io/library/redis:latest                                                                  application/vnd.oci.image.index.v1+json sha256:b5ddcd52d425a8e354696c022f392fe45fca928f68d6289e6bb4a709c3a74668 49.0 MiB linux/386,linux/amd64,linux/arm/v5,linux/arm/v7,linux/arm64/v8,linux/mips64le,linux/ppc64le,linux/s390x,unknown/unknown io.cri-containerd.image=managed

For nginx, here's the log message saying 67.25 MiB

I0115 05:49:41.818791   10345 pullexecutor.go:131] "Finished pulling image" pod-name="test-pod" namespace="test-namespace" uid="test-uid" request-id="fb2ed8f5-8b2b-4526-9cdd-5d7d81b8e867" image="docker.io/library/nginx:latest" pull-duration="23.723444588s" uncompressed-image-size="67.25 MiB"

Which matches what ctr shows

root@75b012314615:/code/cmd/plugin# ctr -n k8s.io image ls | grep -i nginx
docker.io/library/nginx:latest                                                                  application/vnd.oci.image.index.v1+json sha256:4c0fdaa8b6341bfdeca5f18f7837462c80cff90527ee35ef185571e1c327beac 67.3 MiB linux/386,linux/amd64,linux/arm/v5,linux/arm/v7,linux/arm64/v8,linux/mips64le,linux/ppc64le,linux/s390x,unknown/unknown io.cri-containerd.image=managed 

As well as the dockerhub page.

@mbtamuli
Copy link
Contributor

mbtamuli commented Jan 15, 2024

@gesarki You have the log message key reversed - uncompressed-image-size. It should be compressed-image-size.

Name Compressed(MiB) Uncompressed(MiB)
redis:latest ~40 ~150
nginx:latest ~60 ~190
# Source: https://gist.github.com/MichaelSimons/fb588539dcefd9b5fdf45ba04c302db6
$ dockersize() { docker manifest inspect -v $1 | jq -c 'if type == "array" then .[] else . end' | jq -r '[ ( .Descriptor.platform | [ .os, .architecture, .variant, ."os.version" ] | del(..|nulls) | join("/") ), ( [ .OCIManifest.layers[].size ] | add ) ] | join(" ")' | numfmt --to iec-i --format '%.2f' --field 2 | column -t | grep -v unknown}

# compressed
$ dockersize nginx:latest
linux/amd64      67.24Mi
linux/arm/v5     60.16Mi
linux/arm/v7     56.71Mi
linux/arm64/v8   64.09Mi
linux/386        65.49Mi
linux/mips64le   63.11Mi
linux/ppc64le    71.45Mi
linux/s390x      61.82Mi
$ dockersize redis:latest
linux/amd64      49.02Mi
linux/arm/v5     45.28Mi
linux/arm/v7     42.65Mi
linux/arm64/v8   48.05Mi
linux/386        49.09Mi
linux/mips64le   48.32Mi
linux/ppc64le    53.76Mi
linux/s390x      46.45Mi

# uncompressed
$ docker inspect -f "{{ .Size }}" nginx:latest | numfmt --to=iec-i
184Mi
$ docker inspect -f "{{ .Size }}" redis:latest | numfmt --to=iec-i
151Mi

@mbtamuli
Copy link
Contributor

@gesarki On the note of image size, see containerd/containerd#9261.

@@ -59,7 +60,9 @@ type NodeServer struct {
}

func (n NodeServer) NodePublishVolume(ctx context.Context, req *csi.NodePublishVolumeRequest) (resp *csi.NodePublishVolumeResponse, err error) {
klog.Infof("mount request: %s", req.String())
u, _ := uuid.NewRandom()
valuesLogger := klog.LoggerWithValues(klog.NewKlogr(), "pod-name", req.VolumeContext["pod-name"], "namespace", req.VolumeContext["namespace"], "uid", req.VolumeContext["uid"], "request-id", u.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this information, the additional key-values, added to all logs from the moment a new request comes in, not make the logs noisy?

I suggest not adding all these key-values via LoggerWithValues, but rather logging just the request itself. An example in aws-ebs-csi-driver

Even if all the key-values are desired to be added to all logs, might I suggest increasing the V level - V(4), such that only on increasing the verbosity level do you see such verbose logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra key-values will only be added to logs called valuesLogger, i.e. valuesLogger.Info() or valuesLogger.Error(), etc. We're just using it for logs that we want to include those key-values. The reason is we want to be able to track individual requests as they come in, start pulling, finish pulling, start mounting, finish mounting. Including the random UIDs and context info help to be able to track requests especially as pulls can be asynchronous now.
Attached is a sample log from one async pull. You can see it's just a select few messages that are logged sample-log.txt

Co-authored-by: Mriyam Tamuli <mbtamuli@gmail.com>
@mbtamuli
Copy link
Contributor

@gesarki You'll need to rebase once.

@@ -160,3 +160,6 @@ See [test/sanity](/~https://github.com/warm-metal/csi-driver-image/tree/master/tes
### E2E test

See [test/e2e](/~https://github.com/warm-metal/csi-driver-image/tree/master/test/e2e).

## Note on logging image size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator

@mugdha-adhav mugdha-adhav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are looking good, and thanks for updating the documentation.

Before merging, I am also updating Makefile and Chart.yaml in the same PR. Created #119 for improving this process.

@mugdha-adhav mugdha-adhav merged commit c1787b4 into warm-metal:master Jan 17, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants