Add log file access information on first access by wswsmao · Pull Request #2205 · containerd/stargz-snapshotter (original) (raw)
Conversation
This PR introduces a file access log in debug mode.
By using a sync.Map to track accessed node IDs, it ensures that metadata for each file—including its path, size, mode, and layer digest—is logged only once when it is first opened or its symlink is read.
This feature is useful for analyzing container image usage patterns and identifying which files are actually being accessed during runtime.
| rootID uint32 |
|---|
| opaqueXattrs []string |
| passThrough passThroughConfig |
| accessedPaths sync.Map |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size of this data continuously increase during runtime as long as the filesystem is mounted, right? If so, can this feature be optional? And what is the reason we avoid duplication here?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deduplication is essential to prevent log flooding and keep the debug output clear.
I've replaced sync.Map with a bitset to minimize memory usage (1 bit/file).
Comment on lines +127 to +131
| idx := int(n.id / 64) |
|---|
| bit := uint(n.id % 64) |
| fs.accessedMu.Lock() |
| if idx >= len(fs.accessedBits) { |
| newBits := make([]uint64, idx+1) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no guarantee that the id is assigned incrementally from zero and any number is allowed as long as it fits uint32. If the metadata pkg uses the maximum value (4294967295), this immediately allocates hundreds of MBs of the slice, which is still problematic for the memory efficiency.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved the deduplication flag into the node struct using atomic CAS, which eliminates large global allocations and allows the memory to be automatically reclaimed alongside the inode's lifecycle.
| return nil, 0, syscall.EIO |
|---|
| } |
| n.logAccessOnce(ctx) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you enable this function only when a config filed like log_file_access is set to true? Debug level logs are usually used for debugging containerd-stargz-grpc's internal behaviour (e.g. for fixing bugs) but the purpose of this log is different (i.e. for analyzing container image usage patterns) and can potentially increase the amount logs significantly.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we need to add a new parameter to newNode to pass this configuration.
However, the parameter list for newNode is already quite long. I'm considering refactoring it via #2223 first, and then implementing this change based on the refactored code.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you enable this function only when a config filed like
log_file_accessis set to true? Debug level logs are usually used for debugging containerd-stargz-grpc's internal behaviour (e.g. for fixing bugs) but the purpose of this log is different (i.e. for analyzing container image usage patterns) and can potentially increase the amount logs significantly.
done
| var _ = (fusefs.NodeReadlinker)((*node)(nil)) |
|---|
| func (n *node) Readlink(ctx context.Context) ([]byte, syscall.Errno) { |
| n.logAccessOnce(ctx) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: abushwang abushwang@tencent.com
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})