Problem:
The client reported encountering a recurring issue when attempting to execute scripts within a Docker container. The client consistently received an error code 137, indicating an Out of Memory (OOM) condition. Despite attempts to resolve the issue by restarting and reinstalling Docker, the problem persisted.
Process:
- Gathering System Information:
- The Docker version being used;
- System information, including the output of
uname -a
andfree
commands, to understand the system’s configuration.
- Verifying Docker Functionality:
- To ensure Docker was functioning correctly, the expert instructed the client to execute a specific Docker command:
docker run --rm -it busybox ls
.
- To ensure Docker was functioning correctly, the expert instructed the client to execute a specific Docker command:
- Understanding Docker Use Case:
- The expert sought to understand the specific tasks or applications the client was attempting to run within Docker;
- The expert inquired about any memory constraints in the Docker command or Docker Compose file.
- Checking cgroup Limits:
- The expert asked the client to check cgroup limits inside the container by running commands for both cgroup v1 and v2;
- The expert requested testing the container with the
--oom-kill-disable
option to assess memory consumption; - The expert suggested running the container with explicit memory limits, such as
--memory 2G
, and investigating kernel logs for OOM records.
- Examining Logs and Security Tools:
- Provided logs were reviewed but found no evidence of OOM killer errors;
- The expert inquired about the presence of security tools, such as Carbon Black or CrowdStrike, which might interfere with processes;
- Upon further investigation, it was discovered that CrowdStrike was causing the termination of script processes.
Solution:
The issue was successfully resolved after identifying CrowdStrike as the root cause of script process termination. Stopping the CrowdStrike service resolved the problem.
Conclusion:
This case demonstrated the importance of a thorough investigation into system configurations, Docker functionality, and potential third-party software interference when troubleshooting Docker container issues. In this instance, the swift identification of the security tool responsible for the problem led to a successful resolution.