Blog

Containers are popular because they are a cost-effective way to build, package, and promote an application or service, and all its dependencies, throughout its entire lifecycle and across different on-prem, cloud, or hybrid environments. However, major security risks emerge in downstream repositories and subsequent logging of ephemeral objects that naturally disappear. 

Alan Orlikoski of Square shared his insights on how to mitigate some of these risks and conduct proper vulnerability management and incident response with regard to container environments.

Vulnerability Management:

Within data-center, on-prem, cloud, and hybrid environments, engineers using containers need to consider not only patching for updates, but understand where objects originate. 

Most organizations rely on base images stored in a central repository such as Docker, Google/Amazon internal repos, or hosted in an engineer’s own repos used for the CI/CD process. Once the security team has those base images, these need to be maintained and updated as part of a vulnerability management strategy just like normal base servers, because that is what is going to be deployed during a refresh. 

If a security team doesn’t trust the upstream source, the image an infrastructure engineer used is stored on a repo and if that instance is taken over by a malicious actor, then there is a breakdown in security. 

For example, if an attacker takes over a public image repo location such as a Docker Hub account, two malicious scenarios could play out: 

  1. The attacker uploaded a new version by either version increase and forced usage; or took over a known tag with a brand new image with malicious code and forced an upgrade to a known malicious version
  2. The attacker replaced a known good image with a now bad version in which the company’s already vetted version is not going to re-vet with the same version number. 

Kubernetes and Docker Swarm are common orchestration tools to deploy when building applications in on-prem, cloud, data center, and hybrid environments. The challenge can be open source templates (ex. terraform, helm) and infrastructure engineers that may have a mentality of “it just works.” 

If a developer deploys a container version with default credentials and security keys they may have little idea they are putting themselves at risk. A lot more coordination needs to occur between security and infrastructure engineers to make sure those repos are secure and updates are applied as part of vulnerability management strategy just like base servers.

Secondly, it’s critical for engineers to build and maintain their own internal repository to establish “known good”. If an engineer takes a fully vetted image on Docker Hub and pushes it to an internal repo instead of a public one (similar to deploying from the same gold image server from years ago), this is a necessary step to ensure “known good” despite the extra step. 

Likewise it’s critical to take advantage of Google and Amazon’s vulnerability management tools to ensure security within the containers themselves.  

Incident Response: 

Incident response has traditionally been defined by analyzing traffic from network sensors or forensics on physical machines. With an ephemeral object such as a container that lives for an hour or a day, odds are that container being there at the time of an incident are low. 

Security teams must plan that the container won’t be around, so it is critical to log all file action and user actions of ephemeral objects. For instance, if something happened to one of these objects seven days ago, typically there would be no record to assist  incident response and no physical disks to conduct forensics on. 

If an organization has to log every process, file, user, and network action in the containers, the environment is going to produce a tremendous amount of logging. Decisions need to occur between the security and infrastructure team on the duration of storage, when storage will need to be accessed, and the associated costs. 

A flat log file is not going to be useful in a timely fashion and could take weeks to conduct incident response, whereas if the organization is ingesting into Athena, Splunk, or BigQuery, the incident responders can answer questions in minutes, not weeks.

With regard to threat intelligence, it’s important to pay attention to known compromised libraries, compromised publicly available docker images, and attacks against cloud providers. Like all good threat intelligence teams, a team defending the new attack vector of containerized environments should seek to find specific knowledge of how attackers might approach your environment to enable action.