how containers have panned out – adrian trenaman – qcon

Posted on June 14, 2016 by Jeanne Boyarsky

For more QCon posts, see my live blog table of contents. Adrian is from Gilt.

History

No off the shelf software to run a flash sale business. Therefore Gilt has to do something custom.
Started with Ruby on Rails in 2007. Didn’t scale well enough
Moved to Java in 2011
Moved to microservices in 2015
In a 30 day period, moved bulk of Gilt to Amazon

Problems

Isolation problem – nobody should be able to take down someone else’s work
A noon outage in 2013 – what happened
Impedance mismatch problems. “Developers often think of machines as something that’s all theirs, magically provided by the hardware fairy.”

Machines for Gilt Japan

Run 20-40 containers per machine.
Load balancer between two racks of three boxes each.
Separate machines for the database and email.
From developer’s point of view, a machine is a machine.

What did Gilt Japan learn

Scalable by time of day
Solves impedance mismatch – developers see “a machine”
Limits damage one person can do
Infra/Devops engineer embedded into engineering team
Outstanding potential problems
- Static infrastructure
- Resource hogging

Docker topology

Dark canary – only for internal use
Canary – First prod install. Let it run for a while (ex through a noon cycle for Gilt)
Release – Once happy with canary, roll it out to other nodes
Gilt has a lot of read only traffic which limits damage you can do and reduces need for staging environment.
Gilt has one container per host/EC2 instance
Want to have as few moving parts/risk points in deployment process
“We could solve this now, or just wait six months and Amazon wil provide a solution”

Projects

ION Roller
- Immutable deployment – Destroy original cluster when done with this process for Docker upgrades.
- Slow to setup/tear down environments.
- Can be expensive for continuous deployment
- Open source, but in house.
Nova
- Uses yaml to deploy
- No Docker registry. Base images are on Docker. Releases aren’t needed on there so go straight to Amazon
- Less boilerplate
- Immutable deployment on mutable infrastructure. Docker container is immutable.
Fighting bit rot, chaos-monkey style
- Don’t want things to run forever in Prod.
- What if there is a security vulnerability
- Every day, kill oldest AMI randomly. This forces latest AMI with fixes and fail early.
- Doesn’t solve vulnerability in Docker container. Would need new release with new base image for that. Hasn’t happened to Gilt yet.
Sundial
- For running batch jobs
- Automatically reschedules if fail
- Define a process – group of tasks with dependencies between them

EC2

Less configuration
Automatic rollout
Integrations
IAM roles are at instance level, not container level

Using Docker as a local build platform

Different projects use different versions of build tools
Docker can be used as a versioned build container.
A year from now, will still have everything need to run code

Lessons

Containers let separate what deploy from how.where deploy it
Still the wild west on how containers are deployed
Seek immutability in the container, not in the stack
The competitive advantage for Gilt is to be able to deploy quickly/frequently/safely to production and therefore can innovate faster. Gilt lets engineers deploy whenever they want without asking permission.

unprivileged containers – jessie frazelle – qcon

Posted on June 14, 2016 by Jeanne Boyarsky

For more QCon posts, see my live blog table of contents.

Today

Docker typically runs as a privileged user.
Containers are meant to limit the damage from a compromise. The world an attacker can see inside the container is a limited one)
Want unprivileged containers so don’t need sudo/privileged access to launch container in the first place.

Chrome sandbox on Linux

uses Seccomp, Namspaces, Apparmor.
doesn’t need to be run as root.
each tab is in its own namespace – process only knows about itself
if Chrome can do this, why not Docker

General notes

cgroups (Controlgroups) limit what resources a process can use and how much.
Each time you docker build something it spawns a new container. Just blocking things wholesale would cause issues here.
I had trouble following what was current/future in the examples.

Future

Won’t need to run as root
Can customize sandboxes from defaults, better UX for dealing with security policies.
“postgres should maintain a postgres profile”

Impression: A lot of this was recorded demos (show typing commands as graphic/video that plays out.) For the namespaces, it was helpful seeing the examples. For the Docker part, some of it went over my head. I only know a little Docker. And my system admin Linux isn’t strong enough to understand the implications of everything she brought up either. I still go something out of it though. And learned things that would be interesting to read more about.

the bad things happen when you’re not looking – ryan huber – qcon

Posted on June 14, 2016 by Jeanne Boyarsky

See the live blog table of contents. Gist is posted at https://goo.gl/ZAxCnH (github login required)

Ryan was the first security employee at Slack. He is doing an experiment where red slides means don’t take pictures or tweet about the slide. I really like that idea. It makes speaker intent clear.

How find out about a problem

Don’t want to find out from Brian Krebs that you’ve been breached
Don’t want hackers to tell you something strange is going on. They are done at that point and are showing off
Even worse – don’t notice

General Notes

Time to detect is important metric
Credential theft is biggest/one of the biggest
Goal – watch as many things as possible, but don’t be a dashboard. Want as little as possible on the dashboard. If it is mostly empty, things will get noticed when they are there.
Bad model – NetCool – train people to acknowledge all alerts and they miss things because bad habit
The defender’s advantage – if the attackers don’t know what you are looking for/trip wire, they dont know what to avoid
“Zero days are not invisibility cloaks” – other boxes can pick up on it
The hypothetcial malicious insider – a former security team member has a lot of knowledge. And an insider with credentials has access
Don’t overwhelm users. Confirm bulk actions in bulk not one at a time.
Canaries – need to validate monitoring, recording, etc.
Do table top red team exercises if not doing real ones.

Slack Security

Setup reliable logging platform
- RELP (reliable event logging protocol)
- steamstash/logstash -> Elastic search (Splunk is superior but costs more)
- Two weeks of data is about 2 terrabytes of logged data. Almost never sits on disk
auditd – part of Linux. Run auditctl commands and kernel looks for matching events.
audisp – works with auditd to transform data
osquery – Facebook project for system monitoring using SQL
ElastAlert – yelp project to pick up on ElasticSearch events. Does queries on a timer against Elastic Search.
AlertCenter – have SecurityBot looking at alerts. Security bot posts to Slack asking user to type “acknowledge” on phone to confirm action. That way, know have phone and not just Slack account. If no reply in X hours, goes to Pagerduty. Automated triage to avoid flood of data. Instead of security team looking at all alerts, whole company is helping. This means the security team responds to less than 5 alerts a day.

Rules

Listeners – specific events
Time awake – nobody is awake for 24 hours. Trigger an alert when this happens
GeoIP – Doesn’t work perfectly. T-Mobile has feature that can travel abroad without paying roaming. This works by routing some traffic through Texas so your location keeps jumping between Texas and aboard
IPs – less unique IPs than you’d think. Worth looking at when user comes from new IP.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28

Down Home Country Coding With Scott Selikoff and Jeanne Boyarsky

Java/J2EE Software Development and Technology Discussion Blog

Category Archives: Technology

how containers have panned out – adrian trenaman – qcon

unprivileged containers – jessie frazelle – qcon

the bad things happen when you’re not looking – ryan huber – qcon

Share this:

Share this:

Share this: