Intro: SPIFFE - Andrew Jessup & Daniel Feldman, Scytale

Intro: SPIFFE – Andrew Jessup & Daniel Feldman, Scytale



all right mmm let's kick off hi everybody so my name's Andrew with me is in the corner here is my colleague Dan let's say hi to Dan everyone whoo hey so Dan and I don't I have about half an hour with you today and when we were putting this talk together we had a lot to tell you we spend most of our time actually cutting material out and so what we're going to talk we're going to talk a little bit about obviously the spiffy project a little bit about its history why it exists and in particular in this talk we're going to focus on how you as a developer should think about spiffy and how you can interact with it we left a lot on the cutting room floor so hopefully if we did our jobs right there will be more questions possibly than answers at the end of this the good news though is that if you have questions about spiffy you're in the right place first of all we have are lots of spiffy talks at coop Con this year I think we have I think seven or more and including several of which it's a day a couple more tomorrow and then on top of that many folks who work on the project both a Scytale our company and elsewhere around the conference some of us are wearing these shirts so come and ask us and we'd love to talk to you more about the project so with that I'm gonna kick off and I'm also going to talk very quickly because we've only got half an hour so if you're watching this on YouTube later and you have the speed dial you can always turn this down to half speed it may help so I'm gonna kick off with just a little bit of history behind the project before we talk about what it is the main point I want you to take away from this is that spiffy is not a new idea when we've traced the archeology of the project we were able to trace it as far back as plan 9 which is a project out of Bell Labs around about the beginning of 2000's and in particular a component of plan 9 called factotum and that in turn found its way into Google's internal infrastructure around about 2005 in the form of the system called low s which now is ubiquitous across the company and has been around for you know more than 10 years in 2016 a gentleman by the name of Joe bata who some of you may have heard of has who previously started kubernetes by taking and the idea in borg another internal google system and making them available and accessible in open-source was looking around for other systems at google for which you could do the same thing and he came across the lowest system and saw some of the ideas in there and realized that they would be hugely valuable for many other organizations other than google and so he proposed a project called spiffy and he proposed a white paper and glue con and since then a few folks including myself including dan got together to try and make spiffy a reality we launched v1 at coop con this same events last year in Austin in 2017 and earlier this year the spiffy project as well as its complement project spire were both accepted into the CNC F so we're really excited to be here since then we've had a huge amount of interest in a huge amount of growth the the project itself has grown substantially we've we've kept building on our foundations we've kept improving the project responding to bugs improving the reach of the project but most importantly we've also brought more and more contributors and more and more organizations and users into the project as well and every time that's happened we've learned more about how this thing can be applied the problems that can be applied to and how it can work as well as the challenges that come with different organizations and at different levels of scale so we're we're really excited to see that growth and see that adoption and to to see that the problems aren't just limited to a small number of companies but are actually a broad base and we're particularly grateful for this enthusiastic community that's come together to help us solve that problem so it's very much become a community effort so you might be wondering given this as a spiffy intro talk what spiffy actually does in fact quick show of hands who here has actually heard of spiffy before excellent who could describe what it does or what it is excellent okay everyone put their hand up before three people put their hand up afterwards let's see if we can fix that so what spiffy does is solve fundamentally is solve a door deliver rather trusted identities to software systems what is a trusted identity and why do I care a trusted identity allows when allows when a workload is receiving a communication from another workload in a distributed system allows it to make a couple of important sessions firstly a trusted identity will allow want one system to know that it's when it's talking to another system that it's actually speaking to the system that it thought it was that it thinks it's speaking to the second thing it can happen oh is that the message that's coming from that system hasn't been tampered with by any other third party so in other words you could think of trusted identities as being the foundation of a robust software security practice and as we move more and more into this world of micro services and multi cloud and in other words very very distributed heterogeneous systems this idea of trust becomes really really important it's worth saying it's worth spending a moment as well to talk about what spiffy is not or what rather spiffy complements you know and people we talk about trust and trust as I then these people ask about things like authentication and authorization and what role please spiffy plays in that is spiffy authentication and authorization and the answer is sort of but not really the way we think about spiffy is is a no and for that matter identity in general for workloads is that it's a foundation for authentication and a foundation for authorization if I have a protocol like TLS for example it includes an authorization and authentication handshake is a part of that protocol what spiffy will do is provide that the these TLS libraries with the identities they need in order to be able to perform that handshake but it does not implement say TLS likewise if say I'm sending HTTP messages and I've built an authentication protocol around a jot token the spiffy can provide the jot token and the material for the job token and the ability to verify that job token but it does not provide the HTTP itself it does not provide the authentication itself it supports it and this is very much by design this allows for a lot of flexibility in terms of the implementation and the usage of it and was was a very much a core principle of the project we have some other talks a little bit later that dive into these ideas and these concepts and these integrations in a bit more detail and Dan's gonna talk a little bit about it as well but I want to take a step back now and talk about you know the vision really for the project and why it exists and you know the core problems that have driven it and as I as I alluded to before the core problem really is that modern software and modern solutions are becoming increasingly heterogeneous and complex you know we're here at coop to talk about projects like kubernetes we're also here to talk about technologies like cloud infrastructure many of you I'm sure are using platform as a service when I'm a single application or maybe an application team and maybe using one of these things but when I up level as an organization and look at what all of the different systems I need to deliver a solution I'm usually using many of these different things at once and as I want to build solutions that serve problems of say a customer or a user often is not the communication patterns will need to span all of these different things they'll need to spend on Prem and cloud they'll need to spend between a cloud and a pass system at cloud an Orchestrator so we have a lot of complexity now that we didn't really have before because we didn't have cloud and we didn't have all of these different middleware systems we also didn't have this micro service thing so in this chat in this heterogeneous world how do we solve for this and in particular how do we how do we provide identity to our workload so they can communicate between each other well the sort of the default way the way we've done it for a long time and worked pretty well in static deployments was to use the network to use the IP address my IP address identifies my workload the problem is IP doesn't itself given and doesn't provide any security so I need things like firewalls and VPNs and V pcs and other network perimeters to guarantee what can actually be delivered and in particular IP address and what can't again in static environments that can work reasonably well but when I get to dynamically scheduled or elastically scaled environments and for that matter I have to think about the overlay of say kubernetes is networking topology with my Amazon security group with the V PC I'm using to connect with on-prem it becomes very difficult to reason about and rationalize these and it becomes very difficult to dynamically update all of my firewalls and access rules and routers and other pieces to be able to accommodate this so we see a lot of brittleness and a lot of failure modes around using the network as my sole mechanism to establish identity and Trust between software systems the other way that I suspect most folks in this room are really familiar with is using some kind of shared secret a shared secret is like say a database password or an API access secret and some kind of token that the the destination workload generates and then it gets passed to the source workload it's pretty simple it works really well if you've used I don't know my sequel using a man password then you've used this system before or if you've tried to talk to an Amazon API you've used this before it's easy to distribute which is nice but it has a lot of challenges come along with this – how do i generate these secrets in the first place how do i generate a unique secret for every destination workload how do I know which sorry each source workload that needs to connect to me how do I know which source workload is assigned which secret have I rotated these secrets how do I distribute them and a particular how do I distribute them without getting them embedded in my CI CD pipeline or my configuration management system messing this up is the result of an enormous number of breaches today and if you ever want to have a small show of horror go and search github or search just Google for Amazon keys and you'll see thousands hundreds of thousands of these things listed publicly shared secrets works but it's very hard to secure and do well and then the third way that's a little more recent that we've seen people reason about identity is to actually just ask my platform if I'm running on Amazon for example then Amazon can give me an identity say an Amazon I am identity if I'm running in Cuba Nettie's for example I can ask for a my service account and you know other systems have other equivalents of that there's some level of workload identity or application identity that they need to support and not only that but they can also allow different workloads to verify and verify the identity of another workload running in that same system and it's actually a really nice model it gets rid of it's very easy for an application to to hook into it just simply asks say kubernetes hey give me my service account and it now knows its identity and it can use it it's a nice model that I don't have to worry about key distribution or rotation or management in the way that I did with share secrets but it has one glaring weakness and that weakness is that every single system needs to be running on the same platform if everything is running in Amazon and Amazon knows about it and knows about that specific application and workload and workload then I can use Amazon to do this if everything is running in the same kubernetes cluster then I can use kubernetes service accounts for example to do this but as soon as I get any degree of heterogeneity again this model falls down because it becomes very hard to unify these pieces together so what's the ideal vision that we would have here if we wanted to support trust and/or education between system between between systems in his heterogeneous environment it would probably look something a little bit like this I would have a mechanism that is a workload I could call into to get my identity that's that the green box on these on these diagrams my workloads by the way are the green circles I would have some way for my workloads to retrieve their identity from the platform directly when it needs its identity it would then also have a way for workloads running in two different systems to be able to verify each other they would have some common standard what standard document and that they can use to to prove their identity to each other and verify it and then in the background there would be some way for these different platforms to be able to establish trust to each other so that the documents that were exchanged could be verified I sort of crude analogy of this so actually we and so this is the the cornerstone of the spiffy project is to define a set of standards documents and api's to allow this world to exist and my colleague Dan's going to break this down in a bit more detail but I'm going to give you the bumper-sticker version of each of these three the first is these documents we call the s fit which is short for the spiffy verifiable identity document we support a few different forms of these documents but you can kind of think of them as like a driver's license for your software system and the it's been issued by an authority which comes from a particular platform the next piece we have is a thing called the workload API which is a standard vendor-neutral api that a workload can call into to retrieve its document and it means it can do it in a way that's automatic in a way that is you can you can embed into a library and will work consistently in the same way whether you're running in cloud whether you're running on prem running in a container running on staging versus production versus development and then finally we have this thing called federation which is a fairly new part of spiffy and we're going to talk about a little bit in a little bit more detail later in the week what federation allows you to do is to trust different authorities if you think of your s fit as a driver's license issued by a particular state then what federation allows you to do is for different states to trust each other such that I can take a driver's license issued in Washington and use it in say Arizona and and so what this does is by creating these api's and and creating this standard and contract is we've we've created an interface between issuers of spiffy identity documents issuers of these driver's licenses if you will and with consumers who can use them to actually do meaningful work like authentication and authorization and although we're a fairly young project we've already seen the community galvanized around this on the issuer's side that is software that you can run that actually issues these driver's licenses we have three projects already that are adopting this the first being spire this is the project that we work on that we've worked on a lot of site l and is now also part of the CNC F that implements these standards in in a wide range of environments different cloud providers different platforms different orchestrators and I would love to be able to talk about that more if we had more time but we do have more talks on this and we also have with plenty of material online that talks about this as well we have other projects too though you know both hash you cope console and citadel both now support parts of the specification and we're working with those communities to hopefully get them to fuller fuller implementation in you know the coming months on the consumer side we have projects now that are starting to adopt this as well whether it's secret stores that want to use spiffy IDs in order to solve for the credential bootstrap problem as well as proxies that we all want to be able to use this to automatically provide a TLS an authentication Dan's going to talk a bit more about that in more detail as well and with that I am gonna hand over to Dan who's going to give you some is an actual developer who can give you some actual detail on how all of this works and you can everyone hear me is my mic working okay cool so what I'm gonna be talking about is s feds the workload API and Federation which are three of the key parts of the spiffy specification and then the the next section is going to be how you actually use 50 in a project but first I am going to take a step back and talk about what makes a good identity document and what makes a bad identity document so on the Left I've got a not very good identity document a name tag and on the right I've got a driver's license which is the identity document we all know and trust more or less so what makes an identity document good well has to be unique for whatever you're trying to authenticate so we're trying to authenticate services you need one identity per service and you might think that's obvious but if you're using IP addresses as your identity an IP address can clearly have multiple services either at the same time if you're running a server with a whole bunch of stuff running on it or at different points in time obviously you spend down pods you spin up pods the IP addresses get recycled identity documents are more or less static they I mean they can be renewed but what I mean by that is the identity that's encoded in the document stays the same over time because otherwise it's really hard to develop authorization policies if your identities are changing all the time and verifiable so again on the Left I've got this name tag that doesn't prove anything about me I can write whatever I want on there on the right I've got a driver's license that has all kinds of anti-counterfeiting measures and Holograms and a barcode on the back you can query a database and verify that that's a real driver's license so any kind of identity document needs to be verifiable we're all familiar with x.509 certificates used in PKI and you can verify that an x.509 certificate is a real x.509 certificate you can check the signature on that document and make sure that it's correct but actually the most important part about spiffy is not those three things it's the last thing it's that there's a trusted Authority that's attesting to the identity so that trusted Authority is your spiffy implementation that is somehow going out to your platform your infrastructure and verifying that the workload is what it says it is so in a driver's license that's the DMV checks my birth certificate and gives me a driver's license and ever and more or less trust the DMV in a spiffy implementation that's the spiffy implementation I asked for an S fit it goes out and it talks to my infrastructure and similar to checking a birth certificate and then it gives me an S feedback and that s fed is what we call a spiffy verifiable identity document so what goes into an S vid well there's the spiffy ID it looks like a URL it starts with the spiffy colon slash slash and then the green part is what we call a trust domain so that's really an identifier for your organization we say acne com it doesn't even have to be resolvable DNS none of this or depends on DNS because we don't trust DNS to be secure then after the green part we have the blue part which is your workload identifier again this doesn't have to be a resolvable URL this just has to be unique and consistent within your organization and then how do we encode that thing well we put it in like I mentioned an x.509 certificate in most cases it will just be encoded in an x.509 certificate because that's supported by the vast majority of software everyone kind of understands how it works it's a x.509 certificate with certain very specific fields in it that identify this workload and it's actually we restricted what other fields can be used just to avoid potential bugs we also have a format for encoding your spiffy ID in a jot token raise your hand if you know about jock tokens oh wow most people ok cool these are used all over the place in web development there are certain cases where x.509 won't work for you about job will work for you they're gonna be pretty rare at least right now and in the in the long run we think that will become more important ok in order to use spiffy then I need my spiffy verifiable identity document and this other thing called a trust bundle which we'll get into in there in a minute how do I get that spiffy verifiable identity document oh and again it has the spiffy ID encoded in it well it has to come from the spiffy implementation this is not a self signed certificate I have to ask the implementation to give me that identity document and I need the trust bundle which is have the keys that are used to verify that other identity documents are correct so has the public keys similar to a certificate authority file in your in your web browser so I use this workload API I am a workload I call out using G RPC to the workload API and I get back that bundle and the identity document for myself and it's all up to whatever spiffy implementation you choose to use to figure out how to get you that information aspire is is the reference implementation of spiffy it stands for a spiffy reference implementation and that's the one that we are working on then that's the one that's a CN CF project and that one has a number of different plugins for talking to different types of infrastructure and getting that information for you and then there will be other implementations in the future and they will get that information in different ways now the last thing that andrew had mentioned is Federation and all I really want to say about Federation is we're working on it actively and we're developing this protocol for two different spiffy implementations to talk to each other so if you are using spire and you're using someone else's spiffy implementation we want those to interoperate completely we want we want one spiffy implementation to be able to verify the identity documents from another implementation but we don't want them to be too tightly coupled similar like Andrew had said if if two states trust each other they trust each other's driver's licenses or if two countries trust each other they trust each other's passports if if you configure – spiffy implementations to trust each other we want their identity documents to be interchangeable and then you can develop authorization policies that allow a workload from trust domain from the first trust domain to talk to the second trust domain and vice versa ok so how do I actually use 50 in my application well I've got five different ways here and I'm gonna go through them one by one but it's just a very very brief overview of each one the number one way the most obvious way is that can just call that workload API if you're comfortable with G RPC ap is it's it's a really simple GRP C API you to obviously have protobuf and gr pc available in whatever programming language you're using in your environment but you just you call the API you get back that s vid and you get back the trust bundle and then you have to use them in some way it's totally up to you how you use them but when you're initiating connections we'd expect you to use that s vid perhaps as the client cert in starting an EM TLS connection and then we'd expect you to use the trust bundle to verify certificates but in that situation it's more or less up to you or you can use library we are have or are developing libraries for most of the programming languages Sego and Java are on there and you can you can find these very easily on github the Java one is particularly interesting to me I didn't work on it but it's really cool because Java is so modular that you can actually use the spiffy library in a lot of situations without changing your code at all you just change the jar files that are imported and changed some java xml configuration for j2ee and this is an example where you're talking to a database and you just specify that the socket factor is a spiffy socket factory instead of a regular socket factory and suddenly it starts using the Java library and us as DevOps people could then just implement that without having to talk to a development team that might be slower or maybe we don't have ability to change that code and then using a proxy so this is probably going to be the most common approach at least for the the near future most of us are using proxies already in front of our services in some way this is so we support ghost tunnel nginx and Envoy and we they're all supported in slightly different ways for Envoy it's supported using the Envoy api's and nginx we actually had to change the nginx code to be able to support spiffy so if you do this you need to do a bunch of configuration in your proxy but the cool part is because the proxy is doing all the communication it handles any authentication and authorization you need and your code doesn't have to change at all ah spiffy helper this is just a simple way to be able to use spiffy relatively easily spiffy helper all it is is it's a small wrapper program it's really simple all it does is talk to the workload API get the test fit get the trust bundle get everything you need and write it to disk periodically every time that that information changes and then your program just has to read the files on disk and this is really nice if you're not familiar with GRP scip is you don't use a proxy but you want to be able to use spiffy and then the last option if you're using a service mesh that supports spiffy fully which none of them do yet but we're working on it then that service special will do all of this but we're not there yet so what's next if you're interested in this well we've got a number of other talks at this conference if you if you just go to the schedule and command F for spiffy you'll see that there are quite a few but there there's one at 1:45 today one at 3:40 today and one at 1:45 tomorrow and then if you can't make it to any of those or if you're watching on YouTube there's a slack it's a very active slack with discussion about spiffy and that's that's really the best way to get in touch with us ask questions learn more about spiffy of course we're in github of course we have a website we also have a blog and we're going to have a post on Federation that I was just working on a few minutes ago that's going to be posted today and with that I think we have just a few minutes left oh we've almost ended on so Jason's for each question if you don't mind raising your hand and listen I'll repeat the question and then in roulette sir [Applause] he's sincere PC on the plain sight of that right so if you so it's a really good question right if I'm calling the workload API how do I identify myself to the book or the API in the first place this is actually the core problem Mississippi's trying to solve for so I'll explain it first through the semantics of the API when I make a call I'm making an unauthenticated you have AC detection I just call it to it see my Hospital how do I know who has the thing exposed in this API no but what is on the UNIX at least the ways this API is exposed as a UNIX domain socket so when and it's exposed it's exposed by a process running on the same machine as the calling process as well so when it makes you call the agent is able to interrogate the UNIX kernel and say the police couldn't say okay it's obvious the metadata about this give me its PID by the JD you see named you can interrogate all this from a clip from the kernel then it will different mechanisms we do this this death pointed out different implementations we use different mechanisms to actually figure out what identities wishes in the case of spy I really couldn't pull into the local couplets and we'll pull into local metadata about the machine Oh yeah I'll try man I did my mighty lists who is Belle I'll try on so that's a spiffy give you a it'll give you a key by the look what they've got and we'll give you a certificate as damage as well the other thing we'll do is we'll Community Trust bundle for that for any certificate issue by that same Smith the implementation sure sure so the first question is how do we integrate with kubernetes and the second question is that we integrated with specific managed preferred Eddie's like gke and could the next person who has a question this razor and Nelson super client so yeah so let's run throught and that just runs minute against with your kubernetes the easiest way to do it is to it's definitely you can install it run that longer than it is itself what it actually does is you you can run the supply aspire has two components a server and agent the server runs as a service the agent runs with demon sect and it exposes that locomotive the guy on each node and then what happens is what the workloads call into that API to retrieve their identity the agent will interrogate the local cublas to pull out the plus speck of the workload and use that as part of this mechanism to retrieve exactly that's a very superficial explanations a lot of moving parts that that's the basis the next question from Cullen is what is missing from ischium what's missing twisty so I'll say that suppose the spiffy specula we've been building out was still in development right Federation for example is still not and things like work mode API fairly stable at this point but we've never called them one more time what's missing from what's missing from misty Oh today is is the workload API our concepts with the local API and Federation is get me we have a tree works with these students have been working with us to help define that but it's not there today today in this do I believe it's internally issues with the enemies but they can't be federated without you also all the way they're not going to run all right the questionnaire was one of their control playing the communications between the agent and the server and mainly I know the talk side of the diagram that breaks this down in a bit more detail but what was very short it's mainly about is mainly about authentication so the server has a rigid this is the supplier the server has a registry of identities it has a set of policies around how those identities should be issued so you know what what Amazon security group should be in what lunch you know what what prospects shouldn't match once you're sure the image is damaged Simon if so as we signed it and what they did between the agent of the server the first thing the agent which runs on the kernel itself on the machine needs to do is to authenticate itself to the server and inspire there's a plugin architecture so you can use different authentication mechanisms depending on whether while you're running on Amazon or running on bare metal with TPM or you have a Kerberos PTA identified machine you know with different environments communes different strategies but the communication is partly about that authentication and the serving and to verify the integrity of the machine that it's talking to and then there's a separate set of communications around telling the agent what policy what workloads it's specifically entitled to issue actually yeah I don't know if we have time now but I do have some slides to into this that I can I can talk folks through about how sure made a question from air the question is do we support certificate rotation or verification and Johnny it's tightening and I'll actually just go ahead answer that thing we do support certificate of rotation we do not support our vacation at eleven and not exactly sure what is my audience I think it dropped but I'm sure yes the replications point by the way it's really important this is one of the properties you get from talking to an MP is it means that it becomes feasible to make these certificates and key is really short lives you know ideally in the order of minutes because they mechanism for generating from is more automated than a mechanism for retrieving and by the workflow can be eliminated so and actually a certain point it becomes a secure reason you're the fec from being able to have a short-lived certificate that expires in a couple of minutes gives you about the same efficiency is being able to work now it's game yes so the question is do you have to manually trust every city server such a collaboration where multiple cities system and that the answer is yes to start your configuration use on which this is what say no that actually the you know what you do it inspire released as you define a policy for your people locomotor activity and that that policy could include collections like security groups like that so I can say for instance I have a I have a process that's a demon and it runs on every every node and then Amazon or a scale grouper for example you can just write a policy that describes that and then the when the agents running on each of those machines authenticates of the spy server will do the checking checking to see it is that instance not only who is that it's verified but then also is that instance panel its Achilles cover so you can write you don't have to you can write your policies and generalize ways like that and when you have say on the scale environment that becomes really important sure I think we probably attack for one more question is there any plan for TV mbar HSN Parker security and the answer is yes isn't a lot of upper that was that's not the party if someone's interested in contributing that it should be pretty straightforward there's a very it's a reasonably trivial plug-in interface at least on the spire side and then you need to write the server called an exchange on TV and mice inside it's a fox aged in doing that it was a bunch of us inside and when the project already we would love to see that [Applause]

Author:

Leave a Reply

Your email address will not be published. Required fields are marked *