In the first post in this series, we explored the security need for an IT professional to identify who the author of a piece of code (like a PowerShell script) is. The challenge with that is producing evidence of the identity of the author that’s formatted in a way that the consumer of that script can use it.
That interaction is somewhat like the interaction between a speeder and a police officer. The officer needs to prove the identity of the driver in much the same way that an IT pro needs to identify a scripter. The officer can ask for a driver’s license, but what can an IT Pro ask for?
The solution starts with digital certificates: clever mathematical puzzles created by official Certificate Authorities. But as of the end of our last chat about these technologies, we still hadn’t determined how to reliably know if someone has such a digital certificate – we can’t just trust a comment in the code that says the author has a certificate, just like an officer can’t trust a driver who complains that he left his license at home.
One piece of the puzzle is the Public Key, and the other half is the Private Key. The two form a matched set of values with an interesting mathematical relationship; when a piece of data is encrypted with the Public Key, it can only be decrypted with the Private Key. And data encrypted with the Private Key can only be decrypted with the Public Key. Once encrypted the encryption cannot be undone without the other key – not even the key used to encrypt the data can be used for the decryption process.
That’s a strange bit of key behavior – imagine locking the door of your house with a key, but then finding that you can’t unlock the door with that same key – you need a different key to do the unlocking. Here’s an analogy; I lived for a while in an apartment complex with a shared central mail facility. I had a key to my mailbox that let me retrieve my mail, and send out letters of my own. The mail carrier, visiting in the middle of the day also had a key – but it wasn’t a copy of my key; it was a totally different key that unlocked the back side of the mailbox installation, allowing the mail carrier access to all the mailboxes in the complex. It looked like this:
Consider the relationship between my key and the letter carrier’s key. I could use my key to send a message that only the mail carrier could retrieve. The mail carrier could use his or her key to send a message in a way that only I could retrieve. But my key could not open the mail carrier’s side of the mailbox, and neither could the letter carrier open my side. My key didn’t open any of my neighbor’s mailboxes, and they couldn’t open mine.
No amount of time spent staring at my key would tell me what the mail carrier’s key looked like and the letter carrier couldn’t determine the shape of mine from looking at his own key. But between the two keys, a secure communication channel is established. That’s the goal of Public Key Infrastructure – the set of solutions enabled by deployment of Public and Private keys.
PKI undergirds PowerShell’s strategy for digitally identifying the author of a piece of software. Let’s prove it by working backwards. We know we can’t rely on an author’s say-so, in a comment for example, as a source of identification – that leaves open the question of who is saying so, right? So what can we look for that proves a scripter’s identity? Let’s take the example of one very important scripter for our example – a little company from the Pacific Northwest, maybe you’ve heard of them? They wrote some code that PowerShell uses and included it in the $PSHome directory – the home folder for PowerShell. If we take a look at the file in that directory named DotNetTypes.format.ps1xml and skip to the last couple of hundred lines of the file, we find this:
Good heavens! What’s all that gibberish?
What you’re looking at is called Base64. It’s a clever way of being able to embed binary data in a text file. Ordinarily, a text file can only contain printable characters – letters, numbers, punctuation – that sort of thing. That makes it challenging to store binary data – which can store data using any of the 256 possible patterns of bits that can be stored in a byte, rather than limited to the smaller set of printable characters. Base64 provides a sneaky technique to describe all those 8-bit bytes by breaking them up into 6-bit chunks. Each of those 6-bit chunks can have 2^6 possible bit patterns in them, and you might not be surprised to know that 2^6 is 64.
So 64 possible patterns are then going to be described using the 26 uppercase letters, plus the 26 lowercase letters, plus the 10 number symbols (bringing us up to 62 characters) plus two punctuation characters, the plus (+) and the slash (/). There’s the 64 possible patterns, represented as perfectly valid text.
So, what’s in this block of text? For that, we can rely on a Base64 Decoder, like the one you find at http://www.base64decode.org/
What’s the result? Have a look.
ICK! That looks worse than the Base64 gibberiish! But there’s some interesting data in there, if you look. There are some areas of readable text in there, and it’s referencing that one Northwestern software development shop. But what’s the rest of the dreck surrounding it? That’s the binary data that the use of Base64 is meant to mask. Fortunately, Microsoft gives us an easier way to understand what it is.
Microsoft describes the big block of Base64 with the overly fancy term Authenticode. What that means exactly will become clear in a moment, but for starters, it’s worth pointing out that we can extract the meaning of that Authenticode data by calling the Get-AuthenticodeSignature cmdlet, and passing the name of the file whose signature we would like to extract. Asking what’s in the signature data can be answered simply by passing the signature object extracted from a file to Format-List *
That text looks familiar – we saw pieces of it in the decoded version of the Base64 data. What’s in the signature is a description of a digital certificate! It identifies its subject, a company called… Microsoft Corporation (that’s the name I was forgetting…). It seems that Microsoft’s own Code Signing certificate authority produced the certificate.
Step back into your role as the scripting police officer. You’ve just been handed a digital driver’s license. Is it a real license, or is it a yellow sticky note? How can you tell? Think about the police officer and the speeder – the license isn’t valid because the drive says so, it’s valid because the DMV says so, and, critically, the officer trusts the DMV.
This license identifies the Department of Scripter Identification that issued this particular license – the Microsoft Code Signing CA. The bigger question is, do we trust that source? There’s an easy way to find out – open up the CertMgr.msc MMC snap-in.
Sure enough – that CA is in the list of CAs that we trust – these are the “Trusted Root Certification Authorities” for this computer. This list gets updated routinely by Windows Update as Microsoft identifies new organizations that are trustworthy “license”-generators, and as other companies go out of business or are decertified for various reasons.
So that’s it – ironclad proof of the identity of a scripter. The script embeds the identity of the scripter in a certificate, and the certificate contains the identity of the certificate authority. We trust that certificate authority, so we can therefore trust that the owner of that certificate is who they claim to be.
But we have one last problem to overcome. A police officer knows if your license belongs to you – your picture is on it. But in the scripting scenario, all we know is that the script is followed by a signature. How do we know the signature belongs to the creator of the script? Couldn’t a virus coder just swipe the Base64 from the bottom of a script, add it to bottom of the script containing his PowerShell-borne virus, and trick the world into thinking the code was legitimate?
Fortunately, the answer is no. But that’s a story for another time…
Automating Administration with Windows PowerShell v3.0 (M10961)