There are lots of things that we regard as secret, and there are lots of times when you have a secret you want to share with someone else, the question is how do you share the secret with the person you want to without sharing it with anyone else by accident.
There are two basic approaches, you either share the secret in such a way that only the person you want to share it with can understand what you mean, this is called Cryptography.
The second alternative would be to share it in such a way that no one else notices that you’re sharing a secret, this is called Stenography.
Think of it like the difference between telling someone something in a language that only the two of you speak (Encryption), and whispering the secret when no-one else is looking (Stenography).
About Stenography
People have been using stenography for as long as they have been using codes, there are some great examples of hiding information:
The ancient greeks using tablets made from wood coated with wax with one message carved into the wax, and a secret message carved into the wood underneath exposed only be melting the wax.
Mask letters which read one thing normally, but with a cutout masks overlaid read something different, such as revolutionary war letters sent in 1777 from Sir Henry Clinton to John Burgoyne.
Jeremiah Denton sending a hidden message by blinking in morse code during televised press conference in which he says he is being well treated while being held as a POW by North Vietnamese captors in 1966. When decoding his blink patterns the message reads simple the word TORTURE but at the time was the first clear report of POW mistreatment communicated to the US from a captive.
There are lots of ways to hide information limited only by the imagination and inventiveness of the person doing the hiding, while participating in the Volga CTF run by Samara State Aerospace University which among an extensive list of very well put together and very challenging questions had a few Stenography ones I’d like to share with you.
Poem
So in this challenge, we were given a pdf document, and told that it contained a hidden message.
There are a lot of ways to hide things in text, cutout masks as we’ve discussed, using the first letter from each word (or other offsets), or changes based on capitalization, punctuation or spacing.
My first point of call was to lookup the source texts and compare them with the content in the PDF looking for any alterations, this didn’t lead to any changes being found it looks to be a pretty straight forward copy of the text so time to look at some other options.
Looking at the PDF file it doesn’t quite look even somehow, the lines don’t seem evenly spaced, so I theorized that something in the document structure was throwing off the look of the content.
The trouble with the PDF file format is that it’s absolutely not a text format, so much of the structure is about page layouts and positioning that it is very complicate to get information out of them, and the normal tools for reading and writing them as they usually provide a lot of interpretation and abstraction of the content to make an end users job easier, which can obscure hidden things.
A little research lead me to a tool calls iText RUPS which allows you to basically browse the content of a PDF file in its native form and after a little digging in the document structure I found the text streams for each page.
The first thing I noticed was that each line seems to have a line height of either 14.xxx or 17.xxx so I extracted that information and turned it into binary based on the assumption that 14.xxx = 0 and 17.xxx = 1
Given this leads to such a short stream of hidden data I tried splitting it into 8 char groups which gave me the following data:
01111011 01110011 01101001 01111010 11001010 10111110 11001000 11011110 10010101 11001101 01111101 10110101 00001011 10100011 10100011 00101011 10100101 11110110 00010110 01100111 00001100 10101110 01001011 11101100 00011011 00011011 00011111 0
Having spent a lot of time with data like this one of the things you come to recognize is that almost all text based data chars start with a zero, so it looks promising for the first few groups but then seems to break down. In fact if you decode those first few chars you get:
01111011 - { 01110011 - s 01101001 - i 01111010 - z
so looking at the combinations again, all the values for the next few rows end with a zero, so if I was to insert an extra bit of data they would all suddenly be in my roughly printable range too, and after they stop ending with a zero it seems that the second to last one is zero for a while too.
This seems to indicate I’ve missed a few bits out somehow, counting through it seems I’ve in fact missed 7 bits, and that would seem to indicate something on the page boundary that I missed.
Anyway, knowing that I’ve missed out bits I simply added some spaces and re-broke things out to make them all printable, and included a value from the pdf that seemed to follow:
01111011 { 01110011 s 01101001 i 0111101 -37.904 z { 01100101 e 01011111 _ 01100100 d 01101111 o 0100101 -40.738 J K 01110011 s 01011111 _ 01101101 m 0100001 -37.904 B C 01110100 t 01110100 t 01100101 e 0111010 -34.452 t u 01011111 _ 01100001 a 01100110 f 0111000 -46.408 p q 01100101 e 01110010 r 01011111 _ 0110000 -34.452 ` a 01101100 l 01101100 l 0111110 -190.854 | }
I’ve also added the decoded values and as you can see I’ve included two options for each of the chars where I’m missing a bit assuming adding either a zero or a one on the end. Clearly it’s not quite right, but it’s close enough to guess the message is {size_does_matter_after_all}
After a little checking we can confirm we’ve simply guessed the wrong place for the spaces, checking the raw data we do indeed get a much better decoding.
TLSB
Another good place to hide information is in images, this challenge gave us an image and told us there was data hidden inside it.
In things like images (or even sounds) because of the way there are a huge number of variations it can be quite easy so hide some smaller amounts of information without making changes that would be detectable by a person.
One of the common ways is called Least Significant Bit encoding, which is hiding information by doing things like checking if the numeric value of each pixel is odd or even, which can be altered by adding or subtracting one from each pixel to make the values in the cover image be equal to the bits in the hidden message.
This table shows the binary values of the first 5 pixels in their binary forms, each pixel being made up by three values red, green and blue. The highlighted row is the LSB which is where we expect to find the message.
I spent some time playing with the LSB values, looking for ways to shuffle them around, reading only red bits, or excluding all the blue bits, or every nth pixel, but didn’t come up with anything which looked promising.
Fortunately there were a few clues published (clearly they were taking pity on us for not solving it!) the first of which was “2 bpp. That’s incredible.”
After trying a few variations of how you could miss bits out in our stream so we’re only reading 2 in three bits and not getting anywhere I put my thinking cap on, and looked for a scheme to convert three bits of data into two bits of message.
Adding up the number of bits set to 1 might be a good start, this is called the Hamming Weight, and you can express it as a table like this.
R G B Hamming weight 0 0 0 0 = 00 0 0 1 1 = 01 0 1 0 1 = 01 1 0 0 1 = 01 0 1 1 2 = 10 1 0 1 2 = 10 1 1 0 2 = 10 1 1 1 3 = 11
Looking at the output of this again I can’t see anything that I can understand, however there does seem to be quite a large extent of values that are either 00 or 11 which is unlikely given the table above indicates that random values should favour 01 and 10.
The second clue mentioned Matrix embedding, and the third mentioned Jessica Fridrich, which lead me to a few papers including these:
Matrix Embedding for Large Payloads
Steganalysis of JPEG Images: Breaking the F5 Algorithm
I spent some time digging through these, and I’ll confess to being out of my comfort zone trying to follow the maths notation, which made it hard going understanding exactly how these would help when dealing with our image, but I picked up a few things and bookmarked these for later.
I went back to google looking for sample implementations of some of what they’re talking about, found some interesting things. The real breakthrough came from: F5 a Steganographic algorithm which is quite a nice PowerPoint deck discussing different ways of embedding data into images but approaches things from a much more programmer-friendly perspective.
It focuses on jpeg based images which has a number of differences from the png based data we’re working with here, but the principles of lsb embedding are still relevant. On slide 15 which states there’s a really nice example of 3×2 based matrix embedding (rather than the generic examples in the papers above):
x1 == a1 ^ a3, x2 == a2 ^ a3 - no modification required x1 != a1 ^ a3, x2 == a2 ^ a3 - flip a1 x1 == a1 ^ a3, x2 != a2 ^ a3 - flip a2 x1 != a1 ^ a3, x2 != a2 ^ a3 - flip a3
This is basically saying you can encode two bits of data into three source bits by only modifying one bit from each pixel at worst (it’s zero bits 25% of the time) in the cover image (which is pretty darn cool!)
The slide refers to the more general cases for hiding x bits of data in y bits of cover, which was what I was trying and failing to grasp from the Math papers I suspect, but that’s not the issue, because this is the 3×2 case we’re already looking for!
So this means that our output bits following this scheme would be:
x1 = (r & 1) ^ (b & 1) x2 = (g & 1) ^ (b & 1)
Well it’s not quite the data I was looking for, I can’t quite make anything out, but that’s a whole lot of double zeros, clearly we’re on to something here, maybe there’s some pattern to the one’s.
To test that I’ve changed it to print zeros as spaces and ones as X’s and do a new line every time we loop round the image in the hope that it makes it easier to see the pattern…
Rather than a pattern I could decode they’re all arranged in down the left side! After a little head scratching and looking at what’s going on, I realized I’ve got my x and y mixed up, because I’m looping x and then y in my code I’m actually doing columns vertically rather than rows, a quick change of my code and I’ve now got a fairly contiguous block of data followed by a big group of zeros.
Let’s take a look see!
Although file doesn’t recognize it, but that repeating pattern looks reminiscent of the way blank sections of some images look when encoded, and there are other sections of the file with more detail which could be content.
Looking at the header it seems to match quite well with the png headers from the outer resource file and looking at it in hex seems to show that the footer of the file looks like a similar layout as well.
Looking at the bits in the header it’s actually only the bits coloured red (the ones from doing xor between the red and blue pixels) that isn’t matching, the others line up perfectly. I spent some time trying to work out if there was a pattern in the miss-matches, but didn’t find anything there. The next step was to try a few variations on the way of calculating those values which lead me to:
x1 = (r & 1) ^ (g & 1) x2 = (g & 1) ^ (b & 1)
Run that through my inspection code and I see a whole png header, right from the start of the file! A quick export and we’ve got a valid output!
Source for the final run:
from PIL import Image im = Image.open("stego.png") #Can be many different formats. #im = Image.open("original.jpeg") pix = im.load() limx, limy = im.size red_bits = [] green_bits = [] blue_bits = [] counter = 0 last_1 = (0,0,0) bits = [0,0,0,0,0,0,0,0] recovered_bits = [] chars = [] for y in xrange(limy): for x in xrange(limx): r, g, b = pix[x,y] red_bits.append(r&1) green_bits.append(g&1) blue_bits.append(b&1) x1 = (r & 1) ^ (g & 1) x2 = (g & 1) ^ (b & 1) recovered_bits.append(x1) recovered_bits.append(x2) bits[counter % 8] = str(x1) bits[(counter % 8) + 1] = str(x2) counter += 2 if counter % 8 == 0: chars.append(chr(int(''.join(bits),2))) if(x1 != 0 or x2 != 0): last_1 = (x,y,len(chars)) recovered = ''.join(chars[:last_1[2]+1]) png_header = "\x89\x50\x4e\x47\x0d\x0a\x1a\x0a" png_bits = [] for c in png_header: for i in xrange(8): png_bits.append((ord(c) & pow(2, 7 - i)) >> 7 - i) from termcolor import colored print "Image Dimentions: (%i x %i)" % (limx, limy) print "Last Data: (x,y) = (%i, %i) - output length = %i" % last_1 print "Length of png header bits %i" % len(png_bits) print "Hunting..." matched = 0 matched_best = 0 for i in xrange(len(red_bits)): if red_bits[i] ^ blue_bits[i] == png_bits[matched] and green_bits[i] ^ blue_bits[i] == png_bits[matched+1]: matched += 2 if matched > matched_best: matched_best = matched print "matched %i bits starting at %i" % (matched, i - ((matched / 2) - 1)) else: matched = 0 print "PNG header:" for x in png_bits: print x, print print "Red:" for x in red_bits[:len(png_bits)]: print colored(str(x), 'red'), print print "Green:" for x in green_bits[:len(png_bits)]: print colored(str(x), 'green'), print print "Blue:" for x in blue_bits[:len(png_bits)]: print colored(str(x), 'blue'), print print "====== Based on XOR ======" print "PNG:" Toggle = True for x in png_bits: print colored(str(x), 'red') if Toggle else colored(str(x), 'blue'), Toggle = not Toggle print print "Recovered:" Toggle = True for x in recovered_bits[:len(png_bits)]: print colored(str(x), 'red') if Toggle else colored(str(x), 'blue'), Toggle = not Toggle print print "======= Bytes =======" print "PNG:" print [bin(ord(x))[2:].zfill(8) for x in png_header] print "Recovered:" print [bin(ord(x))[2:].zfill(8) for x in recovered[:len(png_header)]] print "Bits by val" print "Writing output to out.png..." with open('out.png', 'wb+') as f: f.write(recovered) print "Done!"