So I recently worked on a feature at work that allows the client to upload images to a cloud storage provider of our choosing. Sounds easy, right? But of course, there’s more. When users send an image file, we store it in the storage bucket and return the file URL. Still sounds easy? There’s more.
You see, we also wanted to get some information about the image that was uploaded, information like where it was taken, what time it was taken, and any other metadata relevant to the next step the user would take after uploading the image.
This was my first time encountering something called Exchangeable Image File Format (EXIF). I read a little bit about it and started looking for a library to help me sort it out. This is usually not how I work, I love to tackle complex software problems when the opportunity presents itself but in this case, I was working at a fast-paced startup. And in this kind of environment, the focus is on shipping the next feature, and then the next one after that, and then the next.
Lately, I’ve started spending my weekends building exciting personal projects. Most of these I try to finish in one weekend, start on Saturday, wrap up by Sunday just so there’s no room for spillovers. I usually build them in Golang, and this weekend was no different.
I decided to explore the nitty-gritty of EXIF. And it led me down a rabbit hole of bytes and byte manipulation, endian gods, header checks, file permissions, and even a few philosophical reflections about the meaning of 0xFFD8
. It reminded me of some of the wacky things I learned back in Computer Science classes, things I never thought I’d use.
You can find the full source code for the EXIF Parsing Engine here
What Even Is EXIF?
You might not realize this, but every time an image is taken especially on modern devices, metadata gets embedded in it. Information like the date the photo was taken, the camera model, exposure settings, and yes, whether the flash was used or not (spoiler: the answer is always no because we all hate flash).
That’s EXIF.
And I wanted to write a tool in Go that could read this data and store it in a JSON file.
Now, there are tools that already do this. I didn’t bother checking them out. The point wasn’t to solve the problem fast, it was to deeply understand the subject matter and write the solution in a language I’ve been obsessed with for years: Go.
Step 1: Just Read the File
This was the easy part. Open the file, read it into a []byte
. Go’s standard library doesn’t give you EXIF parsing sugar, so you’re left to chew the raw bytes like someone decoding ancient runes.
Everything after this was one big learning curve.
data, err := os.ReadFile("path/to/image.jpg")
if err != nil {
log.Fatal("Error reading file: bro, your file doesn’t even exist.")
}
Step 2: JPEGs Come With an Entry Fee
While researching and building out this tool, I learned that JPEG images are usually where you’ll find EXIF data. I didn’t bother checking other formats. JPEG files are structured as a sequence of segments, each marked by a two-byte marker.
JPEGs always start with 0xFF
and 0xD8
. These are the SOI (Start of Image) markers. If your file doesn’t start with these, stop right there it’s not a JPEG.
func isJPEG(data []byte) bool {
return len(data) >= 2 && data[0] == 0xFF && data[1] == 0xD8
}
Step 3: Finding the EXIF Segment (aka APP1 Marker)
Remember those “segments” I mentioned? JPEG files are filled with them. One of them, APP1, is where the EXIF data lives.
APP1 is the segment we care about most because that’s where you’ll find data like:
- Camera make and model
- Date and time the image was taken
- Other juicy metadata bits
To find the EXIF block, scan through the byte data for this sequence:
if data[i] == 0xFF && data[i+1] == 0xE1 {
// Welcome to the EXIF block, my friend.
}
Step 4: The Curse of Byte Order
Starting 6 bytes into the EXIF segment, you’ll find the TIFF header. This tells you the byte order for interpreting the rest of the data: either LittleEndian or BigEndian.
var endian binary.ByteOrder
if data[6] == 0x49 && data[7] == 0x49 {
endian = binary.LittleEndian
} else if data[6] == 0x4D && data[7] == 0x4D {
endian = binary.BigEndian
} else {
log.Fatal("What in the corrupted image is this?")
}
At this point, you might not even know what endian-ness means, but just roll with it.
Step 5: Are We at the IFD Yet?
The TIFF header gives you an offset to something called the IFD (Image File Directory). This is where the real party begins. EXIF tags live here.
Each tag has a unique identifier. For example:
0x010F
= “Make”0x0110
= “Model”
You calculate the start of the IFD like this:
firstIFDIndex := tiffStart + int(ifdOffset)
You are now in EXIF land.
Step 6: Parsing EXIF Tags Like a Hacker (but Tired)
Each EXIF tag is 12 bytes long and includes:
- A tag ID
- A data format
- A component count
- Either the value itself or an offset to where the value lives
You loop through these entries, decode each tag’s value, and convert it into something readable.
if valueSize <= 4 {
tagValue = entry[8 : 8+valueSize]
} else {
tagValue = data[tiffStart+int(valueOffset):tiffStart+int(valueOffset)+valueSize]
}
You learn that:
- ASCII = format 2
- SHORT = format 3
- RATIONAL = format 5
And you decode accordingly. Trust me, it’s not glamorous, but it’s worth it.
Step 7: JSON Please, but Make it Pretty
After gathering all your EXIF tags into a Go struct, you want to save the output to a file and format it nicely, of course.
jsonBytes, err := json.MarshalIndent(exifPayload, "", " ")
os.WriteFile("output.json", jsonBytes, 0644)
Done. Now you have a clean, readable JSON output that represents all the hidden metadata inside your image.
Hard Lessons & Random Gotchas
- The EXIF block must start with
"Exif\x00\x00"
binary.BigEndian.Uint32(data)
will panic if the slice isn’t 4 bytes long- Forgetting to check slice bounds will cost you your weekend
- Offsets are relative to the TIFF header, not the start of the file
- Formatting JSON nicely? Use
json.MarshalIndent
- File permissions? Use
0644
unless you have a really good reason not to
Final Thoughts
I set out to write a tiny Go script. I ended up decoding binary internals of images, learning the historical trauma of byte order, and writing the cleanest JSON file of my life.
Did it frustrate me? Yes. Did I almost blame the image file for my bad code? Absolutely. But now? I can parse EXIF data like a boss.
And just like life, the bytes eventually made sense.
Until I pick another overly ambitious weekend project…
Stay curious. Stay building.