Skip to main content
← Back to lab
SEC401 - Defense in Depth | Printable command sheet
Lab 2.2 - Data Loss Prevention

Lab 2.2 - Data Loss Prevention

Data Security & DLP | SEC401 | Apr 2026

Scanned removable media for sensitive content using grep keyword matching, extracted document metadata with exiftool revealing author identity and SECRET classification, and geolocated a photo's origin from embedded GPS coordinates.

Tools: grep, exiftool, EXIF/GPS analysis, CLI

Commands

1. Scan removable media for sensitive keywords

Navigated to the mounted CDROM and used grep with Perl-compatible regex to scan all files for the words 'secret', 'confidential', or 'sensitive'. The -P flag enables PCRE, -a treats binary files as text, -i makes the search case-insensitive, and -l prints only filenames (not matching lines). One file matched: 'Merger Offer Letter to Beta Industries.doc', a document that would be flagged by any DLP system scanning for classification markers.

cd /media/sec401/CDROM/
grep -Pail '(secret|confidential|sensitive)' *
-P: Perl-compatible regex (supports alternation with |) -a: treat binary files as text (needed for .doc/.docx) -i: case-insensitive matching -l: print only filenames, not matching content

2. Extract document metadata with exiftool

Ran exiftool against Bankruptcy.docx to extract all embedded metadata. Key findings: the document was created by Madison Jeffries, last modified by Jerry Jackson, and tagged with the keyword SECRET. Additional metadata reveals it was created in Microsoft Office Word (App Version 16.0000), has 358 words across 2 pages, and has a total edit time of 2,982,555 days. The Keywords field is particularly significant for DLP: this is where classification markings like SECRET, CONFIDENTIAL, or TOP SECRET are often stored in government and corporate environments.

exiftool Bankruptcy.docx
exiftool: read/write metadata in files (EXIF, IPTC, XMP, Office XML) Outputs all metadata fields including Creator, Keywords, Last Modified By

3. Geolocate photo from GPS coordinates

Opened an image file and examined its EXIF properties, which revealed embedded GPS coordinates (GPSLatitude and GPSLongitude). Photographs taken with smartphones and GPS-enabled cameras automatically embed location data in the image file. By extracting these coordinates, the exact location where the photo was taken can be identified on a map. This is a major privacy and security concern: employees sharing photos from sensitive facilities, whistleblowers inadvertently revealing their location, or insiders documenting assets before exfiltration. DLP policies should strip EXIF data from outbound images or flag files containing GPS metadata.

Key Findings

  • grep -Pail flagged 'Merger Offer Letter to Beta Industries.doc' containing sensitive keywords
  • exiftool revealed Bankruptcy.docx: Creator (Madison Jeffries), Keywords (SECRET), Last Modified By (Jerry Jackson)
  • Image EXIF data contained embedded GPS coordinates revealing the photo's geographic origin
  • All three data leakage vectors (content, metadata, geolocation) found on a single removable device

Security Controls

  • DLP policies scanning removable media for classification keywords
  • Endpoint controls restricting USB/CDROM write access
  • EXIF/GPS metadata stripping on outbound files
  • Document classification enforcement (mandatory marking)
  • Insider threat monitoring and behavioral analytics
  • Data-at-rest scanning for misplaced classified documents