Ethical Hacking Reconnaissance and Information Gathering

Reconnaissance is the first phase of every penetration test. Before an ethical hacker launches a single scan or exploit, they spend time gathering as much information as possible about the target. The more you know about a target before you touch it, the more focused and effective your attack becomes.

Think of a detective building a case file before making an arrest. The detective gathers photos, phone records, financial documents, and witness statements. By the time they make a move, they know exactly who they are dealing with and where the weak points are. Reconnaissance in hacking works the same way.

Passive vs Active Reconnaissance

Reconnaissance falls into two broad categories, and the distinction matters both technically and legally.

Passive Reconnaissance

Passive reconnaissance gathers information without directly interacting with the target's systems. The attacker uses publicly available sources — websites, search engines, public databases, social media. The target has no way to detect passive reconnaissance because no traffic touches their servers.

Examples: reading a company's website, searching LinkedIn for employee names and job titles, looking up domain registration records.

Active Reconnaissance

Active reconnaissance involves direct interaction with the target. The attacker sends network packets to the target's systems — for example, pinging servers or scanning for open ports. The target's firewall and intrusion detection systems can potentially log and flag this activity.

Examples: port scanning with Nmap, banner grabbing, DNS zone transfers.

Open-Source Intelligence (OSINT)

OSINT stands for Open-Source Intelligence. It refers to information gathered from publicly available sources — no hacking required. OSINT is the backbone of passive reconnaissance.

WHOIS Lookups

When a company registers a domain name, it submits contact information to a public registry. A WHOIS lookup retrieves this record and reveals:

  • Domain owner name and organization
  • Registrar (the company that sold the domain)
  • Registration and expiration dates
  • Name servers
  • Contact email addresses (sometimes masked by privacy protection)

Run a WHOIS lookup from the command line:

whois example.com

DNS Enumeration

A company's DNS records reveal a map of its online infrastructure. Each record type tells you something different:

Record TypeWhat It RevealsExample
ADomain → IPv4 addressexample.com → 93.184.216.34
MXMail server addressmail.example.com (reveals email provider)
NSName serverns1.example.com (DNS infrastructure)
TXTSPF, DMARC, domain verificationReveals email security configuration
CNAMEAlias for another domainwww → example.com (infrastructure details)

The tool dnsenum and the command dig both retrieve DNS records. A DNS zone transfer — when misconfigured — dumps the entire DNS zone for a domain, revealing every subdomain at once.

Google Dorking

Google's search operators let you craft highly targeted searches that surface sensitive information. This technique is called Google Dorking or Google Hacking. Common operators include:

  • site:example.com — Lists all pages Google has indexed from a specific domain
  • filetype:pdf site:example.com — Finds all PDF files on a domain
  • intitle:"index of" — Finds open directory listings where files are publicly browsable
  • inurl:admin — Finds pages with "admin" in the URL
  • filetype:sql — Searches for exposed SQL database files

Google dorking finds accidentally exposed files, login panels, configuration files, and employee documents — all indexed publicly by Google and retrievable without touching the target's server.

Shodan: The Search Engine for Devices

Shodan is a search engine that continuously scans the internet and indexes the banners (responses) from servers, cameras, routers, industrial control systems, and any other internet-connected device. Searching Shodan reveals:

  • All servers a company has exposed to the internet
  • Operating system and software versions running on those servers
  • Default login pages for cameras, routers, and IoT devices
  • Industrial control systems that should never be internet-facing

A search like org:"Target Company Name" on Shodan returns all devices that organization has on the internet — without sending a single packet to the organization's network.

theHarvester

theHarvester is a Kali Linux tool that collects email addresses, domain names, hosts, employee names, open ports, and banners from multiple public sources in a single automated search. It queries search engines, PGP key servers, and other OSINT sources.

theHarvester -d example.com -b google

This command collects information about example.com using Google as the data source.

Social Media OSINT

LinkedIn, Twitter, Facebook, and GitHub reveal an enormous amount of information useful for an ethical hacker:

  • LinkedIn — Reveals employee names, job titles, department structures, technology stacks ("Managed AWS infrastructure"), and hiring trends that indicate what technologies the company uses.
  • GitHub — Developers frequently commit code that contains API keys, passwords, internal IP addresses, and configuration files. Searching GitHub for a company's name or domain often surfaces leaked credentials.
  • Job Postings — A job listing for "Senior Engineer — Cisco ASA Firewall Management" tells you exactly what firewall vendor the company uses.

Maltego: Visual Intelligence Mapping

Maltego is a graphical OSINT tool that visualizes relationships between entities — domains, IP addresses, organizations, email addresses, phone numbers, and social profiles. Instead of a list of data points, Maltego produces a link graph showing how everything connects.

An ethical hacker using Maltego might build a map showing: the company's primary domain → linked subdomains → IP addresses hosting those subdomains → hosting provider → email addresses associated with registration records → employee social media profiles. Each node in the graph is a potential attack vector.

Footprinting a Target: A Practical Workflow

A systematic reconnaissance workflow for a typical engagement follows this sequence:

  1. Identify the target scope — List all domains, IP ranges, and subsidiaries in scope per the signed authorization.
  2. WHOIS lookup — Retrieve domain registration details for every domain in scope.
  3. DNS enumeration — Map all subdomains, mail servers, and name servers.
  4. Google dorking — Search for exposed files, admin panels, and indexed sensitive information.
  5. Shodan search — Find all internet-facing devices and their software versions.
  6. theHarvester scan — Collect email addresses and employee names.
  7. LinkedIn and GitHub review — Identify technology stack and check for leaked credentials.
  8. Compile intelligence report — Organize findings into a structured document before moving to active scanning.

Reconnaissance Diagram: The Intelligence Funnel

Think of reconnaissance as a funnel:

  • Wide top (broad search) — Start with everything publicly visible: WHOIS, DNS, Shodan, Google.
  • Middle (targeted search) — Focus on specific employees, technologies, and services identified in the first pass.
  • Narrow bottom (high-value intelligence) — Arrive at specific targets: vulnerable software versions, valid email formats, employee credentials, unpatched servers.

The more thorough the reconnaissance, the smaller and more precise the list of targets for the next phase — scanning and enumeration.

Key Points

  • Passive reconnaissance gathers information without touching the target's systems; active reconnaissance involves direct interaction.
  • WHOIS reveals domain ownership and contact information; DNS records map the target's online infrastructure.
  • Google dorking uses search operators to find accidentally exposed sensitive files and admin panels.
  • Shodan indexes internet-facing devices and reveals software versions and misconfigurations without sending packets to the target.
  • Social media and GitHub frequently leak technology choices, employee data, and even credentials.
  • A systematic reconnaissance workflow produces organized intelligence before any active testing begins.

Leave a Comment