Information Gathering&scanning for sensitive information[ Reloaded ]

  • Horizontal domain correlation [ to find more horizontal domain for the company / finding acquisitions]
  • Subdomain Enumeration / Vertical domain correlation [ to find vulnerability /security issue and gathering the targets assets]
  • ASN lookup [Discovering more assets of the company using asn number]
  • Target Visualize/Web-Screenshot [ also know as visual recon to see how the target looks like what feature visually available to test]
  • Crawling & Collecting Pagelinks [ crawling the subdomains to get links and url of the domain]
  • Javascript Files Crawling [ to find sensitive information like api key auth key , plain information etc.]
  • Parameter discovery [ to scan for injection type vulnerability or other security issue]
  • Subdomain Cname extraction [ to check if any domain is pointed to third party service later we can use those information for subdomain takeover]
  • Domain/Subdomain Version and technology detection [ to map next vulnerability scanning steps]
  • Sensitive information discovery [Using search engine to find sensitive information about the target]

Whois Lookup

To Check Other websites registered by the registrant of the site (reverse check on the registrant, email address, and telephone), and in-depth investigation of the target sites .

whois target.tld
whois $domain | grep "Registrant Email" | egrep -ho "[[:graph:]]+@[[:graph:]]+"

Horizontal domain correlation/acquisitions

Most of The time we focus on subdomains ,but they skipout the other half aka Horizontal domain correlation .horizontal domain correlation is a process of finding other domain names, which have a different second-level domain name but are related to the same entity. [0xpatrik]

whois $domain | grep "Registrant Email" | egrep -ho "[[:graph:]]+@[[:graph:]]+"

Subdomain enumeration

After collecting acquisitions for our target company , our first step is to enumerate subdomains for those collected domains. we can divide the enumeration process one is passive enumeration one is active enumeration . On passive enumeration process we will collect subdomains from different source of the website using tools , these tools collects subdomains from sources like [hackertarget , virustotal , threatcrowd , etc] in active enumeration process we will use word-list with all our target domains to generate a permuted list and resolve them to check which are alive and vaild subdomains of target.

Passive enumeration

There are so many tools available on the internet to gather subdomain in passive method.

agent="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36"
curl -s -A $agent "*.$domain&start=10" | grep -Po '((http|https):\/\/)?(([\w.-]*)\.([\w]*)\.([A-z]))\w+' | grep $domain | sort -u curl -s -A $agent "*.$domain&start=20" | grep -Po '((http|https):\/\/)?(([\w.-]*)\.([\w]*)\.([A-z]))\w+' | grep $domain | sort -u curl -s -A $agent "*.$domain&start=30" | grep -Po '((http|https):\/\/)?(([\w.-]*)\.([\w]*)\.([A-z]))\w+' | grep $domain | sort -u curl -s -A $agent "*.$domain&start=40" | grep -Po '((http|https):\/\/)?(([\w.-]*)\.([\w]*)\.([A-z]))\w+' | grep $domain | sort -u
shodan init your_api_key #set your api key on client 
shodan domain domain.tld
curl -s "" | jq -r '.[].name_value' | sed 's/\*\.//g' | sort -u
curl -s "" | grep -Po "(([\w.-]*)\.([\w]*)\.([A-z]))\w+" | sort -u
curl "" -s | grep -Po "(([\w.-]*)\.([\w]*)\.([A-z]))\w+"
curl -s "" | jq .[].dns_names | tr -d '[]"\n ' | tr ',' '\n'
sed -ne 's/^\( *\)Subject:/\1/p;/X509v3 Subject Alternative Name/{N;s/^.*\n//;:a;s/^\( *\)\(.*\), /\1\2\n\1/;ta;p;q; }' < <(openssl x509 -noout -text -in <(openssl s_client -ign_eof 2>/dev/null <<<$'HEAD / HTTP/1.0\r\n\r' \-connect ) ) | grep -Po '((http|https):\/\/)?(([\w.-]*)\.([\w]*)\.([A-z]))\w+'
wget Login into cloudflare "Add site" to your account Provide the target domain as a site you want to add# Wait for cloudflare to dig through DNS data and display the resultspython target.tld
go get -u #download the assetfinderassetfinder --subs-only domain.tld # enumerates the subdomain
download -d domain.tld --silent
download from [ ]  findomain -t target.tld -q

Active enumeration

In active enumeration process first we will permute our collected domain with our wordlist you can use any wordlist from sec-lists , jhaddix-all.txt . to get most of the subdomain i will suggest you to use big wordlist which contains more word to permute more subdomains. there are tools available which can do permutation and resolving in same time like amass . But this amass has some issue , we are gonna skip it . I will show you tool i use to generate dns wordlist.

cat passive-subs.txt perm.txt | sort -u | tee -a all-sub.txt
massdns -r resolvers.txt -t AAAA -w result.txt all-sub.txt
goaltdns -h -w all.txt | massdns -r resolvers.txt -t A -w results.txt -
nmap --script dns-brute --script-args,dns-brute.threads=6

ASN Lookup

There are many ways to find asn number of a company , asn number will help us to retrieve targets internet asset . We can find asn number of a company using dig and whois , but most of the time these will give you a hosting provider asn number.
whois -h

Target Visualize/Web-Screenshot

After Enumerating subdomains/domains we need to visualize those target to see how the use interface look like , mostly is the subdomain is leaking any important information or database or not.sometime on domain/subdomain enumeration we got like 2k-10k subdomains its quite impossible to visit all of them cause it will take more than 30–40 hour , there are many tools available to screenshot those subdomains from subdomains list.

[download-eyewitness] ./EyeWitness -f subdomains.txt --web
[download-webscreenshot] pip3 install webscreenshot webscreenshot -i subdomains.txt

Crawling & Collecting Pagelinks

A url or pagelinks contains many information . Some time those pagelinks contains parameter , endpoint to some sensitive information disclosure etc etc. There are lots of tool available to crawl or collect pagelinks .

Javascript Files Crawling/Sensitive data extracting from js

Its not a different part from website crawling we can gather javascript file from those crawling tools.

gospider -s --js --quiet
echo | gau | grep '\.js$' | httpx -status-code -mc 200 -content-type | grep 'application/javascript'
gau target.tld | grep "\\.js" | uniq | sort -u waybackurls targets.tld | grep "\\.js" | uniq | sort
cat subdomains | getJS --complete
[see-the-list] cat file.js | grep API_REGEX
cat file.js | grep -aoP "(?<=(\"|\'|\`))\/[a-zA-Z0-9_?&=\/\-\#\.]*(?=(\"|\'|\`))" | sort -u
cat file.js | ./extract.rb

Parameter discovery

Web applications use parameters (or queries) to accept user input. We can test for some vulnerability on params like xss,sql,lfi,rce,etc.

[download-arjun] pip3 install arjun arjun -i subdomains.txt -m GET -oT param.txt #for multiple targetarjun -u -m GET -oT param.txt #for single target [-m ] parameter method 
[-oT] text format output # you can see more options on arjun -h

Subdomain Cname extraction

extracting cname of subdomain is usefull for us to see if any of these subdomain is pointing to other hosting/cloud services. So that later we can test for takeover.

dig CNAME +short
cat subdomains.txt | xargs -P10 -n1 dig CNAME +short

Domain/Subdomain Version and technology detection

Its important to scan for domain/subdomain version and technology so that we can create a model for vulnerability detection , how we are gonna approach our target site.

[install-wappalyzer] npm i -g wappalyzer 
wappalyzer #single domain
cat subdomain.txt | xargs -P1 -n1 wappalyzer | tee -a result

Sensitive information discovery

Some target unintentionally contains sensitive information like database dump, site backup,database backup , debug mode leak , etc. sometime searchengine like google,shodan,zoomeye,leakix contains some sensitive information of the site or leak something .

cat subdomains | xargs -P1 -n1 ffuf -w backup.txt -mc 200,403 -u 
site:target.tld ext:doc | ext:docx | ext:odt | ext:rtf | ext:sxw | ext:psw | ext:ppt | ext:pptx | ext:pps | ext:csv
site:target.tld intitle:index.of
site:target.tld ext:xml | ext:conf | ext:cnf | ext:reg | ext:inf | ext:rdp | ext:cfg | ext:txt | ext:ora | ext:ini | ext:env html:"db_uname:" port:"80" http.status:200 # this will find us a asset of with db_uname: with it with staus response code 200http.html:/dana-na/"" # this will find us Pulse VPN with possible CVE-2019-11510html:"horde_login"""  # this will find us Horde Webamil with possible CVE 2018-19518We can Repet the second 2 process also with product filter Ex:product:"Pulse Secure"""http.html:"* The wp-config.php creation script uses this file" # this will find us open wp-config.php file with possible sensitive credential



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Joy Ghosh

Joy Ghosh

Security Researchers | Ctf Player | Web-Application Pen-tester | Programmer