How to Extract Emails from Websites

Why Websites Contain Email Addresses

Websites are one of the richest sources of publicly shared email addresses. Organizations and individuals publish their contact information in many places across a website:

Contact pages – dedicated contact sections almost always list one or more email addresses for inquiries, support, or sales.
Footer sections – many websites include a general contact email in the site-wide footer, visible on every page.
About and team pages – company team pages often list individual email addresses for each team member or department.
Job listings – career pages and job postings frequently include an HR or hiring manager email for applications.
Legal and impressum pages – in many countries (especially in Europe), businesses are legally required to publish contact details including email on their impressum or legal notice page.
Blog posts and articles – author bios, guest contributor credits, and press releases often contain email addresses.

Manually browsing through every page of a website to collect email addresses is tedious and easy to miss. The methods below automate this process.

Method 1: Copy-Paste from a Website

The simplest approach requires no tools or programming knowledge. It works for any web page you can view in a browser.

Navigate to the web page containing email addresses.
Select all visible text with Ctrl+A (Windows/Linux) or Cmd+A (macOS).
Copy the selected text with Ctrl+C / Cmd+C.
Go to extract-emails.com and paste the text into the input field.
The tool instantly identifies and lists every email address found in the pasted text.

Limitation: This method only captures text visible on the current page. It does not follow links to other pages, and email addresses hidden in the HTML source (e.g., mailto: links behind buttons) may not be included.

Method 2: Use Our Browser-Based Tool (Recommended)

Our tool at extract-emails.com can process large blocks of text pasted from one or multiple web pages. For the best results:

Visit the pages you want to extract emails from.
Use Ctrl+U (View Source) to access the full HTML source code, which may contain email addresses not visible on the rendered page.
Copy the source and paste it into our tool.
The tool strips all HTML tags and applies a regex pattern to find every email address, including those in mailto: links, meta tags, and JavaScript variables.
Results are deduplicated and displayed instantly. Copy them or download as a file.

Privacy: All processing happens locally in your browser. No data is sent to any server.

Method 3: Python with Beautiful Soup

For developers who need to extract emails from multiple pages or automate the process, Python with the Beautiful Soup library is the standard approach.

Basic Single-Page Extraction

Install dependencies and extract emails from a URL

pip install beautifulsoup4 requests

import re
import requests
from bs4 import BeautifulSoup

def extract_emails_from_url(url):
    response = requests.get(url, timeout=10)
    soup = BeautifulSoup(response.text, "html.parser")

    # Get all visible text
    text = soup.get_text()

    # Also check mailto: links
    for link in soup.find_all("a", href=True):
        if link["href"].startswith("mailto:"):
            text += " " + link["href"].replace("mailto:", "")

    pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
    emails = list(set(re.findall(pattern, text)))
    return sorted(emails)

# Example usage
emails = extract_emails_from_url("https://example.com/contact")
for email in emails:
    print(email)

Crawling Multiple Pages

To extract emails from an entire website, you can follow internal links and visit each page:

import re
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse

def crawl_and_extract(start_url, max_pages=50):
    domain = urlparse(start_url).netloc
    visited = set()
    to_visit = [start_url]
    all_emails = set()
    pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

    while to_visit and len(visited) < max_pages:
        url = to_visit.pop(0)
        if url in visited:
            continue
        visited.add(url)

        try:
            response = requests.get(url, timeout=10)
            soup = BeautifulSoup(response.text, "html.parser")

            # Extract emails from text
            text = soup.get_text()
            emails = re.findall(pattern, text)
            all_emails.update(emails)

            # Extract from mailto: links
            for link in soup.find_all("a", href=True):
                href = link["href"]
                if href.startswith("mailto:"):
                    email = href.replace("mailto:", "").split("?")[0]
                    all_emails.add(email)
                else:
                    full_url = urljoin(url, href)
                    if urlparse(full_url).netloc == domain:
                        to_visit.append(full_url)

            print(f"Visited: {url} ({len(emails)} emails)")
        except Exception as e:
            print(f"Error: {url} - {e}")

    return sorted(all_emails)

emails = crawl_and_extract("https://example.com")
print(f"\nTotal unique emails: {len(emails)}")
for email in emails:
    print(email)

Legal and Ethical Considerations

Before extracting email addresses from websites, consider these important points:

Check robots.txt: Many websites specify crawling rules in their robots.txt file. Respect these directives, especially for automated scraping.
Rate limiting: If crawling multiple pages, add delays between requests (e.g., 1–2 seconds) to avoid overloading the server.
GDPR compliance: Under the GDPR, the fact that an email address is publicly visible does not automatically give you the right to use it for marketing. You still need a lawful basis for processing. See our GDPR guide for details.
Terms of service: Many websites prohibit automated scraping in their terms of service. Review these before crawling.
Distinguish extraction from harvesting: Extracting emails from a website you own or have permission to access is different from mass-harvesting emails across the internet. The latter is prohibited in many jurisdictions.

Tips for Best Results

Check the HTML source. Many email addresses are hidden behind buttons or forms. Viewing the page source (Ctrl+U) often reveals addresses not visible on the rendered page.
Look for obfuscated emails. Some websites protect email addresses using JavaScript encoding, HTML entities, or formats like “name [at] domain [dot] com”. These require additional processing beyond standard regex.
Focus on relevant pages. Contact, about, team, and impressum pages are the most likely to contain email addresses. Start there before crawling the entire site.
Validate your results. Not every string matching an email pattern is a real address. Filter out obvious placeholders like example@example.com or noreply@domain.com.
Remove duplicates. Large websites often repeat the same email address on multiple pages. All methods above include deduplication.
Use our tool for quick results. If you just need emails from a handful of pages, copy-pasting text or HTML source into our tool is faster than writing a script.

Method 4: Using Browser Developer Tools

Your browser’s built-in developer tools offer a fast, no-install way to extract email addresses from any web page — including pages that load content dynamically via JavaScript.

Inspect the Network Tab

Open the page in Chrome or Firefox and press F12 to open DevTools.
Go to the Network tab and reload the page (Ctrl+R).
Filter requests by XHR / Fetch to see API calls. Contact data — including email addresses — is often returned in these JSON responses rather than in the initial HTML.
Click any suspicious request and inspect the Response tab. Use Ctrl+F to search for the @ character.

Search the Full Page Source

Press Ctrl+U (View Page Source) to open the raw HTML in a new tab, then use Ctrl+F and search for @ or mailto. This finds addresses buried in script blocks, hidden inputs, and HTML comments that don’t appear on the rendered page.

For a faster workflow, copy the entire page source and paste it into our email extractor — it scans the full HTML instantly and returns a clean, deduplicated list.

Handling JavaScript-Rendered Pages

Single-page apps (React, Vue, Angular) and infinite-scroll feeds assemble their HTML in the browser rather than serving it from the server. A simple HTTP request to the URL returns an almost-empty HTML shell — the email addresses are only present after JavaScript executes.

Two common solutions:

Headless browsers (Puppeteer / Playwright): These tools control a real Chromium instance programmatically, wait for the page to fully render, and then let you read the complete DOM. A Puppeteer snippet to grab rendered HTML looks like this:

const puppeteer = require('puppeteer');
const { execSync } = require('child_process');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com/team', { waitUntil: 'networkidle2' });
  const html = await page.content();
  const emails = [...html.matchAll(/[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g)]
    .map(m => m[0]);
  console.log([...new Set(emails)].join('\n'));
  await browser.close();
})();

Selenium (Python/Java): An older but widely supported alternative that works with Chrome, Firefox, and Edge via WebDriver.

For occasional lookups, the DevTools approach is quicker. For recurring scraping of JS-heavy sites, a headless browser script is the reliable choice.

Frequently Asked Questions

Can I extract emails from a website I don’t own?: Technically yes, if the addresses are publicly visible. Legally, it depends on the site’s terms of service, your jurisdiction, and your intended use. Always check the ToS and ensure you have a lawful basis before using extracted addresses for marketing.
What is the fastest method for a single page?: View the page source (Ctrl+U), copy it all, and paste it into our tool. Results appear in under a second.
My Python script finds no emails, but I can see them on the page.: The page is almost certainly JavaScript-rendered. The requests library only fetches the HTML skeleton; it does not execute JavaScript. Switch to Puppeteer, Playwright, or Selenium to get the fully rendered DOM.
How do I avoid getting blocked while crawling?: Rotate user-agent strings, add 1–3 second delays between requests, respect robots.txt disallow rules, and avoid crawling the same domain from multiple threads simultaneously.

Extract Emails from Any Website Now

Paste text or HTML source from any web page – our free tool finds every email address instantly, right in your browser.

Open Email Extractor

About the Author

Daniel Dorfer worked for nearly four years in technical support at GMX, one of Germany’s largest email providers, and for almost two years at united domains, a leading domain hoster and registrar. He is a founding member of the KIBC (KI Business Club). This website was built entirely with the help of Claude Code (Opus 4.6) by Anthropic.