📌 Processing `sitemap.xml` with Bash

This Bash script reads only **.html** and **.php** links from `sitemap.xml` and generates a beautiful `sitemap.html` file.

📝 Bash Script


#!/bin/bash
# Disable history expansion to prevent issues with special characters
set +H

# Input and output files
SITEMAP_FILE="sitemap.xml"
OUTPUT_FILE="sitemap.html"

# Check if sitemap.xml exists
if [[ ! -f "$SITEMAP_FILE" ]]; then
  echo "❌ Error: sitemap.xml file not found!"
  exit 1
fi

# Extract only links ending in .html or .php
URLS=$(grep -oP '<loc>\K(https?://[^<]+.(html|php))' "$SITEMAP_FILE")

# Check if any valid links were found
if [[ -z "$URLS" ]]; then
  echo "❌ Error: No .html or .php links found in the sitemap."
  exit 1
fi

# Function to format page names
format_page_name() {
  local url="$1"
  url=${url#http://}
  url=${url#https://}
  url=${url#*/}
  url_display=${url%.html}
  url_display=${url_display%.php}
  formatted_name=$(echo "$url_display" | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++) $i=toupper(substr($i,1,1)) tolower(substr($i,2));}1')
  echo "$formatted_name"
}

# Generate output HTML
{
  echo "<!DOCTYPE html>"
  echo "<html lang='en'>"
  echo "<head>"
  echo "  <meta charset='UTF-8'>"
  echo "  <meta name='viewport' content='width=device-width, initial-scale=1.0'>"
  echo "  <meta name='robots' content='index, follow'>"
  echo "  <meta name='keywords' content='sitemap, website navigation, SEO, index page'>"
  echo "  <meta name='description' content='This is the sitemap page, providing users quick access to all sections of the website.'>"
  echo "  <title>Website Sitemap</title>"
  echo "  <meta property='og:title' content='Website Sitemap'>"
  echo "  <meta property='og:description' content='A structured list of all pages on this website to help with navigation.'>"
  echo "  <meta name='author' content='Mir Ali Shahidi'>"
  echo "  <style>"
  echo "    body { font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif; background-color: #f8f9fa; color: #343a40; margin: 0; padding: 20px; line-height: 1.6; }"
  echo "    ul { list-style: none; margin: 20px 0; padding: 0 20px; list-style-position: inside; counter-reset: list-item; background-color: #fff; border-left: 2px solid #007bff; border-radius: 5px; }"
  echo "    ul li { margin-bottom: 10px; padding: 10px; border-bottom: 1px solid #dee2e6; counter-increment: list-item; }"
  echo "    ul li::before { content: counter(list-item) '. '; font-weight: bold; color: #007bff; margin-right: 10px; }"
  echo "    ul li:last-child { border-bottom: none; }"
  echo "    a { color: #007bff; text-decoration: none; font-weight: 500; transition: color 0.3s ease, text-decoration 0.3s ease; }"
  echo "    a:hover { color: #0056b3; text-decoration: underline; }"
  echo "  </style>"
  echo "</head>"
  echo "<body>"
  echo "  <header>"
  echo "    <h1>Website Sitemap</h1>"
  echo "  </header>"
  echo "  <ul>"

  while read -r url; do
    page_name=$(format_page_name "$url")
    echo "    <li><a href='$url'>$page_name</a></li>"
  done <<< "$URLS"

  echo "  </ul>"
  echo "</body>"
  echo "</html>"
} > "$OUTPUT_FILE"

echo "✅ HTML file successfully created: $OUTPUT_FILE"


🔹 Sample `sitemap.xml` (Input)


<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url><loc>https://www.miralishahidi.ir/about.html</loc></url>
    <url><loc>https://www.miralishahidi.ir/contact.html</loc></url>
    <url><loc>https://www.miralishahidi.ir/login.php</loc></url>
    <url><loc>https://www.miralishahidi.ir/dashboard.php</loc></url>
    <url><loc>https://www.miralishahidi.ir/image.png</loc></url> 
</urlset>
    

🔹 Output `sitemap.html`


<ul>
    <li><a href="https://www.miralishahidi.ir/about.html">About</a></li>
    <li><a href="https://www.miralishahidi.ir/contact.html">Contact</a></li>
    <li><a href="https://www.miralishahidi.ir/login.php">Login</a></li>
    <li><a href="https://www.miralishahidi.ir/dashboard.php">Dashboard</a></li>
</ul>
    

🚀 Running the Script on Termux or Linux


bash script.sh
    

📌 Important Notes