Understanding Uniform Resource Locators (URLs): Structure and Components

Learn about the structure and function of Uniform Resource Locators (URLs). This guide explains the different parts of a URL, including scheme, domain, path, and query parameters, and how they work together to identify resources on the internet.



Understanding Uniform Resource Locators (URLs)

What is a URL?

A URL (Uniform Resource Locator), commonly known as a web address, identifies a resource (like a webpage, image, or file) on the internet. Web browsers use URLs to request resources from web servers. URLs are more user-friendly than using IP addresses directly. While URLs can be composed of words (e.g., `w3schools.com`), the underlying system uses Internet Protocol (IP) addresses (numerical addresses like `192.168.1.1`) to route traffic.

URL Structure and Syntax

A URL typically follows this format:

scheme://prefix.domain:port/path/filename

Let's break down each part:

  • scheme: The type of internet service (e.g., http, https, ftp). http is the standard for web pages. https is the secure version, using encryption.
  • prefix: A subdomain (often `www`).
  • domain: The domain name (e.g., `tutorialsarena.com`).
  • port: The port number (default is 80 for HTTP, 443 for HTTPS); usually omitted.
  • path: The path to the resource on the server (omitted if the resource is in the root directory).
  • filename: The name of the document or resource.

For example, in https://www.tutorialsarena.com/page/index.html:

  • scheme: https
  • prefix: www
  • domain: tutorialsarena.com
  • path: /page/
  • filename: index.html

Common URL Schemes

Scheme Description Use
http HyperText Transfer Protocol Standard web pages (not encrypted).
https Secure HyperText Transfer Protocol Secure web pages (encrypted).
ftp File Transfer Protocol Transferring files.
file File URL Accessing files on your local computer.

URL Encoding

URLs can only use the ASCII character set. If a URL contains characters outside of ASCII (like accented characters or spaces), it needs to be encoded. URL encoding replaces non-ASCII characters with a "%" followed by their hexadecimal representation. Spaces are typically replaced with "+" or "%20".

Below is a table showing examples of ASCII encoding (your browser's encoding will depend on the page's character set; the default in HTML5 is UTF-8).

Character Windows-1252 Encoding UTF-8 Encoding
%80 %E2%82%AC
£ %A3 %C2%A3
© %A9 %C2%A9
® %AE %C2%AE
À %C0 %C3%80
Á %C1 %C3%81
 %C2 %C3%82
à %C3 %C3%83
Ä %C4 %C3%84
Å %C5 %C3%85