Understanding Uniform Resource Locators (URLs): Structure and Components

Learn about the structure and function of Uniform Resource Locators (URLs). This guide explains the different parts of a URL, including scheme, domain, path, and query parameters, and how they work together to identify resources on the internet.

Understanding Uniform Resource Locators (URLs)

What is a URL?

A URL (Uniform Resource Locator), commonly known as a web address, identifies a resource (like a webpage, image, or file) on the internet. Web browsers use URLs to request resources from web servers. URLs are more user-friendly than using IP addresses directly. While URLs can be composed of words (e.g., `w3schools.com`), the underlying system uses Internet Protocol (IP) addresses (numerical addresses like `192.168.1.1`) to route traffic.

URL Structure and Syntax

A URL typically follows this format:

scheme://prefix.domain:port/path/filename

Let's break down each part:

scheme: The type of internet service (e.g., http, https, ftp). http is the standard for web pages. https is the secure version, using encryption.
prefix: A subdomain (often `www`).
domain: The domain name (e.g., `tutorialsarena.com`).
port: The port number (default is 80 for HTTP, 443 for HTTPS); usually omitted.
path: The path to the resource on the server (omitted if the resource is in the root directory).
filename: The name of the document or resource.

For example, in https://www.tutorialsarena.com/page/index.html:

scheme: https
prefix: www
domain: tutorialsarena.com
path: /page/
filename: index.html

Common URL Schemes

Scheme	Description	Use
`http`	HyperText Transfer Protocol	Standard web pages (not encrypted).
`https`	Secure HyperText Transfer Protocol	Secure web pages (encrypted).
`ftp`	File Transfer Protocol	Transferring files.
`file`	File URL	Accessing files on your local computer.

URL Encoding

URLs can only use the ASCII character set. If a URL contains characters outside of ASCII (like accented characters or spaces), it needs to be encoded. URL encoding replaces non-ASCII characters with a "%" followed by their hexadecimal representation. Spaces are typically replaced with "+" or "%20".

Below is a table showing examples of ASCII encoding (your browser's encoding will depend on the page's character set; the default in HTML5 is UTF-8).

Character	Windows-1252 Encoding	UTF-8 Encoding
€	%80	%E2%82%AC
£	%A3	%C2%A3
©	%A9	%C2%A9
®	%AE	%C2%AE
À	%C0	%C3%80
Á	%C1	%C3%81
Â	%C2	%C3%82
Ã	%C3	%C3%83
Ä	%C4	%C3%84
Å	%C5	%C3%85

Follow On

TutorialsArena

Understanding Uniform Resource Locators (URLs): Structure and Components