11. The Web at a Glance: Anatomy of a URL

Disclosure: Your support helps keep the site running! We earn a referral fee for some of the services we recommend on this page. Learn more

A URL is a Uniform Resource Locator. Another name for it is a URI, or Uniform Resource Identifier.

Both URL and URI refer to the same thing: the text address of a website. Most people call it a URL, but URI is more technically accurate so people in the tech-industry tend to call it a URI. It doesn’t matter that much for everyday use — but it is good to know both terms, so you know what people are talking about either way.

Contents

1 What’s in a Name?
2 The Components of a URL

What’s in a Name?

Let’s think about those names for a moment.

Uniform

A URL/URI is called Uniform for two reasons:

The URL format is a uniform format — all URLs are structured the same way. There is a standard.
For any given resource (web page, file, etc.), there should be one, and only one, URL.

It turns out not to be the case that there is one and only one URL for any given resource — which is why the U doesn’t stand for “Unique.” There are often multiple URLs that will all reach the same resource. The technical reasons for this aren’t important, but it’s good to know that it does happen. Good web administrators will designate one as the canonical URL, or the “official” one. This provides a uniform way of accessing the given resource.

Resource

A web page is a resource. Why not call the address a UPL — a Uniform Page Locator? Because not all URLs point to pages:

Images are not pages, they are files. Every image on the internet has a URL — if it didn’t have a URL, you wouldn’t be able to access.
Other types of media also have URLs — video, audio, games.
Web applications have URLs.
Web services have URLs. All of the different types of things you can access from the internet via a URL are called resources.

Locator

The original conception of URLs is that they represented the “address” of a particular resource, where it could be found. We (that is, humans) continue to think of it that way, but it isn’t really technically accurate.

Identifier

It is more accurate to talk about URIs as identifying a resource, rather than locating a resource. The URI tells you (and the computer system) what you are looking for, and then the computer system finds it for you.

The Components of a URL

A URL is made up of several different pieces, each indicating a particular piece of information about the resource. Not every URL will have all of these different components, but all will have some of them.

Consider the following example URL:

https://blog.example.com/web-tutorials/addresses.html?userid=123456#section1

Protocol: https

Most websites will either use HTTP (Hypertext Transfer Protocol), or HTTPS, its secure version.

HTTP is the protocol used for transferring web pages, and most other resources, over the internet. If you are logged in or providing sensitive information (like your credit card number), the protocol should be HTTPS.

Other protocols you might see include:

FTP — File transfer protocol. Used for uploading and downloading files.
SPDY — Google replacement for HTTP, which is designed to work faster.
SMTP — Simple Mail Transfer Protocol, the protocol used for moving email messages around the internet. You will usually not see this in a browser address bar.

Top Level Domain: .com

The .com domain is usually used for commercial sites. Other TLDs you might see include:

.org — Usually used for non-profit organizations.
.gov — Only used by the United States government.
.edu — Only used by colleges and universities.
.uk — Sites originating in the United Kingdom.

Host Name: blog.example.com

In this case, the host name consists of three parts:

example.com — This is the domain name.
blog — This is a subdomain, usually used for a specific sub-site within a larger domain. The most common subdomain is www, which stands for World Wide Web. It is used by some domains to indicate publicly accessible content.
blog.example.com — Taken all together, this is called the host name, or (less frequently) the server name.

Path: /web-tutorials/addresses.html

/web-tutorials/addresses.html — This is called the path. It tells the hosting server what resource to find. It has a few sub-parts:

/web-tutorials/ is a directory, like a file folder on your computer. In fact, sometimes it is a file folder on the server, though not always. On many websites, especially blogs, the directory represents categories. Shopping sites might have a long hierarchy of cateogories: /books/non-fiction/computers/tutorials/internet/
addresses.html — This is the resource name. It might be a page name or a file name.
.html — This is the file extension. Just like on your home computer, this indicates the type of file it is.

.html is the typical file extension for websites. Other common extensions for websites include .php and .asp. Many non-page files on the web have other extensions: .jpg, .gif, .pdf, .js. Many pages don’t use a file extension at all.

Parameters: ?userrid=123456

?userid=123456 — This is called a parameter. It is used to encode some specific date into the URL, for use by the server. In this case, it appears to indicate that a user with the id 123456 is requesting the page.

Each parameter is made up of a few different parts:

The question mark ( ? ) indicates the beginning of a list of parameters.
Each parameter has a name and a value.
The name of this is parameter is userid, and the value is 123456.
The name and value are separated by an equals signs ( = ).

Sometimes there will be multiple parameters. If that is the case, the parameters are separated by the ampersand sign ( & ), like this: ?userid=123456&color=green.

Named Anchor: #section1

#section1 — This is a named anchor, which links to a specific location on the page; in this case, section 1. Named anchors can be used to link to a specific time-stamp on a video, a specific set of coordinates on a map, or a specific section on a text document.

← Previous Topic: Examples of URLsKeep Reading: Membership Websites →