URL Encoding Explained - Everything You Need to Know
URL Encoding Explained - Everything You Need to Know
Learn everything about URL encoding (percent encoding) including reserved characters, UTF-8 in URLs, common issues, and practical examples. A complete reference for web developers.
Introduction: What Is URL Encoding?
URL encoding, also known as percent encoding, is a mechanism for converting characters into a format that can be safely transmitted in URLs. Since URLs can only contain a limited set of characters from the ASCII character set, any character outside this set -- or any character that has a special meaning in URL syntax -- must be encoded.
If you have ever seen %20 in a URL and wondered what it means, or if you have struggled with broken links containing special characters, this guide will give you a thorough understanding of how URL encoding works and why it matters.
Need to quickly encode or decode a URL? Try our URL Encoder/Decoder tool -- it handles all the encoding rules automatically.
Why URL Encoding Exists
URLs were designed in the early days of the internet when systems had limited character set support. The original URL specification (RFC 1738, published in 1994) restricted URLs to a small subset of ASCII characters. This was a practical decision that ensured URLs would work across different systems, networks, and protocols.
The problem is that the real world requires much more than basic ASCII:
- Spaces in file names:
my document.pdfneeds to be encoded asmy%20document.pdf - International characters: Japanese, Korean, Arabic, and other scripts need encoding
- Special characters in data: Query parameters may contain characters like
&,=,#that have special meaning in URLs - Binary data: Sometimes binary data needs to be included in URLs
URL encoding solves all of these problems by providing a universal escape mechanism.
How Percent Encoding Works
The percent encoding algorithm is straightforward:
- Take the character to be encoded
- Convert it to its byte representation (using UTF-8 for non-ASCII characters)
- For each byte, write a
%followed by two hexadecimal digits
Examples
| Character | UTF-8 Bytes | Percent Encoded |
|---|---|---|
| Space | 0x20 | %20 |
! | 0x21 | %21 |
# | 0x23 | %23 |
$ | 0x24 | %24 |
& | 0x26 | %26 |
+ | 0x2B | %2B |
/ | 0x2F | %2F |
= | 0x3D | %3D |
? | 0x3F | %3F |
@ | 0x40 | %40 |
Multi-Byte Characters
Non-ASCII characters require multiple bytes in UTF-8 and therefore produce multiple percent-encoded sequences:
| Character | UTF-8 Bytes | Percent Encoded |
|---|---|---|
e (with accent) | 0xC3 0xA9 | %C3%A9 |
| Euro sign | 0xE2 0x82 0xAC | %E2%82%AC |
| Japanese (ka) | 0xE3 0x81 0x8B | %E3%81%8B |
| Korean (han) | 0xED 0x95 0x9C | %ED%95%9C |
| Emoji (smile) | 0xF0 0x9F 0x98 0x80 | %F0%9F%98%80 |
Try encoding these characters yourself with our URL Encoder tool.
URL Structure and Reserved Characters
To understand URL encoding properly, you need to understand URL structure:
https://user:pass@www.example.com:8080/path/to/page?key=value&foo=bar#section
|_____| |______| |_______________|____||___________|_________________|_______|
scheme userinfo host port path query fragment
Reserved Characters
Reserved characters have special meanings in URLs. If you want to use them as data (rather than as delimiters), they must be percent-encoded.
| Character | Purpose in URL | Encoded Form |
|---|---|---|
: | Separates scheme, port, userinfo | %3A |
/ | Separates path segments | %2F |
? | Starts query string | %3F |
# | Starts fragment | %23 |
[ ] | IPv6 address | %5B %5D |
@ | Separates userinfo from host | %40 |
! | Sub-delimiter | %21 |
$ | Sub-delimiter | %24 |
& | Separates query parameters | %26 |
' | Sub-delimiter | %27 |
( ) | Sub-delimiters | %28 %29 |
* | Sub-delimiter | %2A |
+ | Space in query strings (legacy) | %2B |
, | Sub-delimiter | %2C |
; | Sub-delimiter | %3B |
= | Separates key from value in query | %3D |
Unreserved Characters
These characters never need to be encoded in any part of a URL:
- Letters:
A-Zanda-z - Digits:
0-9 - Hyphen:
- - Period:
. - Underscore:
_ - Tilde:
~
The Space Character: %20 vs. +
The encoding of spaces is one of the most confusing aspects of URL encoding:
%20: The correct percent-encoding for a space character, used in path segments+: An alternative encoding for space, but only valid in query strings (from theapplication/x-www-form-urlencodedformat)
Path: https://example.com/my%20documents/file%20name.pdf (correct)
Path: https://example.com/my+documents/file+name.pdf (WRONG - + is literal)
Query: https://example.com/search?q=hello+world (correct)
Query: https://example.com/search?q=hello%20world (also correct)
Best practice: Use %20 in path segments and either + or %20 in query strings. When in doubt, %20 always works.
URL Encoding in Different Programming Languages
JavaScript
JavaScript provides several functions for URL encoding, each with a different scope:
// encodeURIComponent - Encode a URI component (query parameter value)
encodeURIComponent('hello world & goodbye')
// Result: "hello%20world%20%26%20goodbye"
// encodeURI - Encode a complete URI (preserves URL structure characters)
encodeURI('https://example.com/path with spaces?q=hello world')
// Result: "https://example.com/path%20with%20spaces?q=hello%20world"
// decodeURIComponent - Decode a URI component
decodeURIComponent('hello%20world%20%26%20goodbye')
// Result: "hello world & goodbye"
// decodeURI - Decode a complete URI
decodeURI('https://example.com/path%20with%20spaces')
// Result: "https://example.com/path with spaces"
Key difference between encodeURI and encodeURIComponent:
| Function | Does NOT encode | Use case |
|---|---|---|
encodeURI | :, /, ?, #, &, =, @, + | Encoding a complete URL |
encodeURIComponent | Only unreserved characters | Encoding a query parameter value |
// WRONG: Using encodeURI for a parameter value
const query = 'price=10¤cy=USD';
encodeURI(query) // "price=10¤cy=USD" -- & is NOT encoded!
// CORRECT: Using encodeURIComponent for a parameter value
encodeURIComponent(query) // "price%3D10%26currency%3DUSD"
// Building a URL with query parameters
const baseUrl = 'https://api.example.com/search';
const params = new URLSearchParams({
q: 'hello world',
category: 'books & media',
page: '1',
});
const fullUrl = `${baseUrl}?${params.toString()}`;
// Result: "https://api.example.com/search?q=hello+world&category=books+%26+media&page=1"
The URLSearchParams API
Modern JavaScript provides URLSearchParams for handling query strings:
// Creating query strings
const params = new URLSearchParams();
params.set('name', 'John Doe');
params.set('city', 'New York');
params.set('interests', 'coding & design');
console.log(params.toString());
// "name=John+Doe&city=New+York&interests=coding+%26+design"
// Parsing query strings
const url = new URL('https://example.com/search?q=hello+world&page=2');
console.log(url.searchParams.get('q')); // "hello world"
console.log(url.searchParams.get('page')); // "2"
// Iterating over parameters
for (const [key, value] of url.searchParams) {
console.log(`${key}: ${value}`);
}
Python
from urllib.parse import quote, unquote, urlencode, parse_qs
# Encode a string for use in a URL path
quote('hello world/path') # 'hello%20world/path' (/ not encoded)
quote('hello world/path', safe='') # 'hello%20world%2Fpath' (encode everything)
# Decode a percent-encoded string
unquote('hello%20world') # 'hello world'
# Build query strings
params = {'q': 'hello world', 'category': 'books & media'}
urlencode(params) # 'q=hello+world&category=books+%26+media'
# Parse query strings
parse_qs('q=hello+world&category=books+%26+media')
# {'q': ['hello world'], 'category': ['books & media']}
Java
import java.net.URLEncoder;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
// Encode
String encoded = URLEncoder.encode("hello world & more", StandardCharsets.UTF_8);
// Result: "hello+world+%26+more"
// Decode
String decoded = URLDecoder.decode("hello+world+%26+more", StandardCharsets.UTF_8);
// Result: "hello world & more"
PHP
// URL encode (spaces as +)
urlencode('hello world & more'); // "hello+world+%26+more"
// Raw URL encode (spaces as %20)
rawurlencode('hello world & more'); // "hello%20world%20%26%20more"
// Decode
urldecode('hello+world+%26+more'); // "hello world & more"
rawurldecode('hello%20world%20%26%20more'); // "hello world & more"
// Build query string
http_build_query(['q' => 'hello world', 'page' => 1]);
// "q=hello+world&page=1"
Go
import "net/url"
// Encode a query parameter
url.QueryEscape("hello world & more")
// "hello+world+%26+more"
// Encode a path segment
url.PathEscape("hello world & more")
// "hello%20world%20&%20more"
// Build a URL with parameters
u, _ := url.Parse("https://api.example.com/search")
q := u.Query()
q.Set("q", "hello world")
q.Set("category", "books & media")
u.RawQuery = q.Encode()
// "https://api.example.com/search?category=books+%26+media&q=hello+world"
UTF-8 and International Characters in URLs
The IRI Standard
Internationalized Resource Identifiers (IRIs, defined in RFC 3987) extend URLs to support Unicode characters. However, when transmitted over the wire, IRIs are converted to URIs using UTF-8 percent encoding.
IRI: https://example.com/cafe
URI: https://example.com/caf%C3%A9
IRI: https://example.com/search?q=Tokio (Tokyo)
URI: https://example.com/search?q=%E6%9D%B1%E4%BA%AC
Internationalized Domain Names (IDN)
Domain names with non-ASCII characters use Punycode encoding:
Unicode: https://munchen.de
Punycode: https://xn--mnchen-3ya.de
Unicode: https://example.jp
Punycode: https://xn--r8jz45g.jp
How Browsers Handle Non-ASCII URLs
Modern browsers display IRIs (the Unicode versions) in the address bar for readability but send the percent-encoded URI over the network. This is why you might see https://example.com/cafe in your browser but https://example.com/caf%C3%A9 in the HTTP request.
Common URL Encoding Issues and Solutions
Issue 1: Double Encoding
Double encoding happens when an already-encoded URL is encoded again:
Original: hello world
First encoding: hello%20world
Double encoding: hello%2520world (%25 is the encoding of %)
Solution: Check whether the input is already encoded before encoding it. In JavaScript:
function safeEncodeURIComponent(str) {
try {
// Try to decode first -- if it changes, it was already encoded
const decoded = decodeURIComponent(str);
if (decoded !== str) {
return str; // Already encoded
}
} catch (e) {
// If decoding fails, the string has invalid encoding -- encode it fresh
}
return encodeURIComponent(str);
}
Issue 2: Encoding the Entire URL Instead of Components
// WRONG: Encoding the entire URL
const url = 'https://api.example.com/search?q=hello world';
const encoded = encodeURIComponent(url);
// "https%3A%2F%2Fapi.example.com%2Fsearch%3Fq%3Dhello%20world"
// This is completely broken as a URL!
// CORRECT: Only encode the component that needs encoding
const baseUrl = 'https://api.example.com/search';
const query = encodeURIComponent('hello world');
const correctUrl = `${baseUrl}?q=${query}`;
// "https://api.example.com/search?q=hello%20world"
Issue 3: Not Encoding Special Characters in API Parameters
// Bug: & in the value breaks the query string
const apiUrl = `https://api.example.com/search?q=Tom & Jerry&page=1`;
// This creates THREE parameters: q=Tom, Jerry, page=1
// Fix: Encode the parameter value
const apiUrl = `https://api.example.com/search?q=${encodeURIComponent('Tom & Jerry')}&page=1`;
// Correct: q=Tom%20%26%20Jerry&page=1
Issue 4: File Paths with Spaces on Servers
Spaces in file paths cause issues when constructing URLs:
// File path: /uploads/my report (final).pdf
// WRONG
const url = `/uploads/my report (final).pdf`;
// CORRECT
const fileName = 'my report (final).pdf';
const url = `/uploads/${encodeURIComponent(fileName)}`;
// "/uploads/my%20report%20(final).pdf"
Issue 5: Encoding Issues with HTML Forms
HTML forms encode data differently depending on the method:
- GET forms: Parameters are added to the URL using
application/x-www-form-urlencodedformat (spaces as+) - POST forms: Body uses
application/x-www-form-urlencodedby default - File uploads: Use
multipart/form-dataencoding
{/* GET form - parameters appear in URL */}
<form method="GET" action="/search">
<input name="q" value="hello world" />
{/* Results in: /search?q=hello+world */}
</form>
{/* POST form with JSON body */}
<form id="myForm">
<input name="name" value="John & Jane" />
</form>
<script>
const form = document.getElementById('myForm');
const data = Object.fromEntries(new FormData(form));
fetch('/api/users', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(data),
});
// JSON encoding handles special characters differently
</script>
URL Encoding in Different Contexts
URLs in HTML
When putting URLs in HTML attributes, you need to handle both URL encoding and HTML entity encoding:
{/* URL encoding for the URL itself */}
<a href="https://example.com/search?q=hello%20world&page=1">Search</a>
{/* Note: & in HTML attributes should be & */}
{/* In JavaScript, the browser handles this automatically */}
<a href="https://example.com/search?q=hello%20world&page=1">Search</a>
{/* Works in practice, but & is technically correct HTML */}
URLs in JSON
JSON strings have their own escape sequences. URL-encoded characters in JSON need special attention:
{
"url": "https://example.com/search?q=hello%20world",
"redirect": "https://example.com/path/to/page?ref=other%26site"
}
You can validate and format JSON containing URLs using our JSON Formatter.
URLs in CSS
/* URL encoding in CSS url() function */
.icon {
background-image: url('images/icon%20set/arrow.svg');
}
/* Or use quotes to handle spaces */
.icon {
background-image: url('images/icon set/arrow.svg');
}
URLs in Emails
Email clients may break long URLs across lines. Use URL shorteners or ensure proper encoding:
Full URL: https://example.com/verify?token=abc123&email=user%40example.com
Consider using a shorter URL or a redirect service for email links.
Base64URL Encoding
A variant of URL encoding that is important to know about is Base64URL encoding, used in JWTs and other URL-safe data representations:
// Standard Base64 uses characters that are not URL-safe: +, /, =
// Base64URL replaces them:
// + becomes -
// / becomes _
// = (padding) is removed
function base64UrlEncode(str) {
return btoa(str)
.replace(/\+/g, '-')
.replace(/\//g, '_')
.replace(/=+$/, '');
}
function base64UrlDecode(str) {
str = str.replace(/-/g, '+').replace(/_/g, '/');
while (str.length % 4) str += '=';
return atob(str);
}
Learn more about Base64 encoding in our Base64 Encoding Explained guide, or try our Base64 Encoder/Decoder tool.
Data URIs and URL Encoding
Data URIs embed data directly in URLs using the data: scheme:
data:[mediatype][;base64],data
{/* Text data URI (URL-encoded) */}
<a href="data:text/plain;charset=utf-8,Hello%20World">Download</a>
{/* Base64-encoded image */}
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==" />
{/* SVG data URI */}
<div style="background-image: url('data:image/svg+xml,%3Csvg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22%3E%3Ccircle cx=%2250%22 cy=%2250%22 r=%2240%22 fill=%22blue%22/%3E%3C/svg%3E')"></div>
URL Encoding Reference Table
Here is a comprehensive reference of commonly encoded characters:
| Character | Decimal | Hex | Encoded | Description |
|---|---|---|---|---|
| (space) | 32 | 20 | %20 | Space character |
! | 33 | 21 | %21 | Exclamation mark |
" | 34 | 22 | %22 | Double quote |
# | 35 | 23 | %23 | Hash / fragment delimiter |
$ | 36 | 24 | %24 | Dollar sign |
% | 37 | 25 | %25 | Percent sign (escape character) |
& | 38 | 26 | %26 | Ampersand / query delimiter |
' | 39 | 27 | %27 | Single quote |
( | 40 | 28 | %28 | Left parenthesis |
) | 41 | 29 | %29 | Right parenthesis |
* | 42 | 2A | %2A | Asterisk |
+ | 43 | 2B | %2B | Plus sign |
, | 44 | 2C | %2C | Comma |
/ | 47 | 2F | %2F | Forward slash / path delimiter |
: | 58 | 3A | %3A | Colon |
; | 59 | 3B | %3B | Semicolon |
= | 61 | 3D | %3D | Equals sign |
? | 63 | 3F | %3F | Question mark / query delimiter |
@ | 64 | 40 | %40 | At sign |
[ | 91 | 5B | %5B | Left bracket |
] | 93 | 5D | %5D | Right bracket |
Conclusion
URL encoding is a fundamental concept in web development that affects every application that works with URLs. Understanding how percent encoding works, when to use encodeURIComponent versus encodeURI, and how different programming languages handle URL encoding will save you from subtle bugs and security issues.
Key takeaways:
- Use
encodeURIComponentfor values,encodeURIfor complete URLs in JavaScript - Spaces can be
%20or+depending on context --%20is always safe - Non-ASCII characters use UTF-8 before being percent-encoded
- Watch out for double encoding -- the most common URL encoding bug
- Use
URLSearchParamsin JavaScript for building query strings safely - Always encode user input before including it in URLs to prevent injection attacks
For quick URL encoding and decoding, bookmark our URL Encoder/Decoder tool. It handles all the encoding rules correctly and supports both standard URL encoding and Base64URL encoding.
Related Resources
- URL Encoder/Decoder Tool -- Encode and decode URLs instantly
- Base64 Encoding Explained -- Learn about Base64URL encoding
- JSON Formatter -- Format JSON with URL data
- Hash Generator -- Generate hashes of URL strings
- API Security Best Practices -- Secure your API URLs