URL Encoding: Special Characters and Web Security
Complete guide to URL encoding: when to use it, reserved characters, security. With practical examples.
Understanding URL Encoding
URL encoding (also called percent-encoding) is a mechanism for encoding special characters in URLs. Since URLs can only contain a limited set of ASCII characters, any character outside this setโor with special meaning in URLsโmust be encoded. This guide explains how URL encoding works, when to use it, and how to implement it correctly in your applications.
Why URL Encoding Exists
The Problem with Special Characters
URLs have a specific syntax where certain characters have special meanings:
- Reserved characters:
/ ? # [ ] @ ! $ & ' ( ) * + , ; = - Unsafe characters: Spaces, quotes, angle brackets, and non-ASCII characters
- Control characters: Characters below ASCII 32
Without encoding, a URL like https://example.com/search?q=hello world would break because the space character isn't valid in URLs.
How Percent-Encoding Works
Each character is converted to its UTF-8 byte sequence, then each byte is represented as %XX where XX is the hexadecimal value:
Space โ %20
& โ %26
= โ %3D
? โ %3F
/ โ %2F
# โ %23
+ โ %2B
% โ %25
Reserved vs Unreserved Characters
Unreserved Characters (Safe)
These characters can appear in URLs without encoding:
A-Z a-z 0-9 - _ . ~
Reserved Characters
These have special meaning in URLs and must be encoded when used as data:
| Character | Purpose in URL | Encoded |
|---|---|---|
| / | Path separator | %2F |
| ? | Query string start | %3F |
| # | Fragment identifier | %23 |
| & | Parameter separator | %26 |
| = | Key-value separator | %3D |
| @ | User info separator | %40 |
| : | Port/scheme separator | %3A |
JavaScript Encoding Functions
encodeURIComponent()
The most commonly used function for encoding query parameters:
const query = "hello world & goodbye";
const encoded = encodeURIComponent(query);
// Result: "hello%20world%20%26%20goodbye"
// Use in URL
const url = `https://api.example.com/search?q=${encoded}`;
This function encodes everything except: A-Z a-z 0-9 - _ . ! ~ * ' ( )
encodeURI()
For encoding complete URLs (preserves URL structure):
const url = "https://example.com/path with spaces/page";
const encoded = encodeURI(url);
// Result: "https://example.com/path%20with%20spaces/page"
This preserves: : / ? # [ ] @ ! $ & ' ( ) * + , ; =
Decoding Functions
// Decode component
const decoded = decodeURIComponent("hello%20world");
// Result: "hello world"
// Decode full URL
const url = decodeURI("https://example.com/path%20name");
// Result: "https://example.com/path name"
Common Encoding Scenarios
Query String Parameters
// Building a search URL
function buildSearchUrl(query, filters) {
const params = new URLSearchParams();
params.set('q', query);
params.set('filters', JSON.stringify(filters));
return `https://api.example.com/search?${params.toString()}`;
}
// URLSearchParams handles encoding automatically
const url = buildSearchUrl("hello world", { type: "article" });
// Result: https://api.example.com/search?q=hello+world&filters=%7B%22type%22%3A%22article%22%7D
Path Segments
// Encoding path segments
const filename = "my file (copy).pdf";
const path = `/files/${encodeURIComponent(filename)}`;
// Result: /files/my%20file%20(copy).pdf
Form Data
// application/x-www-form-urlencoded format
const data = {
name: "John Doe",
email: "john@example.com",
message: "Hello & goodbye!"
};
const body = Object.entries(data)
.map(([key, value]) =>
`${encodeURIComponent(key)}=${encodeURIComponent(value)}`
)
.join('&');
// Result: name=John%20Doe&email=john%40example.com&message=Hello%20%26%20goodbye!
URLSearchParams API
Modern JavaScript provides URLSearchParams for working with query strings:
// Creating parameters
const params = new URLSearchParams();
params.append('name', 'John Doe');
params.append('tags', 'javascript');
params.append('tags', 'web'); // Multiple values
console.log(params.toString());
// name=John+Doe&tags=javascript&tags=web
// Parsing existing query strings
const search = new URLSearchParams('?q=hello&page=1');
console.log(search.get('q')); // "hello"
console.log(search.get('page')); // "1"
// Iterating
for (const [key, value] of params) {
console.log(`${key}: ${value}`);
}
Server-Side Encoding
Node.js
const { URLSearchParams } = require('url');
// Using URLSearchParams
const params = new URLSearchParams({ q: 'hello world' });
console.log(params.toString()); // q=hello+world
// Using querystring module
const querystring = require('querystring');
const encoded = querystring.stringify({ q: 'hello world' });
// q=hello%20world
Python
from urllib.parse import quote, urlencode
# Single value
encoded = quote("hello world") # hello%20world
# Query string
params = urlencode({'q': 'hello world', 'page': 1})
# q=hello+world&page=1
PHP
// Single value
$encoded = urlencode("hello world"); // hello+world
$encoded = rawurlencode("hello world"); // hello%20world
// Query string
$params = http_build_query(['q' => 'hello world']);
// q=hello+world
Security Considerations
Double Encoding Issues
Encoding an already-encoded string creates problems:
// Wrong: double encoding
const alreadyEncoded = "hello%20world";
const doubleEncoded = encodeURIComponent(alreadyEncoded);
// Result: "hello%2520world" (wrong!)
// Solution: decode first if uncertain
const safe = encodeURIComponent(decodeURIComponent(input));
URL Injection Prevention
// Dangerous: user input in URL without encoding
const userInput = "javascript:alert('XSS')";
const url = userInput; // Dangerous!
// Safe: validate and encode
function safeUrl(input) {
// Only allow http/https protocols
if (!/^https?:\/\//i.test(input)) {
return null;
}
return encodeURI(input);
}
Path Traversal Prevention
// Dangerous: user input in file path
const filename = "../../../etc/passwd";
// Safe: encode and validate
function safeFilename(input) {
// Remove path separators
const clean = input.replace(/[\/\\]/g, '');
return encodeURIComponent(clean);
}
Special Cases
Plus Sign vs %20
Both can represent a space, but in different contexts:
%20- Standard URL encoding (RFC 3986)+- Form encoding (application/x-www-form-urlencoded)
// URLSearchParams uses + for spaces
const params = new URLSearchParams({ q: 'hello world' });
params.toString(); // q=hello+world
// encodeURIComponent uses %20
encodeURIComponent('hello world'); // hello%20world
Unicode Characters
Non-ASCII characters are encoded as UTF-8 bytes:
encodeURIComponent('cafรฉ'); // caf%C3%A9
encodeURIComponent('ๆฅๆฌ'); // %E6%97%A5%E6%9C%AC
encodeURIComponent('๐'); // %F0%9F%8E%89
Tools and Testing
For testing and debugging URL encoding:
- URL Encoder/Decoder - Encode and decode URLs online
- JSON Formatter - For inspecting encoded JSON in URLs
- Base64 Encoder - Alternative encoding for binary data
Best Practices
Do
- Always encode user input before putting it in URLs
- Use
encodeURIComponent()for query parameter values - Use
URLSearchParamsfor building query strings - Validate URLs before using them
- Handle decoding errors gracefully
Don't
- Don't encode URLs that are already encoded
- Don't use
escape()- it's deprecated - Don't assume all servers handle encoding the same way
- Don't trust user-provided URLs without validation
Troubleshooting Common Issues
Broken URLs
// Problem: URL doesn't work
const url = `https://api.com/search?q=${query}`;
// Solution: encode the parameter
const url = `https://api.com/search?q=${encodeURIComponent(query)}`;
Server Receiving Wrong Data
// Problem: server receives "+" instead of space
// This happens with some server configurations
// Solution: use %20 explicitly
const encoded = encodeURIComponent(value).replace(/\+/g, '%2B');
Conclusion
URL encoding is essential for building robust web applications. The key points to remember:
- Use
encodeURIComponent()for query parameter values - Use
encodeURI()for complete URLs that need to preserve structure - Use
URLSearchParamsfor building and parsing query strings - Always validate and encode user input to prevent security issues
- Be aware of the difference between
+and%20for spaces
For more developer resources, explore our free online tools. For official specification details, see RFC 3986 and MDN encodeURIComponent documentation.