THEJORD LogoTHEJORD

URL Encoding: Special Characters and Web Security

THEJORD Teamโ€ขโ€ข1 min read
urlencodingwebsecurity

Complete guide to URL encoding: when to use it, reserved characters, security. With practical examples.

URL Encoding: Special Characters and Web Security

Understanding URL Encoding

URL encoding (also called percent-encoding) is a mechanism for encoding special characters in URLs. Since URLs can only contain a limited set of ASCII characters, any character outside this setโ€”or with special meaning in URLsโ€”must be encoded. This guide explains how URL encoding works, when to use it, and how to implement it correctly in your applications.

Why URL Encoding Exists

The Problem with Special Characters

URLs have a specific syntax where certain characters have special meanings:

  • Reserved characters: / ? # [ ] @ ! $ & ' ( ) * + , ; =
  • Unsafe characters: Spaces, quotes, angle brackets, and non-ASCII characters
  • Control characters: Characters below ASCII 32

Without encoding, a URL like https://example.com/search?q=hello world would break because the space character isn't valid in URLs.

How Percent-Encoding Works

Each character is converted to its UTF-8 byte sequence, then each byte is represented as %XX where XX is the hexadecimal value:

Space โ†’ %20
& โ†’ %26
= โ†’ %3D
? โ†’ %3F
/ โ†’ %2F
# โ†’ %23
+ โ†’ %2B
% โ†’ %25

Reserved vs Unreserved Characters

Unreserved Characters (Safe)

These characters can appear in URLs without encoding:

A-Z a-z 0-9 - _ . ~

Reserved Characters

These have special meaning in URLs and must be encoded when used as data:

CharacterPurpose in URLEncoded
/Path separator%2F
?Query string start%3F
#Fragment identifier%23
&Parameter separator%26
=Key-value separator%3D
@User info separator%40
:Port/scheme separator%3A

JavaScript Encoding Functions

encodeURIComponent()

The most commonly used function for encoding query parameters:

const query = "hello world & goodbye";
const encoded = encodeURIComponent(query);
// Result: "hello%20world%20%26%20goodbye"

// Use in URL
const url = `https://api.example.com/search?q=${encoded}`;

This function encodes everything except: A-Z a-z 0-9 - _ . ! ~ * ' ( )

encodeURI()

For encoding complete URLs (preserves URL structure):

const url = "https://example.com/path with spaces/page";
const encoded = encodeURI(url);
// Result: "https://example.com/path%20with%20spaces/page"

This preserves: : / ? # [ ] @ ! $ & ' ( ) * + , ; =

Decoding Functions

// Decode component
const decoded = decodeURIComponent("hello%20world");
// Result: "hello world"

// Decode full URL
const url = decodeURI("https://example.com/path%20name");
// Result: "https://example.com/path name"

Common Encoding Scenarios

Query String Parameters

// Building a search URL
function buildSearchUrl(query, filters) {
  const params = new URLSearchParams();
  params.set('q', query);
  params.set('filters', JSON.stringify(filters));

  return `https://api.example.com/search?${params.toString()}`;
}

// URLSearchParams handles encoding automatically
const url = buildSearchUrl("hello world", { type: "article" });
// Result: https://api.example.com/search?q=hello+world&filters=%7B%22type%22%3A%22article%22%7D

Path Segments

// Encoding path segments
const filename = "my file (copy).pdf";
const path = `/files/${encodeURIComponent(filename)}`;
// Result: /files/my%20file%20(copy).pdf

Form Data

// application/x-www-form-urlencoded format
const data = {
  name: "John Doe",
  email: "john@example.com",
  message: "Hello & goodbye!"
};

const body = Object.entries(data)
  .map(([key, value]) =>
    `${encodeURIComponent(key)}=${encodeURIComponent(value)}`
  )
  .join('&');
// Result: name=John%20Doe&email=john%40example.com&message=Hello%20%26%20goodbye!

URLSearchParams API

Modern JavaScript provides URLSearchParams for working with query strings:

// Creating parameters
const params = new URLSearchParams();
params.append('name', 'John Doe');
params.append('tags', 'javascript');
params.append('tags', 'web'); // Multiple values

console.log(params.toString());
// name=John+Doe&tags=javascript&tags=web

// Parsing existing query strings
const search = new URLSearchParams('?q=hello&page=1');
console.log(search.get('q')); // "hello"
console.log(search.get('page')); // "1"

// Iterating
for (const [key, value] of params) {
  console.log(`${key}: ${value}`);
}

Server-Side Encoding

Node.js

const { URLSearchParams } = require('url');

// Using URLSearchParams
const params = new URLSearchParams({ q: 'hello world' });
console.log(params.toString()); // q=hello+world

// Using querystring module
const querystring = require('querystring');
const encoded = querystring.stringify({ q: 'hello world' });
// q=hello%20world

Python

from urllib.parse import quote, urlencode

# Single value
encoded = quote("hello world")  # hello%20world

# Query string
params = urlencode({'q': 'hello world', 'page': 1})
# q=hello+world&page=1

PHP

// Single value
$encoded = urlencode("hello world"); // hello+world
$encoded = rawurlencode("hello world"); // hello%20world

// Query string
$params = http_build_query(['q' => 'hello world']);
// q=hello+world

Security Considerations

Double Encoding Issues

Encoding an already-encoded string creates problems:

// Wrong: double encoding
const alreadyEncoded = "hello%20world";
const doubleEncoded = encodeURIComponent(alreadyEncoded);
// Result: "hello%2520world" (wrong!)

// Solution: decode first if uncertain
const safe = encodeURIComponent(decodeURIComponent(input));

URL Injection Prevention

// Dangerous: user input in URL without encoding
const userInput = "javascript:alert('XSS')";
const url = userInput; // Dangerous!

// Safe: validate and encode
function safeUrl(input) {
  // Only allow http/https protocols
  if (!/^https?:\/\//i.test(input)) {
    return null;
  }
  return encodeURI(input);
}

Path Traversal Prevention

// Dangerous: user input in file path
const filename = "../../../etc/passwd";

// Safe: encode and validate
function safeFilename(input) {
  // Remove path separators
  const clean = input.replace(/[\/\\]/g, '');
  return encodeURIComponent(clean);
}

Special Cases

Plus Sign vs %20

Both can represent a space, but in different contexts:

  • %20 - Standard URL encoding (RFC 3986)
  • + - Form encoding (application/x-www-form-urlencoded)
// URLSearchParams uses + for spaces
const params = new URLSearchParams({ q: 'hello world' });
params.toString(); // q=hello+world

// encodeURIComponent uses %20
encodeURIComponent('hello world'); // hello%20world

Unicode Characters

Non-ASCII characters are encoded as UTF-8 bytes:

encodeURIComponent('cafรฉ'); // caf%C3%A9
encodeURIComponent('ๆ—ฅๆœฌ'); // %E6%97%A5%E6%9C%AC
encodeURIComponent('๐ŸŽ‰');   // %F0%9F%8E%89

Tools and Testing

For testing and debugging URL encoding:

Best Practices

Do

  • Always encode user input before putting it in URLs
  • Use encodeURIComponent() for query parameter values
  • Use URLSearchParams for building query strings
  • Validate URLs before using them
  • Handle decoding errors gracefully

Don't

  • Don't encode URLs that are already encoded
  • Don't use escape() - it's deprecated
  • Don't assume all servers handle encoding the same way
  • Don't trust user-provided URLs without validation

Troubleshooting Common Issues

Broken URLs

// Problem: URL doesn't work
const url = `https://api.com/search?q=${query}`;

// Solution: encode the parameter
const url = `https://api.com/search?q=${encodeURIComponent(query)}`;

Server Receiving Wrong Data

// Problem: server receives "+" instead of space
// This happens with some server configurations

// Solution: use %20 explicitly
const encoded = encodeURIComponent(value).replace(/\+/g, '%2B');

Conclusion

URL encoding is essential for building robust web applications. The key points to remember:

  • Use encodeURIComponent() for query parameter values
  • Use encodeURI() for complete URLs that need to preserve structure
  • Use URLSearchParams for building and parsing query strings
  • Always validate and encode user input to prevent security issues
  • Be aware of the difference between + and %20 for spaces

For more developer resources, explore our free online tools. For official specification details, see RFC 3986 and MDN encodeURIComponent documentation.