Language : en | de | fr | es

What Is urlencode and How URL Encoding Works

Web applications constantly exchange data through URLs: search queries, filters, tracking parameters, multilingual paths, and API calls. However, a URL can contain only a restricted set of characters defined by technical standards. When a string includes spaces, symbols, or non-Latin characters, it must be transformed into a safe representation before transmission. This conversion guarantees that browsers, servers, and intermediaries interpret the address exactly as intended.

This guide explains the technical logic behind the process, demonstrates how it works with practical examples, and highlights common implementation mistakes that lead to broken links or corrupted parameters.

What URL Encoding Means in Technical Terms

According to RFC 3986, URL encoding means that a URL may contain only specific “unreserved” ASCII characters, and any character outside this allowed set must be converted before it can safely travel across the web.

The transformation rule is straightforward:

% + two hexadecimal digits

The hexadecimal value represents the numeric byte of the original character.

Why This Transformation Is Necessary

If characters are not converted properly:

  • Spaces interrupt the URL path
  • Query strings may split incorrectly
  • Special symbols may be misinterpreted
  • Servers can respond with 400 or 404 errors
  • Security risks may arise from malformed requests

When applied correctly, the conversion ensures:

  • Stable HTTP communication
  • Accurate parsing of parameters
  • Consistent cross-browser behavior
  • Reliable decoding on the server side

How the Encoding Process Works Step by Step

Technically, this mechanism converts characters into their hexadecimal byte representation based on ASCII or UTF-8 encoding. Each resulting byte is prefixed with a percent sign.

Example 1: Space in a File Path

Original URL:

https://example.com/new page.html

A space is not permitted inside a path.

Character value of space:

  • Decimal: 32
  • Hexadecimal: 20

Corrected version:

https://example.com/new%20page.html

Example 2: Special Symbols Inside Parameters

Original:

https://example.com/search?q=smart locker & indoor

If & is treated as data rather than a parameter separator, it must be encoded.

Proper version:

https://example.com/search?q=smart%20locker%20%26%20indoor

Converted elements:

  • Space → %20
  • & → %26

Important: The structural ampersand separating parameters must remain unchanged.

Categories of Characters in URLs

Recognizing character types helps avoid incorrect transformations.

Unreserved Characters

These characters are safe and can remain unchanged:

Category

Characters

Letters

A–Z a–z

Numbers

0–9

Symbols

- _ . ~

Example:

https://example.com/product-123_A

No conversion is required.

Reserved Characters

These characters define the structure of a URL:

Character

Role

?

Starts query string

&

Separates parameters

=

Assigns parameter values

#

Fragment reference

/

Path divider

:

Scheme separator

If these symbols are encoded incorrectly, the address loses its logical structure.

Incorrect:

https://example.com/page%3Fid%3D10

Correct:

https://example.com/page?id=10

Characters That Must Be Converted

The following symbols should be transformed when used as literal data:

  • Space
  • "
  • < >
  • { }
  • |
  • \
  • ^
  • `
  • %

Example with Percent Sign

Original:

https://example.com/50% discount

Safe version:

https://example.com/50%25%20discount

  • % becomes %25
  • Space becomes %20

URL Encoding in Web Applications

In practical development, when developers define URL encoding, they usually refer to data that has already been converted into percent-encoded format before being sent within an HTTP request.

You will encounter such transformed values in:

  • HTML form submissions
  • AJAX calls
  • Redirect parameters
  • Tracking URLs
  • REST API requests

Example: Form Data

User enters:

John Smith & Co.

Browser sends:

John+Smith+%26+Co.

Note: In the application/x-www-form-urlencoded format, spaces are replaced with + instead of %20.

Implementation in Code

Most programming languages provide built-in utilities.

JavaScript Example

Encoding a parameter:

encodeURIComponent("indoor locker & storage")

Result:

indoor%20locker%20%26%20storage

Decoding restores the original string:

decodeURIComponent("indoor%20locker%20%26%20storage")

Handling Unicode and Multilingual URLs

Modern systems rely on UTF-8. Non-ASCII characters are first converted into UTF-8 byte sequences and then expressed in percent format.

Example:

Original:

https://example.com/product

Converted:

https://example.com/%D0%BF%D1%80%D0%BE%D0%B4%D1%83%D0%BA%D1%82

Each UTF-8 byte becomes %XX.

This mechanism enables proper handling of international content and multilingual websites.

Quick Reference Table

Character

Decimal

Hex

Encoded

Space

32

20

%20

!

33

21

%21

34

22

%22

#

35

23

%23

%

37

25

%25

&

38

26

%26

=

61

3D

%3D

?

63

3F

%3F

Frequent Implementation Mistakes

Double Conversion

Example:

%20 → %2520

This occurs when already transformed data is processed again.

Consequence:

  • Broken redirects
  • Corrupted parameters
  • Complex debugging

Encoding the Entire Address

Incorrect:

encodeURIComponent("https://example.com/page?id=5")

Only parameter values should be transformed, not the structural components of the address.

Inconsistent Canonical Handling

If both converted and non-converted versions of a page are accessible, duplicate URLs may appear. Proper URL generation logic should ensure consistency.

Why Proper Implementation Matters

At its foundation, this mechanism ensures that characters unsafe for direct transmission are represented in a standardized, transport-friendly format. It keeps HTTP communication predictable and machine-readable.

It functions behind the scenes in:

  • Web frameworks
  • Routing systems
  • E-commerce filters
  • API endpoints
  • Analytics tracking

When implemented correctly, it remains invisible to users. When misconfigured, it leads to malformed requests, indexing issues, and unstable application behavior.

A clear understanding of how character transformation works is essential for developers building reliable, scalable web systems.