Language : en | de | fr | es
Back to blogs

What Is URL Encoding and Why It Is Used

Legal Characters in URL Explained

A URL is not just a line of text. It is a structured identifier with rules about which symbols can appear directly, which must be encoded, and which may break parsing if used carelessly. When developers, content managers, and SEO specialists ignore those rules, they get broken links, routing errors, malformed parameters, and pages that behave differently across browsers, servers, and analytics tools.

This article explains which symbols are safe, which require percent-encoding, and how to build clean links that remain readable and technically valid. The focus is practical: real examples, common mistakes, and a clear framework for creating addresses that work reliably.

Which symbols are accepted in a web address

When people ask about characters allowed in url, they usually want a simple answer. The short version is this: letters, digits, and a limited set of separators are generally safe, but the exact rule depends on the part of the URL.

A standard address may contain these components:

  • scheme: https
  • host: example.com
  • path: /blog/url-rules
  • query: ?page=2&sort=asc
  • fragment: #section-1

Each section follows its own syntax. A slash is normal in the path, but the same symbol inside a parameter value may need encoding depending on context.

Safe symbols you can usually keep readable

These are commonly treated as valid web address characters when used in the correct place:

  • uppercase and lowercase letters: A-Z, a-z
  • digits: 0-9
  • hyphen: -
  • underscore: _
  • period: .
  • tilde: ~

These symbols are widely accepted because they do not usually conflict with parsing rules.

Practical example

This URL is clean and predictable:

https://example.com/guides/url-format-basics

It uses readable separators, no spaces, and a path that is easy for both users and servers to handle.

Reserved symbols and why context matters

Some symbols have a structural role. These are known as reserved url characters because parsers already assign meaning to them.

Common examples include:

Character

Typical meaning

:

separates scheme from the rest

/

divides path segments

?

starts the query string

&

separates query parameters

=

assigns parameter values

#

starts the fragment

@

may separate credentials or appear in other contexts

These symbols are not “bad” by themselves. The problem starts when you use them as ordinary text inside a value without encoding them.

Example: same symbol, different role

Compare these two cases:

https://example.com/search?q=red&blue

https://example.com/search?q=red%26blue

In the first line, the ampersand splits parameters. In the second, %26 means the literal & character belongs to the search term.

Why percent-encoding exists

Percent-encoding replaces a character with % followed by its hexadecimal byte value. This lets you include symbols that would otherwise confuse the parser.

Examples:

  • space → %20
  • & → %26
  • # → %23
  • % → %25

This is the key to handling special characters in url values correctly.

Symbols that often cause broken links

Some characters are technically possible in limited cases, but they often create operational problems in CMS platforms, routing systems, proxies, or analytics tools. That is why teams talk about invalid url characters even when the strict standard is more nuanced.

High-risk input examples

The following symbols should be reviewed carefully or encoded before use:

  • space
  • quotation marks
  • angle brackets
  • backslash
  • curly braces
  • pipe
  • caret
  • backtick

A raw space inside a path is a classic mistake:

https://example.com/my file.pdf

A safer version is:

https://example.com/my-file.pdf

or, if the original string must be preserved:

https://example.com/my%20file.pdf

Characters that should not be pasted blindly into slugs

Many systems reject or normalize these as illegal url characters in slug generation:

  1. spaces
  2. repeated punctuation
  3. non-encoded %
  4. raw # inside filenames
  5. raw ? inside resource names

A route may appear to work locally and then fail in production because the reverse proxy, framework, and browser do not normalize the same way.

How to build readable and safe URL slugs

The most stable pattern is to use short lowercase words and join them with hyphens. These are the most common url friendly characters for article slugs, product pages, and category paths.

Recommended slug rules

  • use lowercase letters
  • use digits only when they add meaning
  • separate words with hyphens
  • avoid spaces and mixed separators
  • remove decorative punctuation
  • keep slugs short

Example:

https://example.com/blog/how-routing-works

Less consistent version:

https://example.com/Blog/How_Routing!Works?

The second version introduces case inconsistency, mixed punctuation, and a trailing symbol with parsing meaning.

Better slug conversion examples

Raw title

Better path

URL Rules for Beginners

/url-rules-for-beginners

Price List: 2026 Edition

/price-list-2026-edition

C# vs. C++ Guide

/c-sharp-vs-c-plus-plus-guide

Summer Sale 50% Off

/summer-sale-50-percent-off

This is a safer strategy than trying to preserve every original symbol.

Reading the difference between valid, reserved, and unsafe

A lot of confusion comes from mixing three categories:

Directly usable symbols

These are often safe as plain text in many contexts:

  • letters
  • digits
  • hyphen
  • underscore
  • period
  • tilde

Structural symbols

These are meaningful to the parser and should only appear literally when you want that structure:

  • :
  • /
  • ?
  • &
  • =
  • #

Input that should be encoded or normalized

These often create trouble in links copied from user-generated content:

  • spaces
  • quotes
  • brackets in some contexts
  • raw percent signs
  • nonstandard punctuation

That distinction helps teams avoid confusion when discussing legal characters in url policies in development, SEO, and content workflows.

Common technical mistakes in real projects

Many broken links come from ordinary workflow errors rather than deep protocol problems.

1. Mixing path rules with query rules

A slash is normal in a path:

https://example.com/docs/setup/install

But the same slash inside a parameter value may need review depending on how the backend interprets it.

2. Encoding twice

Wrong:

name=red%2520car

If %20 becomes %2520, the percent sign itself was encoded again.

3. Allowing raw user input into URLs

If a title field becomes a path segment without normalization, one article may generate spaces, quotes, and fragments accidentally.

4. Treating all browsers and servers as identical

One platform may silently repair a malformed address. Another may reject it, redirect unexpectedly, or log a different value.

Practical checklist for clean URL creation

Use this workflow when building links manually or generating them in code:

  1. decide which part of the URL you are editing
  2. keep paths human-readable and minimal
  3. use lowercase slugs with hyphens
  4. encode user input before inserting it into parameters
  5. avoid decorative punctuation in paths
  6. test the final URL in browser, server logs, and analytics tools

Example of a well-structured address

https://shop.example.com/products/wireless-keyboard?color=space-gray&layout=us

Why it works:

  • the path is readable
  • the query structure is explicit
  • separators are used for their intended roles
  • no ambiguous whitespace or broken punctuation appears in the slug

Conclusion

A reliable URL is built from structure, not guesswork. The safest approach is to keep paths simple, use readable separators, encode data when symbols carry special meaning, and avoid raw input that can confuse parsers. Once you separate safe text symbols from structural delimiters, URL handling becomes much more predictable.

For daily work, the most practical rule is simple: keep paths clean, reserve special symbols for their technical purpose, and encode anything that might be interpreted incorrectly. That reduces broken links, cleaner logs, fewer routing bugs, and more stable behavior across browsers, frameworks, and servers.