The seven most common symptoms of broken URL encoding, what causes each, and the practical fix. From %2520 double-encoding to garbled non-English characters.
Five symptoms that mean URL encoding has gone wrong, what causes each, and how to fix it.
%20 instead of spacesSymptom: Users see https://example.com/My%20Document in their address bar where they expected https://example.com/My Document.
Cause: This is normal. Browsers display the encoded form of the URL but render decoded text where they can. %20 in the address bar is correct and expected for spaces in paths.
Fix: No fix needed. The URL is working correctly. If it bothers you visually, you can copy and decode it for readability — but the literal form is the canonical, transmittable form.
%2520 instead of %20Symptom: The receiving system sees a literal %20 as if it were data, not a space.
Cause: Double-encoding. The string was encoded twice. The percent sign % in %20 got encoded as %25 on the second pass, giving %2520.
How it happens:
encodeURIComponent on a value that came from encodeURIComponentFix: Find the layer doing the second encoding and remove it. Two strategies:
If you can’t change the upstream code, decode the input once before processing: decodeURIComponent(input).
If our tool gets a double-encoded string, enable “Decode recursively” on the decoder. It’ll keep decoding until no %XX remain — up to 16 rounds.
café, café, or caf%E9Symptom: Special characters render as accented Latin gibberish or get replaced with question marks.
Cause: Character-set mismatch. The original bytes were encoded with one charset, and you’re decoding with another.
Specific diagnoses:
café instead of café — UTF-8 encoded, decoded as ISO-8859-1 (or Windows-1252).café with an unmapped char — UTF-8 encoded, decoded as a single-byte charset that has no character at that position.caf%E9 — the input contained bytes invalid in the chosen charset; the decoder gave up partway.Fix: Change the destination character set on the decoder. Try UTF-8 first (the default — most modern data). If it’s legacy Western European data, try Windows-1252 or ISO-8859-1. For old Russian, try Windows-1251 or KOI8-R. For old Japanese, Shift_JIS or EUC-JP. For old Chinese, GBK or Big5.
Symptom: You send ?q=rock & roll&lang=en, the server reads q=rock and a separate parameter roll that wasn’t in your data.
Cause: Reserved characters in the value weren’t encoded. The & in your data is being interpreted as a parameter separator. Same problem with literal # ending the query and starting a fragment.
Fix: Encode each parameter value with encodeURIComponent (or equivalent in your language) before assembling the URL:
// Wrong
const url = `?q=${userInput}&lang=en`;
// Right
const url = `?q=${encodeURIComponent(userInput)}&lang=en`;
Symptom: A search for C++ programming returns results as if the user searched for C programming.
Cause: Confusion between the two URL-encoding conventions for spaces.
+ means space; literal plus is %2B.+ is a literal plus; space is always %20.Browsers use form encoding when submitting HTML forms via GET. Most server frameworks expect form encoding in query strings. Path components use RFC 3986 strict.
Fix: Decide which convention applies to your context, then either:
On the encoder side: use the right function. JavaScript’s encodeURIComponent produces RFC 3986 (literal +). URLSearchParams produces form encoding (+ for space).
On the decoder side: toggle the “Treat + as space” option. On for form-encoded data (the default), off for path components.
Symptom: The URL works locally but fails in production with errors like “414 Request-URI Too Large” or just silent truncation.
Cause: Browser, proxy, or server URL-length limits. Practical limits vary:
Fix: If your URL exceeds 2,000 characters, restructure. Use POST instead of GET for large payloads. Send big data in the request body, not the URL. For sharing long URLs, use a URL shortener.
Symptom: URL behaviour differs between browsers.
Cause: Browsers normalize URLs slightly differently. Common cases:
%2F back to / in the displayed URL but pass it through differently%2F) in paths as a security measure (against path-traversal)Fix: Test in all browsers. If a path needs to contain a literal / as data, change the structure (encode it as a query parameter instead). If you’re hitting the encoded-slash-in-path restriction, configure the server to allow it (Apache: AllowEncodedSlashes On; Nginx: special handling needed).
When something doesn’t look right:
1. Look at the raw URL. Open your browser’s DevTools → Network tab → reload. The actual URL that gets sent is in the request line. Compare with what you intended.
2. Decode it. Paste the broken URL into our URL decoder. If the decoded form has extra % signs, you have double-encoding. If the decoded form looks garbled, you have a charset mismatch.
3. Parse the components. Use our URL parser to see how the browser/server splits the URL into protocol, host, path, query, and fragment. The component that’s wrong is where your bug is.
4. Test with a known-good value. Replace your dynamic input with a hardcoded literal that has no special characters. If the URL works, encoding is the bug. If it still fails, the bug is elsewhere.
Double-encoding. Your string was URL-encoded twice. The % in %20 got encoded as %25 on the second pass, giving %2520. Find the upstream code that's applying a second round of encoding, or use the "Decode recursively" option on our decoder to peel off both layers automatically.
Character-set mismatch. The bytes were encoded in a different charset than you're decoding with. Switch the destination charset selector — try Windows-1252 for old Western European data, Shift_JIS for old Japanese, GBK for Simplified Chinese, KOI8-R for old Russian. UTF-8 is the modern default.
The ampersand in your data wasn't encoded. & is a reserved character that separates query parameters; an unencoded & inside a value gets interpreted as the start of a new parameter. Fix: pass each value through encodeURIComponent before assembling the URL.
Two URL-encoding conventions exist. In form-encoded query strings (what HTML form GET produces), + means space. In RFC 3986 paths, + is a literal plus. The decoder's "Treat + as space" option lets you pick — turn it on for query strings, off for path components.
There is no universal limit, but practical limits exist: Apache caps at 8,190 bytes by default, Nginx at 4,096, some CDNs lower. Older browsers had ~2,000-character limits. For payloads larger than this, use POST with a request body instead of GET with query parameters.