What is JSON

JSON (JavaScript Object Notation) is a lightweight, text-based data format designed for human readability and machine parsing. It emerged in the early 2000s as Douglas Crockford's alternative to XML for data interchange. Today, it's the de facto standard for APIs, configuration files, and any structured data that needs to move between systems.

JSON's dominance comes from its simplicity: minimal syntax, broad language support, and native integration with JavaScript (making it essential for web development). If you're building APIs, consuming REST endpoints, or storing semi-structured data, you'll work with JSON constantly.

JSON Syntax: The 6 Value Types

JSON supports exactly six data types. Mastering these is the foundation of avoiding parsing errors.

1. String

Always wrapped in double quotes. Single quotes break the parser.

{
  "name": "Alice",
  "email": "[email protected]",
  "bio": "Line 1\nLine 2"
}

2. Number

Integer or floating-point. No commas in large numbers, no leading zeros (except 0 itself).

{
  "age": 30,
  "price": 19.99,
  "count": -5,
  "scientific": 1.23e-4
}

3. Boolean

Exactly true or false (lowercase).

4. Null

Represents absence of value. Written as null (lowercase).

5. Array

Ordered list of values separated by commas, wrapped in [ and ].

{
  "tags": ["javascript", "api", "json"],
  "matrix": [[1, 2], [3, 4]],
  "mixed": [1, "text", true, null]
}

6. Object

Key-value pairs where keys must be strings. Wrapped in { and }.

{
  "user": {
    "id": 123,
    "active": true,
    "roles": ["admin", "user"]
  }
}

Whitespace Rules

Whitespace (spaces, tabs, newlines) is allowed almost everywhere except inside strings. Use it for readability in configuration, but minify for transmission. Both are equally valid JSON:

{"name":"Alice","age":30}
{
  "name": "Alice",
  "age": 30
}

Character Encoding

JSON must be UTF-8 encoded. Special characters in strings use backslash escapes: \n (newline), \t (tab), \" (quote), \\ (backslash). Unicode characters use \uXXXX format:

{"text": "Hello\nWorld", "emoji": "\u00A9 2024"}

JSON vs Other Formats: When to Use What

JSON vs XML

XML is verbose and requires a parser for nested structures. JSON is more compact and human-readable. XML shines when you need metadata attributes (like file versioning) or when schema evolution is critical. For APIs, JSON wins. For document markup, XML is still standard.

// XML (verbose)

  Alice


// JSON (compact)
{"id": 123, "name": "Alice"}

JSON vs YAML

YAML is human-friendly but Python-specific in practice. It supports comments and uses indentation (whitespace-sensitive), making it ideal for configuration files. JSON is stricter and more universal. Use YAML for config.yaml and Kubernetes manifests; use JSON for APIs.

JSON vs TOML

TOML is designed for configuration files with explicit section headers and intuitive syntax. It's more readable than JSON for humans but less universal. Use TOML for application settings (like Cargo.toml) and JSON for data interchange.

JSON vs Protocol Buffers / MessagePack

Protocol Buffers and MessagePack are binary formats optimized for size and speed, not human readability. Use them for internal RPC systems where bandwidth matters. JSON is the right choice for REST APIs and any system where debuggability is important.

Common JSON Mistakes That Break Parsers

Trailing Commas

A comma after the last item is invalid. This breaks 99% of JSON parser libraries:

// INVALID
{"name": "Alice", "age": 30,}

// VALID
{"name": "Alice", "age": 30}

Single Quotes Instead of Double Quotes

JSON requires double quotes. Single quotes are valid JavaScript but not valid JSON:

// INVALID
{'name': 'Alice'}

// VALID
{"name": "Alice"}

Comments

JSON doesn't support comments (unlike JavaScript). Strip comments before parsing, or use JSONC (JSON with Comments) extensions for internal tooling:

// INVALID
{
  // This is a user
  "name": "Alice"
}

// VALID: Remove comments before parsing
{"name": "Alice"}

Unquoted Keys

Object keys must always be strings (quoted):

// INVALID
{name: "Alice"}

// VALID
{"name": "Alice"}

NaN and Undefined

JSON doesn't support NaN, Infinity, or undefined. Convert these to strings or null:

// INVALID
{"result": NaN, "value": undefined}

// VALID
{"result": null, "value": null}
// OR
{"result": "NaN", "value": "undefined"}

Dates

JSON has no native date type. Use ISO 8601 strings or Unix timestamps:

// GOOD
{"created": "2024-04-15T10:30:00Z", "timestamp": 1713180600}

// AVOID
{"created": Mon Apr 15 2024}
Pro tip: Use JSON Formatter & Validator to catch these errors instantly. It validates syntax and shows exactly where parsing fails.

JSON Schema: Validation That Scales

What is JSON Schema

JSON Schema is a declarative language for validating JSON structure and types. Instead of hand-coding validation logic, you define a schema once and apply it across systems. It's essential for API contracts, CI/CD pipelines, and any system handling untrusted input.

A Real Schema Example

Here's a schema for a user object:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "id": {"type": "integer", "minimum": 1},
    "name": {"type": "string", "minLength": 1, "maxLength": 100},
    "email": {"type": "string", "format": "email"},
    "roles": {
      "type": "array",
      "items": {"type": "string", "enum": ["admin", "user", "guest"]},
      "minItems": 1
    },
    "active": {"type": "boolean"},
    "tags": {
      "type": "array",
      "items": {"type": "string"}
    }
  },
  "required": ["id", "name", "email"],
  "additionalProperties": false
}

This schema enforces: id is a positive integer, name is 1-100 characters, email follows email format, roles must be one of three values, and only specified properties are allowed.

Why JSON Schema Matters

In production APIs, schema validation prevents invalid data from corrupting your database. Client validation catches errors early; server validation ensures security. Schema is also self-documenting-it's both code and specification. Tools like JSON Editor use schema to provide autocomplete and real-time validation.

Working with Large JSON

Streaming Parsers

For multi-gigabyte JSON files, loading the entire structure into memory fails. Streaming parsers (SAX-style) process the file incrementally, emitting events for each field. Most languages support this: Node.js has JSONStream, Python has ijson, Go has json.Decoder.

Pagination and Chunking

When serving large datasets via API, paginate results. Return 50-100 items per page with a next token, not 10,000 items in one response:

{
  "data": [{...}, {...}],
  "page": 1,
  "pageSize": 50,
  "total": 10000,
  "nextPage": 2,
  "hasMore": true
}

Don't Embed Binaries

Never base64-encode images or files into JSON responses. It bloats payload size and breaks streaming. Return a URL or separate file upload instead:

// BAD
{"image": "iVBORw0KGgoAAAANSUhEUgAAAAUA..."}

// GOOD
{"imageUrl": "https://cdn.example.com/image.png"}

Security Considerations

JSON Injection

If user input is inserted directly into JSON without escaping, attackers can inject malicious data. Always use your language's JSON encoder, never concatenate strings:

// VULNERABLE
const json = '{"name": "' + userName + '"}';

// SAFE
const json = JSON.stringify({name: userName});

Prototype Pollution

In JavaScript, merging untrusted JSON can pollute the Object prototype, allowing attackers to modify built-in behavior. Always validate schema before merging, and use strict mode:

// VULNERABLE
Object.assign({}, userInput);

// SAFER
const safe = {};
for (const key of Object.keys(userInput)) {
  if (!key.includes('__proto__')) {
    safe[key] = userInput[key];
  }
}

Deep Nesting DoS

Deeply nested JSON ({{{{{...}}}}} hundreds of levels deep) can exhaust parser stack space or memory. Set depth limits in your parser. Most production parsers allow configuring max nesting depth.

Large Payloads

Accept a max payload size (e.g., 10 MB) to prevent memory exhaustion attacks. Configure this in your web framework or reverse proxy.

Security best practice: Validate all JSON against a schema, limit payload size, set parser recursion depth, and always use native encoder/decoder functions in your language.

Essential Tools for JSON Work

These tools from onlinedevtools.app handle common JSON tasks:

Related Learning Resources

Deepen your understanding with these guides:

  • JWT Guide - Learn how JSON Web Tokens work (JWTs are JSON signed and encoded)
  • Base64 Explained - Understand base64 encoding used in JWT headers and when to encode data

Summary

JSON is powerful precisely because it's simple. Master the six value types, know which mistakes break parsers, use schema for validation, and always prioritize security in production. Whether you're building APIs, parsing data, or storing configuration, JSON's flexibility and ubiquity make it indispensable. Use the tools above to validate, debug, and optimize your JSON workflows.