Boneyard Tools

JSONL versus JSON and CSV for tabular data

How JSON Lines differs from a JSON array and from CSV, and when converting between them is the right move for exports and logs.

What makes JSONL different from a JSON array

A regular JSON file holds one value, usually an array of objects wrapped in brackets and separated by commas. JSONL drops the wrapper: each line is its own standalone JSON object with no surrounding array and no commas between records. That small change makes JSONL easy to append to, since you can add a line without rewriting the whole file, and easy to stream, since a reader can process one line at a time without loading everything into memory. It is why log pipelines and machine learning datasets so often ship as JSONL rather than one giant array.

Why CSV needs a single flat header

CSV is a flat grid of rows and columns, so every record must map onto the same set of columns. JSONL records, by contrast, can each carry a different mix of keys. To bridge that gap this converter scans every line, collects the full set of keys, and uses their union as one shared header. Records that are missing a column get an empty cell there, which keeps every row the same width and prevents columns from drifting out of alignment when you open the file in a spreadsheet.

Handling values that break naive CSV

The characters that most often corrupt a CSV are the comma, the double quote and the newline, because each one has structural meaning in the format. Rather than stripping them, the tool escapes them: a field that contains any of these is surrounded by double quotes, and any quote already inside the field is written twice. This is the quoting rule from RFC 4180, and it lets a value such as Hello, world sit safely in one cell instead of splitting into two.

Round-tripping without losing rows

Switching to CSV to JSONL reverses the process using a proper CSV parser that understands quoted fields, doubled quotes, and embedded newlines, so a value that spans lines is read back as a single cell. The first row is treated as the header and every later row becomes one object. What it cannot recover is type, because CSV stores everything as text, so a number that started life as 42 returns as the string 42. When you need the original types back, keep the JSONL as your source of truth and treat the CSV as a view for spreadsheets.

Frequently asked questions

Should I store data as JSONL or a JSON array?

Use JSONL when you append often or stream records one at a time, such as logs or events. Use a JSON array when you need the whole dataset as a single parseable value.

Why did my number come back as text after a round trip?

CSV has no type system, so every cell is text. Converting CSV to JSONL therefore yields string values. Keep the original JSONL if you need numbers and booleans preserved.