How do I convert XML to JSON?

Parse the XML, preserve the root element as the top-level JSON key, map attributes to @ keys, put text in #text when an element also has attributes or child elements, and turn repeated sibling elements into arrays.

How are XML attributes represented in JSON?

By convention, attributes become keys prefixed with @, such as @id or @currency. Some converters use an @attributes object or a different prefix, so pick one convention and keep it consistent.

What happens to text inside an XML element?

If the element contains only text, it can become a plain string. If the element also has attributes or child elements, the text needs a reserved key such as #text.

Why does the same XML element sometimes become an object and sometimes an array?

Many converters emit a single value when an element appears once and an array when it appears multiple times. For stable code, configure known repeatable elements as arrays or normalize values after parsing.

How should XML namespaces be handled in JSON?

The safest default is to keep namespace prefixes in JSON keys and preserve xmlns declarations as attributes. Stripping prefixes makes cleaner keys, but it can merge elements from different namespaces.

Does XML to JSON preserve numbers, booleans, and dates?

Not automatically. XML text and attribute values are strings unless a parser or application layer coerces them. Keep strings on the first pass, then convert only fields whose meaning you know.

What happens to CDATA and XML entities?

CDATA becomes text, and predefined ampersand entities are decoded to characters during XML parsing. If exact CDATA boundaries or entity spelling matter, a compact JSON object is not enough.

Can XML to JSON conversion be lossless?

It can be close for data-oriented XML if you preserve attributes, text, arrays, namespaces, and root names. It is often lossy for document-style XML with mixed content, comments, processing instructions, DTDs, or exact whitespace requirements.

← 全部文章

May 26, 2026更新于 May 29, 20268 min read

XML 转 JSON:属性、文本节点、数组与命名空间

把 XML 正确地转成 JSON:属性、文本节点、重复元素、命名空间如何映射 —— 带约定、边角情况以及 JS / Python 代码。

把 XML 转换为 JSON 听上去很机械，但你很快就会撞上 XML 里 JSON 没有直接等价物的部分：属性、文本与元素混合的内容、出现一次或多次的元素、以及命名空间。这里没有唯一「正确」的映射 —— 只有约定。本指南讲解标准约定、你必须做的决定，以及如何在 JavaScript、Python 和浏览器中把 XML 转为 JSON。

映射一览

大多数 XML 转 JSON 工具（xmltodict、fast-xml-parser，本站工具也是）都遵循同一种形态：每个元素变成一个对象，属性变成带特殊前缀的键，文本要么成为值，要么放到一个保留键里。

<!-- XML -->
<note id="1" priority="high">
  <to>Ada</to>
  <from>Bob</from>
  <body>Hello &amp; welcome</body>
</note>

// JSON
{
  "note": {
    "@id": "1",
    "@priority": "high",
    "to": "Ada",
    "from": "Bob",
    "body": "Hello & welcome"
  }
}

属性 → @ 前缀键

JSON 对象没有属性的概念，因此一个近乎通用的约定是给属性名加上 @ 前缀。这样它们与子元素区分开来，映射也是可逆的。

<book id="b1" lang="en"/>
→ { "book": { "@id": "b1", "@lang": "en" } }

有些工具用别的前缀（$、_），或用一个嵌套的 "@attributes" 对象。选一种就一以贯之 —— 下游代码需要知道属性在哪。

文本节点与混合内容

当一个元素只包含文本时，它会塌缩为一个字符串值。但当元素同时拥有属性和文本时，文本需要一个去处 —— 约定俗成的位置是键 #text。

<price currency="USD">9.99</price>
→ { "price": { "@currency": "USD", "#text": "9.99" } }

<title>Effective TypeScript</title>
→ { "title": "Effective TypeScript" }

真正的混合内容（文本与子元素交错出现，像 HTML 一类的标记）是最棘手的情况 —— 多数数据导向的转换器会拼接或丢弃零散的文本。如果你的 XML 是文档风格而非数据风格，预计这里会有损。

单个 vs 数组的问题

这是比其他都更容易把代码坑死的陷阱。一个出现一次的元素会变成对象；同一个元素出现两次就变成数组。所以 JSON 的形态取决于数据，而不是 schema：

<tags><tag>a</tag></tags>
→ { "tags": { "tag": "a" } }          // 对象

<tags><tag>a</tag><tag>b</tag></tags>
→ { "tags": { "tag": ["a", "b"] } }   // 数组

期望 tags.tag 始终是数组的下游会在单项情况下崩溃。两个解法：把解析器配置成对已知可重复的元素总是当数组处理，或者在解析之后做归一化（const arr = [].concat(node.tag ?? [])）。

命名空间

XML 命名空间使用前缀（soap:Envelope）并通过 xmlns 声明绑定。JSON 没有命名空间概念，所以转换器通常采用以下几种做法之一：

把前缀保留在键里 —— "soap:Envelope"。简单且可逆，但键里含冒号，要用方括号访问（obj["soap:Envelope"]）。
丢掉前缀 —— "Envelope"。键更干净，但你会丢掉命名空间，而且两套使用同一本地名的命名空间之间可能撞键。
把 xmlns 当属性保留 —— 声明变成 "@xmlns:soap" 之类的键，使绑定能在往返中存活。

对大多数数据任务来说，把前缀保留在键里是最安全的默认 —— 它永远不会丢信息。

实体与 CDATA

正确的转换器会把五个预定义实体（<、>、&、"、'）以及数字引用（©）解码为对应字符，并把 <![CDATA[...]]> 块视作字面文本。

约定的名字：BadgerFish、GData、Parker

「属性用 @、文本用 #text」并不是江湖上唯一的方案。读其他系统的 XML→JSON 输出时，会遇到这三个有名字的约定：

BadgerFish —— 属性放在以 @ 为前缀的键下；文本放到 $；命名空间声明放到 @xmlns。啰嗦但无损。
GData —— Google 的变体：属性带 $ 前缀；文本放在 $t 下；重复元素总是变成数组。无损且形态可预期。
Parker —— 完全丢弃属性；最简单也最有损的映射。在你掌控两端、只关心元素值时有用。

在与一个已经把 XML 转成 JSON 的系统集成时，先识别它用的是哪一种约定，再写解析代码。

用 JSONPath 查询转换结果

一旦 XML 被转好，就可以用 JSONPath 来寻址值。相比 XPath 的习惯，两处小调整：

属性键带着映射里的 @ 前缀，所以 XPath 的 @id 在 JSONPath 里是 $..['@id']。
上面提到的「单个 vs 数组」意味着像 book/title 这样一条本来能工作的 XPath，在 JSONPath 里可能需要写作 $..book[*].title 来兼顾两种形态。

用代码把 XML 转为 JSON

// JavaScript（浏览器） —— DOMParser + 一个小走树器，或者一个库：
import { XMLParser } from 'fast-xml-parser';
const parser = new XMLParser({ ignoreAttributes: false, attributeNamePrefix: '@' });
const obj = parser.parse(xmlString);

# Python —— xmltodict 会把属性映射到 "@name"，文本映射到 "#text"
import xmltodict, json
doc = xmltodict.parse(xml_string)
print(json.dumps(doc, indent=2))

在线把 XML 转为 JSON

想快速转换，把 XML 贴进 JSON ⇄ XML 转换器并点击 To JSON。它应用上述约定 —— 属性用 @、混合内容用 #text、重复元素用数组 —— 并完全在你的浏览器里运行，因此内部数据流与 API 负载永不离开你的机器。

常见问题

XML 属性在 JSON 里如何表示？

按约定，它们变成以 @ 为前缀的键（例如 @id），与子元素区分开来，使映射可被逆向。

为什么同一个元素有时变成对象有时变成数组？

因为形态跟随数据：一次出现映射为对象，多次出现映射为数组。请把解析器配置成对已知可重复的元素总是按数组处理，或者在解析之后用 [].concat(value) 归一化。

同时拥有属性的元素的文本去哪儿了？

放到保留键 #text 下，因为对象已经容纳了属性。只有文本的元素则塌缩为普通字符串。

XML 命名空间怎么处理？

JSON 没有命名空间。最安全的做法是把前缀保留在键里（"soap:Envelope"），同时把 xmlns 声明作为 @xmlns:* 属性保留，这样什么都不会丢。

来源

fast-xml-parser — JavaScript 用 XML 转对象解析器
xmltodict — Python 用 XML 转 dict
Converting Between XML and JSON — BadgerFish、GData 和 Parker 约定的经典参考

最后审阅:2026 年 6 月。