# 前言

因为业务需求以及网上的解决方案不完整，花了两天时间研究出一行代码，所以写下此文就当 2023 与 2024 之间的承上启下之作了。（代码手打，有错自己改，狗头保命）

之前的解决方案
在网上搜索 java spring 中对于文档的合并输出，解决方案不外乎

使用商业付费 package：Merge Docx java
使用 altchunk，以下代码摘自：https://soaserele.blogspot.com/2011/07/merge-docx-files-in-java-using-docx4j.html

	public class DocxService {
	private static final String CONTENT_TYPE = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";

	public InputStream mergeDocx(final List<InputStream> streams) throws Docx4JException, IOException {

	WordprocessingMLPackage target = null;
	final File generated = File.createTempFile("generated", ".docx");

	int chunkId = 0;
	Iterator<InputStream> it = streams.iterator();
	while (it.hasNext()) {
	InputStream is = it.next();
	if (is != null) {
	if (target == null) {
	// Copy first (master) document
	OutputStream os = new FileOutputStream(generated);
	os.write(IOUtils.toByteArray(is));
	os.close();

	target = WordprocessingMLPackage.load(generated);
	} else {
	// Attach the others (Alternative input parts)
	insertDocx(target.getMainDocumentPart(), IOUtils.toByteArray(is), chunkId++);
	}
	}
	}

	if (target != null) {
	target.save(generated);
	return new FileInputStream(generated);
	} else {
	return null;
	}
	}

	private static void insertDocx(MainDocumentPart main, byte[] bytes, int chunkId) {
	try {
	AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/part" + chunkId + ".docx"));
	afiPart.setContentType(new ContentType(CONTENT_TYPE));
	afiPart.setBinaryData(bytes);
	Relationship altChunkRel = main.addTargetPart(afiPart);

	CTAltChunk chunk = Context.getWmlObjectFactory().createCTAltChunk();
	chunk.setId(altChunkRel.getId());

	main.addObject(chunk);
	} catch (Exception e) {
	e.printStackTrace();
	}
	}
	}

以及自己编写相应的代码，但是需要懂 docx4j 的运用以及 docx 解压包之后的 xml 引用的原理，这里就不赘述了。

# Docx4j 从 WordprocessingMLPackage 的层面合并文档

因为具体需求，直接跳过基础转入合并部分：

	public void mergeFile(WordprocessingMLPackage wordMLP, WordprocessingMLPackage wordMLToP) {
	try {
	// 通过 xpath 获取 docx 中 w:body 的正文节点
	List<Object> bodies = wordMLToP.getMainDocumentPart().getJAXBNodesViaXPath("//w:body",false);
	// 对于多个 body 逐次遍历加入，这里的样式默认与主文档有关
	for (Object bodyObject : bodies ) {
	Body body = (Body) bodyObject;
	for (Object content : body.getContent()) wordMLP.getMainDocumentPart().addObject(content);
	}
	} catch (Exception e) {
	throw TechnicalException(e.getMessage())
	}
	}

但是这里的代码只考虑到了 body 部分，并没有考虑到 docx 中的 relashionship 中 rId 的重复和资源不能引入的问题，最后需要合并的文档也并没有做到另启一页。

# 加入分页符

	private static final ObjectFactory objectFactory = new ObjectFactory();

	void addPageBreak(MainDocumentPart dp) {
	P paragraph = objectFactory.createP();
	R run = objectFactory.createR();
	paragraph.getContent().add(run);
	Br br = objectFactory.createBr();
	run.getContent().add(br);
	br.setType(org.docx4j.wml.STBrType.PAGE);
	documentPart.setObject(paragraph);
	}

# 重建图片索引

因为文档需要，一些标题段前需要 svg 进行修饰，网上目前给到的方案如下：

可以参考
https://stackoverflow.com/questions/23796468/merge-worddocx-documents-with-docx4j-how-to-copy-images

	List<Object> blips = s.getMainDocumentPart().getJAXBNodesViaXPath("//a:blip", false);
	for (Object el : blips) {
	try {

	CTBlip blip = (CTBlip) el;
	RelationshipsPart parts = s.getMainDocumentPart().getRelationshipsPart();
	Relationship rel = parts.getRelationshipByID(blip.getEmbed());
	Part part = parts.getPart(rel);

	if (part instanceof ImagePngPart)
	System.out.println(((ImagePngPart) part).getBytes());
	if (part instanceof ImageJpegPart)
	System.out.println(((ImageJpegPart) part).getBytes());
	if (part instanceof ImageBmpPart)
	System.out.println(((ImageBmpPart) part).getBytes());
	if (part instanceof ImageGifPart)
	System.out.println(((ImageGifPart) part).getBytes());
	if (part instanceof ImageEpsPart)
	System.out.println(((ImageEpsPart) part).getBytes());
	if (part instanceof ImageTiffPart)
	System.out.println(((ImageTiffPart) part).getBytes());

	Relationship newrel = f.getMainDocumentPart().addTargetPart(part, AddPartBehaviour.RENAME_IF_NAME_EXISTS);

	blip.setEmbed(newrel.getId());
	f.getMainDocumentPart().addTargetPart(s.getParts().getParts().get(new PartName("/word/" + rel.getTarget())));

	} catch (Exception ex) {
	ex.printStackTrace();
	}
	}

这个代码中间的 if 可以删去，但是因为是对 a:blip 的全文搜索，所以对 svg 的引用一点作用都没有。即使用此段代码后，虽然 media 资源都被加入、引用都被覆写，但是因为 document.xml 中 asvg:svgBlip 对于 r:embed 的引用依然生效，所以合并后的 media 索引依然会被之前选择合并到的文档索引覆盖，就是在_rels 目录下 document.xml.rels 会出现对于用一个 id 的重复指向。为了解决这一问题需要重置 a:blip 节点下的 a:extList 子节点。也就是在原来的答案代码中多加入一行代码：

	...
	blip.setEmbed(newrel.getId());
	blip.setExtList(null);
	...

至此就可以得到非常完好的合并 wordprocessingpackage 了。

# 前言

# Docx4j 从 WordprocessingMLPackage 的层面合并文档

# 加入分页符

# 重建图片索引

GameInput Service 服务因下列错误而停止: 复合文件 GameInput Service 是用版本较新的存储产生的

Game day 202409