在《Python|实现公司周报的自动生成 》中我实现了公司周报(word)的自动生成。在实际运行中发现会存在脚注的问题。对于脚注的处理python的docx库其实并不支持,这里我通过win32com库实现了脚注的提取和插入。
提取脚注
脚注的提取其实涉及两部分:脚注的内容、脚注的标记位置。
脚注的内容提取比较简单,下面的方法中可以获取到脚注的具体内容,源码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 def get_footnotes (filename ): footText = [] word_app = win32.Dispatch('Word.Application' ) doc = word_app.Documents.Open(filename) footnotes = doc.Footnotes for footnote in footnotes: footText.append(footnote.Range.Text) doc.Close() word_app.Quit() return footText
脚注的标记位置提取比较特殊,比如段落“唧唧复唧唧¹,木兰²当户织。”,提取到的标记内容是:唧唧复唧唧\x02,木兰\x02当户织。我本想把word转为xml后进行节点的定位,但是word的xml层级太多了,感觉很麻烦。最终我的处理方式是先拿到脚注标记的段落内容,转为list之后再遍历list提取到最终的脚注标记位置。源码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 def get_footindex (filename ): footIndex = [] word = win32.Dispatch('Word.Application' ) doc = word.Documents.Open(filename) for foot_note in doc.Footnotes: r = doc.Range(foot_note.Reference.Start,foot_note.Reference.End) for s in r.Sentences: footIndex.append(str (s)) doc.Close() word.Quit() return footIndexdef get_footindextext (fontIndex ): fontindexa = [] fontindexb = [] num = 0 for i in fontIndex: fontindexa.append(re.split(r'\x02' , i)) if len (fontindexa) > 0 : fontindexb.append(fontindexa[0 ][num]) for j in range (1 ,len (fontindexa)): if fontindexa[j] == fontindexa[j-1 ]: num += 1 fontindexb.append(fontindexa[j][num]) else : num = 0 fontindexb.append(fontindexa[j][num]) return fontindexb
插入脚注
脚注的插入时最麻烦的,通过python进行word脚注插入,我在ChatGPT4.0模型上也问了,最终都没有成功。最后突然想到了VBA,在Microsoft Office的处理上VBA在一定程度上还是最丰富的,在确定VBA可以实现脚注插入后只需要通过python执行VBA脚本即可。VBA在指定位置插入脚注的源码如下:
1 2 3 4 5 6 7 8 9 10 11 Sub InsertFootnote() Dim rng As Range Set rng = ActiveDocument.Content With rng.Find .Text = "脚注位置" .Execute If rng.Find.Found Then ActiveDocument.Footnotes.Add Range:=rng, Text:="脚注内容" End If End With End Sub
完整源码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 import win32com.client as win32import osimport datetimeimport redef get_footindex (filename ): footIndex = [] word = win32.Dispatch('Word.Application' ) doc = word.Documents.Open(filename) for foot_note in doc.Footnotes: r = doc.Range(foot_note.Reference.Start,foot_note.Reference.End) for s in r.Sentences: footIndex.append(str (s)) doc.Close() word.Quit() return footIndexdef get_footindextext (fontIndex ): fontindexa = [] fontindexb = [] num = 0 for i in fontIndex: fontindexa.append(re.split(r'\x02' , i)) if len (fontindexa) > 0 : fontindexb.append(fontindexa[0 ][num]) for j in range (1 ,len (fontindexa)): if fontindexa[j] == fontindexa[j-1 ]: num += 1 fontindexb.append(fontindexa[j][num]) else : num = 0 fontindexb.append(fontindexa[j][num]) return fontindexbdef get_footnotes (filename ): footText = [] word_app = win32.Dispatch('Word.Application' ) doc = word_app.Documents.Open(filename) footnotes = doc.Footnotes for footnote in footnotes: footText.append(footnote.Range.Text) doc.Close() word_app.Quit() return footTextdef insert_footnotes (filepath,searchStr,footnoteStr ): word_app = win32.Dispatch("Word.Application" ) doc = word_app.Documents.Open(filepath) vba_code_part1 = '''Sub InsertFootnote() Dim rng As Range Set rng = ActiveDocument.Content With rng.Find .Text = "{}" .Execute If rng.Find.Found Then ''' .format (searchStr) vba_code_part2 = ''' ActiveDocument.Footnotes.Add Range:=rng, Text:="{}" End If End With End Sub ''' .format (footnoteStr) vba_module = doc.VBProject.VBComponents.Add(1 ) vba_code = vba_code_part1 + vba_code_part2 vba_module.CodeModule.AddFromString(vba_code) word_app.Run("insertfootnote" ) doc.Close() word_app.Quit()if __name__ == '__main__' : footnotes = [] fontindextext = [] path = 'D:/xxx/xxx/{}' .format (datetime.datetime.today().date()) for i in os.listdir(path): if i != '公司周报{}.docx' .format (datetime.datetime.today().date()) : fontindextext += get_footindextext(get_footindex(path + '/' +i)) footnotes += get_footnotes(path + '/' +i) if len (fontindextext) > 0 : for i in range (0 ,len (fontindextext)): filepath = path + '/公司周报{}.docx' .format (datetime.datetime.today().date()) print (filepath) insert_footnotes(filepath,fontindextext[i],footnotes[i])