Thank you for bring to the community attention the issue you are facing.
I don’t aim to give you the definitive answer but assit in find the right paths forward.
After, checking the issue, it seems to me that the main matter is the fact Extract XML aims mostly to parse XML file and not convert DOC/DOCX into XML.
With that I believe you would need to conver the Doc file into DOCX (giving you are using DOC).
Once you have the DOCX, then essentially you have a XML file that is zipped. So you would need to unzip if and then parse it as XML.
In a nutshell you will need to ensure the following steps:
Execute Command Node: Convert .doc to .docx (if necessary).
Read Binary File Node: Read the .docx or .doc file.
Execute Command Node: Unzip the .docx file to access XML content.
Read Binary File Node: Read the specific XML file within the unzipped content.
Extract from XML Node: Parse the XML to work with it in n8n.
This would require you to access file in your disk. Also, to convert DOC to DOCX you would need some tool/command like pandoc Pandoc - Pandoc User’s Guide.
Also, I am not sure if the n8n node Comprssion would do the trick for unzip the DOCX file. If it does not then you would need to use command from the OS using unzip command for example.
I hope this gives some clarity and direction for you. If not, please share further insights so we can check and advise accordingly.