TransTools is a free (for personal use) suite of tools to help clean up with the text in Word, Excel, Visio and AutoCAD files. It's aimed at translators but could prove useful for many others.
The program installed easily, with no hassles. It seems to work just about anywhere - there are separate Office add-ins for 2003 and earlier, 2007 and later, 32 or 64-bit - and correctly added itself to our Word and Excel 2013 ribbons.
The Word add-in features a Document Cleaner which tidies up documents produced by OCR or PDF conversion software. This can restore more consistent formatting (fonts, font sizes, spaces, styles), remove unwanted frames, correct problems with tables, reset indentation, remove text highlighting, shading and more. Just choose the actions you need, and click to carry them out.
There are also many smaller corrective options. You're able to remove excessive spaces, and replace them in various ways (with spaces, tabs, a new line or paragraph). A "Remove Highlight" tool can strip out highlights in a particular colour. And the Non-breaking Space Checker does what its name suggests, scanning your document to ensure any non-breaking spaces are used correctly.
The Excel add-in has several more spreadsheet-specific tools. The Cell Resize Wizard resizes cells as required to make text fully visible, for instance, while the Translation > Extract... tool extracts all unique text and displays it in a separate workbook. But there are also simpler formatting tools, including options to remove highlighting or excessive spacing.
We didn't try them, but it seems that TransTools for Visio and AutoCAD work like a stripped down version of the Excel add-in, extracting text from the document, then allowing you to merge it back - with any translations or modifications - later.
Version 3.8 changes (full history):
•TransTools for Word: ◦Added a new option to Unbreaker called “Let me join more paragraphs if they look like separate paragraphs”. When this option is used, Unbreaker is less strict when it decides whether a particular paragraph or line break should be included in the list of uncertain breaks to remove. Normally, you should use this option in documents converted from PDFs in order to discover additional breaks which need to be removed.
◦Added a new feature in Quotation Magic: you can now replace curly (directional) quotation marks with straight quotation marks (" and ') or use straight apostrophe ('). This can be useful if you prepare a document for the CAT tool in order to get better TM leverage.
◦Added new options in Find & Remove Excessive Spaces: you can now remove spaces before and after certain punctuation, e.g., remove an erroneous space before final punctuation (?!.) or after a parenthesis.
◦Added a new feature in Tag Cleaner (Document Cleaner) – if Thick Underline formatting is applied to bold text (which may occur with some PDF conversion tools), Tag Cleaner automatically changes it to Single Underline formatting to avoid unnecessary tags in your CAT tool.