Monday, November 2, 2009

Find and sort acronyms for Microsoft Word document automatically.

We guys work on Information Technology. I always have too many acronyms to deal with when I create a document.  There are many solutions of adding acronym to a word document, such as add as a Bookmark or Hyperlink. But most of them require the author to go through the document and mark all the acronyms manually, which I don’t like.

This script searches through your word document and find all Words >= 3 (configurable) alphabets in Uppercase, then remove duplicate entries and sort by alphabetical order.  This is exactly what I need.  If  you want to try, click Tools->macros->Visual Basic Editor, copy-paste the code below and execute it. There you go!

 

Sub ExtractAcronymsToANewDocument()

Dim oDoc_Source As Document
Dim oDoc_Target As Document
Dim strListSep  As String
Dim strAcronym  As String
Dim oTable      As Table
Dim oRange      As Range

Dim n As Long
Dim strAllFound As String
'use to keep track of founded. Find the list separator from international settings
'In some countries it is comma, in other semicolon

strListSep = Application.International(wdListSeparator)
strAllFound = "#"

Set oDoc_Source = ActiveDocument
'Create new document for acronyms
Set oDoc_Target = Documents.Add

With oDoc_Target
'Make sure document is empty
.Range = ""
'Insert a table with room for acronym and definition
Set oTable = .Tables.Add(Range:=.Range, NumRows:=2, NumColumns:=3)
    With oTable
    'Format the table a bit
    'Insert headings
    .Cell(1, 1).Range.Text = "Acronym"
    .Cell(1, 2).Range.Text = "Definition"
    .Cell(1, 3).Range.Text = "Page"
    'Set row as heading row
    .Rows(1).HeadingFormat = True
    .Rows(1).Range.Font.Bold = True
    .PreferredWidthType = wdPreferredWidthPercent
    .Columns(1).PreferredWidth = 20
    .Columns(2).PreferredWidth = 70
    .Columns(3).PreferredWidth = 10
    End With
End With

With oDoc_Source
Set oRange = .Range
n = 1 'used to count below

    With oRange.Find
    .Text = "<[A-Z]{3" & strListSep & "}>"
    .Forward = True
    .Wrap = wdFindStop
    .Format = False
    .MatchCase = True
    .MatchWildcards = True
    Do While .Execute
    'Continue while found
    strAcronym = oRange
    'Insert in target doc
    'If strAcronym is already in strAllFound, do not add again
    If InStr(1, strAllFound, "#" & strAcronym & "#") = 0 Then
        'Add new row in table from second acronym
        If n > 1 Then oTable.Rows.Add
        'Was not found before
        strAllFound = strAllFound & strAcronym & "#"
        'Insert in column 1 in oTable
        'Compensate for heading row
        With oTable
        .Cell(n + 1, 1).Range.Text = strAcronym
        'Insert page number in column 3
        .Cell(n + 1, 3).Range.Text = oRange.Information(wdActiveEndPageNumber)
        End With
        n = n + 1
        End If
        'If acronym
    Loop
    End With
End With

'Sort the acronyms alphabetically
With Selection
    .Sort ExcludeHeader:=True, FieldNumber:="Column 1", SortFieldType _
    :=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending
    .HomeKey (wdStory)
End With

'Clean up
Set oDoc_Source = Nothing
Set oDoc_Target = Nothing
Set oTable = Nothing
MsgBox "Finished extracting " & n - 1 & " acronymn(s) to a new document."

End Sub