Scripting example - regular expressions and Dictionaries

And here's another one with some regular expression comparisons.

The occasion being a requirement to link some OS X mailservers together. Unfortunately Apple didn't compile LDAP into Postfix so we're using output from the Apple Directory to generate text files for inter-branch mail routing.

#!/usr/local/bin/gst -f
"read a text file with LDIF records and generate a Postfix virtuals file
  from the records.
  Require uid and keyword attributes.
   - uid is home directory shortname and mail aliases e.g. jim, jim.bob
   - keyword is one of BRNCH1, BRNCH2, or BRNCH3 and represents branch
LDIF file includes records like the ones below. Just want email accounts.

mail:postfix administrator$ ldapsearch -x -b cn=users,dc=example,dc=com uid apple-keyword
# extended LDIF
# LDAPv3
# base  with scope subtree
# filter: (objectclass=*)
# requesting: uid mail apple-keyword 

# users,
dn: cn=users,dc=example,dc=com

# diradmin, users,
dn: uid=diradmin,cn=users,dc=example,dc=com
uid: diradmin

# fredj, users,
dn: uid=fredj,cn=users,dc=example,dc=com
uid: fredj
uid: fred.jones
apple-keyword: BRNCH1

# bobs, users,
dn: uid=bobs,cn=users,dc=example,dc=com
uid: bobs
apple-keyword: BRNCH3

Virtuals file will be like this:
------------------------------------                anything                  

Object subclass: #MailAcct
        instanceVariableNames: 'uidList branchDict branchServer uidName keywordName'
        classVariableNames: ''
        poolDictionaries: ''
        category: 'SG-Scripting' !

MailAcct comment: 
'LDIF parsing; Attributes stored in a dictionary. Note: handling multivalued attributes
would require a dictionary of collections. Have not needed to go there
for the attributes I am handling.' !

!MailAcct methodsFor: 'instance creation'!
        uidName := 'uid'.
        keywordName := 'apple-keyword'.
              branchDict := Dictionary new; from: {
            'BRNCH1' -> ''. 
            'BRNCH2'-> ''. 
        'BRNCH3' -> ''.
           uidList := OrderedCollection new.
        ! ! 

!MailAcct methodsFor: 'accessing'!

   ^uidList !

addUid: aString
    "expecting a String"
    uidList add: aString. !

        ^branchServer !
assignBranch: aKey
        "private method"
        branchServer := branchDict at: aKey ifAbsent: [] !
addKeyValue: anAttribute value: aValue
    "either a keyword or a UID"
    (anAttribute = uidName) ifTrue: [self addUid: aValue].
    (anAttribute = keywordName) ifTrue: [
            self assignBranch: aValue]. !

    "Yes, if have a branchServer for this person"
    ^branchServer isNil not. !    
Object subclass: #VirtualGen
        instanceVariableNames: 'domain mailAccts'
        classVariableNames: ''
        poolDictionaries: ''
        category: 'SG-Scripting'!

VirtualGen comment: 
'Write out Postfix virtual file for the MailAccts collection' !

!VirtualGen methodsFor: 'instance creation'!
init: theMailAccts
        domain := ''.
        mailAccts := theMailAccts.            !

!VirtualGen methodsFor: 'operation' !
printRecords: aStream
"print the virtual entries. Prints out Postfix virtual file"
        aStream display: domain,'                anything'; nl.
        mailAccts do: [ :mailAcct |
                mailAcct uidList do: [:alias |
                        aStream display: ('%1@%2                %1@%3' bindWith: alias with: domain with: mailAcct branchServer); nl.
        ]. !
| ldifFilePath ldifFile virtualsPath virtualsFile mailAccts mailAcct rStream dataFlag |
"main program begins here"
    ldifFilePath :=         '/Volumes/doc/osx_ch_export.ldif'.
    virtualsPath :=         '/Volumes/doc/virtual'.
    ldapBase :=                 'cn=users,dc=example,dc=com'.
    virtualsFile := File name: virtualsPath.
    ldifFile := File name: ldifFilePath.
    dataFlag := false.
    mailAccts := OrderedCollection new.
    mailAcct := MailAcct new init.
    ldifFile readStream linesDo: [:line |        
        ( line =~ ldapBase) matched ifTrue: [
                dataFlag := true ].
        dataFlag ifTrue: [
                ( line =~ ':') ifMatched: [:match |
                        matchIndex := match to.
                        attrib := line copyFrom: 1 to: matchIndex - 1.
                        value := line copyFrom: matchIndex + 2.
                        "Transcript showCr: attrib,' : ',value." 
                        mailAcct addKeyValue: attrib value: value ].
                ( line =~ '^$') matched  ifTrue: [
                    "reached blank line: only add to MailAcct collection if a mailAcct"
                    mailAcct validMailAcct ifTrue: [
                            mailAccts add: mailAcct ].
                    mailAcct := MailAcct new init.
                    dataFlag := false ].
        ] ].
    Transcript nl; showCr: 'Output '; nl.
    Transcript showCr: 'mailAccts size: ',mailAccts size printString.
    virtualGen := VirtualGen new init: mailAccts.
    virtualGen printRecords: virtualsFile writeStream.        


is this a beginner smalltalk script? holy crap.
i've had to dig deep into:
info gst

to find any kind of beginner tutorials and luckily they were a lot simpler, but i'm sure your tutorial is a good one, no disrespect =)

there is nothing here in this wiki for beginners.

After I'd done the tutorial that is on the GNU Smalltalk home page, there was quite a gap that I had to fill in to get real things done. Like how to open files, how to use a Dictionary, how to incorporate objects in a server script etc. Once someone has their head around the syntax, I expect they would find the scripts helpful - I would have anyway.

Well, maybe it's better to change the title of the blog entry to "Putting it together in a script: Dictionaries and regular expressions" or something like that. :-)


Instead of Dictionary new; from: {...} just use Dictionary from: {...}.

Indeed, the former work just does Dictionary new, creating a new empty dictionary, and the Dictionary from: {...}, discarding the previous dictionary and creating a new one.

Only suggestion I have: if you're only interested in the boolean result of #=~ (matched/unmatched), use #~ to do the regular expression match.

Alternatively you can use #ifMatched:ifNotMatched: (there are of course all the other variants as for #ifTrue:ifFalse:). In GST 3.0.x the "matched" block needs one parameter; in 3.1 that parameter is optional.

Also, you can use % {...} instead of the lengthy #bindWith:with:with: constructs.

Just to document your comment about {...}
brace syntax:

So instead of '%1 %2' bindWith var1 with: var2,
it would be '%1 %2' % {var1. var2}

Clearer indeed.

User login