Automating Windows™ from Java® and WindowTreeDom☂

(So anyway there used to be a page on codeproject containing some of these ideas implemented in C#, but I can’t find it any more. I’m sure it exists.)

So let’s say you’re writing your everything-is-a-service web application and you’ve got RESTful and/or SOAPy calls to your middle tier and everything’s going swimmingly until you discover you need to integrate your app with some 80’s-era RAD Windows application that no-one has the sourcecode to and embodies some vital bit of business intelligence that you need.

This application was written by IBM on the hard drive platters of angels so isn’t going anyway any time soon, but for the purposes of this blog post, let’s call that application “Notepad” [*]

[*]: (For those on Windows Metro, notepad is this application that allows you to write words into a white square on your computer screen and gives you the ability to recall those words later on if you give it what we in the tech industry call a “file name”. This is a bit like a URL, except it doesn’t run on raindrops and icecream).

Wouldn’t it be nice if your software could automatically interact with that application, wrap it up in a service-oriented bow and then deliver it’s results to you in a method that some would consider slightly more standardised ?

Of course it would.

So lets have a look at the frontend to Notepad’s data layer, the ‘Save As’ dialog box:

A 'Save As' dialog box
A ‘Save As’ dialog box in Windows XP

The dialog box is made up of windows and controls, which are visible in the Windows™ window hierarchy.

If you fire up the venerable Spy++ from MSDN, you’ll notice that all the objects on your desktop, including the dialog box, are arranged in a treelike structure.

A 'Save As' dialog box in Spy++
A ‘Save As’ dialog box in Spy++

The Document Object Model (DOM) is also a treelike structure, and has a whole arsenal of utilities that operate on it (XSLT, XPath, that sort of thing).

So let’s represent the window tree as a DOM tree, with each element of the XML tree representing a window in the Windows™ window tree:

...
<window class="Notepad" hwnd="native@0x761476" title="Untitled - Notepad">
  <window class="#32770" hwnd="native@0x2e1528"
    owner="native@0x761476" title="Save As">
    <window class="tooltips_class32" hwnd="native@0x4158a"
      owner="native@0x2e1528" title=""/>
    <window class="tooltips_class32" hwnd="native@0x3159c"
      owner="native@0x2e1528" title=""/>
    <window class="Static" hwnd="native@0x681474" title="Save &amp;in:"/>
    <window class="ComboBox" hwnd="native@0x84141c" title=""/>
    <window class="Static" hwnd="native@0x121534" title=""/>
    <window class="ToolbarWindow32" hwnd="native@0x41580" title=""/>
    <window class="ToolbarWindow32" hwnd="native@0x19154e" title=""/>
    <window class="ListBox" hwnd="native@0x3157c" title=""/>
    <window class="SHELLDLL_DefView" hwnd="native@0x3159a" title="">
      <window class="SysListView32" hwnd="native@0x31596" title="FolderView"/>
    </window>
    <window class="Static" hwnd="native@0x31566" title="File &amp;name:"/>
    <window class="ComboBoxEx32" hwnd="native@0x31574" title="">
      <window class="ComboBox" hwnd="native@0x31578" title="">
        <window class="Edit" hwnd="native@0x31570" title=""/>
      </window>
    </window>
    <window class="Static" hwnd="native@0x3157a" title="Save as &amp;type:"/>
    <window class="ComboBox" hwnd="native@0x31582" title=""/>
    <window class="Button" hwnd="native@0x21568" title="Open as &amp;read-only"/>
    <window class="Button" hwnd="native@0x2156a" title="&amp;Save"/>
    <window class="Button" hwnd="native@0x21564" title="Cancel"/>
    <window class="Button" hwnd="native@0x21562" title="&amp;Help"/>
    <window class="ScrollBar" hwnd="native@0x2155a" title=""/>
    <window class="#32770" hwnd="native@0x2155c" title="">
      <window class="Static" hwnd="native@0x21586" title="&amp;Encoding:"/>
      <window class="ComboBox" hwnd="native@0x2155e" title=""/>
    </window>
    <window class="IME" hwnd="native@0xd1554" owner="native@0x2e1528" title="Default IME">
      <window class="MSCTFIME UI" hwnd="native@0x7b1478"
        owner="native@0xd1554" title="M"/>
    </window>
  </window>
  <window class="Edit" hwnd="native@0x11152c" title=""/>
  <window class="msctls_statusbar32" hwnd="native@0xbe1422" title=""/>
</window>
...

Then to find a particular window, we could code up something similar to the following:

WindowTreeDom wtd = new WindowTreeDom();
Document dom = wtd.getDom();
Element okButtonEl = (Element) XPathAPI.getSingleNode(dom, 
    ".//window[@class='Notepad']/window[@title=\"Save As\"]/window[@title=\"&amp;Save\"]");
User32.SendMessage(WindowTreeDom.getHwnd(okButtonEl), WM_CLICK);

which grabs a handle to the Save button on a Save As dialog for a notepad window, and sends a WM_CLICK message to the control’s message loop.

Which is useful if you want to automatically press the Save button on a Save As dialog on a notepad window without paying half a million dollars for HP LoadRunner.

So here’s the code already:

WindowTreeDom.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
package com.randomnoun.common.jna;
 
/* (c) 2013 randomnoun. All Rights Reserved. This work is licensed under a
 * BSD Simplified License. (http://www.randomnoun.com/bsd-simplified.html)
 */
 
import java.util.HashMap;
import java.util.Map;
 
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.TransformerException;
 
import org.apache.log4j.Logger;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
 
import com.sun.jna.Native;
import com.sun.jna.Pointer;
import com.sun.jna.platform.win32.WinUser;
import com.sun.jna.platform.win32.WinDef.DWORD;
import com.sun.jna.platform.win32.WinDef.HWND;
import com.sun.jna.platform.win32.WinUser.WNDENUMPROC;
import com.sun.jna.win32.StdCallLibrary;
import com.sun.jna.win32.W32APIOptions;
 
import com.randomnoun.common.XmlUtil;
 
/** A class to convert the Win32 windows tree into a DOM object
 * 
 * @blog http://www.randomnoun.com/wp/2012/12/26/automating-windows-from-java-and-windowtreedom/
 * @author knoxg
 * @version $Id: WindowTreeDom.java,v 1.4 2013-09-24 02:37:09 knoxg Exp $
 */
public class WindowTreeDom {
 
    /** A revision marker to be used in exception stack traces. */
    public static final String _revision = "$Id: WindowTreeDom.java,v 1.4 2013-09-24 02:37:09 knoxg Exp $";
 
	// the User32 functions we invoke from this class
	public interface User32 extends StdCallLibrary {
      User32 INSTANCE = (User32) Native.loadLibrary("user32", User32.class, 
        W32APIOptions.DEFAULT_OPTIONS);
 
      public static final DWORD GW_OWNER = new DWORD(4);
      boolean EnumWindows(WinUser.WNDENUMPROC lpEnumFunc, Pointer arg);
      boolean EnumChildWindows(HWND hWnd, WNDENUMPROC lpEnumFunc, Pointer data);
      int GetWindowText(HWND hWnd, char[] lpString, int nMaxCount);
      int GetClassName(HWND hWnd, char[] lpClassName, int nMaxCount);
      public HWND GetWindow(HWND hWnd, DWORD cmd);
      HWND GetParent(HWND hWnd);
    }
 
	/** JNA interface to USER32.DLL */
	final static User32 lib = User32.INSTANCE;
 
	/** Logger instance for this class */
	static Logger logger = Logger.getLogger(WindowTreeDom.class);
 
	/** WindowTreeDom constructor.
	 * 
	 * @see #getDom()
	 */
	public WindowTreeDom() {
 
	}
 
	/** This callback is invoked for each window found. It generates XML 
	 * {#link org.w3c.Element}s for each window, and attaches them to the supplied 
	 * {#link org.w3c.Document}.
	 * 
	 */
	private static class WindowCallback implements WinUser.WNDENUMPROC {
		Document doc;
		Element documentElement;
		Element topLevelWindow; 
		Map<String, Element> hwndMap = new HashMap<String, Element>();
 
		/** Creates a new window callback
		 * 
		 * @param doc The XML document populated by this callback.
		 * @param topLevelHWND If non-null, the windows being returned should all be
		 *   child windows of this HWND (via EnumChildWindows), otherwise it is
		 *   assumed toplevel windows are returned (via EnumWindows)
		 * @param topLevelWindow If non-null, the document Element within <tt>doc</tt>
		 *   which will contain new child elements.
		 */
		public WindowCallback(Document doc, HWND topLevelHWND, Element topLevelWindow) {
			this.doc = doc;
			this.topLevelWindow = topLevelWindow;
			if (topLevelWindow != null) {
				hwndMap.put(topLevelHWND.getPointer().toString(), topLevelWindow);
			}
			documentElement = doc.getDocumentElement();
		}
 
		public boolean callback(HWND hWnd, Pointer data) {
 
			char[] buffer = new char[512];
			User32.INSTANCE.GetWindowText(hWnd, buffer, 512);
 
			char[] buffer2 = new char[1026];
			int classLen = User32.INSTANCE.GetClassName(hWnd, buffer2, 1026);
 
			String windowTitle = Native.toString(buffer);
			String className = Native.toString(buffer2);
 
			HWND parent = User32.INSTANCE.GetParent(hWnd);
			HWND owner = User32.INSTANCE.GetWindow(hWnd, User32.GW_OWNER);
 
		    // check if this has already been created in the DOM
			Element el = hwndMap.get(hWnd.getPointer().toString());
			if (el==null) {
				el = doc.createElement("window");
			} else {
				el.removeAttribute("pwindow");
			}
			el.setAttribute("hwnd", hWnd.getPointer().toString());
			if (owner!=null) {
				el.setAttribute("owner", owner.getPointer().toString());
			}
			el.setAttribute("title", windowTitle);
			el.setAttribute("class", className);
 
			hwndMap.put(hWnd.getPointer().toString(), el);
			if (topLevelWindow==null) { 
				// this is a real top level element, so enumerate its children
				WindowCallback childDommer = new WindowCallback(doc, hWnd, el);
				// this code relies on being able to enum child windows whilst enumming toplevel windows
				lib.EnumChildWindows (hWnd, childDommer, new Pointer(0));
				try {
					childDommer.checkForOrphanedWindows();
				} catch (TransformerException e) {
					logger.error("Problem serialising orphaned windows to XML", e);
				}
			}
 
			if (parent==null) {
				documentElement.appendChild(el);
				if (topLevelWindow!=null) {
					// have seen VMDragDetectWndClass'es here, presumably a vmware thing
					// (note that this window won't be in the parent callback's hwndMap)
					try {
						logger.warn("Toplevel child window found: " + XmlUtil.getXmlString(el, true));
					} catch (TransformerException e) {
						logger.error("Toplevel child window found, problem serialising toplevel windows to XML", e);
					}
				}
 
			} else {
				Element parentEl = hwndMap.get(parent.getPointer().toString());
				if (parentEl==null) {
					// throw new IllegalStateException("Unknown parent window '" + parent.getPointer().toString() + "'");
					// it appears that we can get IME child windows being returned 
					// by EnumWindows, even though they're not top-level
					parentEl = doc.createElement("window");
					parentEl.setAttribute("pwindow", "true");
					parentEl.setAttribute("hwnd", parent.getPointer().toString());
					hwndMap.put(parent.getPointer().toString(), parentEl); 
				}
				parentEl.appendChild(el);
			}
 
			return true;
		}
 
		/** Lists any window nodes that were generated via enumeration, whose 
		 * parent nodes were not generated.
		 * 
		 * @throws TransformerException
		 */
		public void checkForOrphanedWindows() throws TransformerException {
			for (Element e : hwndMap.values()) {
				if (!e.getAttribute("pwindow").equals("")) {
					// the desktop window isn't in the enumeration
					logger.warn("Parent window found that was not in enumeration: " + XmlUtil.getXmlString(e, true));
					// throw new IllegalStateException("Window found without parent window");
				}
			}
		}
 
	}
 
	/** Generate an XML document from the Win32 window tree */
	public Document getDom() throws ParserConfigurationException, TransformerException {
		DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
		DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
		Document doc = docBuilder.newDocument();
		Element topElement = doc.createElement("windows");
		doc.appendChild(topElement);
 
		WindowCallback dommer = new WindowCallback(doc, null, null);
		lib.EnumWindows (dommer, new Pointer(0));
		dommer.checkForOrphanedWindows();
 
		return doc;
	}
 
	/** Return the hwnd of an element, as a pointer represented as a long 
	 * 
	 * @param windowEl a window element returned from getDom()
	 * 
	 * @return the hwnd of the element.
	 */
	public HWND getHwnd(Element windowEl) {
        String hwndString = windowEl.getAttribute("hwnd");
        if (hwndString.startsWith("native@0x")) {
        	return new HWND(new Pointer(Long.parseLong(hwndString.substring(9), 16)));
        } else {
        	throw new IllegalStateException("Could not determine HWND of window element: found '" + hwndString + "'");
        }
	}
 
}

and here’s a unit test:

WindowTreeDomTest.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
package com.randomnoun.common.jna;
 
/* (c) 2013 randomnoun. All Rights Reserved. This work is licensed under a
 * BSD Simplified License. (http://www.randomnoun.com/bsd-simplified.html)
 */
 
import java.io.IOException;
 
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.TransformerException;
 
import junit.framework.TestCase;
 
import org.apache.log4j.Logger;
import org.w3c.dom.Document;
 
import com.randomnoun.common.XmlUtil;
import com.randomnoun.common.log4j.Log4jCliConfiguration;
 
/** Unit test for WindowTreeDom 
 *
 * @blog http://www.randomnoun.com/wp/2012/12/26/automating-windows-from-java-and-windowtreedom/
 **/
public class WindowTreeDomTest extends TestCase {
 
	Logger logger = Logger.getLogger(WindowTreeDomTest.class);
 
	public void testWindowTreeDom() throws ParserConfigurationException, TransformerException, IOException {
		if (System.getProperty("os.name").startsWith("Windows")) {
			WindowTreeDom wtd = new WindowTreeDom();
			Document d = wtd.getDom();
			logger.info(XmlUtil.getXmlString(d, true));
		} else {
			logger.info("Not running tests on operating sytem '" + System.getProperty("os.name") + "'");
		}
	}
 
	public static void main(String args[]) throws ParserConfigurationException, IOException, TransformerException {
		Log4jCliConfiguration lcc = new Log4jCliConfiguration();
		lcc.init("", null);
		WindowTreeDomTest wtdt = new WindowTreeDomTest();
		wtdt.testWindowTreeDom();
	}
 
}

You’ll need to add JNA and whatever XML toolkit you’re using to your project’s dependencies, which in my pom.xml looks like the following pom.xml fragment:

<dependency>
   <groupId>net.java.dev.jna</groupId>
   <artifactId>jna</artifactId>
   <version>3.2.4</version>
   <type>jar</type>
   <scope>compile</scope>
</dependency>

Hooray.

Update 2013-09-25: This code is in the com.randomnoun.common:common-public maven artifact:

common-public
com.randomnoun.common:common-public

Update 2021-01-29: and github:

common-public
git@github.com:randomnoun/common-public.git

Tags:, ,
5 Comments

Add a Comment

Your email address will not be published. Required fields are marked *