Archive for February, 2011

In a recent discussion on cf-talk the question was asked how to improve the performance of ColdFusion when working with very large XML documents. One of the solutions proposed was to use StAX and that got me thinking. StAX is a stream processor works very different from what you may be used to from other XML processors. Instead of viewing an XML document as a whole and elements in context to their parents, children and siblings, it just treats the whole document as a sequence of items. Each of these elements can be of type elementstart, elementend, comment, entity etc. The way you work with this is you iterate through all the items in your document and process them one by one. Working that way is sufficiently different to make it necessary to rewrite all your processing from scratch if you want to switch from the built-in processor to StAX which makes it a solution that is not so attractive.

But what if we combine a preprocessing step in StAX to split the large XML document into smaller pieces with the regular processing in ColdFusion? StAX is Java so it is easy to integrate it into ColdFusion and to test this I wrote a sample implementation to test if this would help. It has some limitations such as only handling elements, element text and attributes, but it seems to work just fine (and the code is open for improvement). With this I benchmarked some XML files I downloaded from internet with the following results:

Source file Source size Split on Records Time
http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz 111 MB regions 1 24274 ms
http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz 111 MB mailbox 21750 146999 ms
ftp://ftp.nlm.nih.gov/nlmdata/sample/medline/medsamp2011h.xml.zip 164 MB 30000 30000 472043 ms

As you can see how you are splitting a document has a significant impact. I presume this is mostly due to the impact the write operations have on my laptop with a slow 5400 rpm harddisk. On the other hand in the best case scenario the parsing speed is over 4 MB per second. Memory consumption stayed under 200 MB for the whole server so it looks like there are some scenario’s where this might be useful.

Code for xmlSplitter.cfc, tested on CF 9.01, 64-bit with StAX 1.2.0 and Java 1.6u24 64-bit.

Product:               Seapine TestTrack Pro
Vulnerable versions:   2010.x, 2011.x
Vulnerability:         predictable session cookies
Vendor informed:       2010-09-07
Fix available:         no

Info:
TestTrack Pro is an issue tracking application from Seapine

Vulnerability:
TestTrack Pro offers a SOAP interface which works as follows:
- connect with username and password to retrieve a list of available
  projects: getProjectList(username, password);
- connect with username and passsword to retrieve a session login cookie
  on a project: projectLogon(project, username, password);
- query the system to retrieve project data using the session login
  cookie to authenticate: getRecordListForTable (cookie, .....);
- log off the session: databaseLogoff(cookie).

The session login cookies generated by the server are predictable. Below
is a log file from the connections showing the date and time of a log
entry, and then the cookie used for authentication:
"09/07/10","11:18:19","1246111"
"09/07/10","11:18:22","1246115"
"09/07/10","11:18:44","1246123"
"09/07/10","11:18:46","1246127"
"09/07/10","11:18:51","1246132"
"09/07/10","11:18:53","1246139"
"09/07/10","11:19:16","1246144"
"09/07/10","11:19:18","1246151"
"09/07/10","11:19:33","1246156"
"09/07/10","11:19:35","1246163"
"09/07/10","11:19:51","1246167"
"09/07/10","11:19:53","1246175"

The absolute value of the session cookie is related to the server
uptime, starting near 0 when the server is just started and increasing
monotonic afterwards.

History:
2010-09-07 Seapine was informed and assigned case number 121426
2010-09-08 Seapine confirmed the issue as a known issue and scheduled a
           fix in 'an upcoming 2011.0.x maintenance release'.
2010-12-20 TestTrack 2011.1 was released without a fix.
2010-12-24 Seapine was asked to publish a security bulletin detailing
           risks and mitigations despite no fix being availale
2011-02-02 Seapine was informed this issue would be publicly disclosed
2011-02-13 Submitted to bugtrack and published on my blog