Archive for the ‘Uncategorized’ Category

In a recent discussion on cf-talk the question was asked how to improve the performance of ColdFusion when working with very large XML documents. One of the solutions proposed was to use StAX and that got me thinking. StAX is a stream processor works very different from what you may be used to from other XML processors. Instead of viewing an XML document as a whole and elements in context to their parents, children and siblings, it just treats the whole document as a sequence of items. Each of these elements can be of type elementstart, elementend, comment, entity etc. The way you work with this is you iterate through all the items in your document and process them one by one. Working that way is sufficiently different to make it necessary to rewrite all your processing from scratch if you want to switch from the built-in processor to StAX which makes it a solution that is not so attractive.

But what if we combine a preprocessing step in StAX to split the large XML document into smaller pieces with the regular processing in ColdFusion? StAX is Java so it is easy to integrate it into ColdFusion and to test this I wrote a sample implementation to test if this would help. It has some limitations such as only handling elements, element text and attributes, but it seems to work just fine (and the code is open for improvement). With this I benchmarked some XML files I downloaded from internet with the following results:

Source file Source size Split on Records Time
http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz 111 MB regions 1 24274 ms
http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz 111 MB mailbox 21750 146999 ms
ftp://ftp.nlm.nih.gov/nlmdata/sample/medline/medsamp2011h.xml.zip 164 MB 30000 30000 472043 ms

As you can see how you are splitting a document has a significant impact. I presume this is mostly due to the impact the write operations have on my laptop with a slow 5400 rpm harddisk. On the other hand in the best case scenario the parsing speed is over 4 MB per second. Memory consumption stayed under 200 MB for the whole server so it looks like there are some scenario’s where this might be useful.

Code for xmlSplitter.cfc, tested on CF 9.01, 64-bit with StAX 1.2.0 and Java 1.6u24 64-bit.

Product:               Seapine TestTrack Pro
Vulnerable versions:   2010.x, 2011.x
Vulnerability:         predictable session cookies
Vendor informed:       2010-09-07
Fix available:         no

Info:
TestTrack Pro is an issue tracking application from Seapine

Vulnerability:
TestTrack Pro offers a SOAP interface which works as follows:
- connect with username and password to retrieve a list of available
  projects: getProjectList(username, password);
- connect with username and passsword to retrieve a session login cookie
  on a project: projectLogon(project, username, password);
- query the system to retrieve project data using the session login
  cookie to authenticate: getRecordListForTable (cookie, .....);
- log off the session: databaseLogoff(cookie).

The session login cookies generated by the server are predictable. Below
is a log file from the connections showing the date and time of a log
entry, and then the cookie used for authentication:
"09/07/10","11:18:19","1246111"
"09/07/10","11:18:22","1246115"
"09/07/10","11:18:44","1246123"
"09/07/10","11:18:46","1246127"
"09/07/10","11:18:51","1246132"
"09/07/10","11:18:53","1246139"
"09/07/10","11:19:16","1246144"
"09/07/10","11:19:18","1246151"
"09/07/10","11:19:33","1246156"
"09/07/10","11:19:35","1246163"
"09/07/10","11:19:51","1246167"
"09/07/10","11:19:53","1246175"

The absolute value of the session cookie is related to the server
uptime, starting near 0 when the server is just started and increasing
monotonic afterwards.

History:
2010-09-07 Seapine was informed and assigned case number 121426
2010-09-08 Seapine confirmed the issue as a known issue and scheduled a
           fix in 'an upcoming 2011.0.x maintenance release'.
2010-12-20 TestTrack 2011.1 was released without a fix.
2010-12-24 Seapine was asked to publish a security bulletin detailing
           risks and mitigations despite no fix being availale
2011-02-02 Seapine was informed this issue would be publicly disclosed
2011-02-13 Submitted to bugtrack and published on my blog

forumclient001

It doesn’t look like much yet, but the alpha 2 is available for download. Server communication stuff should mostly work, except for the gazillion bugs and missing functions in the Jive webservices. UI is modelled after the Thunderbird NNTP client, so you download a list of forums, then you subscribe to certain forums and then messages for those forums will be downloaded into a local SQLite database so you can even read them offline.

This Sunday at long last my first real server “Spike” died. Spike wasn’t really my server, but it was the first server that wasn’t just for my entertainment and I carried final responsibility for. Purchased in February 2001 it entered service as a shared hosting server for a not-for-profit in March 2001. In the 8 years and 1 month it ran the only real problem it had was a worn down CPU fan causing an overheated CPU, until Sunday morning at long last the primary harddisk died and it was retired from service.

With its demise I am truly saying goodbye to an era (or perhaps to a relic): Spike ran trusty old Windows NT4 SP6a with ColdFusion Enterprise 4.5.2. With it gone, the youngest production machine I have access to is a Windows 2003 system (7 years younger then NT4) with CF 7 (6 years younger then CF 4.5). Spike itself is replaced with a machine with Windows 2003 and CF 8.0.1. And the contrast between how it worked then and now it works now is quite profound. At least in the area of security configuration, CFML is sufficiently backward compatible to just drop it on the new server.

2001, the year Spike was configured, was just before the height of the ‘hackable internet’. A few months after it was taken into production we saw the release of Code Red, followed shortly by Nimda. At that time, a large part of the servers connected to internet was vulnerable to attacks. (Nowadays vulnerabilities are more of a client problem or arguably a user problem then s server problem.) And that showed itself in the way Spike was build. It took me several weeks to come up with a stable and secure configuration, with all sorts of weird constraints. To build Spike I followed the NSA guidelines for configuring a Windows NT system, which for instance meant I wasn’t supposed to install any graphical driver, because no driver was NSA certified. And the way I ended up running ColdFusion, with Sandbox Security configured to impersonate OS accounts, has once even earned me the comment from a Macromedia engineer to be the only one in the world with that configuration in production. But the result was there: even with the onslaught of Code Red and Nimda it took 7 months before there was a Windows patch that was applicable for the hardened configuration.

Contrast this with how I threw a new server online. Windows installation was a default installation, after which I had to add components instead of remove them. When installing IIS I had to add filetypes and extensions, instead of remove them. When configuring Sandbox Security for ColdFusion I could easily find anything I wanted on the subject, because there are dozens of people blogging about it. Obviously some of the ease of installing a new server is due to more experience on my side, but I think it is hard to deny that the “secure by default” mindset has made inroads.

Just a quick update on the two avenues I am pursuing to get the new Adobe forums to open up and share their content.

Adobe is going to followup on the email headers with Jive to see what they can implement. Hopefully this will result in getting the References and / or In-Reply-To headers added to the outgoing email notifications soon so email clients can thread them properly. It should be almost trivial to implement that, because judging by the definition of the ForumMessage object each forum message knows its own ID to put in the Message-ID header and also knows its ParentMessageID to put in the References header.

The other avenue is getting access through the webservice APIs. I have been looking at the login sequence of the Adobe site and it looks like the master cookie to determine whether you are logged in is the AUID cookie for the .adobe.com domain. Subsites may add more cookies and eventually maintain state through their own cookies, but if you have the AUID cookie you can get in. So I have reduced the dozens of requests involved in a browser logging in to the forums (many of them image requests) to two requests a client must make. I have included the minimal wget testscript to simulate the login here.

?View Code WINBATCH
REM parameters
SET USERNAME=spam@vandieten.net
SET PASSWORD=sihtyrt
SET RETURNURL=http%3A%2F%2Fforums.adobe.com%2Flogin.jspa
SET LOGINURL="https://www.adobe.com/cfusion/entitlement/index.cfm?loc=en&e=ca&returnurl=http%3A%2F%2Fforums%2Eadobe%2Ecom%2Flogin%2Ejspa"
SET FORUMURL=http://forums.adobe.com/index.jspa
 
REM remove old cookie and index
del adobe.txt
del forumindex.html
 
REM login
wget -S --no-check-certificate --save-cookies=adobe.txt --post-data "returnURL=%RETURNURL%&up_login=yes&ignore_email_validation=yes&up_username=%USERNAME%&has_pwd=true&up_password=%PASSWORD%" %LOGINURL%
REM get forum index
wget -S --load-cookies=adobe.txt -O forumindex.html "%FORUMURL%"

I need to tweak this a bit to do proper login validation, but it looks like just ignoring all the authentication mechanisms in the API and running these requests instead and then maintaining the collected cookies with all webservice calls will do the trick. Even though the entire login sequence will end up taking close to 15 seconds due to the response times of the Adobe sites, that is progress!

Every year we all go on a trip together for a few days with everybody from Prisma IT. This year the trip is to Iceland. This is all of us having lunch at Schiphol airport before boarding.

Prisma IT having lunch at Schiphol airport

Prisma IT having lunch at Schiphol airport

So for the next two days we will all be out of office. Well, not really all of us, Richard is staying in the Netherlands to answer the phone …

Today Google released a first Beta of Google Chrome, a new web browser.  Just installing it already showed how Google Chrome is different from other browsers. Since I am always logged on on my laptop as a normal user I right-clicked the installer and choose the “Run as” option and installed it under the Administrator accout. The result was surprising: it didn’t work. No error messages during installation, but I couldn’t get any website to open. After some searching I figured out that Google Chrome installed itself in “%USERPROFILE%\Local Settings\Google\” so you have to install it as the user that is going to use it. So after installing it as myself I could see pages and the fun starts.

One of the touted differences between Chrome and current browsers is that Chrome runs each tab in a separate process and processes are independent and one process will not affect another, even when it crashes. (Reputably the current IE 8 beta has the same feature, but I haven’t tried that yet.) So when you look in the Windows Task Manager you see separate processes for all tabs. With Chrome’s own Task Manager you can see a little bit deeper since you can see which process is which tab, but the real good stuff is in the “Stats for Nerds” which shows detailed information. I have included screenshots of all three below here.

Windows Task Manager view Google Chrome Task Manager Stats for Nerds

The most remarkable thing other then that is how unremarkable Chrome is. It is very unobtrusive with very little chrome and lots of room for content, it feels extremely responsive and it just works. It didn’t ask if I wanted to install the Flash Player, it just worked. I don’t even know if it is embedded or if Chrome borrows the Firefox plugin or something. But since it runs in its own process the memory consumption becomes visible. So I guess pretty soon people will start to look a whole lot closer at the memory consumption of Flex applications.

All in all the first impression is very good. After the installation hurdle it just works and I really like the security / process / memory model (it feels like Unix and specifically like the model of PostgreSQL). And I mean really like it. Now I know I am a bit peculiar in my security habbits / preferences (how many others will have run into the installation issues I did?), but I really take such issues into consideration when I choose tools. And so far Google Chrome is scoring big points.

It is hard to believe anybody has missed the 2 big browser announcements of today. First, Microsoft announced that “IE8 will, by default, interpret web content in the most standards compliant way it can“. Second, The Web Standards Project has released the Acid3 test.

In the torrent of messages about these 2 big announcements I found a link to a service that I hadn’t heard from before: browsershots. It allows you to select a number of browsers on different platforms and generates screenshots of the way they render. And of course somebody has submitted the Acid3 test to browsershots and we can now all see how the different browsers score.

So at long last I have a blog. Thought about it a while, played with it a while, and finally made the jump. What you can expect here is mostly related to my work and IT interest (ColdFusion, LiveCycle, Flex, FMS, PostgreSQL and networking), with the occasional ‘human interest’ story. Probably over the next week or so I will also make some changes to this blog to tie up the remaining loose ends:

  • import my old content (customtags etc.)
  • make my own layout
  • get comments without login working

And in case somebody is wondering “it could be bunnies” is a quote from the episode Once More With Feeling of Buffy the Vampire Slayer. Because not only is Buffy the Vampire Slayer one of the most awesome TV shows ever made, it is also a constant reminder to think out of the box.