Posts tagged ‘Adobe’

I have been promising myself (and others) to write about the ColdFusion UUID implementation for quite a while now and I feel like I have been procrastinating long enough. So at long last the definitive guide to ColdFusion UUIDs, based on many years of experience and a few conversations with the ColdFusion engineering team over beer at the MAX.

What is a UUID

A UUID is an Universally Unique Identifier which is just a fancy name for a 128-bit integer. While a 128-bit integer is a really large number, it is not an infinite number, so it is not really unique, it is just so rare for a conflict to occur that we normally just presume it is actually unique. This 128-bit integer is typically represented as a hexadecimal string split into 5 groups by hyphens in the pattern 8-4-4-4-12. This UUID is typically generated from one of 5 different algorithms:

  1. MAC address based
  2. DCE based
  3. MD5 hash based
  4. Random
  5. SHA-1 hash based

Each of these versions offers different guarantees for uniqueness and randomness. For ColdFusion developers the import version are 1 and 4.

MAC address based UUIDs

The algorithm for a MAC address based UUID is based on 3 different components:

  1. timestamp
  2. clock sequence
  3. node identifier

The timestamp is a 60-bit integer counting the number of 100 nanosecond increments since the beginning of the Gregorian calendar in 1582. The clock sequence is an initially random number used to prevent duplicate UUIDs when the time is reset backwards for instance through an NTP client. The node identifier is a supposedly unique identification for the node on which the UUID is generated. Since this node identifier is typically the MAC address of one of the NICs of the system this version is commonly referred to as a MAC based UUID.

From this algorithm a few things stand out:

  1. The timestamp will overflow in stardate 3400 or something and from that moment on the generated UUIDs may conflict with earlier generated UUIDs. But since I doubt anybody was generating UUIDs in 1582 it is safe to assume the first actual conflicts from that will occur a few hundred years later.
  2. The UUID is only as unique as the MAC address is. While MAC addresses are supposedly unique anybody who has run a somewhat larger network like a campus network will know that in reality they are not.
  3. It is impossible to generate more then 10 million version 1 UUIDs per second per node due to the 100 nanosecond timestamp resolution.
  4. MAC based UUIDs are actually quite predictable.

The MAC based algorithm is the algorithm used in ColdFusion.

Random UUIDs

Random UUIDs are generated mostly random. The version number and 2 other bits are restricted, but the other 122 bits are generated from a random source. This means:

  1. Version 4 UUIDs are unpredictable.
  2. Version 4 UUIDs are more likely to conflict than version 1 UUIDs. Still for all practical purposes they are unique.
  3. The quality and speed of the generation of version 4 UUIDs depends on your entropy source.

Amongst others, java.util.UUID is one of the implementations of a version 4 UUID generator.

UUIDs in ColdFusion

UUIDs are generated in ColdFusion through the createUUID() function. This function generates UUIDs using the version 1 algorithm (MAC address based).  The one thing that makes these UUIDs stand out very much is that they have a non-standard string representation. Instead of being grouped in 5 groups with the pattern 8-4-4-4-12 they are grouped in 4 groups with the pattern 8-4-4-16. I have been told this was an unintentional deviation that was not discovered until after shipping and then backward compatibility was deemed more important than conforming to the string representation of others.

The ColdFusion createUUID() function gets interesting with the rewrite to Java in ColdFusion MX. At that time Java had no API to find the MAC address of a NIC in the system, so on Windows a little bit of native code in NeoUUID.dll was used to find the MAC address and on other platforms a MAC address was faked. When doing a native Java deployment on Windows (EAR/WAR file) the system would also fall back the same as on other platforms. In addition the timestamp resolution of the Sun JVMs was rather limited (10 milliseconds on Windows, 1 millisecond on other platforms). Since you can generate only one UUID per clock tick, the theoretical limit for the number of UUIDs generated per second was 100 on Windows (64 on multi-core systems).

A particular problem in this version was a bug in the Sun JVM where using createUUID() would cause the system clock to move forward a little bit. Under heavy use the clock would move forward up to 12 seconds per minute. Then when the time was resynchronized with the NTP server and the server clock went back a minute or so, the generation of UUIDs was stalled until the system was back in the future. Very much the intended behavior of a UUID generation algorithm that values uniqueness over everything else, but still an unpleasant surprise.

With the arrival of ColdFusion 9 createUUID() got a speed boost. The implementation was rewritten from using a millisecond time API to use a new Java API that provides timestamps with a nanosecond resolution. That means the theoretical limit of 100 or 1000 UUIDs per second got increased to 10 million per second. The practical limit is still a bit lower because the clock tick is not really 1 nanosecond, but the speed improvement is still very significant. The speed of createUUID() now actually varies depending on the clock speed of the hardware you use to run the test.

GUIDs in ColdFusion

In addition to a UUID datatype ColdFusion also has a GUID datatype. This is another 128-bit integer that is unfortunately incompatible with ColdFusion UUIDs because it uses the 8-4-4-4-12 string representation . On the other hand it has the huge benefit that it is compatible with the way the rest of the world represents UUIDs so we can natively exchange them with Java, databases etc. instead of having to serialize them to a string. I have written previously about the performance benefits you can reap if you use a native uniqueidentifier datatype in MS SQL Server instead of a string representation.

What ColdFusion does not have is a native function to generate GUIDs. Typically this is solved by generating GUIDs from UUIDs by just inserting another hyphen, or by falling back to the Java java.util.UUID class. Just remember that when you use the ColdFusion createUUID() function you get better uniqueness guarantees since it is a version 1 UUID, while when using java.util.UUID you get better performance since it is a version 4 UUID (if you have sufficient entropy).

Last week I had the second Flex 4 Crash Course session at the Adobe office in Amsterdam as an introduction to Flex for people with no previous Flex experience. (Although there were some familiar faces in the audience.) The training material was provided by Adobe and I am not allowed to publish all the originals, but I can share the slides with all the links to external resources.

flex4_crash_course_slides_extract

It has been quiet for a while, but with good reason. Over the last 2 months I have been travelling a lot. And with a lot I mean Schotland, India, England, USA and England again, all for business. And the way it goes is that by the time you get back to the hotel from your appointments you have loads of email from the office waiting for you. With all that travel I had a grand total of one day off in India and 2 in the USA, which I spend away from the computer.

But now that I am back I have started to tie up some of the loose ends for the ForumClient for the Adobe forums. First and foremost, I have made it portable so it can now also access the Jive forums. The main reason for that is that it offers me another server to run tests against so that I can more easily determine whether issues are between the keyboard and the chair or if they are real server issues. Unfortunately with the number of bugs in the server software and the lack of documentation this is a real necessity. Most interesting for users is probably that forums, thread and messages now have right-click menu’s to mark as (un)read and that I squashed most of the bugs in the counts of unread messages. And I have started some work on getting messages to display better by adding some CSS to the message display.

Last but not least, at MAX I sat down with some people from Adobe and we had a good discussion on some possible future directions. One of those is a Flex version for mobile users (try the current forums on a mobile to see how badly that is needed) which Adobe would need to support by publishing a cross domain policy file. Second we had some discussion on the consequences of making this Open Source. I have decided that I will be publishing the sourcecode for this client at some time in the future. No definite timeline, but it won’t be beforea new version of ClearSpace has been deployed for the Adobe forums.

Download version 0.1.0 and give it a try.

I have uploaded version 0.0.5 of ForumClient. I really didn’t want to do a release just yet (it is halfway a SOAP to REST rewrite), but the previous version had expired and people couldn’t use it anymore. More soon.

forumclient001

It doesn’t look like much yet, but the alpha 2 is available for download. Server communication stuff should mostly work, except for the gazillion bugs and missing functions in the Jive webservices. UI is modelled after the Thunderbird NNTP client, so you download a list of forums, then you subscribe to certain forums and then messages for those forums will be downloaded into a local SQLite database so you can even read them offline.

Since I am in Bangalore for a training I dropped in on the “Adobe Flash Platform Tools Preview” this evening. The agenda promised short sessions on Flash Builder 4, Flash Catalyst and LiveCycle ES, followed by food and networking. The Flash Builder (previously known as Flex Builder) session was solid. It showed Data Centric Development, where services are defined in Flash Builder and can then easily be wired into the UI because all the code for the services and value objects is generated. (See Raghu’s blog for the demo screencast). Next up was Flash Catalyst, showing a design - development workflow where a .psd file was transformed to a .fxp, which was then imported in Flash Builder to wire the data in through the new services management. Last was LiveCycle ES. Unfortunately, but understandably considering the audience, this was all about data management and not process management. What was new for me was that apparently this now wires directly into Hibernate so you don’t need to write any server side code anymore, you can have everything generated.

The Q&A focused mainly on the designer - developer workflow with Flash catalyst. The main question that was repeated several times in different words was whether this workflow put any additional constraints on the designer. And each time the answer was that good development on the design side, including the judicious use of layers, was all it took. I think this reflected the audience of architects and project managers, from developers I would expect more technical oriented questions.

Afterward the food and networking were great. Not just the Flash team from Adobe was there, but also people from the LiveCycle team and the ColdFusion team., so I got an opportunity to thank some people in person for fixes and new features I am not allowed to mention yet. And I also met up with some of the people we do business with in India.

As promised my hacks to get past the the quirks in the webservice API for the Adobe forums. I am deliberately not publishing the full application, the lack of local caching in it makes it more of a DOS tool then a forum client.

Authentication

The PermissionsService authenticate method doesn’t work since the Adobe forums do not use the standard Jive login methods, but a custom Adobe SSO login method. To get a login on the forums from AIR replay what a browser does when logging in to the forums. First visit the index page of the forums to GET a few cookies, then POST the credentials to the Adobe authentication server, then GET the index page of the forums again to allow the forums to do a call back to the Adobe authentication server to collect the user profile. So all in all it takes 3 HTTP requests to log on.
I have included the code snippet that my AIR app uses to log in to the forums below for those who wish to experiment with it. Be warned that the full sequence takes on average 15 seconds.

?View Code ACTIONSCRIPT
// General variables
private const _AdobeAuthURL:String = "http://www.adobe.com/cfusion/entitlement/index.cfm?loc=en&e=ca&returnurl=http%3A%2F%2Fforums%2Eadobe%2Ecom%2Flogin%2Ejspa";
private const _ForumRootURL:String = "http://forums.adobe.com/index.jspa";
 
private function startLoginSequence():void {
	writeLog("Starting login ");
	getForumRoot(preAuthResult);
}
private function getForumRoot(resultFn:Function):void {
	// just GET the forum root to collect cookies
	var forumRootService:HTTPService = new HTTPService();
	forumRootService.method = "GET";
	forumRootService.url = _ForumRootURL;
	forumRootService.useProxy = false;
	forumRootService.resultFormat = "text";
	forumRootService.addEventListener(FaultEvent.FAULT, faultHandler);
	forumRootService.addEventListener(ResultEvent.RESULT, resultFn);
	forumRootService.send();
}
private function preAuthResult(event:Event):void {
	// We have now collected the forum cookies, log in to the Adobe ID
	login();
}
private function login():void {
	// login does a login request to the main Adobe site
 
	// credentials Object with all name value pairs
	var credentials:Object = new Object();
	credentials['returnURL'] = 'http://forums.adobe.com%2Flogin.jspa';
	credentials['up_login'] = 'yes';
	credentials['ignore_email_validation'] = 'yes';
	credentials['up_username'] = "spam@vandieten.net";
	credentials['has_pwd'] = "true";
	credentials['up_password'] = "sihtyrt";
 
	// login Service
	var loginService:HTTPService = new HTTPService();
	loginService.method = "POST";
	loginService.url = _AdobeAuthURL;
	loginService.useProxy = false;
	loginService.resultFormat = "text";
	loginService.addEventListener(FaultEvent.FAULT, faultHandler);
	loginService.addEventListener(ResultEvent.RESULT, loginResult);
	loginService.send(credentials);
}
private function loginResult(event:ResultEvent):void {
	// check if we are really logged in
	var loggedInTestString:String = '<a href="http://www.adobe.com/cfusion/membership/logout.cfm">Sign out</a>';
	if (event.result.indexOf(loggedInTestString) != -1) {
		// Login success
		// Do a new HTTP request to the forums to propagate the login from Adobe to Jive
		getForumRoot(postAuthResult);
	} else {
		// Login failure
		throw("Username password combination incorrect.");
	}
}
private function postAuthResult(event:ResultEvent):void {
	// Fully logged in, ready to use the API
	writeLog("Login complete");
 
	// extractUserDetails(event.result);
}

Once we are logged in we can use all of the APIs to get the actual useful information from the forums. The FlexBuilder WSDL import tool works quite well with the APIs, all cases where it failed turned out to be mistakes in my programming. To get stared call getRecursiveCommunities of the CommunityService to get a list of all the communities (forums) available. Depending on how busy the forums are, internet bandwidth, traffic and the position of the moon this can take between 25 seconds and 5 minutes. From the list of forums you get you can drill down into the list of threads (ForumService getThreadsByCommunityID: up to 50 seconds to get the thread list of the DreamWeaver forum, the busiest forum with about 50K threads) and then into the list of messages per thread (ForumService getMessagesByThreadID: usually less then a second). When you get the messages you will also get the users and all things you need to display a tree of who responded to who.

Getting your own userID

Apart from the login methods in the API there appears to be another problem in the webservice API (or maybe just something I haven’t figured out). Searching for a user, either by his username or by his emailaddress, has never returned any results for me. (I do in fact get a 500 Internal server error when I try to use the UserService getUserByUsername.) But the user details are needed for all API methods to post messages. The workaround I implemented at last for that is to just do a string lookup in the HTML of the forum start page (the commented out call to extractUserDetails in the previous code snippet):

?View Code ACTIONSCRIPT
public function extractUserDetails(pageObj:Object):void {
	var page:String = pageObj as String;
	var userIDPreString:String = 'quickuserprofile.getUserProfileTooltip(';
	var userIDPreStringIndex:int = page.indexOf(userIDPreString) + 39;
	var userIDPostString:String = ')';
	var userIDPostStringIndex:int = page.indexOf(userIDPostString, userIDPreStringIndex);
	_userID = page.substring(userIDPreStringIndex, userIDPostStringIndex) as int;
}

Obviously depending on visiting some sites in a certain order to pick up cookies along the way and scraping generated HTML is rather fragile. The smallest change in the HTML or the authentication will break any application that uses this. Ideally Adobe would implement some authentication webservices on its own site to facilitate logging in from an application.

With this the basic services to list forums, retrieve threads, read messages and post your own messages can be accessed without further surprises. I intend to continue working on my ForumClient, but it will be a while. I will need a proper design for the application, I am thinking about modules to support different forum software API’s, storing configuration data in XML, message caching in a SQLite database per configured site etc. Then I need to develop the whole thing. And in the mean time real life is catching up and I am going on a training tour of Europe, so I will have little time for the next three weeks. Drop me a line if you are interested in helping out, or don’t wait for me and just get started on your own client :)

I managed to get a proof of concept AIR client for the Adobe forums running. It can log in, browse communities, threads, and messages and can even reply. But since it doesn’t do much with local storage yet it doesn’t remember which messages are already read. I’ll post some more on the hacks to log in to the forums soon, but for now I have screenshots.

ForumClient Screenshot

ForumClient Screenshot

ForumClient screenshot

ForumClient screenshot

Just a quick update on the two avenues I am pursuing to get the new Adobe forums to open up and share their content.

Adobe is going to followup on the email headers with Jive to see what they can implement. Hopefully this will result in getting the References and / or In-Reply-To headers added to the outgoing email notifications soon so email clients can thread them properly. It should be almost trivial to implement that, because judging by the definition of the ForumMessage object each forum message knows its own ID to put in the Message-ID header and also knows its ParentMessageID to put in the References header.

The other avenue is getting access through the webservice APIs. I have been looking at the login sequence of the Adobe site and it looks like the master cookie to determine whether you are logged in is the AUID cookie for the .adobe.com domain. Subsites may add more cookies and eventually maintain state through their own cookies, but if you have the AUID cookie you can get in. So I have reduced the dozens of requests involved in a browser logging in to the forums (many of them image requests) to two requests a client must make. I have included the minimal wget testscript to simulate the login here.

?View Code WINBATCH
REM parameters
SET USERNAME=spam@vandieten.net
SET PASSWORD=sihtyrt
SET RETURNURL=http%3A%2F%2Fforums.adobe.com%2Flogin.jspa
SET LOGINURL="https://www.adobe.com/cfusion/entitlement/index.cfm?loc=en&amp;e=ca&amp;returnurl=http%3A%2F%2Fforums%2Eadobe%2Ecom%2Flogin%2Ejspa"
SET FORUMURL=http://forums.adobe.com/index.jspa
 
REM remove old cookie and index
del adobe.txt
del forumindex.html
 
REM login
wget -S --no-check-certificate --save-cookies=adobe.txt --post-data "returnURL=%RETURNURL%&amp;up_login=yes&amp;ignore_email_validation=yes&amp;up_username=%USERNAME%&amp;has_pwd=true&amp;up_password=%PASSWORD%" %LOGINURL%
REM get forum index
wget -S --load-cookies=adobe.txt -O forumindex.html "%FORUMURL%"

I need to tweak this a bit to do proper login validation, but it looks like just ignoring all the authentication mechanisms in the API and running these requests instead and then maintaining the collected cookies with all webservice calls will do the trick. Even though the entire login sequence will end up taking close to 15 seconds due to the response times of the Adobe sites, that is progress!

Two weeks ago Adobe informed us we could preview and test the new version of the online community forums. At first glance these forums offer an exciting new look with many new features. A single login system where we can use our Adobe ID to log on to the forums, RSS support, an improved search and room for attachments are great new features.

But when I started digging at what features the forums offered the excitement pretty soon turned into a feeling of unease. The old forums were an open system, accessible through NNTP, which provided two way interaction that allowed people to run their own interface and program against the forums through the NNTP ‘API’. Yet the new forums appeared to be a closed system following the old ‘ thou shalt watch me the way I want’ ways. So the questions that pretty soon bubbled up were not of excitement. Is this all? How am I going to work around this layout? How am I going to work with this workflow? And then the feeling of unease turned into a feeling of disappointment.
The new forums are state of the art. The web interface is so much better then the previous one. And yet they are a disappointment. Because they may be state of the art, but they do nothing new. Nothing we haven’t seen before. Nothing to inspire us. Adobe is merely catching up, not leading the way. And in the process Adobe is throwing away the one open interface I had to program against the forums.

You see, I am an avid NNTP user. When I start my client, it automatically goes out on the web to collect all new messages I follow. That is currently 27 newsgroups (on the Adobe, Macromedia, prerelease and other servers) with on average 40 message per day each. And when these messages are collected, they are sorted, filtered, grouped and prioritized for me. Messages get filtered out if I don’t like the sender or subject, people that post a direct response to me get priority, threads I have hidden before won’t resurface, etc. All of that takes maybe 20 seconds, and then a thousand messages have been reduced to two dozen priority messages from people who are waiting for me and have responded to me, a bunch of messages I may find interesting, and the rest that I don’t find interesting is gone. Now how is that going to work with these new forums? What is the closed system of these new forums offering that allow me to automate my workflow to the same level as NNTP offers? And I know that many other NNTP users struggle with the same questions.

Now I have done some research into possible solutions. And surprisingly enough the answer is really, really simple. It took me two hours to build a system that converted the email feed from the new forums to NNTP. All it takes is a subscription on all forums and a small script to convert email headers to NNTP headers. That means you have to replace the header “To: Joh Doe <JohnDoe@example.invalid>” with the header “Newsgroups: general.english_discussions”. And then I write that converted email to the spool folder of the Apache James NNTP server and I have an operational email to NNTP conversion. Sure, it doesn’t have avatars, author profiles, ‘mark as answer’ functionality, but as a NNTP user I really couldn’t care less about that. And the beauty of it is that this is two-way access because all I need to do is click “Reply to user” instead of “Reply to newsgroup” and the email will go to the Reply by email functionality for the forums.

Now to get the last bit of functionality implemented, I just need Adobe to fix their last bug. Email has all these hidden header fields that tell email clients how different messages relate to each other. And Adobe is filling them out incorrectly, so I can not sort messages correctly. So if Adobe could please fix the References and / or In-Reply-To headers in the outgoing email that would make me really happy. And if I am really happy, I might just open up my NNTP server for others.

And it gets better. Because the new forums could easily become inspiring. Because as it turns out, the new Adobe forums have a webservice API. And that API is enabled. And we can program against that API using Flex or more likely AIR (because of the local storage and to bypass any cross domain problems) and build their own interfaces for the forums and customize them. Write plugins for them, for instance to easily link to documentation or copy examples. Or people could write their own automation rules again, so they can sort, filter and prioritize messages the same way they can now through the NNTP ‘API’. I have a bunch of ideas for things I could do (which I probably won’t if Adobe fixes the email bugs), and I am sure there are many more people with even better ideas.
But for that to happen Adobe first needs to do something. The forums may have a webservice API right now, but it doesn’t have any documentation. Sure, there is some documentation from the vendor of the forum, but the implementation of the Adobe ID based login system of the new forums is different. And without a login no API access.

I really think Adobe has an opportunity to make these forums great. The number or forums that truly integrates web, email and NNTP is very limited. Adobe could be the first one to not only do that, but also allow unlimited AIR extensions through its webservices and empower the community to build its own tools. Or is Adobe, the company that is preaching the gospel of the Rich Internet Applications, really forcing all users to go back to a web interface?