Archive for the ‘ColdFusion’ Category

This was written in response to a forum question, but I figured it might be useful for more people.

There are many ways to deploy ColdFusion code to a server. Probably the most prevalent, especially considering shared hosting, is using FTP to upload CFML templates to the server.  Tools such as DreamWeaver and CFBuilder allow you to do so right from your IDE. Another way to do it is to run some Ant script or batch file and extract the sources straight from source control to the server. With a little bit of effort you get much more control and much more reproducible results.  At Prisma IT we prefer to go a step further and deliver the ColdFusion applications we build as Enterprise ARchives (EARs) to our clients.  This allows us even more control, especially when we don’t have any.

Let me explain that a bit. We have several clients where we do all their development, but final deployment is done on the clients infrastructure. If we are lucky, we may have read-only access to the User Acceptance Testing servers, but sometimes we don’t even have that. In those cases deploying an application is completely up to the client (or their hosting partner). That leaves us no wiggle room to deal with stuff that could go wrong during a deployment. With EAR files we eliminate a huge number of risks from the process. An EAR file is a full application, so there is no risk that some files get forgotten. And we have the MD5 to prove it. Since it gets deployed to its own temporary folder, there is no chance of any old files remaining on the server ans slipping in to the server (the cfclasses folder is famous for that).

The one thing you need to solve for this is how to configure your application. If the client had to log in to the ColdFusion Administrator after deploying the application to configure datasources, mappings etc., it would be just as easy to do something wrong. So what we do instead is to have the client place a properties file on the class path with a bunch of configuration settings. Standard ones, such as the IP address of the outgoing mail server and the folder for logfiles, and application specific ones such as the location on the SAN where all the documents are stored. Then in the onApplicationStart() the application parses that and configures itself. Each of these settings is checked when it is loaded into the application, so if there is a path configured, a directoryExists() wil make sure it actually exists.

The added benefit is that it becomes very easy to move an application around. Once you have written your properties files for test, QA and production, they stay the same. You just move an EAR file with a release from one environment to the next and it configures itself as soon as it starts. The EAR files themselves get generated by Ant on our build server to make sure they are completely reproducible. And to protect our intellectual property and deter others from mucking around in them, they only contain compiled source code. And since an EAR is a standard format, it works on different JEE servers too. (Mostly JRun and occasionally JBoss for us.)

As any solution, this process has downsides. Working with compiled EAR files is obviously not a good idea if you push small changes to a live server three times a day. It is a very ‘heavy’ process, because in each EAR you are packaging ColdFusion as well (100+ MB). And building EAR files without a ColdFusion Administrator does not just mean the client can not mess the configuration up anymore, it also means you can not fix the configuration anymore either. But all in all, it is serving us well.

Since I am in Bangalore for a training I dropped in on the “Adobe Flash Platform Tools Preview” this evening. The agenda promised short sessions on Flash Builder 4, Flash Catalyst and LiveCycle ES, followed by food and networking. The Flash Builder (previously known as Flex Builder) session was solid. It showed Data Centric Development, where services are defined in Flash Builder and can then easily be wired into the UI because all the code for the services and value objects is generated. (See Raghu’s blog for the demo screencast). Next up was Flash Catalyst, showing a design - development workflow where a .psd file was transformed to a .fxp, which was then imported in Flash Builder to wire the data in through the new services management. Last was LiveCycle ES. Unfortunately, but understandably considering the audience, this was all about data management and not process management. What was new for me was that apparently this now wires directly into Hibernate so you don’t need to write any server side code anymore, you can have everything generated.

The Q&A focused mainly on the designer - developer workflow with Flash catalyst. The main question that was repeated several times in different words was whether this workflow put any additional constraints on the designer. And each time the answer was that good development on the design side, including the judicious use of layers, was all it took. I think this reflected the audience of architects and project managers, from developers I would expect more technical oriented questions.

Afterward the food and networking were great. Not just the Flash team from Adobe was there, but also people from the LiveCycle team and the ColdFusion team., so I got an opportunity to thank some people in person for fixes and new features I am not allowed to mention yet. And I also met up with some of the people we do business with in India.

Today was the CF Insider Workshop in Brussels. Line-up was the same as last week in Amsterdam, Claude Englebert from Adobe and Simon Slooten and me from Prisma IT. Mark van Hedel was supposed to join us for the ColdFusion and AJAX session, but he broke his leg last week so he couldn’t be there and Simon took over his session. The event in Amsterdam was pretty well attended with 20+ attendees, in Brussels we had a few less (but they have another event in French tomorrow). But the good news is that after years of a slumber Adobe is finally talking about ColdFusion in Europe again. Some of the attendents were even Adobe employees from other departments who wanted to know what that ColdFusion thing was about.

I have uploaded my slides from the CF Insider workshop to the Prisma IT website:

What’s new in ColdFusion 8

The ColdFusion Server Monitor

Adobe and Prisma IT are organizing the Dutch and Flemish versions of the ColdFusion Insider Workshop that is touring Europe. First we will have the Dutch one on January 22th in the Adobe office in Amsterdam and then on Januari 27th the Flemish on in Brussels. There will be speakers from Adobe and Prisma IT. Attendance is free if you get an entrance voucher by filling out the registration form.

I will be doing a presentation on the ColdFusion Server Monitor and a more general one on how ColdFusion makes life easier with all the cool new features in ColdFusion 8. If you have a good suggestion I may be persuaded to add your topic to my “OMG, nobody has questions, how do I fill the awkward silence” emergency slides.

At Prisma IT we have an application that has been under constant development for 2 years. Over that time it has seen lots of feature creep and changing requirements, and no refactoring. As you can imagine that doesn’t make for pretty code or an optimal database model. Due to a retargeting of that application and a much more focused requirements process we are finally getting ready to start reworking that application. Over the past month I have been working on and off (mostly in the train on my way to clients) on putting more structure in the long list with things we (I) really would like to change about the technology of the application. I have also been running some experiments on the code to see what the impact of certain changes would be. Amongst others this includes the experiments with changing the datatype used in MS SQL Server to store ColdFusion UUIDs.

One of the last things I wanted was to get a feel for the cost/benefit ratio of a conversion from MS SQL Server to PostgreSQL. Continue reading ‘Converting a CF application from MS SQL Server to PostgreSQL?’ »

Lately I have posted a number of posts on the state of shared hosting security. Unfortunately we have to conclude that only by taking drastic measures a hoster can begin to protect his customers from eachother. And even when the hoster locks dow everything to the best of his abilities, which disables lots of useful functionality, there are some significant holes left. Therefore I have compiled this wishlist of things I would like to see improved in Centaur to make shared hosting security a bit better and easier to deal with.

Documentation

Setting up shared hosting is largely a process of trial-and-error. There is absolutely no documentation available that tells us which tags will blow up when we Sandbox the temp folder. That cfdocument will error out if the fonts folder is not allowed in the Sanbox. There is nothing to warn us that cfdump will break if we disable createObject(JAVA). What we need is better documentation and preferably a whitepaper that shows us how to configure a secure shared hosting environment step by step.

A locked down application scope

Frameworks are storing more and more data in the application scope. The application scope is currently completely unprotected: everybody can see everything. We need a way to lock down the application scope so we can only access it from just one Sandbox.

Fine grained Java permissions

The biggest useful feature we need to disable in order to protect servers is access to Java objects. We need much more control over which actions a Java class can perform before we can open up access to Java. For starters, we need knobs to disable many actions from the Runtime class. We need to be able to disable the loading of arbitrary classes by users (what is the point in disabling createObject(COM) if users can load their own version of JIntegra?), we need to stop users from killing the server with exitVM, halt etc., we need to stop them from spawning other processes etc. Then we need to have all access to SecurityPermission disabled so that users can’t simply reset the security and we need to have ReflectPermission disabled to keep our private stuff private.

Java provides all the API we need for that, we just don’t have any knobs in ColdFusion to switch them on for our Sandboxes

Sane default settings for Sandboxes

If you create a new Sandbox from scratch it has access to many things it shouldn’t. A secure by default configuration would for instance disable all datasources and leave it up to the administrator to enable them on an as needed basis. Same goes for tags like cfexecute and cfobject.

On the other hand the default Sandboxes do not have access to to for instance the /WEB-INF/cftags/ folder. Without that access cfinterface may not work. What is the harm in giving Sandboxes Read and Execute access to that folder by default? Isn’t the code in those folders written by and trusted by Adobe?

There are many more issues that would help ColdFusion becomming a better hosted platform, and I will address a few more in the future, but these are the 4 most important issues in the area of shared hosting security that Adobe needs to address. All of these have been submitted to Adobe, but I am sure a few more votes and ideas won’t hurt.

So far in our attempts to get access to the templates and data of other hosted customers we have primarily focused on accessing the templates of other users in order to get access to their data through them. A more direct approach is to try to access the database directly.

Registered datasources

To access a registered datasource all you need is its name. More often then not, names are easy to guess. Depending on how the datasource is registered in the CF Administrator you may need a username and password as well. If the username and password to the datasource are registered in the ColdFusion Administrator ColdFusion will write them to neo-datasource.xml. They are encrypted there, but with a reversible algorithm (ColdFusion needs to be able to decrypt them in order to authenticate). The way to decrypt these passwords is well known. So if you have read access to neo-datasource.xml through an incorrect Sandbox configuration you have all usernames and passwords registered there.
If the username and password of the datasource are not registered in the ColdFusion Administrator chances are they are stored in the application scope somewhere and we have already seen that the application scope is insecure.

The first line of defense against this is to configure Sandbox Security to only allow access to specific datasources to each Sandbox. This will effectively lock down access from ColdFusion to the datasource, even if you know the full connection string.

Database to database connections

But as good as the protection for datasources is in ColdFusion, it only protects against connections from ColdFusion’s cfquery and cfstoredproc. Just as the trick to work around a disabled cfexecute is to create Java / COM / .NET objects, the trick to work around a disabled ColdFusion datasource connection is to start the connection from somewhere else. And the way to do that is to connect from one database to another. The protocol to do so, SQL/MED, is not widely implemented in databases, but most databases offer proprietary ways to do so. For instance, in MS SQL Server you would use:

SELECT *
FROM OpenDataSource(
  ’SQLOLEDB‘
  ,Data Source=server.asdf.com;User ID=yourusername;Password=yourpassword‘
  ).somedatabase.dbo.sometable;

In PostgreSQL that would be:

SELECT *
FROM dblink(
  'dbname=postgres'
  , 'SELECT ID, value FROM tables'
  ) AS t1(
  ID INT
  , value TEXT
  );

The one thing that makes this more difficult is that you usually need some sort of superuser privilege on the database server to be allowed to do this.

MS Access

MS Access is a file based database that is pretty popular in shared hosting. While people frequently question performance, scalability and security, it is still an attractive offering because it allows users to download the database, edit it offline and then upload an entirely new database with new data.

MS Access offers a very simple feature to access tables in other MS Access databases without the superuser privileges you would need on a more full-featured database server. You just specify the path to the data file you want to access directly in the query:

SELECT *
FROM table IN 'h:\sites\database.mdb'

I was reading a little discussion on whether the maxrows attribute of cfquery was pushed down all the way into the database like a TOP or LIMIT statement would, or whether the full query statement was executed and only a limited number of rows was retrieved. And while there was a lot of debate, there was no proof, not even a reference to some standard that said how it should work (which is something entirely different from how it does work).

Since I am personally convinced that the full query statement is executed and only a limited number of rows is retrieved I set out to prove so. In order to prove so, we need to have a query statement that returns rows to ColdFusion, but has a side-effect as well that is registered in the database. The the results of that side effect should be visible in ColdFusion. My first idea was to do a select with a RAISE NOTICE call per record, and then see if the number of print notices was equal to the maxrows in the cfquery. Only the output of a notice isn’t visible in ColdFusion.

The new idea was to write a query statement that not just inserts a record, but returns a record at the same time. That is easily done with the RETURNING clause in PostgreSQL.

INSERT INTO x (y)
VALUES (z)
RETURNING *

With some searching I figured out that MS SQL Server offers similar functionality

INSERT INTO x (y)
OUTPUT INSERTED.*
VALUES (z)

So here is the script to set up the database tables and values:

create table src (ID INTEGER);
create table tgt (ID INTEGER);
 
insert into src (ID) values(1);
insert into src (ID) values(2);
insert into src (ID) values(3);
insert into src (ID) values(4);
insert into src (ID) values(5);
insert into src (ID) values(6);
insert into src (ID) values(7);
insert into src (ID) values(8);
insert into src (ID) values(9);
insert into src (ID) values(10);

And the test script to run from ColdFusion:

<!--- baseline --->
<cfquery datasource="pgtest">
DELETE FROM tgt;
</cfquery>
<cfquery datasource="pgtest" result="debug">
INSERT INTO tgt (ID) SELECT ID FROM src RETURNING *;
</cfquery>
<cfquery datasource="pgtest" name="result">
SELECT COUNT(*) AS result FROM tgt;
</cfquery>
<cfdump var="#debug#">
<cfdump var="#result.result#">
 
<!--- LIMIT/TOP --->
<cfquery datasource="pgtest">
DELETE FROM tgt;
</cfquery>
<cfquery datasource="pgtest" result="debug">
INSERT INTO tgt (ID) SELECT ID FROM src LIMIT 5 RETURNING *;
</cfquery>
<cfquery datasource="pgtest" name="result">
SELECT COUNT(*) AS result FROM tgt;
</cfquery>
<cfdump var="#debug#">
<cfdump var="#result.result#">
 
<!--- maxrows --->
<cfquery datasource="pgtest">
DELETE FROM tgt;
</cfquery>
<cfquery datasource="pgtest" result="debug" maxrows="5">
INSERT INTO tgt (ID) SELECT ID FROM src RETURNING *;
</cfquery>
<cfquery datasource="pgtest" name="result">
SELECT COUNT(*) AS result FROM tgt;
</cfquery>
<cfdump var="#debug#">
<cfdump var="#result.result#">

Results

The results are very clear:

Query PostgreSQL MS SQL Server
cfquery recordcount database count cfquery recordcount database count
Baseline 10 10 10 10
LIMIT / TOP 5 5 5 5
maxrows 5 10 5 10

So using a maxrows attribute in your query does not prevent you database from executing the query statement completely. The limit in the number of records is only enforced in the JDBC driver (or in ColdFusion) and the database runs the full query.

I would be interested in seeing results from other databases, although I don’t expect them to be different.

In the previous part I have shown that the datatype you use in MS SQL Server has major consequences for the database size. Using an NVARCHAR(35) instead of a UNIQUEIDENTIFIER can triple the size of your tables, and a VARCHAR(35) is somewhere in between. Conventional wisdom relates the performance inversely to the size of a database, with a factor between O(ln(n)) for indexed access and O(n) for table access. To test that, I created an artificial benchmark for my databases with several queries that mainly exercise range access to the tables that saw the highest size ratio from the datatype conversion and access more data then fits in RAM. (Due to the string format difference the values used in the queries differed slightly.) Then I ran those queries from the MS SQL Management Studio and checked the wall clock time.

Datatype Storage requirements (B) Database size (GB) Run time (s)
UNIQUEIDENTIFIER 16 29.6 22
VARCHAR(35) 35 32.5 35
NVARCHAR(35) 70 38.6 47

The benchmark proved conventional wisdom right (unsurprisingly I might add, because it was designed to prove conventional wisdom right).

Querying from ColdFusion

If we add ColdFusion to the mix, the equation changes. Crazy as it may sound, the typeless ColdFusion language has 2 incompatible 128-bit integer datatypes. First there is the UUID, a 128-bit integer with a 35-byte string representation. It has a terribly slow generator function CreateUUID(), but it is incompatible with the rest of the world. Then there is the GUID, a 128-bit integer with a 36-byte string representation. There is no function in ColdFusion to generate GUIDs, but you can either insert an extra - in a UUID or generate them from java.util.UUID. This is the string representation that the rest of the world, including MS SQL Server and Java, uses. Since neither the database nor ColdFusion can cast automatically, we need to make sure we use the right string representation on both sides.

The next question is whether or not to use cfqueryparam. In my opinion not using cfqueryparam should only be done in very specific circumstances where you have a reason not to use it (and comments in the code should explain that reason for the next programmer).

Since we use cfqueryparam we need to decide is whether to use Unicode for our string parameters or not. This is an option in the ColdFusion administrator that enables or disables this for the entire datasource. Most applications I work with need some form of i18n support so we need unicode and have to enable this.

Lastly we need to pick a database driver. ColdFusion ships with a MS SQL Server driver from DataDirect, but there are other alternatives. I have tested the drivers from Microsoft as well during this round.

Test 1

The benchmark I am using for this first round is extremely simple, just a primary key lookup of a row of the table that had the largest relative size reduction between NVARCHAR and UNIQUEIDENTIFIER. This is run for each database datatype, for each driver, once with cfqueryparam and once without cfqueryparam.

Query 1
	SELECT
		*
	FROM
		table
	WHERE
		primary_key= '#searchValue#'
Query 2
	SELECT
		*
	FROM
		table
	WHERE
		primary_key = <cfqueryparam cfsqltype="cf_sql_varchar" value="#searchValue#">

The query without cfqueryparam is the query that is most similar to a query as we would run it from the MS SQL Management Studio. It is only included to provide a reference since I couldn’t believe the results at first, I would not want this in any production code. The table below shows the results, averaged over all runs and all drivers.

Datatype Query 1 (ms) Query 2 (ms)
UNIQUEIDENTIFIER 5 6
VARCHAR(35) 5 39700
NVARCHAR(35) 6 6

As you can see the differences are minimal, except for the case where we have a query sending a cfqueryparam with Unicode hint. This is the same behavior I have shown before: a character set mismatch forces MS SQL Server to change the execution plan to do a table scan instead of an index lookup. The effect is just much bigger here because this is a real sized dataset and not a small testcase.

Test 2

After seeing this I dropped the VARCHAR database from the tests. That allowed me to increase the number of iterations for each test and add a few more queries and still finish this year. So for the following test I had the following queries:

Query 1
	SELECT
		*
	FROM
		questionScore
	WHERE
		questionScoreID = '#searchValue#'
Query 2
	SELECT
		*
	FROM
		questionScore
	WHERE
		questionScoreID = N'#searchValue#'
Query 3
	SELECT
		*
	FROM
		questionScore
	WHERE
		questionScoreID = <cfqueryparam cfsqltype="cf_sql_varchar" value="#searchValue#">
Query 4
	SELECT
		*
	FROM
		questionScore
	WHERE
		questionScoreID = <cfqueryparam cfsqltype="cf_sql_uniqueidentifier" value="#searchValue#">

The way I ended up with Query 4 is that I had no idea what the proper cfsqltype for a uniqueidentifier was, so I just tried something and it didn’t generate an error. (It doesn’t really work either because it doesn’t do any validation on the client side either.) The results over 100 queries provided an interesting view.

With unicode hint for datasource Without unicode hint for datsource
Datatype Driver Query 1 (ms) Query 2 (ms) Query 3 (ms) Query 4 (ms) Query 1 (ms) Query 2 (ms) Query 3 (ms) Query 4 (ms)
UNIQUEIDENTIFIER A 266 485 969 1360 265 500 637 1406
B 250 500 953 1391 250 547 984 1437
NVARCHAR(35) A 297 562 1141 1578 313 579 1110 1594
B 297 579 1079 1625 296 579 1093 1640

As you can see there is a consistent pattern where queries without cfqueryparam are faster then queries with cfqueryparam, and also that cf_sql_varchar is faster then cf_sql_uniqueidentifier. There is a small difference between the speed of a UNIQUEIDENTIFIER and a NVARCHAR and queries with a Unicode hint appear to be marginally faster. The differences between the drivers are well within the margin of error for this test.

But the most important thing is that no combination shows the awful behavior the combination of a VARCHAR field and Unicode hints show.

Preliminary conclusions

So with this batch of tests I can now indicate an order of preference for the datatype to store 128-bit integers in:

  1. If you are building a new application, use the GUID datatype on the ColdFusion side and the UNIQUEIDENTIFIER datatype on the MS SQL Server  side. This minimizes the size of your database, maximizes the speed and gives you a string representation that is compatible with the rest of the world.
  2. If you have an existing application where you use UUIDs and store them in a VARCHAR field, it may be worthwhile to convert that VARCHAR field. Conversion to an NVARCHAR should be quick and easy, conversion to a UNIQUEIDENTIFIER gives the best performance but will require a lot more work.

A still need to figure out where the speed differences in the last table originate: the MS SQL Server or ColdFusion. I have run some initial tests that suggest that the performance difference is actually in the driver and not the database, but since the differences are a lot smaller I need more tests and SQL profiles to figure that out. That may take a while, but when I do I’ll post back.

So far we have seen that to secure our shared hosting customers from each other we need to place them in a Sandbox limiting file system access, datasource access and disabling access to cfexecute, cfobject and CreateObject(JAVA/COM/.NET) . Typically I add cfregistry to the list of disallowed tags as well. While not strictly necessary for security (you are running ColdFusion with a non-privileged account, aren’t you?), the registry is a shared resource that is easily fragmented and has a limited size, so I just want to keep customers out.

cfdump

The problem we have now is that apart from the functionality we intended to disable we have inadvertently disabled some more functionality, including the cfdump tag. In the directory /WEB-INF/cftags/ are a number of .cfm templates that implement tags and one of them is dump.cfm. But since this tag is included much like a customtag would be included, it runs with the same permissions that a normal customtag would be running. And those permissions disable the use of CreateObject(JAVA), causing cfdump to throw an error when some datatypes are dumped.

The workaround (go ask Adobe for a solution) is to replace /WEB-INF/cftags/dump.cfm with a custom dump template. I have made the one I use available as a download. It isn’t pretty, it doesn’t dump Java objects (but shared hosting customers can’t create them anyway), but it does the job.

cfinterface

cfinterface is another one of the tags implemented in CFML in /WEB-INF/cftags/. If you try to use a component implementing an interface you will notice that this throws an error. Apparently we need to grant read and execute permissions on that directory to every Sandbox (you can adapt the script I posted before for that). I just ran into this this weekend when I updated my dump template to show interface information as well, so I guess that shows how much I use shared hosting myself.

The ColdFusion Administrator and Admin API

In ColdFusion 8 Adobe introduced multiple user accounts and permission delegation for the ColdFusion Administrator and the Admin API.  I have my doubts about delegating access to the ColdFusion Administrator in a shared hosting environment since the delegation is based on functionality and not on Sandboxes. So that means you can delegate access to see certain features, such as the logfiles, but not certain features for a specific Sandbox, i.e. the logfiles of a specific customer. Since pretty much every feature in the Administrator I can delegate has security implications, I can not delegate anything.

With the  Admin API the situation is even worse. Internally the Admin API has been implemented in CFML and it suffers from the same problem as cfdump: it uses cfobject and createobject(JAVA) functionality that we can not enable without giving users the ability to upgrade themselves to root administrator. The only way we can use them is in our own management scripts and as a backend invoked through some facade we need to write first.

Application scope variables

ColdFusion piggybacks its application scope variables on the Java application scope. This allows for an integration in the application scope between for instance ColdFusion and JSP code: variables you write in one are available in the other. The downside is that on the Java side the application scope is considered an instance wide scope and all data is accessible. That means that whatever customer A puts in the application scope is accessible to customer B. Put the following in your Application.cfc to see how that works:

<cfcomponent>
 
	<cfset this.name = "" />
 
	<cffunction name="onRequestStart">
		<cfdump var="#application#" />
		<cfabort />
	</cffunction>
 
</cfcomponent>

This sharing of the application scope is by design and as a result it is unsafe to store data in the application scope on a shared host.