Sunday, July 25, 2010

Restoring a Clonezilla Image using VirtualBox

Ubuntu 10.04 has been out for a few months, and I'm still on 09.10. I have had some success in the past upgrading, but I still prefer doing fresh installs. I guess it comes from my windows days, when an occasional fresh install was good for the computer soul. However, this time I'm also starting a new project at work doing .net instead of java, and I really wanted the ability to "come back" to my old setup. Basically, I wanted to convert my host machine to a virtual one or what's called P2V (Physical to Virtual). I tried VMware Converter but didn't get very far. With some advice from several co-workers though, I did come up with a method that did work and it was fairly easy.

The basic steps are:

  • Use Clonezilla to Save a Disk Image to a external USB drive. This essentially clones my host machine, where I can restore it later. My hard drive is around 120GB, so I put the image on my 500GB external USB drive. This took about 1.5 hours.
  • Create a new virtual machine on another external USB drive. The nice thing about using Clonezilla, is for this step you can use VMware or VirtualBox. I used VirtualBox. Obviously, you can't create the virtual machine on your laptop because you don't have enough space. And you can't use the same USB drive because when restoring, Clonezilla needs it to be unmounted. So instead I used another external USB drive.
  • Start the new virtual machine and boot up Clonezilla to begin restoring the image. You need to change the mode because the default view doesn't work very well when restoring. So at the Clonezilla menu, choose "Other modes of Clonezilla live". Then choose "Clonezilla live (Safe graphic settings, vga=normal)".
  • When you get to the point where Clonezilla needs to point to the external USB drive that contains the Clonezilla image, remember to enable the USB drive in VirtualBox. To do this, go to the Devices menu option in your virtual machine and select USB Devices and check the appropriate USB drive. Restoring my 120GB image took about 24 hours so make sure to do it when you have time.
  • Once Clonezilla has finished restoring the image your ready to poweroff the virtual machine, remove the clonezilla CD, and restart.
I still had a few adjustments to make in order to get it to work. When I first started my virtual machine, it complained about not having PAE (Physical Address Extension) enabled. I enabled PAE in ubuntu about a month ago so I could use all 4GB of RAM. Fixing it was easy. Under your machines settings go to System and click on the Processor tab. Check the "Enable PAE/NX" checkbox and restart.

Once it booted up, it complained about my graphics configuration. I tried selecting "Reconfigure Graphics", but that didn't work. Instead I was able to get passed it by selecting "Run in low graphics for one session". This allowed me to finish booting where I installed the virtualbox guest additions which seemed to solve the graphics issue.

That is all there is to it. It was all rather easy. Now I can install Ubuntu 10.04 and have the ability to go back to my previous development environment. I could also see lots of different use cases for this. Combined with the ability to clone virtual machines, all your virtual needs are met.

Upgrading to Maven 3

I've been playing around with maven 3 lately on our legacy maven 2 multi-module project via mvnsh. Like advertised, maven 3 is backwards compatible with maven 2. In fact, most everything worked out of the box when switching to maven 3. In this post, I'm going to highlight the required and currently optional items I changed so you can start preparing to migrate your project to maven 3. But first, what's so special about maven 3 and why would you upgrade? Polyglot maven, mvnsh, and improved performance (50%-400% faster) are just a few. And since it's so easy to migrate to maven 3, you really don't have any excuses.

Currently, I build our project using maven 2.2.1. This article was tested with mvnsh 0.10 which includes maven 3.0-alpha-6. The current release of maven 3 is 3.0-beta-1, while maven 3.1 is due out Q1 of 2011.

Profiles.xml no longer supported
I haven't really figured out the reasoning, but it doesn't really matter; maven 3.0 no longer supports a profiles.xml. Instead you place your profiles in your ~/.m2/settings.xml. Some of our database processes and integration tests require properties from our profiles.xml. It was simple to solve by just moving my profiles to my settings.xml and everything worked.

Upgrade GMaven Plugin
We depend pretty heavily on the gmaven plugin for testing, simple groovy scripts, and some ant calls. In order to build some modules I had to upgrade gmaven. The current version we were using was 1.0-rc-3. Our projects built perfectly after changing it to org.codehaus.gmaven:gmaven-plugin:1.2.

${pom.version} changing to ${project.version}
Here maven 3 kindly warned me that uses of the maven property pom.version may no longer be supported in future versions and should be changed to project.version. My modules still built, but thought it was nice of maven to inform me of the potential change.

Version and Scope Issues
We had a few places where we needed to define a dependency version and another place where we shouldn't have defined a scope. Both instances prevented maven 3.0 from building our modules, but fixing them was easy. The first instance was we defined a version for a plugin in the pluginManagement section, but maven 3 required it also where it was used in the reporting plugin section. Not exactly sure about this one, ideally you would only define your plugin versions in the pluginManagement section but oh well.

We had some WAR projects using jetty. In the jetty plugin definition we had a dependency on geronimo and had defined a scope of provided. Maven 3 complained about it and since it's really not necessary, just removing it fixed the issue.

modelVersion
Maven 3.0 kept warning about using ${modelVersion} instead of ${project.modelVersion}. I was still able to build though, so my guess is the value for modelVersion, 4.0.0, most likely will change when maven 3.1 comes out.

Weird Surefire Output
This wasn't necessarily an issue with the surefire plugin, but I wanted to comment about it's output when tests failed as I thought it might have been a maven 3 issue. Below is a screenshot of the output when you have failed tests. At first I thought it was a maven 3 issue, but I built the same project using the same commands with maven 2.2.1 and got the same test failures. Hopefully, they can clean this type of thing up, because I could image lots of people getting confused.

Failed test output

That's essentially it. Happily, there really wasn't much required to change, which goes to show the great lengths the maven team has gone through to ensure backwards compatibility. Finally, here is a Compatibility Notes maven has provided on the subject of migrating maven 2 projects to maven 3.

Monday, July 19, 2010

My First Groovy DSL

It's no secret I'm a groovy homer. I love it. One of the things that makes using groovy so fun, is it's syntax. Being able to get the contents of a file by just saying new File("/home/james/test.log").text is refreshing compared to it's java counterpart. Another thing that makes groovy enjoyable is it's ability to support Domain Specific Languages (DSL). MarkupBuilder is a great example. With Groovy, you can create simple or very complex DSLs for your purposes. To my knowledge there are a few ways you can create your own DSL: extending BuilderSupport or using methodMissing/propertyMissing. In my opinion, extending BuilderSupport is more involved while methodMissing/propertyMissing is kind of the poor man's way of creating a DSL.

Up to this point though, I've never actually came across a good use case for creating a DSL until this past week. We have a large set of automated tests that run against our REST Services. Since our application is now multi-tenant, all of our tests need a valid organization (tenant). In our case, an organization contains multiple roles and locations. Each test has different requirements on the types of organizations it needs. Some might need 2 unique organizations, while another might need an organization with at least 2 roles and 2 locations. It was this use case that I thought a groovy DSL would fit perfectly.

My end goal was to have something like this:

def orgs = OrganizationService.getOrganizations().withRoles().withLocations()

This would return a list of organizations that had at least 1 role and 1 location. The nice thing about this DSL is it's scalable. Meaning, if we add new lists of information to an organization, we won't have to update our class. Also, an important feature, is the method name Roles and Locations correlate to the JSON named arrays of the organization. So my JSON looks something like this:

{"organizations": {"name": "James", "roles": ["R1", "R2"], "locations": ["Tulsa", "Omaha"]}}

When writing my DSL I decided to go the poor man's way and use the methodMissing approach combined with the @Delegate annotation. Here it is:

import net.sf.json.JSONArray

class OrganizationFilterArray {
    @Delegate private JSONArray array
    
    OrganizationFilterArray(array) {
        this.array = array
    }
      
    def methodMissing(String name, args) {        
        if (name.startsWith("with")) {
            def length = (args.length == 0) ? 1 : args[0]
            def arrayName = name[4..5].toLowerCase() + name[6..-1]
            
            return filterByLength(arrayName, length)
        } else {
            throw new MissingMethodException(name, this.class, args)
        }
    }
    
    private filterByLength(listName, length) {        
        def filteredArray = array.findAll {
            it."$listName"?.size() >= length
        }
        
        return new OrganizationFilterArray(filteredArray)
    }
}

I could have just as easily extended JSONArray since it's not final, but I was following the @Delegate guide initially and just thought it was an interesting alternative. The big key here is how I used methodMissing to support an infinite amount of possibilities with how to filter an organization. Everything else I think is pretty self explanatory. When it comes across a method that is missing, withRoles(), it calls my methodMissing method. From there I filter out all the organizations that don't fit the criteria. Eventually, this class could be refactored to support more than just the size of an array. Note, I did have to upgrade the gmaven plugin version to 1.0 to get it work in our maven project.

I knew from the beginning I wasn't going to use BuilderSupport. It did take me some time to figure out how I was going to support filtered (getOrganizations().withRoles()) and non-filtered versions (getOrganizations()). That is when I decided to extend List or JSONArray, as both method calls had to return my custom List/JSONArray. Overall, I'm pretty happy with the outcome and how long it took me. It was pretty trivial and very fun thanks to groovy.

Wednesday, July 14, 2010

Tip Debugging External Java Dependencies

Ever spent time debugging 3rd party java libraries? Decompiling is usually the first step. Attempting to walk through the code can be tedious but it's usually the first line of defense. But what if you want to deploy a slightly modified version? In the past, I've checked out the project and built it with my modifications. Since most open source projects don't support "virgin builds", this has a success rate of about 10%. Fortunately, there is a better way. I'm just disappointed I didn't think of it.

In our project we deploy a wiki that is based on JSPWiki using maven overlays. In the version we are using, there isn't any support for being able to configure the wiki files directory outside of a properties file in the WAR. In order to point JSPWiki to a different directory, you would basically have to unzip the WAR, update the file, and then zip the WAR back together (#fail). So, someone on our team discovered we could basically override this behaviour by providing our own implementation of the same class.

To be more specific, the class under question is com.ecyrd.jspwiki.PropertyReader. It's included in the JSPWiki.jar file under /WEB-INF/lib. It's default behaviour is not suitable for our needs, so we get an original copy of PropertyReader.java, and place it under our maven projects /src/main/java directory under the same package of com.ecyrd.jspwiki. Once the projects builds, we now have our version of PropertyReader.class under /WEB-INF/classes, which is important because the ClassLoader will first look under /WEB-INF/classes first before looking in /WEB-INF/lib. This means our class is used instead of the one provided by JSPWiki in /WEB-INF/lib/JSPWiki.jar.

Now I know what your thinking: that's a horrible idea James. And for the most part I agree, but it's not my fault this ability doesn't already exist in JSPWiki. So if you want to keep your conscience clean, go ahead and continue unpacking and repacking that WAR. I'll be happy getting important things done. Obviously, practicing this is the exception and not the rule. And one should provide the patch as an improvement back to the 3rd party for all to enjoy. And before you start asking yourself why you can't just extend the real PropertyReader and override the necessary methods, which I agree would be more ideal, it's not possible because you'd basically be extending yourself since the modified class is the first class in the classpath.

This technique has actually helped me twice debug environment specific issues. It'd saved me a huge amount of time not having to build an external library. In fact, if you check out the exact version, you could even perform remote debugging with breakpoints.

So next time you need to debug an external 3rd party library, consider using this technique before attempting to build it.

Tuesday, July 13, 2010

Avatar Maven

Today I gave a quick presentation to some coworkers about maven. It's a broad topic, so I kept it fairly limited. Most of my audience was very familiar with maven, so I tried not boring them with stuff they already knew. I tried making it a little engaging by comparing the Avatar, master of all 4 elements, to Maven, master of the build (it's a stretch I know). It's a quick presentation (15 slides) providing some helpful maven tips, what's coming in maven 3, and mvnsh. Hope you like it.

Friday, July 9, 2010

Sharing Resources in Maven

Today I needed to figure out the best way to share resources across multiple maven modules. We have previously done it 2 different ways, neither of which I thought were very good. The first way was using a relative path to reach across to the modules resource directory (usually not a good practice in maven). It went something like this:


    
        ../module1/src/main/resources
    


The second way was using the infamous maven assembly plugin. I typically avoid the assembly plugin like I avoid writing Assembly. Plus I prefer avoiding 100 extra lines of XML on something so trivial. Luckily, the Sonatype guys apparently knew this and have come up with a more efficient way of sharing resources using the maven-remote-resources-plugin. It has the advantages of requiring a lot less XML lifting and it's nicely integrated into the maven lifecycle. I did run into one small issue trying to get it to work. By default it only copies **/*.txt files from src/main/resources. For several minutes, I couldn't figure how why it wasn't working until I added an includes for **/*.xml. Then it worked perfectly. Here is the end result:

Creating a resource bundle
Add the following to your POM which is going to create the resource bundle.
      
    maven-remote-resources-plugin
    1.1
    
        
            
                bundle
            
            
                
                    **/*.xml
                
            
        
    


You now should see the following message in your mvn output while running mvn clean install.

[remote-resources:bundle {execution: default}]

This produces a /target/classes/META-INF/maven/remote-resources.xml file which contains references to the resource files. For example,

    test.xml

Consuming Resource Bundle
Add the following to the POM which needs to consume the new resource bundle.
      
    maven-remote-resources-plugin
    1.1
    
        
            
                process
            
            
                
                    com.lorenzen:lorenzen-core:${pom.version}
                
            
        
    


You now should see the following message in your mvn output while running mvn clean install.

[remote-resources:process {execution: default}]

You should now be able to look into your second modules /target/classes directory and see test.xml.

Thursday, July 1, 2010

RSS, Lucene, and REST

Sorry for the horrible title. I struggled trying to come up with a worthy title, but after a few minutes I decided to not let perfection get in the way of good.

My team has recently worked on a new feature I am pretty excited about: adding support for RSS/Atom in our application. I know your thinking so what. It's not really the what I am excited about but the how. What I'm really excited about was how the story was defined and implemented.

Approach
We had the simple requirement from a newer customer to provide an RSS feed for newly created items. This actually wasn't the first time for this requirement. We prototyped a similar capability a long time ago using OpenESB and the RSS BC, but for multiple reasons it just didn't work out.

So our first decision had to answer: how we were going to implement it......again, but better. Before the sprint began, a few of us got together and hashed out a potential solution: how about we use the Search REST Service, which is backed by Lucene, to support Advanced searches and return RSS?

Why does this excite me so much? To understand that I need to explain our application at a high level. It's a completely javascript-based application using ExtJS (now sencha), backed by REST Services using Jersey. Consequently, we have a lot of REST Services. Right now those REST Services support returning XML or JSON using a custom Response Builder we have created internally.

I'm excited because this single user story could have a huge improvement on the entire system:

  1. If we modified the Search Service to return RSS, then all our REST Services could support RSS.
  2. The REST Service would now support Advanced searches. Previously, it only really supported basic keyword searches.
  3. Any search they perform could now be subscribe to via RSS.
Implementation
I'm not going to go into every detail on how it was done. I wasn't even actually the one who implemented it (see Matt White. He did a fantastic job.). We did have one major hurdle we had to overcome, and that was how to index items to enable advanced searches like Status=New.

Previously this wasn't possible given how we were indexing our items. We were basically indexing the item by building up a large String containing all the item information like the following:
import org.apache.lucene.document.Document
import org.apache.lucene.document.Field

def Document createDocument(item) {
    Document d = new Document()
    
    doc.add(new Field("content",
        getContent(item),
        Field.Store.NO,
        Field.Index.ANALYZED))
        
    return d
}

def String getContent(item) {
    def s = new StringBuilder()
    
    s.append(item.getTitle()).append(" ")
    s.append(item.getStatus()).append(" ")
    s.append(item.getPriority()).append(" ")
    s.append(item.getDescription()).append(" ")
    
    return s.toString()
}

The problem with this is performing a search for "New" would have returned any item with a status of New as well as any items that contained the word New. The solution was to just add another Field to the Document.
doc.add(new Field("Status",
    item.getStatus(),
    Field.Store.NO,
    Field.Index.NOT_ANALYZED));

Now the Search Service could support advanced searches like: Status:"New". You should put the value in quotes in case the value contains spaces (ie Status:"In Progress"). And since Lucene is so powerful, it also means the follow search would work: Status:"New" AND Priority:"High" AND "Hurricane". Now users have the freedom to subscribe to a near limitless amount of RSS feeds based on Advanced Searches.

Start to Finish
I think there were several reasons why this story was a success in my eyes. Most importantly where the two really smart co-workers who worked on it: Matt White and Chuck Hinson. All three of us knew of this user story ahead of time and we were able to discuss it technically days before backlog selection. This allowed us to brainstorm some ideas. Once we narrowed it down, we spent some more time separately looking into the code to find out the level of difficulty and if Advanced Searches like Status:New would be possible. Overall, together I'd say we spent 3-4 hours doing the preliminary work. Doing that preliminary work I think really enabled us to give a proper WAG for the story.

I really can't speak for how the development went (I was at Disney World for 10 days with the family), but I was really impressed with the tests Matt wrote. He wrote a number of unit tests making sure advanced searches worked and basic searches still worked. On top of that, he wrote an overall functional test using HttpBuilder executing the REST Service just as our javascript client would.

Finally, once the main work was finished, we uploaded a diff file to our internal instance of Review Board. From there I was able to perform a peer review where we found a minor bug in the changes.

Summary
I am sure it's not an original idea, but I thought it was a fun User Story that hopefully will provide a lot of value beyond what was originally estimated. Ideally, this might help others who are in similar situations.