Java, Strings, and Words — oh my

My problem seemed easy — I want to capitalize the words in a sentence. So “FRANK BURNS EATS WORMS” become “Frank Burns Eats Worms”. And I wrote a little method that did that and life was good.

But then I realized that I also had strings like “M.A. HOSTETLER (COMPUTER GEEK)” and they would come out to “M.a. Hostetler (COMPUTER GEEK)”. Hm, not quite what I wanted. I googled around and found nothing. A few more, generic searches later and I found <code>WordUtils</code> in common-lang. Yeah, I already had common-lang in my project since it was a dependency for something else! Well, why not try it! Surely the Apache Commons people were smarter than me!

Nope, nadda. They seemed to be as smart as I am on this issue. Well. maybe a little bit smarter because the capitalizeFully method could take an array of characters as word delimiters. So I got a game plan in my head that seemed easy and, after many tries got something that worked. Not optimal, mind you, but working.

    public static String capWords(String str){  
          return capWords(str," ");    
    }       
    public static String capWords(String str, String delimiter) {   
          String[] strList = str.split(delimiter);       
          ArrayList new_strs = new ArrayList();     

          Pattern non_alpha =  Pattern.compile(”W”);             

          for(int i=0; i<strList.length; i++ ) {  
               String mystr = strList[i]; 

               Matcher match = non_alpha.matcher(mystr);     
               String mark = null;  

               if (match.find()) {     
                 int idx = match.start();      
                 mark = mystr.substring(idx,idx+1);    
                } else { 
                 mark=delimiter;
                }        
           new_strs.add(WordUtils.capitalizeFully(mystr, mark.toCharArray()));

          } 
       return strJoin(new_strs,delimiter);
}

Now I can correctly get “M.A. Hostetler (Computer Geek)”, just like I want it to be.

The new method lets the WordUtils class do it’s thing, but I just feed it different delimiters. Of course, the delimiters are hard-coded and I probably didn’t even need to use the space as a special case, but it’s the most common delimiter. Using a space as delimiter one and then the punctuation, etc., after that stops us from dividing too much too soon.

This just proves Joel Spotsky’s statement — strings are hard.

Powered by ScribeFire.

Leave a Reply

You must be logged in to post a comment.