Generating permalink from string in Ruby

May 14
2007 18:27 (Ruby on Rails) · Русский (9,476 views)

If you are creating Ruby on Rails application like a blog, you most probably want to generate URLs using post titles. It’s good practice, because search engines like keywords in URL, and it looks more human-readable. Just compare: http://example.com/posts/10 and http://example.com/posts/generating-permalinks-from-string (yeah, it’s long, but self-descriptive). Anyways, this is small post about converting a title to a permalink.

First thing I love in Ruby is an ability to extend classes with my own methods. I could just add method to_permalink to any string and then everywhere I could write something like @post.title.to_permalink. It’s amazing!

Here is my version of to_permalink method:

class String
  def to_permalink
    result = strip_tags
    # Preserve escaped octets.
    result.gsub!(/-+/, '-')
    result.gsub!(/%([a-f0-9]{2})/i, '--\1--')
    # Remove percent signs that are not part of an octet.
    result.gsub!('%', '-')
    # Restore octets.
    result.gsub!(/--([a-f0-9]{2})--/i, '%\1')

    result.gsub!(/&.+?;/, '-') # kill entities
    result.gsub!(/[^%a-z0-9_-]+/i, '-')
    result.gsub!(/-+/, '-')
    result.gsub!(/(^-+|-+$)/, '')
    return result.downcase
  end

  private
 
    def strip_tags
      return clone if blank?
      if index('<')
        text = ''
        tokenizer = HTML::Tokenizer.new(self)

        while token = tokenizer.next
          node = HTML::Node.parse(nil, 0, 0, token, false)
          # result is only the content of any Text nodes
          text << node.to_s if node.class == HTML::Text
        end
        # strip any comments, and if they have a newline at the end (ie. line with
        # only a comment) strip that too
        text.gsub(/<!--(.*?)-->[\n]?/m, '')
      else
        clone # already plain text
      end
    end
end

How it’s working? First thing you would see is a private method strip_tags. Yes, I know about ActionView::Helpers::TextHelper::strip_tags, and this is almost 100% copy of Rails version (the only difference is that my version always returns clone of the original string). I just don’t want to rely on the Rails library.

Then my method replaces all special characters with dashes (only octets like %A0 would be kept), and trims dashed from the beginning and the end of the string. Finally full string will be lowercased.

Of course, in your application you should check collisions (several posts which have the same title should have unique permalinks, for example you could append numbers starting from 1: hello, hello-1, hello-2, etc). This is not my goal to cover all difficulties you could face, it’s small post, do you remember?

Just for your pleasure, here are the RSpec tests for this method:

describe 'String.to_permalink from extensions.rb' do
  it 'should replace all punctuation marks and spaces with dashes' do
    "!.@#$\%^&*()Test case\n\t".to_permalink.should == 'test-case'
  end

  it 'should preserve _ symbol' do
    "Test_case".to_permalink.should == 'test_case'
  end
 
  it 'should preserve escaped octets and remove redundant %' do
    'Test%%20case'.to_permalink.should == 'test-%20case'
  end

  it 'should strip HTML tags' do
    '<a href="http://example.com">Test</a> <b>case</b>'.to_permalink.should == 'test-case'
  end

  it 'should strip HTML entities and insert dashes' do
    'Test&nbsp;case'.to_permalink.should == 'test-case'
  end

  it 'should trim beginning and ending dashes' do
    '-. Test case .-'.to_permalink.should == 'test-case'
  end

  it 'should not use ---aa--- as octet' do
    'b---aa---b'.to_permalink.should == 'b-aa-b'
  end
 
  it 'should replace % with -' do
    'Hello%world'.to_permalink.should == 'hello-world'
  end

  it 'should not modify original string' do
    s = 'Hello,&nbsp;<b>world</b>%20'
    s.to_permalink.should == 'hello-world%20'
    s.should == 'Hello,&nbsp;<b>world</b>%20'

    s = 'Hello'
    s.to_permalink.should == 'hello'
    s.should == 'Hello'
  end
end

It’s funny, right?

5 Responses to 'Generating permalink from string in Ruby'

Subscribe to comments with RSS or TrackBack to 'Generating permalink from string in Ruby'.

1
demjan
said on 2007-05-18 at 8.59 pm

Да. ))
Интересно сколько займет ресурсов и как будет выглядеть реализация на php в свете недавнего обсуждения на харахабарачтототам

2
said on 2007-05-18 at 9.30 pm

Да ровно столько же! Все эти войны Something vs. Something another — чистой воды треп людей, которым нечем заняться. При грамотном подходе код в любом языке будет лаконичным и эффективным. Вопрос в компактности и читабельности — это да, целиком ложится на совесть языка.

Вон в комментах к статье о подзапросах я приводил пример кода пхп. Да, задача не повседневная, но классическая, когда надо обработать выборку. И ну никак всего этого уродства не обойти. Можно размазать по коду, это да. Или вынести в отдельный файл и никому не показывать. Но оно останется. Потому что ПХП — это набор знаков препинания, разделенных словами.

3
demjan
said on 2007-05-19 at 8.57 am

Аналога этому в не обьектном языке не напишешь:

class String
  def to_permalink
  end
end
'string'.to_permalink

Только вроде такого

function to_permalink ($name) {
}
to_permalink($post->title);
4
said on 2007-05-19 at 12.41 pm

Ну почему же. В C# 3.0 можно будет писать extension методы:

public static class Extensions
{
    public static String ToPermalink(this String s)
    {
    }
}

'Something'.ToPermalink();

Это не аналогия открытости классов руби, а просто синтаксический сахар. Но результаты в общем-то неразличимы невооруженным взглядом :-)

5
kelyar
said on 2007-06-04 at 11.29 am

а почему просто не переопределить to_param модели?

Post a comment

You can use simple HTML-formatting tags (like <a>, <ul> and others). To format your code sample use <code lang="php">$a = "hello";</code> (allowed languages are ruby, php, yaml, html, csharp, javascript). Also you could use <code>$a = "hello";</code> and its syntax would not be highlighted. If you are not using <code> tag, replace < sign with &lt;.

Submit Comment

 
Copyright © 2005 - 2008, Dmytro Shteflyuk