Martin @ Blog

software development and life.

Flower

Archive for the ‘php’ Category

HTML Purifier

Kore Nordmann explains why in his opinion one shouldn’t use BBCode for comments and forums. I think he has a point, but it only holds when the BBCode is parsed using regular expressions, as he explains in another article. Actually, you’re not really parsing the BBCode when using regular expressions, because it is pattern matching. He explains why it makes no difference to use HTML syntax instead of BBCode syntax. Obviously, he has a very good point, because the BBCode syntax is not well defined, while HTML syntax – especially for the things that normally are allowed in blog comments or on forums – are well defined and known by many people.

An intresting observation is that, even despite the good explanation of the problem with BBCode – a false sense of security when parsing it with regexps – is that people demonstrate in the comments that they really don’t understand it. For example, one comment states that it is almost impossible to block all not allowed HTML using blacklists… Obviously, one shouldn’t use blacklists, but whitelists. By default, all < and > should be replaced by &lt; and &gt;.

HTML Purifier is a library that parses HTML and uses a whitelist to allow certain HTML tags and attributes. Why should one develop something like this from scratch when there is alreay a library available?

Accessing properties in PHP objects

Today, I stumbled upon a weblog post of Jeff Moore on the way properties of objects should be accessed in PHP. Accidently, I thought a little about this problem myself last week because I’m working on a small project which uses a large number of data objects. Jeff Moore argues that you should not use $object->set($name, $value) or $object->get($name) to modify properties, because it does not add anything. I agree completely with that (and I’ve never used this technique myself). He recommends accessing properties directly or using setXxx($value) and getXxx() to access properties (where Xxx is the name of the property).
An intresting discussion arises in the comments where some people argue to use getter and setter methods, while others defend direct accessing the properties. I’m not sure on which side I am standing, but I think it depends on the purpose of your class.

For example, in a hobby project I’m currently working, I have quite a number of data objects (in fact models) which are generated dynamically using some kind of object-relational mapper. The properties of the objects are the fields of the table in the database the object is representing. I think in such a case, it is valid to access the properties directly. Other languages and frameworks (e.g. Ruby on Rails) use a similar strategy. I think it is also valid to use this technique, because since PHP 5, the language provides magic methods (__set and __get) which enables the developer to override the properties when necessary. This way, it is possible to modify the implementation without breaking the API of the class and as such keep the objects’ loose coupling. I think classes which are more behavioural (and not a representation of data) it makes more sense to use setter and getter methods, because you hide the implementation completely.

You are currently browsing the archives for the php category.